CentOS7.8 构建方式安装 Slurm-20.02.5 环境-Master端 | IT运维网
  • 本站启用了账户登录密码错误就锁定模式,如有人误操作被锁请邮件(yvan.lu@ityww.cn)告知我账户名。
  • 本站为个人技术性站点,上面会更新一些系统、网络、虚拟化及云计算等相关的文章,与大家一起参考、学习和交流。
  • 欢迎访问本网站;本站QQ交流群:654792529;如果您觉得本站非常有看点,那么赶紧使用Ctrl+D收藏吧!

CentOS7.8 构建方式安装 Slurm-20.02.5 环境-Master端

CentOS yvan 4年前 (2020-09-25) 8794次浏览 已收录 6个评论 扫描二维码

Slurm简介

Slurm是一个开源,容错,高度可扩展的集群管理和作业调度系统,适用于各种规模的Linux集群。 Slurm不需要对其操作进行内核修改,并且相对独立。作为集群工作负载管理器,Slurm有以下特性:
1、它在一段时间内为用户分配对资源(计算节点)的独占和/或非独占访问,以便他们可以执行工作;
2、它提供了一个框架,用于在分配的节点集上启动,执行和监视工作(通常是并行作业);
3、它通过管理待处理工作的队列来仲裁资源争用。
4、它提供作业信息统计,作业状态诊断等工具。

环境说明

系统:CentOS最小化安装;升级软件补丁,内核;关闭SELinux和防火墙。
Slurm专用账户(slurm):Master端和Node端专用账户统一ID,建议ID号规划为200;
Slurm Master如需要支持GUI命令(sview)则需要安装GUI界面(Server with GUI);

Slurm Maser端安装

0、安装EPEL源:yum install -y epel-release && yum makecache

[root@localhost ~]# yum install -y epel-release && yum makecache
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirrors.aliyun.com
 * extras: mirrors.aliyun.com
 * updates: mirrors.aliyun.com
base                                                                                                                               | 3.6 kB  00:00:00
epel                                                                                                                               | 4.7 kB  00:00:00
extras                                                                                                                             | 2.9 kB  00:00:00
updates                                                                                                                            | 2.9 kB  00:00:00
(1/3): epel/x86_64/updateinfo                                                                                                      | 1.0 MB  00:00:00
(2/3): updates/7/x86_64/primary_db                                                                                                 | 4.5 MB  00:00:01
(3/3): epel/x86_64/primary_db                                                                                                      | 6.9 MB  00:00:02
......此处省略......
(5/9): updates/7/x86_64/other_db                                                                                                   | 318 kB  00:00:00
(6/9): updates/7/x86_64/filelists_db                                                                                               | 2.4 MB  00:00:01
(7/9): base/7/x86_64/filelists_db                                                                                                  | 7.1 MB  00:00:03
(8/9): epel/x86_64/other_db                                                                                                        | 3.3 MB  00:00:04
(9/9): epel/x86_64/filelists_db                                                                                                    |  12 MB  00:00:04
Metadata Cache Created

1、安装GUI界面支持GUI命令(sview)并重启:yum groups install -y “Server with GUI” && reboot    #仅仅安装GUI包组即可,启动无需切换保持默认即可;

[root@localhost ~]# yum groups install -y "Server with GUI" && reboot
Loaded plugins: fastestmirror
There is no installed groups file.
Maybe run: yum groups mark convert (see man yum)
Loading mirror speeds from cached hostfile
 * base: mirrors.aliyun.com
 * extras: mirrors.aliyun.com
 * updates: mirrors.aliyun.com
Warning: Group core does not have any packages to install.
Resolving Dependencies
--> Running transaction check
---> Package ModemManager.x86_64 0:1.6.10-3.el7_6 will be installed
--> Processing Dependency: ModemManager-glib(x86-64) = 1.6.10-3.el7_6 for package: ModemManager-1.6.10-3.el7_6.x86_64
--> Processing Dependency: libqmi-utils for package: ModemManager-1.6.10-3.el7_6.x86_64
--> Processing Dependency: libmbim-utils for package: ModemManager-1.6.10-3.el7_6.x86_64
......此处省略......
  xorg-x11-font-utils.x86_64 1:7.5-21.el7                                      xorg-x11-fonts-Type1.noarch 0:7.5-9.el7
  xorg-x11-proto-devel.noarch 0:2018.4-1.el7                                   xorg-x11-server-common.x86_64 0:1.20.4-10.el7
  xorg-x11-server-utils.x86_64 0:7.7-20.el7                                    xorg-x11-xkb-utils.x86_64 0:7.7-14.el7
  yajl.x86_64 0:2.0.4-4.el7                                                    yelp-libs.x86_64 2:3.28.1-1.el7
  yelp-xsl.noarch 0:3.28.0-1.el7                                               zenity.x86_64 0:3.28.1-1.el7

Complete!

2、配置主机名:hostnamectl set-hostname slurm-master    #配置后重新连接即可即可生效

[root@localhost ~]# hostnamectl set-hostname slurm-master

3、配置时间服务并同步时间:CentOS7系统默认已采用Chrony时间服务

[root@slurm-master ~]# systemctl status chronyd.service
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2020-09-25 12:54:49 CST; 8min ago
     Docs: man:chronyd(8)
           man:chrony.conf(5)
  Process: 1048 ExecStartPost=/usr/libexec/chrony-helper update-daemon (code=exited, status=0/SUCCESS)
  Process: 1003 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 1013 (chronyd)
    Tasks: 1
   CGroup: /system.slice/chronyd.service
           └─1013 /usr/sbin/chronyd

Sep 25 12:54:49 localhost.localdomain systemd[1]: Starting NTP client/server...
Sep 25 12:54:49 localhost.localdomain chronyd[1013]: chronyd version 3.4 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYN... +DEBUG)
Sep 25 12:54:49 localhost.localdomain chronyd[1013]: Frequency -7.864 +/- 1.309 ppm read from /var/lib/chrony/drift
Sep 25 12:54:49 localhost.localdomain systemd[1]: Started NTP client/server.
Sep 25 12:54:55 localhost.localdomain chronyd[1013]: Selected source 119.28.206.193
Hint: Some lines were ellipsized, use -l to show in full.

4、部署Munge:目前在线安装的Munge版本为0.5.11

5、安装所需组件:yum install -y rpm-build bzip2-devel openssl openssl-devel zlib-devel perl-DBI perl-ExtUtils-MakeMaker pam-devel readline-devel mariadb-devel python3 gtk2 gtk2-devel

[root@slurm-master ~]# yum install -y rpm-build bzip2-devel openssl openssl-devel zlib-devel perl-DBI perl-ExtUtils-MakeMaker pam-devel readline-devel mariadb-devel python3 gtk2 gtk2-devel
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
 * base: mirrors.aliyun.com
 * extras: mirrors.aliyun.com
 * updates: mirrors.aliyun.com
Package rpm-build-4.11.3-43.el7.x86_64 already installed and latest version
Package 1:openssl-1.0.2k-19.el7.x86_64 already installed and latest version
Package perl-DBI-1.627-4.el7.x86_64 already installed and latest version
Package gtk2-2.24.31-1.el7.x86_64 already installed and latest version
Resolving Dependencies
--> Running transaction check
---> Package bzip2-devel.x86_64 0:1.0.6-13.el7 will be installed
---> Package gtk2-devel.x86_64 0:2.24.31-1.el7 will be installed
--> Processing Dependency: pango-devel >= 1.20.0-1 for package: gtk2-devel-2.24.31-1.el7.x86_64
--> Processing Dependency: glib2-devel >= 2.28.0-1 for package: gtk2-devel-2.24.31-1.el7.x86_64
--> Processing Dependency: cairo-devel >= 1.6.0-1 for package: gtk2-devel-2.24.31-1.el7.x86_64
--> Processing Dependency: atk-devel >= 1.29.4-2 for package: gtk2-devel-2.24.31-1.el7.x86_64
--> Processing Dependency: pkgconfig(pangoft2) for package: gtk2-devel-2.24.31-1.el7.x86_64
......此处省略......
  mesa-khr-devel.x86_64 0:18.3.4-7.el7_8.1                                       mesa-libEGL-devel.x86_64 0:18.3.4-7.el7_8.1
  mesa-libGL-devel.x86_64 0:18.3.4-7.el7_8.1                                     ncurses-devel.x86_64 0:5.9-14.20130511.el7_4
  pango-devel.x86_64 0:1.42.4-4.el7_7                                            pcre-devel.x86_64 0:8.32-17.el7
  perl-ExtUtils-Install.noarch 0:1.58-295.el7                                    perl-ExtUtils-Manifest.noarch 0:1.61-244.el7
  perl-ExtUtils-ParseXS.noarch 1:3.18-3.el7                                      perl-devel.x86_64 4:5.16.3-295.el7
  pixman-devel.x86_64 0:0.34.0-1.el7                                             pyparsing.noarch 0:1.5.6-9.el7
  python3-libs.x86_64 0:3.6.8-13.el7                                             python3-pip.noarch 0:9.0.3-7.el7_7
  python3-setuptools.noarch 0:39.2.0-10.el7                                      systemtap-sdt-devel.x86_64 0:4.0-11.el7

Complete!

6、部署Slurm程序

到此Slurm的Master端安装部署基本完成,可以使用sinfo命令或sview命令来查看及管理

[root@slurm-master slurm]# sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
debug*       up   infinite      1   unk* slurm-node1
[root@slurm-master slurm]# sview

CentOS7.8 构建方式安装 Slurm-20.02.5 环境-Master端


IT运维网 版权所有丨如未注明 , 均为原创丨本网站采用BY-NC-SA协议进行授权 , 转载请注明CentOS7.8 构建方式安装 Slurm-20.02.5 环境-Master端
喜欢 (28)
yvan
关于作者:
聪明来自勤奋、知识在于积累、好记性不如烂键盘!

您必须 登录 才能发表评论!

(6)个小伙伴在吐槽
  1. 还有srun的命令参数说明
    Pioneer2020-12-10 11:18 Windows 10 | Chrome 87.0.4280.88
    • yvan
      srun命令的用法后面会有更新,谢谢支持~!!~
      yvan2020-12-11 09:54 Windows 10 | Chrome 87.0.4280.88
  2. 更新一下数据库的配置以及python3脚本并行再多node机器上执行
    Pioneer2020-12-10 11:17 Windows 10 | Chrome 87.0.4280.88
    • yvan
      我这边实际的应用没有那么深入,所以数据库更新可能会缓一点,不好意思!~!~ 至于脚本并行在多节点上执行,我的理解是不是就是提交任务后让多节点同时执行工作呢,是的话那就是正常通过srun命令提交任务即可啊。
      yvan2020-12-11 10:00 Windows 10 | Chrome 87.0.4280.88
  3. 作者你没有配置数据库吧
    Pioneer2020-12-10 11:16 Windows 10 | Chrome 87.0.4280.88
    • yvan
      我发布的目前还是基础环境的部署,还没那么深入,所以数据库这块没有关联,不好意思!!~
      yvan2020-12-11 09:56 Windows 10 | Chrome 87.0.4280.88