系统:CentOS 7.9 最小化安装;升级软件补丁;关闭SELinux和防火墙。
1、本地系统预配置(
以下所有操作在所有节点执行
)a)配置定义Ceph所有节点HostName;
hostnamectl set-hostname ceph1 hostnamectl set-hostname ceph2 hostnamectl set-hostname ceph3
b)配置Host文件(/etc/hosts)配置所有节点的本地解析记录;
192.168.80.245 ceph1
192.168.80.246 ceph2
192.168.80.247 ceph3
[root@ceph1 ~]# cat >> /etc/hosts <<EOF > 192.168.80.245 ceph1 > 192.168.80.246 ceph2 > 192.168.80.247 ceph3 > EOF [root@ceph1 ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.80.245 ceph1 192.168.80.246 ceph2 192.168.80.247 ceph3
[root@ceph2 ~]# cat >> /etc/hosts <<EOF > 192.168.80.245 ceph1 > 192.168.80.246 ceph2 > 192.168.80.247 ceph3 > EOF [root@ceph2 ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.80.245 ceph1 192.168.80.246 ceph2 192.168.80.247 ceph3
[root@ceph3 ~]# cat >> /etc/hosts <<EOF > 192.168.80.245 ceph1 > 192.168.80.246 ceph2 > 192.168.80.247 ceph3 > EOF [root@ceph3 ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.80.245 ceph1 192.168.80.246 ceph2 192.168.80.247 ceph3
c)配置时间同步服务(时间不同步会造成Ceph集群健康状况告警);CentOS7.9默认已安装Crondy时间同步服务;
vim /etc/chrony.conf #编辑修改Chrony配置文件 # These servers were defined in the installation: #server 0.centos.pool.ntp.org iburst #server 1.centos.pool.ntp.org iburst #server 2.centos.pool.ntp.org iburst #server 3.centos.pool.ntp.org iburst server ntp.aliyum.com iburst #注释以上四行并重新配置公网NTP时间服务器地址 # Allow NTP client access from local network. #allow 192.168.0.0/16 allow 192.168.80.0/24 #复制以上一行并根据实际情况配置允许内网访问此服务器同步时间的客户端网段地址
[root@ceph1 ~]# systemctl restart chronyd.service #重启Chrony时间服务使配置生效(其它节点同样操作) [root@ceph1 ~]# systemctl status chronyd.service #查看Chrony服务状态确保服务运行正常 ● chronyd.service - NTP client/server Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled) Active: active (running) since Sun 2021-03-14 22:24:35 CST; 4s ago Docs: man:chronyd(8) man:chrony.conf(5) Process: 1663 ExecStartPost=/usr/libexec/chrony-helper update-daemon (code=exited, status=0/SUCCESS) Process: 1660 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=0/SUCCESS) Main PID: 1662 (chronyd) CGroup: /system.slice/chronyd.service └─1662 /usr/sbin/chronyd Mar 14 22:24:35 ceph1 systemd[1]: Starting NTP client/server... Mar 14 22:24:35 ceph1 chronyd[1662]: chronyd version 3.4 starting (+CMDMON +NTP +REFCLOCK ...UG) Mar 14 22:24:35 ceph1 chronyd[1662]: Frequency 13.537 +/- 2.008 ppm read from /var/lib/chr...ift Mar 14 22:24:35 ceph1 systemd[1]: Started NTP client/server. Hint: Some lines were ellipsized, use -l to show in full.
[root@ceph1 ~]# chronyc sources #查看跟公网时间服务器通讯状态(S列符号为星号即代表通讯正常) 210 Number of sources = 1 MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* 203.107.6.88 2 6 17 3 +358us[ +283us] +/- 18ms [root@ceph1 ~]# chronyc clients #查看跟其它节点时间客户端通讯状态(Hostname列有客户端信息即代表通讯正常) Hostname NTP Drop Int IntL Last Cmd Drop Int Last =============================================================================== ceph2 4 0 1 - 32 0 0 - - ceph3 4 0 1 - 18 0 0 - -
d)安装Python3:yum install -y python3
[root@ceph1 ~]# yum install -y python3 [root@ceph2 ~]# yum install -y python3 [root@ceph3 ~]# yum install -y python3
e)安装Docker服务:安装请参考以下链接文章
安装Docker-CE 1、安装必要的一些系统工具:yum install -y yum-utils devi […]
2、安装Cephadm(
以下所有操作在Ceph节点1执行
)a)获取脚本:curl –silent –remote-name –location https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadm
[root@ceph1 ~]# curl --silent --remote-name --location https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadm [root@ceph1 ~]# ll #查看文件下载与否(由于网络原因有时可能会没下载下来,多下载几次即可;) total 220 -rw-------. 1 root root 1456 Jan 15 2019 anaconda-ks.cfg -rw-r--r-- 1 root root 219622 Mar 14 22:49 cephadm
b)配置脚本可执行权限:chmod +x cephadm
[root@ceph1 ~]# chmod +x cephadm
c)执行安装脚本:
./cephadm add-repo –release octopus #添加Ceph镜像源地址
cp /etc/yum.repos.d/ceph.repo{,.bak} 备份镜像源地址
sed -i ‘s#download.ceph.com#mirrors.aliyun.com/ceph#’ /etc/yum.repos.d/ceph.repo #修改镜像源为国内阿里云地址
./cephadm install
[root@ceph1 ~]# ./cephadm add-repo --release octopus Writing repo to /etc/yum.repos.d/ceph.repo... Enabling EPEL... [root@ceph1 ~]# cp /etc/yum.repos.d/ceph.repo{,.bak} [root@ceph1 ~]# sed -i 's#download.ceph.com#mirrors.aliyun.com/ceph#' /etc/yum.repos.d/ceph.repo [root@ceph1 ~]# ./cephadm install Installing packages ['cephadm']...
d)检验cephadm已加入PATH环境变量(可能需要退出命令行界面重新登录)
[root@ceph1 ~]# which cephadm /usr/sbin/cephadm
3、Ceph集群部署(以下所有操作在Ceph节点1执行
)
a)引导创建集群:cephadm bootstrap –mon-ip 192.168.80.245
[root@ceph1 ~]# cephadm bootstrap --mon-ip 192.168.80.245 Creating directory /etc/ceph for ceph.conf Verifying podman|docker is present... Verifying lvm2 is present... Verifying time synchronization is in place... Unit chronyd.service is enabled and running Repeating the final host check... podman|docker (/usr/bin/docker) is present systemctl is present lvcreate is present Unit chronyd.service is enabled and running Host looks OK Cluster fsid: 12837782-84d6-11eb-a474-00505622b20c Verifying IP 192.168.80.245 port 3300 ... Verifying IP 192.168.80.245 port 6789 ... Mon IP 192.168.80.245 is in CIDR network 192.168.80.0/24 Pulling container image docker.io/ceph/ceph:v15... Extracting ceph user uid/gid from container image... Creating initial keys... Creating initial monmap... Creating mon... Waiting for mon to start... Waiting for mon... mon is available Assimilating anything we can from ceph.conf... Generating new minimal ceph.conf... Restarting the monitor... Setting mon public_network... Creating mgr... Verifying port 9283 ... Wrote keyring to /etc/ceph/ceph.client.admin.keyring Wrote config to /etc/ceph/ceph.conf Waiting for mgr to start... Waiting for mgr... mgr not available, waiting (1/10)... mgr not available, waiting (2/10)... mgr not available, waiting (3/10)... mgr not available, waiting (4/10)... mgr is available Enabling cephadm module... Waiting for the mgr to restart... Waiting for Mgr epoch 5... Mgr epoch 5 is available Setting orchestrator backend to cephadm... Generating ssh key... Wrote public SSH key to to /etc/ceph/ceph.pub Adding key to root@localhost's authorized_keys... Adding host ceph1... Deploying mon service with default placement... Deploying mgr service with default placement... Deploying crash service with default placement... Enabling mgr prometheus module... Deploying prometheus service with default placement... Deploying grafana service with default placement... Deploying node-exporter service with default placement... Deploying alertmanager service with default placement... Enabling the dashboard module... Waiting for the mgr to restart... Waiting for Mgr epoch 13... Mgr epoch 13 is available Generating a dashboard self-signed certificate... Creating initial admin user... Fetching dashboard port number... Ceph Dashboard is now available at: URL: https://ceph1:8443/ User: admin Password: 0m4lylrdco You can access the Ceph CLI with: sudo /usr/sbin/cephadm shell --fsid 12837782-84d6-11eb-a474-00505622b20c -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring Please consider enabling telemetry to help improve Ceph: ceph telemetry on For more information see: https://docs.ceph.com/docs/master/mgr/telemetry/ Bootstrap complete.
PS:该命令执行以下操作:
1、在本地主机上为新集群创建monitor 和 manager daemon守护程序。
2、为Ceph集群生成一个新的SSH密钥,并将其添加到root用户的/root/.ssh/authorized_keys文件中。
3、将与新群集进行通信所需的最小配置文件保存到/etc/ceph/ceph.conf。
4、向/etc/ceph/ceph.client.admin.keyring写入client.admin管理(特权!)secret key的副本。
5、将public key的副本写入/etc/ceph/ceph.pub。
b)查看当前配置文件
[root@ceph1 ~]# ll /etc/ceph/ total 12 -rw------- 1 root root 63 Mar 14 23:01 ceph.client.admin.keyring -rw-r--r-- 1 root root 179 Mar 14 23:01 ceph.conf -rw-r--r-- 1 root root 595 Mar 14 23:02 ceph.pub
c)查看当前拉取的镜像及容器运行状态
[root@ceph1 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE ceph/ceph v15 5553b0cb212c 2 months ago 943MB ceph/ceph-grafana 6.6.2 a0dce381714a 9 months ago 509MB prom/prometheus v2.18.1 de242295e225 10 months ago 140MB prom/alertmanager v0.20.0 0881eb8f169f 15 months ago 52.1MB prom/node-exporter v0.18.1 e5a616e4b9cf 21 months ago 22.9MB [root@ceph1 ~]# docker container ls CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES bef4209bd542 prom/node-exporter:v0.18.1 "/bin/node_exporter …" 2 minutes ago Up 2 minutes ceph-12837782-84d6-11eb-a474-00505622b20c-node-exporter.ceph1 d062d4b9fc13 ceph/ceph-grafana:6.6.2 "/bin/sh -c 'grafana…" 2 minutes ago Up 2 minutes ceph-12837782-84d6-11eb-a474-00505622b20c-grafana.ceph1 3b81b1145e40 prom/alertmanager:v0.20.0 "/bin/alertmanager -…" 2 minutes ago Up 2 minutes ceph-12837782-84d6-11eb-a474-00505622b20c-alertmanager.ceph1 6442a8ae6fe8 prom/prometheus:v2.18.1 "/bin/prometheus --c…" 2 minutes ago Up 2 minutes ceph-12837782-84d6-11eb-a474-00505622b20c-prometheus.ceph1 839e0a93f289 ceph/ceph:v15 "/usr/bin/ceph-crash…" 2 minutes ago Up 2 minutes ceph-12837782-84d6-11eb-a474-00505622b20c-crash.ceph1 fe8d6084aad4 ceph/ceph:v15 "/usr/bin/ceph-mgr -…" 3 minutes ago Up 3 minutes ceph-12837782-84d6-11eb-a474-00505622b20c-mgr.ceph1.usdqnp 180e32d34660 ceph/ceph:v15 "/usr/bin/ceph-mon -…" 3 minutes ago Up 3 minutes ceph-12837782-84d6-11eb-a474-00505622b20c-mon.ceph1
PS:此时已经运行了以下组件
ceph-mgr ceph管理程序
ceph-monitor ceph监视器
ceph-crash 崩溃数据收集模块
prometheus prometheus监控组件
grafana 监控数据展示dashboard
alertmanager prometheus告警组件
node_exporter prometheus节点数据收集组件
d)根据初始化完成的提示使用浏览器访问Dashboard(https://ceph1-IP:8443),修改密码后登录到Ceph Dashboard;还有一个实时显示Ceph集群状态的检测页面Grafana(https://ceph1-IP:3000)
e)启用CEPH命令:
e1)标准用法:默认情况下本地主机是不支持ceph基本命令的,需执行命令(cephadm shell)进入特定Shell中使用(退出exit)
[root@ceph1 ~]# ceph -s -bash: ceph: command not found [root@ceph1 ~]# cephadm shell Inferring fsid 12837782-84d6-11eb-a474-00505622b20c Inferring config /var/lib/ceph/12837782-84d6-11eb-a474-00505622b20c/mon.ceph1/config Using recent ceph image ceph/ceph@sha256:37939a3739e4e037dcf1b1f5828058d721d8c6de958212609f9e7d920b9c62bf [ceph: root@ceph1 /]# ceph -s cluster: id: 12837782-84d6-11eb-a474-00505622b20c health: HEALTH_WARN OSD count 0 < osd_pool_default_size 3 services: mon: 1 daemons, quorum ceph1 (age 42m) mgr: ceph1.usdqnp(active, since 41m) osd: 0 osds: 0 up, 0 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 B usage: 0 B used, 0 B / 0 B avail pgs: [ceph: root@ceph1 /]# exit exit [root@ceph1 ~]#
e2)安装ceph-common包使本地主机支持ceph基本命令:cephadm install ceph-common
[root@ceph1 ~]# cephadm install ceph-common Installing packages ['ceph-common']... [root@ceph1 ~]# ceph -s cluster: id: 12837782-84d6-11eb-a474-00505622b20c health: HEALTH_WARN OSD count 0 < osd_pool_default_size 3 services: mon: 1 daemons, quorum ceph1 (age 48m) mgr: ceph1.usdqnp(active, since 47m) osd: 0 osds: 0 up, 0 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 B usage: 0 B used, 0 B / 0 B avail pgs:
f)将主机添加到集群中
f1)配置集群的公共SSH公钥至其它Ceph节点:ssh-copy-id -f -i /etc/ceph/ceph.pub root@Hostname
[root@ceph1 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph2 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/etc/ceph/ceph.pub" The authenticity of host 'ceph2 (192.168.80.246)' can't be established. ECDSA key fingerprint is SHA256:2Eo2WLWyofiltEAs4nLUFLOcXLFD6YvsuPSDlEDUZGk. ECDSA key fingerprint is MD5:3c:b0:5f:a8:af:6a:15:45:eb:a9:2a:b0:20:21:65:04. Are you sure you want to continue connecting (yes/no)? yes root@ceph2's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'root@ceph2'" and check to make sure that only the key(s) you wanted were added. [root@ceph1 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph3 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/etc/ceph/ceph.pub" The authenticity of host 'ceph3 (192.168.80.247)' can't be established. ECDSA key fingerprint is SHA256:2Eo2WLWyofiltEAs4nLUFLOcXLFD6YvsuPSDlEDUZGk. ECDSA key fingerprint is MD5:3c:b0:5f:a8:af:6a:15:45:eb:a9:2a:b0:20:21:65:04. Are you sure you want to continue connecting (yes/no)? yes root@ceph3's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'root@ceph3'" and check to make sure that only the key(s) you wanted were added.
f2)添加指定新节点加入Ceph集群中:ceph orch host add Hostname
[root@ceph1 ~]# ceph orch host add ceph2 Added host 'ceph2' [root@ceph1 ~]# ceph orch host add ceph3 Added host 'ceph3'
[root@ceph1 ~]# ceph orch host ls #验证查看ceph纳管的所有节点 HOST ADDR LABELS STATUS ceph1 ceph1 ceph2 ceph2 ceph3 ceph3
[root@ceph1 ~]# ceph -s cluster: id: 12837782-84d6-11eb-a474-00505622b20c health: HEALTH_WARN OSD count 0 < osd_pool_default_size 3 services: #此处可看到连第一节点在内公有三个mon监控服务及两个mgr管理服务已扩展部署好 mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 50s) mgr: ceph1.usdqnp(active, since 60m), standbys: ceph2.qopzlo osd: 0 osds: 0 up, 0 in task status: data: pools: 0 pools, 0 pgs objects: 0 objects, 0 B usage: 0 B used, 0 B / 0 B avail pgs:
4、部署OSD(以下所有操作在Ceph节点1执行
)
a)自动使用任何可用和未使用的存储设备(此模式因为没有显示任何信息后期可能会有异常可按需选择):ceph orch apply osd –all-available-devices
b)这里选择手工添加:
b1)在各节点处执行命令(lsblk)查看确定将被配置osd的设备名称
[root@ceph1 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 100G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 99G 0 part / sdb 8:16 0 100G 0 disk [root@ceph2 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 100G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 99G 0 part / sdb 8:16 0 100G 0 disk [root@ceph3 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 100G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 99G 0 part / sdb 8:16 0 100G 0 disk
b2)确定后执行命令添加:ceph orch daemon add osd Hostname:/dev/sdx
[root@ceph1 ~]# ceph orch daemon add osd ceph1:/dev/sdb Created osd(s) 0 on host 'ceph1' [root@ceph1 ~]# ceph orch daemon add osd ceph2:/dev/sdb Created osd(s) 1 on host 'ceph2' [root@ceph1 ~]# ceph orch daemon add osd ceph3:/dev/sdb Created osd(s) 2 on host 'ceph3'
[root@ceph1 ~]# ceph -s cluster: id: 12837782-84d6-11eb-a474-00505622b20c health: HEALTH_OK services: mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 6m) mgr: ceph2.qopzlo(active, since 6m), standbys: ceph1.usdqnp osd: 3 osds: 3 up (since 11s), 3 in (since 11s) #不一会可查看到此处刚刚配置的osd节点都up状态及in状态表示都正确生效了 data: pools: 1 pools, 1 pgs objects: 0 objects, 0 B usage: 3.0 GiB used, 297 GiB / 300 GiB avail pgs: 1 active+clean
[root@ceph1 ~]# ceph -s cluster: id: 12837782-84d6-11eb-a474-00505622b20c health: HEALTH_OK services: mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 12m) mgr: ceph2.qopzlo(active, since 12m), standbys: ceph1.usdqnp osd: 3 osds: 3 up (since 6m), 3 in (since 6m) data: pools: 1 pools, 1 pgs objects: 0 objects, 0 B usage: 3.0 GiB used, 297 GiB / 300 GiB avail pgs: 1 active+clean