文档章节

Ceph添加监视器Monitor失败

哓竹
 哓竹
发布于 2017/08/02 16:00
字数 3191
阅读 149
收藏 0

#1.添加Mon 当前ceph的状态 # ceph -s cluster f4833745-d220-407b-82ea-72eb6297d435 health HEALTH_OK monmap e3: 3 mons at {dlw1=172.16.40.11:6789/0,dlw2=172.16.40.12:6789/0,dlw3=172.16.40.13:6789/0} election epoch 14, quorum 0,1,2 dlw1,dlw2,dlw3 osdmap e26: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds pgmap v9695: 352 pgs, 6 pools, 45725 kB data, 20 objects 253 MB used, 584 GB / 584 GB avail 352 active+clean

当前已有三个mon,分别为dlw1,dlw2和dlw3,现在添加第四个mon dlw4

疑问:为什么要有四个mon,也不满足Paxos 算法,因为我添加了dlw4作为mon,再把dlw1的mon移除掉,这样就等同于mon迁移了...,这不是重点,重点是添加mon过程中的报错及解决办法,做个记录。

用最简单快速的方法来添加ceph-deploy

# ceph-deploy --overwrite-conf mon create dlw4
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.37): /usr/bin/ceph-deploy --overwrite-conf mon create dlw4
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : True
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x1dd2d88>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  mon                           : ['dlw4']
[ceph_deploy.cli][INFO  ]  func                          : <function mon at 0x1c93de8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  keyrings                      : None
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts dlw4
[ceph_deploy.mon][DEBUG ] detecting platform for host dlw4 ...
[dlw4][DEBUG ] connected to host: dlw4 
[dlw4][DEBUG ] detect platform information from remote host
[dlw4][DEBUG ] detect machine type
[dlw4][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO  ] distro info: CentOS Linux 7.2.1511 Core
[dlw4][DEBUG ] determining if provided host has same hostname in remote
[dlw4][DEBUG ] get remote short hostname
[dlw4][DEBUG ] deploying mon to dlw4
[dlw4][DEBUG ] get remote short hostname
[dlw4][DEBUG ] remote hostname: dlw4
[dlw4][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[dlw4][DEBUG ] create the mon path if it does not exist
[dlw4][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-dlw4/done
[dlw4][DEBUG ] create a done file to avoid re-doing the mon deployment
[dlw4][DEBUG ] create the init path if it does not exist
[dlw4][INFO  ] Running command: systemctl enable ceph.target
[dlw4][INFO  ] Running command: systemctl enable ceph-mon@dlw4
[dlw4][INFO  ] Running command: systemctl start ceph-mon@dlw4
[dlw4][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.dlw4.asok mon_status
[dlw4][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[dlw4][WARNIN] monitor: mon.dlw4, might not be running yet
[dlw4][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.dlw4.asok mon_status
[dlw4][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[dlw4][WARNIN] dlw4 is not defined in `mon initial members`
[dlw4][WARNIN] monitor dlw4 does not exist in monmap
[dlw4][WARNIN] neither `public_addr` nor `public_network` keys are defined for monitors
[dlw4][WARNIN] monitors may not be able to form quorum


[root@dlw1 opt]# ceph-deploy --overwrite-conf mon add dlw4 
  
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.37): /usr/bin/ceph-deploy --overwrite-conf mon add dlw4
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : True
[ceph_deploy.cli][INFO  ]  subcommand                    : add
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0xf08d88>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  mon                           : ['dlw4']
[ceph_deploy.cli][INFO  ]  func                          : <function mon at 0xdcade8>
[ceph_deploy.cli][INFO  ]  address                       : None
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mon][INFO  ] ensuring configuration of new mon host: dlw4
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to dlw4
[dlw4][DEBUG ] connected to host: dlw4 
[dlw4][DEBUG ] detect platform information from remote host
[dlw4][DEBUG ] detect machine type
[dlw4][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.mon][DEBUG ] Adding mon to cluster ceph, host dlw4
[ceph_deploy.mon][DEBUG ] using mon address by resolving host: 172.16.40.9
[ceph_deploy.mon][DEBUG ] detecting platform for host dlw4 ...
[dlw4][DEBUG ] connected to host: dlw4 
[dlw4][DEBUG ] detect platform information from remote host
[dlw4][DEBUG ] detect machine type
[dlw4][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO  ] distro info: CentOS Linux 7.2.1511 Core
[dlw4][DEBUG ] determining if provided host has same hostname in remote
[dlw4][DEBUG ] get remote short hostname
[dlw4][DEBUG ] adding mon to dlw4
[dlw4][DEBUG ] get remote short hostname
[dlw4][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[dlw4][DEBUG ] create the mon path if it does not exist
[dlw4][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-dlw4/done
[dlw4][DEBUG ] create a done file to avoid re-doing the mon deployment
[dlw4][DEBUG ] create the init path if it does not exist
[dlw4][INFO  ] Running command: systemctl enable ceph.target
[dlw4][INFO  ] Running command: systemctl enable ceph-mon@dlw4
[dlw4][INFO  ] Running command: systemctl start ceph-mon@dlw4
[dlw4][WARNIN] No data was received after 7 seconds, disconnecting...
[dlw4][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.dlw4.asok mon_status
[dlw4][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[dlw4][WARNIN] dlw4 is not defined in `mon initial members`
[dlw4][WARNIN] monitor dlw4 does not exist in monmap
[dlw4][WARNIN] neither `public_addr` nor `public_network` keys are defined for monitors
[dlw4][WARNIN] monitors may not be able to form quorum
[dlw4][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.dlw4.asok mon_status
[dlw4][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[dlw4][WARNIN] monitor: mon.dlw4, might not be running yet

#2.发现报错 这里用了add和create,发现都报错,而且报错内容一样,是找不到asok

找不到这个文件的原因是在于dlw4上的mon服务启动失败了

[root@dlw4 ceph]# systemctl status ceph-mon@`hostname`
● ceph-mon@dlw4.service - Ceph cluster monitor daemon
   Loaded: loaded (/usr/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: disabled)
   Active: failed (Result: start-limit) since Wed 2017-08-02 16:04:20 CHOST; 11min ago
  Process: 20119 ExecStart=/usr/bin/ceph-mon -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
 Main PID: 20119 (code=exited, status=1/FAILURE)

#3.解决经过 ##3.1.检查日志 在dlw4上

[root@dlw4 ceph]# cd /var/log/ceph/
[root@dlw4 ceph]# ls
ceph.log  ceph-mon.dlw4.log 
[root@dlw4 ceph]# tail -f ceph-mon.dlw4.log 
2017-08-02 16:04:09.972349 7f0232f61640  0 set uid:gid to 167:167 (ceph:ceph)
2017-08-02 16:04:09.972374 7f0232f61640  0 ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185), process ceph-mon, pid 20119
2017-08-02 16:04:09.972410 7f0232f61640  0 pidfile_write: ignore empty --pid-file
2017-08-02 16:04:09.998672 7f0232f61640  1 leveldb: Recovering log #30
2017-08-02 16:04:10.003399 7f0232f61640  1 leveldb: Delete type=0 #30

2017-08-02 16:04:10.003445 7f0232f61640  1 leveldb: Delete type=3 #29

2017-08-02 16:04:10.003694 7f0232f61640  0 mon.dlw4 does not exist in monmap, will attempt to join an existing cluster
2017-08-02 16:04:10.003795 7f0232f61640 -1 no public_addr or public_network specified, and mon.dlw4 not present in monmap or ceph.conf

最后一行引起了注意,没有指定public_addr或public_network,并且mon.dlw4也没指定再monmap或者ceph.conf中

##3.2.修改参数再次添加mon 在dlw1上,往ceph.conf中添加参数 public_network=172.16.40.0/24 并且把dlw4加入到mon_initial_members和mon_host中 把ceph.conf推到所有节点上

# ceph-deploy --overwrite-conf config push dlw2 dlw3 dlw4

再次添加mon

# ceph-deploy --overwrite-conf mon create dlw4
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.37): /usr/bin/ceph-deploy --overwrite-conf mon create dlw4
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : True
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x1ca3d88>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  mon                           : ['dlw4']
[ceph_deploy.cli][INFO  ]  func                          : <function mon at 0x1b64de8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  keyrings                      : None
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts dlw4
[ceph_deploy.mon][DEBUG ] detecting platform for host dlw4 ...
[dlw4][DEBUG ] connected to host: dlw4 
[dlw4][DEBUG ] detect platform information from remote host
[dlw4][DEBUG ] detect machine type
[dlw4][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO  ] distro info: CentOS Linux 7.2.1511 Core
[dlw4][DEBUG ] determining if provided host has same hostname in remote
[dlw4][DEBUG ] get remote short hostname
[dlw4][DEBUG ] deploying mon to dlw4
[dlw4][DEBUG ] get remote short hostname
[dlw4][DEBUG ] remote hostname: dlw4
[dlw4][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[dlw4][DEBUG ] create the mon path if it does not exist
[dlw4][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-dlw4/done
[dlw4][DEBUG ] create a done file to avoid re-doing the mon deployment
[dlw4][DEBUG ] create the init path if it does not exist
[dlw4][INFO  ] Running command: systemctl enable ceph.target
[dlw4][INFO  ] Running command: systemctl enable ceph-mon@dlw4
[dlw4][INFO  ] Running command: systemctl start ceph-mon@dlw4
[dlw4][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.dlw4.asok mon_status
[dlw4][DEBUG ] ********************************************************************************
[dlw4][DEBUG ] status for monitor: mon.dlw4
[dlw4][DEBUG ] {
[dlw4][DEBUG ]   "election_epoch": 0, 
[dlw4][DEBUG ]   "extra_probe_peers": [
[dlw4][DEBUG ]     "172.16.40.11:6789/0", 
[dlw4][DEBUG ]     "172.16.40.12:6789/0", 
[dlw4][DEBUG ]     "172.16.40.13:6789/0"
[dlw4][DEBUG ]   ], 
[dlw4][DEBUG ]   "monmap": {
[dlw4][DEBUG ]     "created": "2017-08-02 10:43:08.448472", 
[dlw4][DEBUG ]     "epoch": 0, 
[dlw4][DEBUG ]     "fsid": "f4833745-d220-407b-82ea-72eb6297d435", 
[dlw4][DEBUG ]     "modified": "2017-08-02 10:43:08.448472", 
[dlw4][DEBUG ]     "mons": [
[dlw4][DEBUG ]       {
[dlw4][DEBUG ]         "addr": "172.16.40.9:6789/0", 
[dlw4][DEBUG ]         "name": "dlw4", 
[dlw4][DEBUG ]         "rank": 0
[dlw4][DEBUG ]       }, 
[dlw4][DEBUG ]       {
[dlw4][DEBUG ]         "addr": "0.0.0.0:0/1", 
[dlw4][DEBUG ]         "name": "dlw1", 
[dlw4][DEBUG ]         "rank": 1
[dlw4][DEBUG ]       }, 
[dlw4][DEBUG ]       {
[dlw4][DEBUG ]         "addr": "0.0.0.0:0/2", 
[dlw4][DEBUG ]         "name": "dlw2", 
[dlw4][DEBUG ]         "rank": 2
[dlw4][DEBUG ]       }, 
[dlw4][DEBUG ]       {
[dlw4][DEBUG ]         "addr": "0.0.0.0:0/3", 
[dlw4][DEBUG ]         "name": "dlw3", 
[dlw4][DEBUG ]         "rank": 3
[dlw4][DEBUG ]       }
[dlw4][DEBUG ]     ]
[dlw4][DEBUG ]   }, 
[dlw4][DEBUG ]   "name": "dlw4", 
[dlw4][DEBUG ]   "outside_quorum": [
[dlw4][DEBUG ]     "dlw4"
[dlw4][DEBUG ]   ], 
[dlw4][DEBUG ]   "quorum": [], 
[dlw4][DEBUG ]   "rank": 0, 
[dlw4][DEBUG ]   "state": "probing", 
[dlw4][DEBUG ]   "sync_provider": []
[dlw4][DEBUG ] }
[dlw4][DEBUG ] ********************************************************************************
[dlw4][INFO  ] monitor: mon.dlw4 is running
[dlw4][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.dlw4.asok mon_status

发现并没有报错,以为添加成功,执行ceph -s,发现还是3个 # ceph -s cluster f4833745-d220-407b-82ea-72eb6297d435 health HEALTH_OK monmap e3: 3 mons at {dlw1=172.16.40.11:6789/0,dlw2=172.16.40.12:6789/0,dlw3=172.16.40.13:6789/0} election epoch 14, quorum 0,1,2 dlw1,dlw2,dlw3 osdmap e26: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds pgmap v9697: 352 pgs, 6 pools, 45725 kB data, 20 objects 253 MB used, 584 GB / 584 GB avail 352 active+clean

切换至dlw4检查mon服务,发现服务也是正常启动的,再执行了一遍mon add,发现结果一样。

[root@dlw4 ceph]# systemctl status ceph-mon@`hostname`
● ceph-mon@dlw4.service - Ceph cluster monitor daemon
   Loaded: loaded (/usr/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2017-08-02 16:24:36 CHOST; 2min 29s ago
 Main PID: 20208 (ceph-mon)
   CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@dlw4.service
           └─20208 /usr/bin/ceph-mon -f --cluster ceph --id dlw4 --setuser ceph --setgroup ceph

Aug 02 16:24:36 dlw4 systemd[1]: Started Ceph cluster monitor daemon.
Aug 02 16:24:36 dlw4 systemd[1]: Starting Ceph cluster monitor daemon...
Aug 02 16:24:36 dlw4 ceph-mon[20208]: starting mon.dlw4 rank -1 at 172.16.40.9:6789/0 mon_data /var/lib/ceph/mon/ceph-dlw4 fsid f4833745-d220-407b-82ea-72eb6297d435

##3.3.检查状态 检查ceph集群mon的状态

# ceph mon_status |jq
{
  "name": "dlw1",
  "rank": 0,
  "state": "leader",
  "election_epoch": 14,
  "quorum": [
    0,
    1,
    2
  ],
  "outside_quorum": [],
  "extra_probe_peers": [
    "172.16.40.9:6789/0",
    "172.16.40.12:6789/0",
    "172.16.40.13:6789/0"
  ],
  "sync_provider": [],
  "monmap": {
    "epoch": 3,
    "fsid": "f4833745-d220-407b-82ea-72eb6297d435",
    "modified": "2017-08-01 19:00:04.795921",
    "created": "2017-07-20 12:38:26.592488",
    "mons": [
      {
        "rank": 0,
        "name": "dlw1",
        "addr": "172.16.40.11:6789/0"
      },
      {
        "rank": 1,
        "name": "dlw2",
        "addr": "172.16.40.12:6789/0"
      },
      {
        "rank": 2,
        "name": "dlw3",
        "addr": "172.16.40.13:6789/0"
      }
    ]
  }
}

备注jq是一个格式化显示工具,需要另外安装,epel源里面就有,ceph本身自带参数也可以格式化显示

# ceph mon_status -f json-pretty

检查dlw4的mon状态

[root@dlw4 ceph]# ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.dlw4.asok mon_status
{
    "name": "dlw4",
    "rank": 0,
    "state": "probing",
    "election_epoch": 0,
    "quorum": [],
    "outside_quorum": [
        "dlw4"
    ],
    "extra_probe_peers": [
        "172.16.40.11:6789\/0",
        "172.16.40.12:6789\/0",
        "172.16.40.13:6789\/0"
    ],
    "sync_provider": [],
    "monmap": {
        "epoch": 0,
        "fsid": "f4833745-d220-407b-82ea-72eb6297d435",
        "modified": "2017-08-02 10:43:08.448472",
        "created": "2017-08-02 10:43:08.448472",
        "mons": [
            {
                "rank": 0,
                "name": "dlw4",
                "addr": "172.16.40.9:6789\/0"
            },
            {
                "rank": 1,
                "name": "dlw1",
                "addr": "0.0.0.0:0\/1"
            },
            {
                "rank": 2,
                "name": "dlw2",
                "addr": "0.0.0.0:0\/2"
            },
            {
                "rank": 3,
                "name": "dlw3",
                "addr": "0.0.0.0:0\/3"
            }
        ]
    }
}

根据mon服务启动时创建的asok文件可以看到dlw4已经在monmap中了,但是状态是probing,相较于其它三台mon的状态分别为leader和peon(领导跟苦工),dlw4还在探索中,也就是dlw4上的mon已经正常了,但是并没有在集群的mon选举中,换句话说,就是它还没连上集群。

##3.4.检查日志 dlw1上(172.16.40.11)

# tail -f ceph-mon.dlw1.log
2017-08-02 16:42:52.877734 7fbfc2333700  0 -- 172.16.40.11:6789/0 >> 172.16.40.9:6789/0 pipe(0x7fbfd9c1c800 sd=22 :6789 s=0 pgs=0 cs=0 l=0 c=0x7fbfd7360700).accept: got bad authorizer
2017-08-02 16:42:54.879024 7fbfc2333700  0 cephx: verify_authorizer could not decrypt ticket info: error: NSS AES final round failed: -8190
2017-08-02 16:42:54.879028 7fbfc2333700  0 mon.dlw1@0(leader) e3 ms_verify_authorizer bad authorizer from mon 172.16.40.9:6789/0
2017-08-02 16:42:54.879034 7fbfc2333700  0 -- 172.16.40.11:6789/0 >> 172.16.40.9:6789/0 pipe(0x7fbfd93c6000 sd=22 :6789 s=0 pgs=0 cs=0 l=0 c=0x7fbfd7360400).accept: got bad authorizer
2017-08-02 16:42:55.076972 7fbfc06d0700  0 cephx: verify_reply couldn't decrypt with error: error decoding block for decryption
2017-08-02 16:42:55.076981 7fbfc06d0700  0 -- 172.16.40.11:6789/0 >> 172.16.40.9:6789/0 pipe(0x7fbfd9c1b400 sd=19 :55595 s=1 pgs=0 cs=0 l=0 c=0x7fbfd8b0ab80).failed verifying authorize reply
2017-08-02 16:42:56.885648 7fbfc2333700  0 cephx: verify_authorizer could not decrypt ticket info: error: NSS AES final round failed: -8190
2017-08-02 16:42:56.885657 7fbfc2333700  0 mon.dlw1@0(leader) e3 ms_verify_authorizer bad authorizer from mon 172.16.40.9:6789/0

dlw4上(172.16.40.9)

[root@dlw4 ceph]# tail -f ceph-mon.dlw4.log 
2017-08-02 16:43:14.890240 7fc9b6483700  0 -- 172.16.40.9:6789/0 >> 172.16.40.13:6789/0 pipe(0x7fc9cb59e800 sd=24 :40309 s=1 pgs=0 cs=0 l=0 c=0x7fc9cb3fad00).failed verifying authorize reply
2017-08-02 16:43:14.919426 7fc9b6584700  0 cephx: verify_reply couldn't decrypt with error: error decoding block for decryption
2017-08-02 16:43:14.919456 7fc9b6584700  0 -- 172.16.40.9:6789/0 >> 172.16.40.12:6789/0 pipe(0x7fc9cb59d400 sd=15 :34233 s=1 pgs=0 cs=0 l=0 c=0x7fc9cb3fab80).failed verifying authorize reply
2017-08-02 16:43:15.693119 7fc9b5b81700  0 cephx: verify_authorizer could not decrypt ticket info: error: NSS AES final round failed: -8190
2017-08-02 16:43:15.693129 7fc9b5b81700  0 mon.dlw4@0(probing) e0 ms_verify_authorizer bad authorizer from mon 172.16.40.13:6789/0

比较日志,也验证了前面的猜想,一边是连接dlw4时获取的权限不对,一边dlw4连接其它mon时权限验证不对。联想ceph的权限是cephx来认证的,而Cephx 用共享密钥来认证,即客户端和监视器集群各自都有客户端密钥的副本。这样的认证协议使参与双方不用展现密钥就能相互认证,就是说集群确信用户拥有密钥、而且用户相信集群有密钥的副本。

##3.5.修改密钥 检查集群中的每个mon的密钥 发现dlw1,dlw2, dlw3的keyring相同是

# cat keyring 
[mon.]
        key = AQDiJHBZAAAAABAAiAz+B0XamXqLSemUudvStA==
        caps mon = "allow *"

而dlw4的keying是

# cd /var/lib/ceph/mon/ceph-dlw4/
# cat keyring 
[mon.]
        key = AQCIUIBZAAAAABAASHOxkpYwK6BlD4ITbuIrkQ==
        caps mon = "allow *"

于是手动修改dlw4的keying,将文件中key修改为dlw1的key 再重启dlw4的mon服务

[root@dlw4 ceph-dlw4]# systemctl restart ceph-mon@`hostname`

##3.6.检查状态 # ceph -s cluster f4833745-d220-407b-82ea-72eb6297d435 health HEALTH_OK monmap e4: 4 mons at {dlw1=172.16.40.11:6789/0,dlw2=172.16.40.12:6789/0,dlw3=172.16.40.13:6789/0,dlw4=172.16.40.9:6789/0} election epoch 16, quorum 0,1,2,3 dlw4,dlw1,dlw2,dlw3 osdmap e26: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds pgmap v9700: 352 pgs, 6 pools, 45725 kB data, 20 objects 253 MB used, 584 GB / 584 GB avail 352 active+clean 发现已经是4个mon了

[root@dlw4 ceph-dlw4]# ceph quorum_status  -f json-pretty

{
    "election_epoch": 16,
    "quorum": [
        0,
        1,
        2,
        3
    ],
    "quorum_names": [
        "dlw4",
        "dlw1",
        "dlw2",
        "dlw3"
    ],
    "quorum_leader_name": "dlw4",
    "monmap": {
        "epoch": 4,
        "fsid": "f4833745-d220-407b-82ea-72eb6297d435",
        "modified": "2017-08-02 16:54:32.549853",
        "created": "2017-07-20 12:38:26.592488",
        "mons": [
            {
                "rank": 0,
                "name": "dlw4",
                "addr": "172.16.40.9:6789\/0"
            },
            {
                "rank": 1,
                "name": "dlw1",
                "addr": "172.16.40.11:6789\/0"
            },
            {
                "rank": 2,
                "name": "dlw2",
                "addr": "172.16.40.12:6789\/0"
            },
            {
                "rank": 3,
                "name": "dlw3",
                "addr": "172.16.40.13:6789\/0"
            }
        ]
    }
}

#4.总结

此次添加mon,一共是出了2个问题

第一个问题是添加mon的时候需要public_network

第二个问题是由于没有添加public_network直接添加mon生成了一个不同于原集群的keying,导致mon之间并不能进行cephx认证,因此mon无法加入到集群的mon选举中。

© 著作权归作者所有

共有 人打赏支持
哓竹
粉丝 4
博文 44
码字总数 51605
作品 0
朝阳
运维
基于redhat7.3 ceph对象存储集群搭建+owncloud S3接口整合生产实践

一、环境准备 安装redhat7.3虚拟机四台 在四台装好的虚拟机上分别加一块100G的硬盘。如图所示: 3.在每个节点上配置主机名 4.集群配置信息如下 5.各节点配置yum源 #需要在每个主机上执行以下...

盖世英雄iii
06/27
0
0
Ceph Monitor启动异常

我采用如下方式配置ceph 监视器: 1.配置/etc/ceph/ceph.conf [global]fsid = 8587ec10-fe1a-41f5-9795-9d38ef20b493moninitialmembers = mdsnodemon_host = 58.220.31.61authclusterrequire......

西昆仑
2015/08/31
2.1K
3
Ceph 开源存储安装

测试架构信息: Ceph-Admin172.17.0.50admin Ceph-Mon172.17.0.40mon Ceph-OSD01172.17.0.41osd01 Ceph-OSD02172.17.0.42osd02 CEph-OSD03172.17.0.43osd03 Ceph-OSD04172.17.0.44osd04 Ceph......

eq2008
2017/04/01
0
0
部署mimic版本的Ceph分布式存储系统

1.简介 Ceph: 开源的分布式存储系统。主要分为对象存储、块设备存储、文件系统服务。Ceph核心组件包括:Ceph OSDs、Monitors、Managers、MDSs。Ceph存储集群至少需要一个Ceph Monitor,Ceph ...

心远何方
08/08
0
0
部署mimic版本的Ceph分布式存储系统

1.简介 Ceph: 开源的分布式存储系统。主要分为对象存储、块设备存储、文件系统服务。Ceph核心组件包括:Ceph OSDs、Monitors、Managers、MDSs。Ceph存储集群至少需要一个Ceph Monitor,Ceph ...

心远何方
08/08
0
0

没有更多内容

加载失败,请刷新页面

加载更多

下一页

Ubuntu18.04 显卡GF-940MX安装NVIDIA-390.77

解决办法: 下面就给大家一个正确的姿势在Ubuntu上安装Nvidia驱动: (a)首先去N卡官网下载自己显卡对应的驱动:www.geforce.cn/drivers (b)下载后好放在英文路径的目录下,怎么简单怎么来...

AI_SKI
今天
0
0
深夜胡思乱想

魔兽世界 最近魔兽世界出了新版本, 周末两天升到了满级,比之前的版本体验好很多,做任务不用抢怪了,不用组队打怪也是共享拾取的。技能简化了很多,哪个亮按哪个。 运维 服务器 产品 之间的...

Firxiao
今天
0
0
MySQL 8 在 Windows 下安装及使用

MySQL 8 带来了全新的体验,比如支持 NoSQL、JSON 等,拥有比 MySQL 5.7 两倍以上的性能提升。本文讲解如何在 Windows 下安装 MySQL 8,以及基本的 MySQL 用法。 下载 下载地址 https://dev....

waylau
今天
0
0
微信第三方平台 access_token is invalid or not latest

微信第三方开发平台code换session_key说的特别容易,但是我一使用就带来无穷无尽的烦恼,搞了一整天也无济于事. 现在记录一下解决问题的过程,方便后来人参考. 我遇到的这个问题搜索了整个网络也...

自由的开源
今天
2
0
openJDK之sun.misc.Unsafe类CAS底层实现

注:这篇文章参考了https://www.cnblogs.com/snowater/p/8303698.html 1.sun.misc.Unsafe中CAS方法 在sun.misc.Unsafe中CAS方法如下: compareAndSwapObject(java.lang.Object arg0, long a......

汉斯-冯-拉特
今天
3
0

没有更多内容

加载失败,请刷新页面

加载更多

下一页

返回顶部
顶部