文档章节

Ceph添加监视器Monitor失败

哓竹
 哓竹
发布于 2017/08/02 16:00
字数 3191
阅读 88
收藏 0
点赞 0
评论 0

#1.添加Mon 当前ceph的状态 # ceph -s cluster f4833745-d220-407b-82ea-72eb6297d435 health HEALTH_OK monmap e3: 3 mons at {dlw1=172.16.40.11:6789/0,dlw2=172.16.40.12:6789/0,dlw3=172.16.40.13:6789/0} election epoch 14, quorum 0,1,2 dlw1,dlw2,dlw3 osdmap e26: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds pgmap v9695: 352 pgs, 6 pools, 45725 kB data, 20 objects 253 MB used, 584 GB / 584 GB avail 352 active+clean

当前已有三个mon,分别为dlw1,dlw2和dlw3,现在添加第四个mon dlw4

疑问:为什么要有四个mon,也不满足Paxos 算法,因为我添加了dlw4作为mon,再把dlw1的mon移除掉,这样就等同于mon迁移了...,这不是重点,重点是添加mon过程中的报错及解决办法,做个记录。

用最简单快速的方法来添加ceph-deploy

# ceph-deploy --overwrite-conf mon create dlw4
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.37): /usr/bin/ceph-deploy --overwrite-conf mon create dlw4
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : True
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x1dd2d88>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  mon                           : ['dlw4']
[ceph_deploy.cli][INFO  ]  func                          : <function mon at 0x1c93de8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  keyrings                      : None
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts dlw4
[ceph_deploy.mon][DEBUG ] detecting platform for host dlw4 ...
[dlw4][DEBUG ] connected to host: dlw4 
[dlw4][DEBUG ] detect platform information from remote host
[dlw4][DEBUG ] detect machine type
[dlw4][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO  ] distro info: CentOS Linux 7.2.1511 Core
[dlw4][DEBUG ] determining if provided host has same hostname in remote
[dlw4][DEBUG ] get remote short hostname
[dlw4][DEBUG ] deploying mon to dlw4
[dlw4][DEBUG ] get remote short hostname
[dlw4][DEBUG ] remote hostname: dlw4
[dlw4][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[dlw4][DEBUG ] create the mon path if it does not exist
[dlw4][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-dlw4/done
[dlw4][DEBUG ] create a done file to avoid re-doing the mon deployment
[dlw4][DEBUG ] create the init path if it does not exist
[dlw4][INFO  ] Running command: systemctl enable ceph.target
[dlw4][INFO  ] Running command: systemctl enable ceph-mon@dlw4
[dlw4][INFO  ] Running command: systemctl start ceph-mon@dlw4
[dlw4][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.dlw4.asok mon_status
[dlw4][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[dlw4][WARNIN] monitor: mon.dlw4, might not be running yet
[dlw4][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.dlw4.asok mon_status
[dlw4][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[dlw4][WARNIN] dlw4 is not defined in `mon initial members`
[dlw4][WARNIN] monitor dlw4 does not exist in monmap
[dlw4][WARNIN] neither `public_addr` nor `public_network` keys are defined for monitors
[dlw4][WARNIN] monitors may not be able to form quorum


[root@dlw1 opt]# ceph-deploy --overwrite-conf mon add dlw4 
  
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.37): /usr/bin/ceph-deploy --overwrite-conf mon add dlw4
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : True
[ceph_deploy.cli][INFO  ]  subcommand                    : add
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0xf08d88>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  mon                           : ['dlw4']
[ceph_deploy.cli][INFO  ]  func                          : <function mon at 0xdcade8>
[ceph_deploy.cli][INFO  ]  address                       : None
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mon][INFO  ] ensuring configuration of new mon host: dlw4
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to dlw4
[dlw4][DEBUG ] connected to host: dlw4 
[dlw4][DEBUG ] detect platform information from remote host
[dlw4][DEBUG ] detect machine type
[dlw4][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.mon][DEBUG ] Adding mon to cluster ceph, host dlw4
[ceph_deploy.mon][DEBUG ] using mon address by resolving host: 172.16.40.9
[ceph_deploy.mon][DEBUG ] detecting platform for host dlw4 ...
[dlw4][DEBUG ] connected to host: dlw4 
[dlw4][DEBUG ] detect platform information from remote host
[dlw4][DEBUG ] detect machine type
[dlw4][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO  ] distro info: CentOS Linux 7.2.1511 Core
[dlw4][DEBUG ] determining if provided host has same hostname in remote
[dlw4][DEBUG ] get remote short hostname
[dlw4][DEBUG ] adding mon to dlw4
[dlw4][DEBUG ] get remote short hostname
[dlw4][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[dlw4][DEBUG ] create the mon path if it does not exist
[dlw4][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-dlw4/done
[dlw4][DEBUG ] create a done file to avoid re-doing the mon deployment
[dlw4][DEBUG ] create the init path if it does not exist
[dlw4][INFO  ] Running command: systemctl enable ceph.target
[dlw4][INFO  ] Running command: systemctl enable ceph-mon@dlw4
[dlw4][INFO  ] Running command: systemctl start ceph-mon@dlw4
[dlw4][WARNIN] No data was received after 7 seconds, disconnecting...
[dlw4][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.dlw4.asok mon_status
[dlw4][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[dlw4][WARNIN] dlw4 is not defined in `mon initial members`
[dlw4][WARNIN] monitor dlw4 does not exist in monmap
[dlw4][WARNIN] neither `public_addr` nor `public_network` keys are defined for monitors
[dlw4][WARNIN] monitors may not be able to form quorum
[dlw4][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.dlw4.asok mon_status
[dlw4][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[dlw4][WARNIN] monitor: mon.dlw4, might not be running yet

#2.发现报错 这里用了add和create,发现都报错,而且报错内容一样,是找不到asok

找不到这个文件的原因是在于dlw4上的mon服务启动失败了

[root@dlw4 ceph]# systemctl status ceph-mon@`hostname`
● ceph-mon@dlw4.service - Ceph cluster monitor daemon
   Loaded: loaded (/usr/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: disabled)
   Active: failed (Result: start-limit) since Wed 2017-08-02 16:04:20 CHOST; 11min ago
  Process: 20119 ExecStart=/usr/bin/ceph-mon -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
 Main PID: 20119 (code=exited, status=1/FAILURE)

#3.解决经过 ##3.1.检查日志 在dlw4上

[root@dlw4 ceph]# cd /var/log/ceph/
[root@dlw4 ceph]# ls
ceph.log  ceph-mon.dlw4.log 
[root@dlw4 ceph]# tail -f ceph-mon.dlw4.log 
2017-08-02 16:04:09.972349 7f0232f61640  0 set uid:gid to 167:167 (ceph:ceph)
2017-08-02 16:04:09.972374 7f0232f61640  0 ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185), process ceph-mon, pid 20119
2017-08-02 16:04:09.972410 7f0232f61640  0 pidfile_write: ignore empty --pid-file
2017-08-02 16:04:09.998672 7f0232f61640  1 leveldb: Recovering log #30
2017-08-02 16:04:10.003399 7f0232f61640  1 leveldb: Delete type=0 #30

2017-08-02 16:04:10.003445 7f0232f61640  1 leveldb: Delete type=3 #29

2017-08-02 16:04:10.003694 7f0232f61640  0 mon.dlw4 does not exist in monmap, will attempt to join an existing cluster
2017-08-02 16:04:10.003795 7f0232f61640 -1 no public_addr or public_network specified, and mon.dlw4 not present in monmap or ceph.conf

最后一行引起了注意,没有指定public_addr或public_network,并且mon.dlw4也没指定再monmap或者ceph.conf中

##3.2.修改参数再次添加mon 在dlw1上,往ceph.conf中添加参数 public_network=172.16.40.0/24 并且把dlw4加入到mon_initial_members和mon_host中 把ceph.conf推到所有节点上

# ceph-deploy --overwrite-conf config push dlw2 dlw3 dlw4

再次添加mon

# ceph-deploy --overwrite-conf mon create dlw4
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.37): /usr/bin/ceph-deploy --overwrite-conf mon create dlw4
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : True
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x1ca3d88>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  mon                           : ['dlw4']
[ceph_deploy.cli][INFO  ]  func                          : <function mon at 0x1b64de8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  keyrings                      : None
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts dlw4
[ceph_deploy.mon][DEBUG ] detecting platform for host dlw4 ...
[dlw4][DEBUG ] connected to host: dlw4 
[dlw4][DEBUG ] detect platform information from remote host
[dlw4][DEBUG ] detect machine type
[dlw4][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO  ] distro info: CentOS Linux 7.2.1511 Core
[dlw4][DEBUG ] determining if provided host has same hostname in remote
[dlw4][DEBUG ] get remote short hostname
[dlw4][DEBUG ] deploying mon to dlw4
[dlw4][DEBUG ] get remote short hostname
[dlw4][DEBUG ] remote hostname: dlw4
[dlw4][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[dlw4][DEBUG ] create the mon path if it does not exist
[dlw4][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-dlw4/done
[dlw4][DEBUG ] create a done file to avoid re-doing the mon deployment
[dlw4][DEBUG ] create the init path if it does not exist
[dlw4][INFO  ] Running command: systemctl enable ceph.target
[dlw4][INFO  ] Running command: systemctl enable ceph-mon@dlw4
[dlw4][INFO  ] Running command: systemctl start ceph-mon@dlw4
[dlw4][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.dlw4.asok mon_status
[dlw4][DEBUG ] ********************************************************************************
[dlw4][DEBUG ] status for monitor: mon.dlw4
[dlw4][DEBUG ] {
[dlw4][DEBUG ]   "election_epoch": 0, 
[dlw4][DEBUG ]   "extra_probe_peers": [
[dlw4][DEBUG ]     "172.16.40.11:6789/0", 
[dlw4][DEBUG ]     "172.16.40.12:6789/0", 
[dlw4][DEBUG ]     "172.16.40.13:6789/0"
[dlw4][DEBUG ]   ], 
[dlw4][DEBUG ]   "monmap": {
[dlw4][DEBUG ]     "created": "2017-08-02 10:43:08.448472", 
[dlw4][DEBUG ]     "epoch": 0, 
[dlw4][DEBUG ]     "fsid": "f4833745-d220-407b-82ea-72eb6297d435", 
[dlw4][DEBUG ]     "modified": "2017-08-02 10:43:08.448472", 
[dlw4][DEBUG ]     "mons": [
[dlw4][DEBUG ]       {
[dlw4][DEBUG ]         "addr": "172.16.40.9:6789/0", 
[dlw4][DEBUG ]         "name": "dlw4", 
[dlw4][DEBUG ]         "rank": 0
[dlw4][DEBUG ]       }, 
[dlw4][DEBUG ]       {
[dlw4][DEBUG ]         "addr": "0.0.0.0:0/1", 
[dlw4][DEBUG ]         "name": "dlw1", 
[dlw4][DEBUG ]         "rank": 1
[dlw4][DEBUG ]       }, 
[dlw4][DEBUG ]       {
[dlw4][DEBUG ]         "addr": "0.0.0.0:0/2", 
[dlw4][DEBUG ]         "name": "dlw2", 
[dlw4][DEBUG ]         "rank": 2
[dlw4][DEBUG ]       }, 
[dlw4][DEBUG ]       {
[dlw4][DEBUG ]         "addr": "0.0.0.0:0/3", 
[dlw4][DEBUG ]         "name": "dlw3", 
[dlw4][DEBUG ]         "rank": 3
[dlw4][DEBUG ]       }
[dlw4][DEBUG ]     ]
[dlw4][DEBUG ]   }, 
[dlw4][DEBUG ]   "name": "dlw4", 
[dlw4][DEBUG ]   "outside_quorum": [
[dlw4][DEBUG ]     "dlw4"
[dlw4][DEBUG ]   ], 
[dlw4][DEBUG ]   "quorum": [], 
[dlw4][DEBUG ]   "rank": 0, 
[dlw4][DEBUG ]   "state": "probing", 
[dlw4][DEBUG ]   "sync_provider": []
[dlw4][DEBUG ] }
[dlw4][DEBUG ] ********************************************************************************
[dlw4][INFO  ] monitor: mon.dlw4 is running
[dlw4][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.dlw4.asok mon_status

发现并没有报错,以为添加成功,执行ceph -s,发现还是3个 # ceph -s cluster f4833745-d220-407b-82ea-72eb6297d435 health HEALTH_OK monmap e3: 3 mons at {dlw1=172.16.40.11:6789/0,dlw2=172.16.40.12:6789/0,dlw3=172.16.40.13:6789/0} election epoch 14, quorum 0,1,2 dlw1,dlw2,dlw3 osdmap e26: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds pgmap v9697: 352 pgs, 6 pools, 45725 kB data, 20 objects 253 MB used, 584 GB / 584 GB avail 352 active+clean

切换至dlw4检查mon服务,发现服务也是正常启动的,再执行了一遍mon add,发现结果一样。

[root@dlw4 ceph]# systemctl status ceph-mon@`hostname`
● ceph-mon@dlw4.service - Ceph cluster monitor daemon
   Loaded: loaded (/usr/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2017-08-02 16:24:36 CHOST; 2min 29s ago
 Main PID: 20208 (ceph-mon)
   CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@dlw4.service
           └─20208 /usr/bin/ceph-mon -f --cluster ceph --id dlw4 --setuser ceph --setgroup ceph

Aug 02 16:24:36 dlw4 systemd[1]: Started Ceph cluster monitor daemon.
Aug 02 16:24:36 dlw4 systemd[1]: Starting Ceph cluster monitor daemon...
Aug 02 16:24:36 dlw4 ceph-mon[20208]: starting mon.dlw4 rank -1 at 172.16.40.9:6789/0 mon_data /var/lib/ceph/mon/ceph-dlw4 fsid f4833745-d220-407b-82ea-72eb6297d435

##3.3.检查状态 检查ceph集群mon的状态

# ceph mon_status |jq
{
  "name": "dlw1",
  "rank": 0,
  "state": "leader",
  "election_epoch": 14,
  "quorum": [
    0,
    1,
    2
  ],
  "outside_quorum": [],
  "extra_probe_peers": [
    "172.16.40.9:6789/0",
    "172.16.40.12:6789/0",
    "172.16.40.13:6789/0"
  ],
  "sync_provider": [],
  "monmap": {
    "epoch": 3,
    "fsid": "f4833745-d220-407b-82ea-72eb6297d435",
    "modified": "2017-08-01 19:00:04.795921",
    "created": "2017-07-20 12:38:26.592488",
    "mons": [
      {
        "rank": 0,
        "name": "dlw1",
        "addr": "172.16.40.11:6789/0"
      },
      {
        "rank": 1,
        "name": "dlw2",
        "addr": "172.16.40.12:6789/0"
      },
      {
        "rank": 2,
        "name": "dlw3",
        "addr": "172.16.40.13:6789/0"
      }
    ]
  }
}

备注jq是一个格式化显示工具,需要另外安装,epel源里面就有,ceph本身自带参数也可以格式化显示

# ceph mon_status -f json-pretty

检查dlw4的mon状态

[root@dlw4 ceph]# ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.dlw4.asok mon_status
{
    "name": "dlw4",
    "rank": 0,
    "state": "probing",
    "election_epoch": 0,
    "quorum": [],
    "outside_quorum": [
        "dlw4"
    ],
    "extra_probe_peers": [
        "172.16.40.11:6789\/0",
        "172.16.40.12:6789\/0",
        "172.16.40.13:6789\/0"
    ],
    "sync_provider": [],
    "monmap": {
        "epoch": 0,
        "fsid": "f4833745-d220-407b-82ea-72eb6297d435",
        "modified": "2017-08-02 10:43:08.448472",
        "created": "2017-08-02 10:43:08.448472",
        "mons": [
            {
                "rank": 0,
                "name": "dlw4",
                "addr": "172.16.40.9:6789\/0"
            },
            {
                "rank": 1,
                "name": "dlw1",
                "addr": "0.0.0.0:0\/1"
            },
            {
                "rank": 2,
                "name": "dlw2",
                "addr": "0.0.0.0:0\/2"
            },
            {
                "rank": 3,
                "name": "dlw3",
                "addr": "0.0.0.0:0\/3"
            }
        ]
    }
}

根据mon服务启动时创建的asok文件可以看到dlw4已经在monmap中了,但是状态是probing,相较于其它三台mon的状态分别为leader和peon(领导跟苦工),dlw4还在探索中,也就是dlw4上的mon已经正常了,但是并没有在集群的mon选举中,换句话说,就是它还没连上集群。

##3.4.检查日志 dlw1上(172.16.40.11)

# tail -f ceph-mon.dlw1.log
2017-08-02 16:42:52.877734 7fbfc2333700  0 -- 172.16.40.11:6789/0 >> 172.16.40.9:6789/0 pipe(0x7fbfd9c1c800 sd=22 :6789 s=0 pgs=0 cs=0 l=0 c=0x7fbfd7360700).accept: got bad authorizer
2017-08-02 16:42:54.879024 7fbfc2333700  0 cephx: verify_authorizer could not decrypt ticket info: error: NSS AES final round failed: -8190
2017-08-02 16:42:54.879028 7fbfc2333700  0 mon.dlw1@0(leader) e3 ms_verify_authorizer bad authorizer from mon 172.16.40.9:6789/0
2017-08-02 16:42:54.879034 7fbfc2333700  0 -- 172.16.40.11:6789/0 >> 172.16.40.9:6789/0 pipe(0x7fbfd93c6000 sd=22 :6789 s=0 pgs=0 cs=0 l=0 c=0x7fbfd7360400).accept: got bad authorizer
2017-08-02 16:42:55.076972 7fbfc06d0700  0 cephx: verify_reply couldn't decrypt with error: error decoding block for decryption
2017-08-02 16:42:55.076981 7fbfc06d0700  0 -- 172.16.40.11:6789/0 >> 172.16.40.9:6789/0 pipe(0x7fbfd9c1b400 sd=19 :55595 s=1 pgs=0 cs=0 l=0 c=0x7fbfd8b0ab80).failed verifying authorize reply
2017-08-02 16:42:56.885648 7fbfc2333700  0 cephx: verify_authorizer could not decrypt ticket info: error: NSS AES final round failed: -8190
2017-08-02 16:42:56.885657 7fbfc2333700  0 mon.dlw1@0(leader) e3 ms_verify_authorizer bad authorizer from mon 172.16.40.9:6789/0

dlw4上(172.16.40.9)

[root@dlw4 ceph]# tail -f ceph-mon.dlw4.log 
2017-08-02 16:43:14.890240 7fc9b6483700  0 -- 172.16.40.9:6789/0 >> 172.16.40.13:6789/0 pipe(0x7fc9cb59e800 sd=24 :40309 s=1 pgs=0 cs=0 l=0 c=0x7fc9cb3fad00).failed verifying authorize reply
2017-08-02 16:43:14.919426 7fc9b6584700  0 cephx: verify_reply couldn't decrypt with error: error decoding block for decryption
2017-08-02 16:43:14.919456 7fc9b6584700  0 -- 172.16.40.9:6789/0 >> 172.16.40.12:6789/0 pipe(0x7fc9cb59d400 sd=15 :34233 s=1 pgs=0 cs=0 l=0 c=0x7fc9cb3fab80).failed verifying authorize reply
2017-08-02 16:43:15.693119 7fc9b5b81700  0 cephx: verify_authorizer could not decrypt ticket info: error: NSS AES final round failed: -8190
2017-08-02 16:43:15.693129 7fc9b5b81700  0 mon.dlw4@0(probing) e0 ms_verify_authorizer bad authorizer from mon 172.16.40.13:6789/0

比较日志,也验证了前面的猜想,一边是连接dlw4时获取的权限不对,一边dlw4连接其它mon时权限验证不对。联想ceph的权限是cephx来认证的,而Cephx 用共享密钥来认证,即客户端和监视器集群各自都有客户端密钥的副本。这样的认证协议使参与双方不用展现密钥就能相互认证,就是说集群确信用户拥有密钥、而且用户相信集群有密钥的副本。

##3.5.修改密钥 检查集群中的每个mon的密钥 发现dlw1,dlw2, dlw3的keyring相同是

# cat keyring 
[mon.]
        key = AQDiJHBZAAAAABAAiAz+B0XamXqLSemUudvStA==
        caps mon = "allow *"

而dlw4的keying是

# cd /var/lib/ceph/mon/ceph-dlw4/
# cat keyring 
[mon.]
        key = AQCIUIBZAAAAABAASHOxkpYwK6BlD4ITbuIrkQ==
        caps mon = "allow *"

于是手动修改dlw4的keying,将文件中key修改为dlw1的key 再重启dlw4的mon服务

[root@dlw4 ceph-dlw4]# systemctl restart ceph-mon@`hostname`

##3.6.检查状态 # ceph -s cluster f4833745-d220-407b-82ea-72eb6297d435 health HEALTH_OK monmap e4: 4 mons at {dlw1=172.16.40.11:6789/0,dlw2=172.16.40.12:6789/0,dlw3=172.16.40.13:6789/0,dlw4=172.16.40.9:6789/0} election epoch 16, quorum 0,1,2,3 dlw4,dlw1,dlw2,dlw3 osdmap e26: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds pgmap v9700: 352 pgs, 6 pools, 45725 kB data, 20 objects 253 MB used, 584 GB / 584 GB avail 352 active+clean 发现已经是4个mon了

[root@dlw4 ceph-dlw4]# ceph quorum_status  -f json-pretty

{
    "election_epoch": 16,
    "quorum": [
        0,
        1,
        2,
        3
    ],
    "quorum_names": [
        "dlw4",
        "dlw1",
        "dlw2",
        "dlw3"
    ],
    "quorum_leader_name": "dlw4",
    "monmap": {
        "epoch": 4,
        "fsid": "f4833745-d220-407b-82ea-72eb6297d435",
        "modified": "2017-08-02 16:54:32.549853",
        "created": "2017-07-20 12:38:26.592488",
        "mons": [
            {
                "rank": 0,
                "name": "dlw4",
                "addr": "172.16.40.9:6789\/0"
            },
            {
                "rank": 1,
                "name": "dlw1",
                "addr": "172.16.40.11:6789\/0"
            },
            {
                "rank": 2,
                "name": "dlw2",
                "addr": "172.16.40.12:6789\/0"
            },
            {
                "rank": 3,
                "name": "dlw3",
                "addr": "172.16.40.13:6789\/0"
            }
        ]
    }
}

#4.总结

此次添加mon,一共是出了2个问题

第一个问题是添加mon的时候需要public_network

第二个问题是由于没有添加public_network直接添加mon生成了一个不同于原集群的keying,导致mon之间并不能进行cephx认证,因此mon无法加入到集群的mon选举中。

© 著作权归作者所有

共有 人打赏支持
哓竹
粉丝 4
博文 44
码字总数 51605
作品 0
朝阳
运维
ceph 存储系统

概述 ceph是一个Linux PB级别的分布式存储系统,ceph的目标简单地定义为: 可轻松扩展到PB容量 对多种工作负载的高性能(每秒输入/输出操作[IOPS]和带宽) 高可靠性 与常见的集中式存储不同,...

Jerry_Baby ⋅ 2014/10/15 ⋅ 6

基于redhat7.3 ceph对象存储集群搭建+owncloud S3接口整合生产实践

一、环境准备 安装redhat7.3虚拟机四台 在四台装好的虚拟机上分别加一块100G的硬盘。如图所示: 3.在每个节点上配置主机名 4.集群配置信息如下 5.各节点配置yum源 #需要在每个主机上执行以下...

盖世英雄iii ⋅ 2017/09/09 ⋅ 0

Ceph Monitor启动异常

我采用如下方式配置ceph 监视器: 1.配置/etc/ceph/ceph.conf [global]fsid = 8587ec10-fe1a-41f5-9795-9d38ef20b493moninitialmembers = mdsnodemon_host = 58.220.31.61authclusterrequire......

西昆仑 ⋅ 2015/08/31 ⋅ 3

Ceph 开源存储安装

测试架构信息: Ceph-Admin172.17.0.50admin Ceph-Mon172.17.0.40mon Ceph-OSD01172.17.0.41osd01 Ceph-OSD02172.17.0.42osd02 CEph-OSD03172.17.0.43osd03 Ceph-OSD04172.17.0.44osd04 Ceph......

eq2008 ⋅ 2017/04/01 ⋅ 0

基于centos7.3安装部署jewel版本ceph集群实战演练

一、环境准备 安装centos7.3虚拟机三台 由于官网源与网盘下载速度都非常的慢,所以给大家提供了国内的搜狐镜像源:http://mirrors.sohu.com/centos/7.3.1611/isos/x8664/CentOS-7-x86_64-DV...

盖世英雄iii ⋅ 2017/09/04 ⋅ 0

57.CEPH分布式文件系

CEPH分布式文件系统 CEPH是一种为优秀的性能,可靠性和可扩展性而设计的统一的,分布式文件系统。 CEPH可以轻松的扩展到数PB容量,支持多种工作负载的高性能,高可靠性。 CEPH四个部分: Cl...

触动的风 ⋅ 2017/07/10 ⋅ 0

基于centos7.3安装部署jewel版本ceph集群实战演练

一、环境准备 安装centos7.3虚拟机三台 由于官网源与网盘下载速度都非常的慢,所以给大家提供了国内的搜狐镜像源:http://mirrors.sohu.com/centos/7.3.1611/isos/x8664/CentOS-7-x86_64-DV...

盖世英雄iii ⋅ 2017/09/04 ⋅ 0

Ceph编译安装教程

Ceph官方版本目前支持的纠删码很有限,实验室这块希望能够整合我们自主开发的纠删码BRS(Binary Reed–Solomon encoding),所以需要编译Ceph环境。Ceph官方目前推荐的安装方式都是通过Ceph-...

LeeHappen ⋅ 2017/12/01 ⋅ 0

国内开发者“豪迈”成为 Ceph 首个杰出贡献者

Ceph 社区为了奖励开发者在社区的杰出贡献,将从Ceph的Kraken版本开始,为每个命名版本关联一个“杰出贡献者”。而第一位杰出贡献者是来自中国的Core Developer—王豪迈,开源中国账号:@ha...

两味真火 ⋅ 2017/01/20 ⋅ 25

centos7 安装配置ceph

前期准备: 规划:8台机器 IP hostname role 192.168.2.20 mon mon.mon 192.168.2.21 osd1 osd.0,mon.osd1 192.168.2.22 osd2 osd.1,mds.b(standby) 192.168.2.23 osd3 osd.2 192.168.2.24 o......

anxingjie ⋅ 2015/08/20 ⋅ 0

没有更多内容

加载失败,请刷新页面

加载更多

下一页

行政区划代码转为字典形式

原数据为: http://www.mca.gov.cn/article/sj/xzqh/2018/201804-12/201804-06041553.html 手动替换了一下格式,并使用下面的代码处理. # 输入格式s = """110000:北京市110101:东城区1101...

漫步海边小路 ⋅ 14分钟前 ⋅ 0

android apk 签名

创建key,需要用到keytool.exe (位于C:\Program Files\Java\jdk1.6.0_10\bin目录下),使用产生的key对apk签名用到的是jarsigner.exe (位于C:\Program Files\Java\jdk1.6.0_10\bin目录下),把...

国仔饼 ⋅ 23分钟前 ⋅ 0

springcloud+jps+mybatis多数据库配置

多数据库配置 配置我们目录结构设置: config ---datasource ----jpa ----mybatis ----redis Datasource中是数据的相关配置 Jap中是springDatajpa的相关配置 Mybatis中是mybatis的相关配置 ...

大-智-若-愚 ⋅ 30分钟前 ⋅ 0

Spring mvc HandlerMapping 实现机制

概述 当DispatcherServlet接受到客户端的请求后,SpringMVC 通过 HandlerMapping 找到请求的Controller。 HandlerMapping 在这里起到路由的作用,负责找到请求的Controller。 Spring MVC 默认...

轨迹_ ⋅ 34分钟前 ⋅ 0

JavaScript零基础入门——(十)JavaScript的DOM基础

JavaScript零基础入门——(十)JavaScript的DOM基础 欢迎大家回到我们的JavaScript零基础入门,上一节课,我们了解了JavaScript中的函数,这一节课,我们来了解一下JavaScript的DOM。 第一节...

JandenMa ⋅ 今天 ⋅ 0

Weex起步

本教程假设你已经在你的本地环境安装了node 其实weex起步教程在 https://github.com/lilugirl/incubator-weex 项目说明文件中都已经有了,但为了有些同学看到英文秒变文盲,所以这里我重新写...

lilugirl ⋅ 今天 ⋅ 0

Jenkins实践1 之安装

1 下载 http://mirrors.jenkins.io/war/latest/jenkins.war 2 启动 java -jar jenkins.war 前提:安装jdk并配置环境变量 启动结果节选: ************************************************......

晨猫 ⋅ 今天 ⋅ 0

组合数学 1-2000 中,能被6或10整除的数的个数

1--2000 中,能被6或10整除的数的个数 利用集合的性质 能被6整除的个数 2000/6 = 333 能被10整除的个数 2000/10 = 200 能被6和10整除的个数 2000/30 = 66 能被6或10整除的个数 333+200-66 =...

阿豪boy ⋅ 今天 ⋅ 0

一篇文章学懂Shell脚本

Shell脚本,就是利用Shell的命令解释的功能,对一个纯文本的文件进行解析,然后执行这些功能,也可以说Shell脚本就是一系列命令的集合。 Shell可以直接使用在win/Unix/Linux上面,并且可以调用...

Jake_xun ⋅ 今天 ⋅ 0

大数据工程师需要精通算法吗,要达到一个什么程度呢?

机器学习是人工智能的一个重要分支,而机器学习下最重要的就是算法,本文讲述归纳了入门级的几个机器学习算法,加大数据学习群:716581014一起加入AI技术大本营。 1、监督学习算法 这个算法由...

董黎明 ⋅ 今天 ⋅ 0

没有更多内容

加载失败,请刷新页面

加载更多

下一页

返回顶部
顶部