一般kubernetes集群搭建的方式有kubeadm(官方推荐),二进制搭建,minikube等方式,本文使用官方推荐的kubeadm搭建
一、基础环境配置
1.准备四台虚拟机
Kubernetes Master01 192.168.100.11 kube-master-01 master
Kubernetes Minion01 192.168.100.12 kube-minion-01 minion
Kubernetes Minion02 192.168.100.13 kube-minion-02 minion
Kubernetes Minion03 192.168.100.14 kube-minion-03 minion
2.配置hosts文件
cat >> /etc/hosts<<EOF
192.168.100.11 kube-master-01
192.168.100.12 kube-minion-01
192.168.100.13 kube-minion-02
192.168.100.14 kube-minion-03
EOF
3.修改hostname文件
sudo hostnamectl set-hostname <newhostname>
4.关闭系统防火墙
systemctl stop firewalld
systemctl disable firewalld
5.禁用swap内存交换
swapoff -a
echo "swapoff -a" >>/etc/rc.d/rc.local
chmod +x /etc/rc.d/rc.local
注意:或开机禁用swap: 编辑/etc/fstab --> 注释掉swap 分区
6.关闭selinux服务
临时关闭:setenforce 0 永久关闭:vi /etc/selinux/config
将SELINUX=enforcing改为SELINUX=disabled 设置后需要重启才能生效,命令如下:
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux
7.配置iptable管理ipv4/6请求
sudo echo "1" > /proc/sys/net/ipv4/ip_forward
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
执行 sysctl --system 使配置生效
8.校对系统时间
yum -y install ntp
systemctl start ntpd
systemctl enable ntpd
二、集群环境配置
1.安装docker服务
配置源wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
安装docker-ce容器服务
yum -y install docker-ce-18.06.1.ce-3.el7
查看docker版本号docker --version和详细信息docker info
添加开机自启动和启动服务systemctl enable docker && systemctl start docker
修改docker启动参数
cat > /etc/docker/daemon.json <<EOF
{
"registry-mirrors": ["https://yywkvob3.mirror.aliyuncs.com"],
"exec-opts": ["native.cgroupdriver=systemd"]
}
EOF
修改docker的启动服务脚本docker.service:
在[Service]节点下增加
ExecStartPost=/sbin/iptables -I FORWARD -s 0.0.0.0/0 -j ACCEPT
修改完成使用systemctl daemon-reload && systemctl restart docker重启服务
2.安装Kubernetes组件
配置源cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
安装组件yum install -y kubelet kubeadm kubectl
3.配置启动kubelet 组件
配置kubelet使用国内pause镜像和配置kubelet的cgroups:
cgroups要和docker的配置一样,使用dokcer info可查看
cat >/etc/sysconfig/kubelet<<EOF
KUBELET_EXTRA_ARGS=--cgroup-driver=systemd
--pod-infra-container-image=k8s.gcr.io/pause:3.1
--runtime-cgroups=/systemd/system.slice
--kubelet-cgroups=/systemd/system.slice
EOF
使配置生效:systemctl daemon-reload
添加自启动:systemctl enable kubelet
4.配置Master节点
在master节点上创建初始化脚本:vi /etc/kubernetes/kubeadm-init.sh
kubeadm init \
--kubernetes-version=v1.16.0 \
--pod-network-cidr=10.244.0.0/16 \
--apiserver-advertise-address=192.168.100.11
修改脚本权限:chmod +x /etc/kubernetes/kubeadm-init.sh
5.初始化Master节点
由于初始化时会从k8s.gcr.io拉取镜像,该镜像被墙,我们手动从国内镜像源拉取
首选我们kubeadm config images list查看需要手动拉取镜像资源
在master节点上创建拉取镜像脚本:vi /etc/kubernetes/kubeadm-pull.sh
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.16.0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.16.0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.16.0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.16.0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.3.15-0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.6.2
docker pull quay.io/coreos/flannel:v0.11.0-amd64
修改脚本权限:chmod +x /etc/kubernetes/kubeadm-pull.sh
执行拉取镜像脚本:/etc/kubernetes/kubeadm-pull.sh
然后镜像拉取完成后我们需要打tag为k8s.gcr.io,让初始化时不在拉不到镜像
在master节点上创建打标镜像脚本:vi /etc/kubernetes/kubeadm-tags.sh
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.16.0 k8s.gcr.io/kube-apiserver:v1.16.0
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.16.0 k8s.gcr.io/kube-controller-manager:v1.16.0
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.16.0 k8s.gcr.io/kube-scheduler:v1.16.0
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.16.0 k8s.gcr.io/kube-proxy:v1.16.0
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1 k8s.gcr.io/pause:3.1
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.3.15-0 k8s.gcr.io/etcd:3.3.15-0
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.6.2 k8s.gcr.io/coredns:1.6.2
docker tag quay.io/coreos/flannel:v0.11.0-amd64 k8s.gcr.io/flannel:v0.11.0
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.16.0
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.16.0
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.16.0
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.16.0
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.3.15-0
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.6.2
docker rmi quay.io/coreos/flannel:v0.11.0-amd64
修改脚本权限:chmod +x /etc/kubernetes/kubeadm-tags.sh
执行打标镜像脚本:/etc/kubernetes/kubeadm-tags.sh
执行/etc/kubernetes/kubeadm-init.sh此时会初始化
注意:如果初始化过程出现问题,使用如下命令重置
kubeadm reset
rm -rf /var/lib/cni/ $HOME/.kube/config
初始化成功如下图:
kubeadm join 192.168.100.11:6443 --token orpb71.4ntdi3oq3ct9fmap --discovery-token-ca-cert-hash
sha256:c392f20abfc6f58da1140a7112a68bf29e68322bb96397c2ffdb7589079bc512
上面这一句是给其他节点加入集群用的,要保存下来,后面要用。
配置master上通过 kubectl 管理集群,执行下面的命令:
rm -rf $HOME/.kube
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
使用kubectl get nodes查看刚初始化的主节点信息:
我们看到master节点的状态时未就绪状态,需要配置使用网络flannel插件:
下载flannel配置文件:
wget -P /etc/kubernetes/conf
https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
修改下载的flannel.yml文件,删除多余部分,并指定网卡信息:
启动flannel组件:kubectl apply -f /etc/kubernetes/conf/kube-flannel.yml
使用kubeadm初始化的集群,出于安全考虑Pod不会被调度到Master Node上,可使用如下命令使Master节点参与工作负载:
kubectl taint nodes --all node-role.kubernetes.io/master-
6.加入各Node节点
首先要给节点拉取镜像:
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.16.0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1
docker pull quay.io/coreos/flannel:v0.11.0-amd64
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.16.0 k8s.gcr.io/kube-proxy:v1.16.0
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1 k8s.gcr.io/pause:3.1
docker tag quay.io/coreos/flannel:v0.11.0-amd64 k8s.gcr.io/flannel:v0.11.0
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.16.0
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1
docker rmi quay.io/coreos/flannel:v0.11.0-amd64
等每个节点机器上都拉取完镜像后,执行下面的加入集群的命令:
kubeadm join 192.168.100.11:6443 --token orpb71.4ntdi3oq3ct9fmap --discovery-token-ca-cert-hash
sha256:c392f20abfc6f58da1140a7112a68bf29e68322bb96397c2ffdb7589079bc512
7.部署Kubernetes Web
从kubernetes官方github下载配置文件:
wget https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml
手动从阿里仓库拉取镜像到各个节点上:
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.1
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.1 k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.1
修改配置文件访问类型为:NodePort
启动webui组件:kubectl apply -f /etc/kubernetes/conf/kubernetes-dashboard.yaml
查看dashboard pod的状态:kubectl get pods -n kube-system
查看端口映射:kubectl get svc -n kube-system
然后通过:https://192.168.100.11:31080/访问
我们看到有两种访问方式,下面我们配置这两种访问方式:
我们创建dashboard用户yaml文件:
vi /etc/kubernetes/conf/kubernetes-dashboard-user.yaml
# Create Dashboard Service Account
apiVersion: v1
kind: ServiceAccount
metadata:
name: dashboard-admin-user
namespace: kube-system
---
# Create ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: dashboard-admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: dashboard-admin-user
namespace: kube-system
然后kubectl apply -f /etc/kubernetes/conf/kubernetes-dashboard-user.yaml
完成后执行kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep dashboard-admin-user | awk '{print $1}') 查看token
获取的token即可用来在页面上输入登录:
eyJhbGciOiJSUzI1NiIsImtpZCI6IkF3QmIxYmVOYUcweXIxODVTdXhxYmZaZG5aQ2FFTzVod2V3bDlzUS1XeFkifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4tcTZwejgiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiZmM4MGU3MTYtODc3Ny00MmZjLTk2MjQtYmU0NWY5YTI5MjZmIiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.NZ7zswny6DO1VkUbXB54b2CFZyNz-IB0nVX9yGOgJP8scAcFU5f6Mvg6AeFnT5Tmw6vdm_B6aXuJouAQEhDwVsYqpa3sI0zzyAfequYs5utWwz_R96gWCLBsrktKNxBpQG2r6JawzWOC3P-vdt1YYgN9jpU5gLo3uyyg0wKYM7KemSPmevqAncXUrm73N-L-4ubKRnYHjuJey1EVnzlSBe0_brV_KRrF5jFiy7Te3ziTmQUa4Z_wgK_yQ_eUoOEMIyu2qNlNfTEr6qdqqczCQo879EXGW4boTHopGQsjlSoI-GUbmrhA9H3H597qKbhmz7cgfA_6lgHpsOeSrBWi0g
此时登录我们发现会出现:the server could not find the requested resource错误
我们查看pod的日志:kubectl logs kubernetes-dashboard-7c54d59f66-lcz2g -n kube-system
通过上面的日志,我们发现有查找heapster服务失败,因为dashboard要显示图表数据需要依赖heapster服务,于是我们部署heapster服务:
下载 heapster 相关 yaml 文件:
wget /etc/kubernetes/heapster https://raw.githubusercontent.com/kubernetes-retired/heapster/master/deploy/kube-config/influxdb/grafana.yaml
wget /etc/kubernetes/heapster https://raw.githubusercontent.com/kubernetes-retired/heapster/master/deploy/kube-config/influxdb/heapster.yaml
wget /etc/kubernetes/heapster https://raw.githubusercontent.com/kubernetes-retired/heapster/master/deploy/kube-config/influxdb/influxdb.yaml
wget /etc/kubernetes/heapster https://raw.githubusercontent.com/kubernetes-retired/heapster/master/deploy/kube-config/rbac/heapster-rbac.yaml
查看需要部署的镜像:
cat grafana.yaml | grep image
cat heapster.yaml | grep image
cat influxdb.yaml | grep image
手动部署heapster相关的镜像:
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-amd64:v1.5.4
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-influxdb-amd64:v1.5.2
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-grafana-amd64:v5.0.4
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-amd64:v1.5.4 k8s.gcr.io/heapster-amd64:v1.5.4
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-influxdb-amd64:v1.5.2 k8s.gcr.io/heapster-influxdb-amd64:v1.5.2
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-grafana-amd64:v5.0.4 k8s.gcr.io/heapster-grafana-amd64:v5.0.4
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-amd64:v1.5.4
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-influxdb-amd64:v1.5.2
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-grafana-amd64:v5.0.4
修改yaml文件:
因为k8s高版本的api版本进行了变化,将上面四个yaml文件中的apiVersion: extensions/v1beta1 改为apiVersion: apps/v1
因为kubelet 只在 10250 监听 https 请求,将heapster.yaml中的- --source=kubernetes:https://kubernetes.default 修改为:
- --source=kubernetes:https://kubernetes.default?kubeletHttps=true&kubeletPort=10250&insecure=true
修改上面四个yaml文件中的spec节点,增加selector,如下图:
然后在heapster配置文件的当前目录下执行部署:kubectl apply -f .
注意:如果部署发生错误,我们执行kubectl delete -f . 进行回退
Heapster各个组件部署成功如下:
然后我们生成使用config登录的文件:
##将secret中的token使用base64方式进行解码,然后使用变量引用
DASH_TOCKEN=$(kubectl get secret -n kube-system dashboard-admin-token-q6pz8 -o jsonpath={.data.token}|base64 -d)
##创建一个集群
kubectl config set-cluster cluster-admin --server=192.168.100.11:6443 --kubeconfig=/etc/kubernetes/conf/dashbord-admin.conf
##创建一个集群用户,并引用sa的token
kubectl config set-credentials dashboard-admin-user --token=$DASH_TOCKEN --kubeconfig=/etc/kubernetes/conf/dashbord-admin.conf
##创建一个上下文,指定集群名、集群用户名
kubectl config set-context dashboard-admin-user@cluster-admin --cluster=cluster-admin --user=dashboard-admin-user
--kubeconfig=/etc/kubernetes/conf/dashbord-admin.conf
##设置集群中当前使用的用户
kubectl config use-context dashboard-admin-user@cluster-admin --kubeconfig=/etc/kubernetes/conf/dashbord-admin.conf
然后使用token或生成的文件登录成功如下:
8.验证集群状态
使用kubectl get nodes -n kube-system -owide 查看节点列表
使用kubectl get pods -n kube-system -owide查看pod列表
使用kubectl get svc -n kube-system -owide 查看服务列表
三、集群问题解决
1.初始化集群时:/proc/sys/net/ipv4/ip_forward contents are not set to 1
问题描述:执行kubeadm init时报出/proc/sys/net/ipv4/ip_forward contents are not set to 1的错误
解决方案:sudo echo "1" > /proc/sys/net/ipv4/ip_forward
2.安装好网络插件后,node和coredns还是NotReady状态
问题描述:安装好网络插件flannel后,node还是NotReady状态,coredns 是padding状态,通过systemctl status kubelet 能看到是cni-flannel版本问题
解决方案:vi /etc/cni/net.d/10-flannel.conflist ,增加"cniVersion":"0.2.0",
3.部署heapster提示版本问题
问题描述:部署heapster 组件提示no matches for king “Deployment” in version “extensions/v1beta1”
解决方案:是因为k8s高版本的api版本进行了变化,将对应的yaml文件中的extensions/v1beta1 改为apiVersion: apps/v1
4.部署heapster提示selector错误
问题描述:部署heapster 组件提示missing required field "selector" in io.k8s.api.apps.v1.DeploymentSpec的错误
解决方案:修改heapster上面四个yaml文件中的spec节点,增加selector,如下图:
5.部署完dashboard访问提示the server could not find the requested resource :404
这是由于安装的最新的k8s 1.16.0的api不支持dashboard,等待dashboard更新支持,即可,或者先降级到1.15.x版本