biglittleant

不被嘲笑的梦想,是不值得去实现的

0%

使用 Kubeadm 安装 kubernetes 集群

使用 Kuberadm 安装 kubernetes 集群

集群环境

主机名 IP 服务
node1 192.168.66.11 master
node2 192.168.66.12 nodes
node3 192.168.66.13 nodes

升级系统内核为 4.44

CentOS 7.x 系统自带的 3.10.x 内核存在一些 Bugs,导致运行的 Docker、Kubernetes 不稳定,我们需要先升级一下内核版本。

1
2
3
4
5
6
7
8
9
10
11
12
13
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm

yum --enablerepo=elrepo-kernel install -y kernel-lt kernel-lt-devel kernel-lt-headers

# 安装完成后检查 /boot/grub2/grub.cfg 中对应内核 menuentry 中是否包含 initrd16 配置,如果没有,再安装 一次!
grep "initrd16" /boot/grub2/grub.cfg
kernel_version=`grep "^menuentry" /boot/grub2/grub.cfg | cut -d "'" -f2 | grep "4.4"`

# 设置开机从新内核启动
grub2-set-default "$kernel_version"

# 确认修改成功
grub2-editenv list

关闭 NUMA

1
2
3
4
5
6
7
8
9
cp /etc/default/grub{,.bak}
vim /etc/default/grub # 在 GRUB_CMDLINE_LINUX 一行添加 `numa=off` 参数,如下所示:
diff /etc/default/grub.bak /etc/default/grub
6c6
< GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos/root rhgb quiet"
---
> GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos/root rhgb quiet numa=off"
cp /boot/grub2/grub.cfg{,.bak}
grub2-mkconfig -o /boot/grub2/grub.cfg

重启一下服务器确认内核版本升级成功

初始化配置

修改主机名

1
2
3
4
5
6
7
8
9
10
hostnamectl set-hostname node1  node1
hostnamectl set-hostname node2 node2
hostnamectl set-hostname node3 node3


cat >> /etc/hosts<<EOF
192.168.66.11 node1
192.168.66.12 node2
192.168.66.13 node3
EOF

关闭系统不需要的服务

1
systemctl stop postfix && systemctl disable postfix

安装依赖

1
yum install -y conntrack ntpdate ntp ipvsadm ipset jq iptables curl sysstat libseccomp wget vim net-tools git lrzsz

修改防火墙

1
2
3
systemctl stop firewalld && systemctl disable firewalld
yum -y install iptables-services && systemctl start iptables && systemctl enable iptables
iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && service iptables save

关闭swap和selinux

1
2
swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
setenforce 0 && sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config

修改内核参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
cat > /etc/sysctl.d/kubernetes.conf <<EOF
net.bridge.bridge-nf-call-iptables=1 # 必备 开启桥接
net.bridge.bridge-nf-call-ip6tables=1 # 必备 开启桥接
net.ipv4.ip_forward=1 # 必备
net.ipv6.conf.all.disable_ipv6=1 # 必备 禁用ip 6
net.ipv4.tcp_tw_recycle=0
vm.swappiness=0 # 禁止使用 swap 空间,只有当系统 OOM 时才允许使用它
vm.overcommit_memory=1 # 不检查物理内存是否够用
vm.panic_on_oom=0 # 开启 OOM
fs.inotify.max_user_instances=8192
fs.inotify.max_user_watches=1048576
fs.file-max=52706963
fs.nr_open=52706963
net.netfilter.nf_conntrack_max=2310720
EOF

sysctl -p /etc/sysctl.d/kubernetes.conf

调整系统时区

1
2
3
4
5
6
7
# 设置系统时区为 中国/上海
timedatectl status # 查看时区状态
timedatectl set-timezone Asia/Shanghai # 将当前的 UTC 时间写入硬件时钟 实现修改的是, /etc/localtime 。lrwxrwxrwx 1 root root 35 Jun 5 16:33 /etc/localtime -> ../usr/share/zoneinfo/Asia/Shanghai
timedatectl set-local-rtc 0 #  将你的硬件时钟设置为协调世界时(UTC)
# 重启依赖于系统时间的服务
systemctl restart rsyslog
systemctl restart crond

设置 rsyslogd 和 systemd journald

1
2
3
4
5
6
7
8
9
10
11
12
13
14
mkdir /var/log/journal # 持久化保存日志的目录
mkdir /etc/systemd/journald.conf.d
cat > /etc/systemd/journald.conf.d/99-prophet.conf <<EOF
[Journal]
# 持久化保存到磁盘
Storage=persistent
# 压缩历史日志 Compress=yes
SyncIntervalSec=5m RateLimitInterval=30s RateLimitBurst=1000
# 最大占用空间 10G SystemMaxUse=10G
# 单日志文件最大 200M SystemMaxFileSize=200M
# 日志保存时间 2 周 MaxRetentionSec=2week
# 不将日志转发到 syslog ForwardToSyslog=no
EOF
systemctl restart systemd-journald

安装 Docker 软件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# step 1: 安装必要的一些系统工具
yum install -y yum-utils device-mapper-persistent-data lvm2
# Step 2: 添加软件源信息
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# Step 3: 更新并安装Docker-CE
yum makecache fast
yum -y install docker-ce

## 创建 /etc/docker 目录 # 配置 daemon.

mkdir /etc/docker

cat > /etc/docker/daemon.json <<EOF
{
"registry-mirrors": ["https://it8jkcyv.mirror.aliyuncs.com"],
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
}
}
EOF
mkdir -p /etc/systemd/system/docker.service.d
# 重启docker服务
systemctl daemon-reload && systemctl restart docker && systemctl enable docker

kubernetes 服务安装

kube-proxy开启ipvs的前置条件是安装lvs服务并启用。

1
2
3
4
5
6
7
8
9
10
11
12
13
# ipvsadm 服务在刚才安装过了。

modprobe br_netfilter
cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF

chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4

安装 Kubeadm (主从配置)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 添加yum源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
### 安装服务
yum -y install kubeadm kubectl kubelet
## 默认启动服务
systemctl enable kubelet.service

通过 kubeadm init 初始化k8s环境。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
kubeadm init --apiserver-advertise-address=192.168.66.11 --kubernetes-version=v1.18.0 --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12 | tee kubeadm-init.log

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.66.11:6443 --token tdew8c.7b303fgva3zy3dub \
--discovery-token-ca-cert-hash sha256:ad7e4f4153a1647392e37c8d1daa4ecc5e2619c06ca166310303313fbe8efd81

基于输出的内容 执行命令

1
2
3
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
1
2
3
4
kubectl get nodes
NAME STATUS ROLES AGE VERSION
node1 NotReady master 4m5s v1.18.5
# NotReady: 缺少网络环境,开始安装flannel

部署网络

1
2
3
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

kubectl apply -f kube-flannel.yml

将 node2 和 node3 加入到集群中

1
2
[root@node2 ~]# kubeadm join 192.168.66.11:6443 --token tdew8c.7b303fgva3zy3dub \
--discovery-token-ca-cert-hash sha256:ad7e4f4153a1647392e37c8d1daa4ecc5e2619c06ca166310303313fbe8efd81
1
2
[root@node3 ~]# kubeadm join 192.168.66.11:6443 --token tdew8c.7b303fgva3zy3dub \
--discovery-token-ca-cert-hash sha256:ad7e4f4153a1647392e37c8d1daa4ecc5e2619c06ca166310303313fbe8efd81

确认nodes正常

1
2
3
4
5
kubectl get nodes
NAME STATUS ROLES AGE VERSION
node1 Ready master 22h v1.18.5
node2 Ready <none> 21h v1.18.5
node3 Ready <none> 21h v1.18.5

确认pods都正常

1
2
3
4
5
6
7
8
9
10
11
12
13
14
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-66bff467f8-6wtcf 1/1 Running 3 21h
kube-system coredns-66bff467f8-zqt6t 1/1 Running 3 21h
kube-system etcd-node1 1/1 Running 3 21h
kube-system kube-apiserver-node1 1/1 Running 3 21h
kube-system kube-controller-manager-node1 1/1 Running 3 21h
kube-system kube-flannel-ds-amd64-88z8h 1/1 Running 2 20h
kube-system kube-flannel-ds-amd64-9jmjr 1/1 Running 2 20h
kube-system kube-flannel-ds-amd64-rk9kj 1/1 Running 4 21h
kube-system kube-proxy-77z8g 1/1 Running 2 20h
kube-system kube-proxy-h76lv 1/1 Running 3 21h
kube-system kube-proxy-pvpdr 1/1 Running 2 20h
kube-system kube-scheduler-node1 1/1 Running 3 21h

故障汇总

故障一 服务器重启后,所以的容器不能启动

原因:使用ansible执行初始化的时候,swap分区没修改成功。导致重启后swap分区还在启用。
解决办法: 关闭swap分区,重启。问题解决。

故障二 node2 服务不能连接到api服务

原因: 安装完IPtable以后没有清空自带的规则信息。

解决办法: iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && service iptables save

参考文档

https://blog.csdn.net/qq_40806970/article/details/97645628