安装 KubeSphere v3.4.0,优秀的K8S容器管理平台

介绍

kubesphere,适用于 Kubernetes 多云、数据中心和边缘 管理的容器平台.

KubeSphere 愿景是打造一个以 Kubernetes 为内核的 云原生分布式操作系统,它的架构可以非常方便地使第三方应用与云原生生态组件进行即插即用(plug-and-play)的集成,支持云原生应用在多云与多集群的统一分发和运维管理,下图为产品架构图。

安装 KubeSphere v3.4.0,优秀的K8S容器管理平台

安装

前置项

  • 准备好默认的 StorageClass

现有 Kubernetes 集群(v1.29.2)上安装 KubeSphere

kubectl apply -f https://github.com/kubesphere/ks-installer/releases/download/v3.4.0/kubesphere-installer.yaml

kubectl apply -f https://github.com/kubesphere/ks-installer/releases/download/v3.4.0/cluster-configuration.yaml

查看安装过程

kubectl  logs ks-installer-6bdfffb7d-9kb2j -n kubesphere-system -f

安装成功,使用 节点的IP:30880 登录 KubeSphere 控制台,默认账号 admin/P@88w0rd

# kubectl  get svc ks-console -n kubesphere-system
NAME         TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
ks-console   NodePort   10.111.251.177   <none>        80:30880/TCP   34h
  • 集群管理首页
    安装 KubeSphere v3.4.0,优秀的K8S容器管理平台

  • 集群状态
    安装 KubeSphere v3.4.0,优秀的K8S容器管理平台

  • 节点状态
    安装 KubeSphere v3.4.0,优秀的K8S容器管理平台

    问题

登录

遇到问题

request to http://ks-apiserver/oauth/token failed, reason: getaddrinfo EAI_AGAIN ks-apiserver

排查,控制台登录的 ks-console Pod 请求 ks-apiserver Pod 失败,初步判定是负责集群内 DNS 解析的 coredns 引起的

# kubectl  logs ks-console-56b9b686c8-k8nxq -n kubesphere-system -f

> kubesphere-console@3.0.0 serve /opt/kubesphere/console
> NODE_ENV=production node server/server.js

Dashboard app running at port 8000

  --> POST /login 200 5,004ms 120b 2024/03/19T12:59:04.570
  <-- POST /login 2024/03/19T13:07:00.295
FetchError: request to http://ks-apiserver/oauth/token failed, reason: getaddrinfo EAI_AGAIN ks-apiserver
    at ClientRequest.<anonymous> (/opt/kubesphere/console/server/server.js:19500:11)
    at ClientRequest.emit (events.js:314:20)
    at Socket.socketErrorListener (_http_client.js:427:9)
    at Socket.emit (events.js:314:20)
    at emitErrorNT (internal/streams/destroy.js:92:8)
    at emitErrorAndCloseNT (internal/streams/destroy.js:60:3)
    at processTicksAndRejections (internal/process/task_queues.js:84:21) {
  type: 'system',
  errno: 'EAI_AGAIN',
  code: 'EAI_AGAIN'
}

根据官方社区 KubeSphere v3.1安装成功后,登录失败 的指引,注释 coredns 的向上解析配置 forward

# kubectl -n kube-system edit cm coredns -o yaml
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        health {
           lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        prometheus :9153
       #forward . /etc/resolv.conf {
       #    max_concurrent 1000
       # }
        cache 30
        loop
        reload
        loadbalance
    }
kind: ConfigMap
metadata:
  creationTimestamp: "2024-03-15T06:21:58Z"
  name: coredns
  namespace: kube-system
  resourceVersion: "500334"
  uid: 3dadaea2-3800-4ae6-95d0-8c75713b3c1f

思考:forward 本来用于集群外部域名,注释后,后续会出现无法解析外部域名的情况,查看文末。

重启 Pod

# kubectl  rollout restart deploy coredns -n kube-system
deployment.apps/coredns restarted

登录正常。

pod prometheus-k8s-0 初始化失败

# kubectl  describe pod/prometheus-k8s-0 -n kubesphere-monitoring-system
Name:             prometheus-k8s-0
Namespace:        kubesphere-monitoring-system
...
Events:
  Type     Reason       Age                From               Message
  ----     ------       ----               ----               -------
  Normal   Scheduled    27s                default-scheduler  Successfully assigned kubesphere-monitoring-system/prometheus-k8s-0 to hadoop-30.com
  Warning  FailedMount  11s (x6 over 26s)  kubelet            MountVolume.SetUp failed for volume "secret-kube-etcd-client-certs" : secret "kube-etcd-client-certs" not found

确实找不到 secret kube-etcd-client-certs

# kubectl get secrets -n kubesphere-monitoring-system
NAME                                         TYPE                 DATA   AGE
alertmanager-main                            Opaque               1      5h52m
alertmanager-main-generated                  Opaque               1      5h52m
alertmanager-main-tls-assets-0               Opaque               0      5h52m
notification-manager-webhook-server-cert     kubernetes.io/tls    2      5h51m
prometheus-k8s                               Opaque               1      5h52m
prometheus-k8s-tls-assets-0                  Opaque               0      5h52m
prometheus-k8s-web-config                    Opaque               1      5h52m
sh.helm.release.v1.notification-manager.v1   helm.sh/release.v1   1      5h51m
sh.helm.release.v1.notification-manager.v2   helm.sh/release.v1   1      5h51m

根据 KubeSphere 社区 指引,创建之。

kubectl -n kubesphere-monitoring-system create  secret generic kube-etcd-client-certs \
 --from-file=etcd-client-ca.crt=/etc/kubernetes/pki/etcd/ca.crt \
 --from-file=etcd-client.crt=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
 --from-file=etcd-client.key=/etc/kubernetes/pki/etcd/healthcheck-client.key

重启

# kubectl rollout restart statefulset prometheus-k8s -n kubesphere-monitoring-system
Warning: spec.template.spec.containers[1].ports[0]: duplicate port definition with spec.template.spec.initContainers[0].ports[0]
statefulset.apps/prometheus-k8s restarted

运行正常了, kubesphere 的监控图表也出来了。

# kubectl  get pod prometheus-k8s-0 -n kubesphere-monitoring-system
NAME               READY   STATUS    RESTARTS   AGE
prometheus-k8s-0   2/2     Running   0          15m

开启组件过多,导致服务器高负载

只有一个工作节点,已经 NodeNotReady 了

# kubectl  describe node hadoop-30.com
  Container Runtime Version:    containerd://1.7.14
  Kubelet Version:              v1.29.2
  Kube-Proxy Version:           v1.29.2
PodCIDR:                        10.244.1.0/24
PodCIDRs:                       10.244.1.0/24
Non-terminated Pods:            (38 in total)
  Namespace                     Name                                                   CPU Requests  CPU Limits   Memory Requests  Memory Limits  Age
  ---------                     ----                                                   ------------  ----------   ---------------  -------------  ---
  default                       nfs-client-provisioner-648798c484-ljrc6                0 (0%)        0 (0%)       0 (0%)           0 (0%)         162m
  default                       nginx-1-17-5d5dc64fc7-7b6hl                            0 (0%)        0 (0%)       0 (0%)           0 (0%)         40h
  default                       nginx-59dc9c9459-nf9cb                                 0 (0%)        0 (0%)       0 (0%)           0 (0%)         2d1h
  default                       nginx-web-65d5f4d459-ptt46                             0 (0%)        0 (0%)       0 (0%)           0 (0%)         28h
  default                       redis-7c888f4788-gflms                                 0 (0%)        0 (0%)       0 (0%)           0 (0%)         42h
  kube-flannel                  kube-flannel-ds-mp2hz                                  100m (2%)     0 (0%)       50Mi (0%)        0 (0%)         3d14h
  kube-system                   coredns-856fc85d8f-7cjl4                               100m (2%)     0 (0%)       70Mi (0%)        170Mi (2%)     23m
  kube-system                   coredns-856fc85d8f-f7wnd                               100m (2%)     0 (0%)       70Mi (0%)        170Mi (2%)     23m
  kube-system                   kube-proxy-j2dmm                                       0 (0%)        0 (0%)       0 (0%)           0 (0%)         3d15h
  kube-system                   metrics-server-95d9bdd76-gmbhs                         0 (0%)        0 (0%)       0 (0%)           0 (0%)         12m
  kube-system                   snapshot-controller-0                                  0 (0%)        0 (0%)       0 (0%)           0 (0%)         155m
  kubernetes-dashboard          kubernetes-dashboard-api-6c49db8b5d-kxxdg              100m (2%)     250m (6%)    200Mi (2%)       400Mi (5%)     16h
  kubernetes-dashboard          kubernetes-dashboard-api-6c49db8b5d-pm66j              100m (2%)     250m (6%)    200Mi (2%)       400Mi (5%)     16h
  kubernetes-dashboard          kubernetes-dashboard-api-6c49db8b5d-w7l2n              100m (2%)     250m (6%)    200Mi (2%)       400Mi (5%)     16h
  kubernetes-dashboard          kubernetes-dashboard-auth-77fc776d6f-5q54s             100m (2%)     250m (6%)    200Mi (2%)       400Mi (5%)     16h
  kubernetes-dashboard          kubernetes-dashboard-kong-cbdf8f7d9-bndnn              0 (0%)        0 (0%)       0 (0%)           0 (0%)         16h
  kubernetes-dashboard          kubernetes-dashboard-metrics-scraper-956d55d9-46dj2    100m (2%)     250m (6%)    200Mi (2%)       400Mi (5%)     16h
  kubernetes-dashboard          kubernetes-dashboard-web-5f57f6bc56-v5cxv              100m (2%)     250m (6%)    200Mi (2%)       400Mi (5%)     16h
  kubesphere-controls-system    default-http-backend-68686bdb6-bbxxt                   10m (0%)      10m (0%)     20Mi (0%)        20Mi (0%)      154m
  kubesphere-controls-system    kubectl-admin-7df4b6b5b6-xwmr5                         0 (0%)        0 (0%)       0 (0%)           0 (0%)         147m
  kubesphere-logging-system     fluent-bit-zxqbs                                       10m (0%)      500m (12%)   25Mi (0%)        200Mi (2%)     9m59s
  kubesphere-logging-system     fluentbit-operator-687c567566-sncr7                    10m (0%)      200m (5%)    20Mi (0%)        200Mi (2%)     10m
  kubesphere-logging-system     logsidecar-injector-deploy-56f59bcbcf-9qzs7            20m (0%)      200m (5%)    20Mi (0%)        200Mi (2%)     7m24s
  kubesphere-logging-system     logsidecar-injector-deploy-56f59bcbcf-gck6w            20m (0%)      200m (5%)    20Mi (0%)        200Mi (2%)     7m24s
  kubesphere-logging-system     opensearch-cluster-data-0                              1 (25%)       0 (0%)       1536Mi (19%)     0 (0%)         10m
  kubesphere-logging-system     opensearch-cluster-master-0                            1 (25%)       0 (0%)       512Mi (6%)       0 (0%)         10m
  kubesphere-monitoring-system  alertmanager-main-0                                    120m (3%)     300m (7%)    80Mi (1%)        250Mi (3%)     150m
  kubesphere-monitoring-system  kube-state-metrics-65c69d6b64-l7s77                    120m (3%)     3 (75%)      190Mi (2%)       8392Mi (108%)  151m
  kubesphere-monitoring-system  node-exporter-84k4s                                    112m (2%)     2 (50%)      200Mi (2%)       600Mi (7%)     151m
  kubesphere-monitoring-system  notification-manager-deployment-85757c9bfb-hn78m       5m (0%)       500m (12%)   20Mi (0%)        500Mi (6%)     149m
  kubesphere-monitoring-system  notification-manager-operator-859d858c57-dgvgq         10m (0%)      100m (2%)    30Mi (0%)        100Mi (1%)     149m
  kubesphere-monitoring-system  prometheus-operator-77d99c696f-qrgws                   110m (2%)     1200m (30%)  120Mi (1%)       300Mi (3%)     151m
  kubesphere-system             ks-apiserver-85cdc5f879-44gsb                          20m (0%)      1 (25%)      100Mi (1%)       1Gi (13%)      154m
  kubesphere-system             ks-console-56b9b686c8-k8nxq                            20m (0%)      1 (25%)      100Mi (1%)       1Gi (13%)      154m
  kubesphere-system             ks-controller-manager-5df5cd69f-n5drp                  30m (0%)      1 (25%)      50Mi (0%)        1000Mi (12%)   154m
  kubesphere-system             ks-installer-6bdfffb7d-bz5t7                           20m (0%)      1 (25%)      100Mi (1%)       1Gi (13%)      156m
  kubesphere-system             minio-6476d94fcb-4dw8z                                 250m (6%)     0 (0%)       256Mi (3%)       0 (0%)         11m
  kubesphere-system             openpitrix-import-job-jdfnv                            0 (0%)        0 (0%)       0 (0%)           0 (0%)         8m2s
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests      Limits
  --------           --------      ------
  cpu                3787m (94%)   13710m (342%)
  memory             4789Mi (62%)  17774Mi (230%)
  ephemeral-storage  0 (0%)        0 (0%)
  hugepages-1Gi      0 (0%)        0 (0%)
  hugepages-2Mi      0 (0%)        0 (0%)
Events:
  Type    Reason        Age    From             Message
  ----    ------        ----   ----             -------
  Normal  NodeNotReady  4m22s  node-controller  Node hadoop-30.com status is now: NodeNotReady

控制台已经进去不了,通过在控制面的节点上修改自定义资源配置 ClusterConfiguration

[root@clouderamanager-15 net.d]# kubectl  get crd | grep clusterconfiguration
clusterconfigurations.installer.kubesphere.io       2024-03-19T01:44:21Z

[root@clouderamanager-15 net.d]# kubectl get clusterconfigurations.installer.kubesphere.io -A
NAMESPACE           NAME           AGE
kubesphere-system   ks-installer   3h56m

将部分组件从 true 改成 false

[root@clouderamanager-15 net.d]# kubectl edit clusterconfiguration ks-installer -n kubesphere-system
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: installer.kubesphere.io/v1alpha1
kind: ClusterConfiguration
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
  creationTimestamp: "2024-03-19T01:44:59Z"
  generation: 12
  labels:
    version: v3.4.0
  name: ks-installer
  namespace: kubesphere-system
  resourceVersion: "504258"
  uid: 86e53ba6-7a78-4175-b1ae-34afc18ff7b8
spec:
  alerting:
    enabled: true
  auditing:
    enabled: false
  authentication:
    jwtSecret: ""
  common:
    core:
      console:
        enableMultiLogin: true
        port: 30880
        type: NodePort
    es:
      basicAuth:
        enabled: false
        password: ""
        username: ""
      elkPrefix: logstash
      enabled: false
      externalElasticsearchHost: ""
      externalElasticsearchPort: ""
      logMaxAge: 7
    gpu:
      kinds:
      - default: true
        resourceName: nvidia.com/gpu
        resourceType: GPU
    minio:
      volumeSize: 20Gi
    monitoring:
      GPUMonitoring:
        enabled: false
      endpoint: http://prometheus-operated.kubesphere-monitoring-system.svc:9090
    openldap:
      enabled: false
      volumeSize: 2Gi
    opensearch:
      basicAuth:
        enabled: true
        password: admin
        username: admin
      dashboard:
        enabled: false
      enabled: true
      externalOpensearchHost: ""
      externalOpensearchPort: ""
      logMaxAge: 7
      opensearchPrefix: whizard
    redis:
      enableHA: false
/logg
    enabled: false
    kubeedge:
      cloudCore:
        cloudHub:
          advertiseAddress:
          - ""
        service:
          cloudhubHttpsNodePort: "30002"
          cloudhubNodePort: "30000"
          cloudhubQuicNodePort: "30001"
          cloudstreamNodePort: "30003"
          tunnelNodePort: "30004"
      enabled: false
      iptables-manager:
        enabled: true
        mode: external
  etcd:
    endpointIps: localhost
    monitoring: true
    port: 2379
    tlsEnable: true
  events:
    enabled: false
    ruler:
      enabled: true
      replicas: 2
  gatekeeper:
    enabled: false
  local_registry: ""
  logging:
    enabled: false
    logsidecar:
      enabled: true
      replicas: 2
  metrics_server:
    enabled: true
  monitoring:
    gpu:
      nvidia_dcgm_exporter:
        enabled: false
    node_exporter:
      port: 9100
    storageClass: ""
  multicluster:
    clusterRole: none
  network:
    ippool:
      type: none
    networkpolicy:
      enabled: false
    topology:
      type: none
  openpitrix:
    store:
      enabled: false
  persistence:
    storageClass: ""
  servicemesh:
    enabled: false
"/tmp/kubectl-edit-3078338335.yaml" 166L, 6067C written
clusterconfiguration.installer.kubesphere.io/ks-installer edited

恢复正常了。

[root@clouderamanager-15 net.d]# kubectl  describe node hadoop-30.com
Name:               hadoop-30.com
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=hadoop-30.com
                    kubernetes.io/os=linux
Annotations:        flannel.alpha.coreos.com/backend-data: {"VNI":1,"VtepMAC":"66:be:4f:9f:61:7e"}
                    flannel.alpha.coreos.com/backend-type: vxlan
                    flannel.alpha.coreos.com/kube-subnet-manager: true
                    flannel.alpha.coreos.com/public-ip: 10.0.0.30
                    kubeadm.alpha.kubernetes.io/cri-socket: unix:///var/run/containerd/containerd.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Fri, 15 Mar 2024 22:19:01 +0800
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  hadoop-30.com
  AcquireTime:     <unset>
  RenewTime:       Tue, 19 Mar 2024 13:45:02 +0800
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Tue, 19 Mar 2024 13:42:10 +0800   Tue, 19 Mar 2024 13:42:10 +0800   FlannelIsUp                  Flannel is running on this node
  MemoryPressure       False   Tue, 19 Mar 2024 13:41:18 +0800   Tue, 19 Mar 2024 13:41:18 +0800   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Tue, 19 Mar 2024 13:41:18 +0800   Tue, 19 Mar 2024 13:41:18 +0800   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Tue, 19 Mar 2024 13:41:18 +0800   Tue, 19 Mar 2024 13:41:18 +0800   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Tue, 19 Mar 2024 13:41:18 +0800   Tue, 19 Mar 2024 13:41:18 +0800   KubeletReady                 kubelet is posting ready status
...
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests      Limits
  --------           --------      ------
  cpu                3637m (90%)   17810m (445%)
  memory             4843Mi (62%)  33868Mi (438%)
  ephemeral-storage  0 (0%)        0 (0%)
  hugepages-1Gi      0 (0%)        0 (0%)
  hugepages-2Mi      0 (0%)        0 (0%)
Events:
  Type     Reason                   Age                    From             Message
  ----     ------                   ----                   ----             -------
  Normal   Starting                 3m38s                  kube-proxy
  Normal   NodeNotReady             10m                    node-controller  Node hadoop-30.com status is now: NodeNotReady
  Normal   Starting                 3m45s                  kubelet          Starting kubelet.
  Warning  InvalidDiskCapacity      3m45s                  kubelet          invalid capacity 0 on image filesystem
  Normal   NodeAllocatableEnforced  3m45s                  kubelet          Updated Node Allocatable limit across pods
  Warning  Rebooted                 3m45s (x2 over 3m45s)  kubelet          Node hadoop-30.com has been rebooted, boot id: 04507eb1-eedc-43f9-b2c0-a9c373d30a65
  Normal   NodeHasSufficientMemory  3m45s (x3 over 3m45s)  kubelet          Node hadoop-30.com status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    3m45s (x3 over 3m45s)  kubelet          Node hadoop-30.com status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     3m45s (x3 over 3m45s)  kubelet          Node hadoop-30.com status is now: NodeHasSufficientPID
  Normal   NodeNotReady             3m45s (x2 over 3m45s)  kubelet          Node hadoop-30.com status is now: NodeNotReady
  Normal   NodeReady                3m45s                  kubelet          Node hadoop-30.com status is now: NodeReady

应用商店

安装 KubeSphere v3.4.0,优秀的K8S容器管理平台

集群配置中 开启 应用商店

kubesphere 通过 CRD clusterconfiguration 来管理集群的配置,比如安装 kubesphere 或 添加组件(应用商店 openpitrix)

kubectl edit clusterconfiguration ks-installer -n kubesphere-system

openpitrix.store.enabled 的值设置为 true

  openpitrix:
    store:
      enabled: true

查看安装过程

kubectl  logs ks-installer-6bdfffb7d-9kb2j -n kubesphere-system -f

安装成功后,顶部导航增加了应用商店入口。

默认只有15个应用,如果需要更多的营养,可以添加应用仓库。

创建企业空间、项目

为了添加应用仓库,需要先创建企业空间,默认的企业空间 system-workspace(system-workspace is a built-in workspace automatically created by KubeSphere. It contains all system components to run KubeSphere) 主要用于 KubeSphere 组件的部署。

参照官方文档 创建企业空间、项目、用户和平台角色 即可。

添加 Helm 仓库

安装 KubeSphere v3.4.0,优秀的K8S容器管理平台

工作台 进入刚创建的空间,应用管理下找到应用仓库,右侧添加。

以下是3个丰富的 Helm 仓库

https://charts.bitnami.com/bitnami
https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts
https://mirror.azure.cn/kubernetes/charts

添加过程中可能遇到报错 Bad Request Get “https://charts.bitnami.com/bitnami/index.yaml”: dial tcp: lookup charts.bitnami.com on 10.96.0.10:53: server misbehaving, 原因是 Pod 请求仓库时会向 coredns 发起域名解析,而 CoreDNS 在前面我们遇到登录问题时注释了引用节点上 /etc/resolv.conf 的配置。

解决办法:使用 kubectl edit cm coredns -n kube-system 取消注释即可,发现现在不影响登录了。

创建应用

进入新的空间 -> 应用负载 -> 应用 -> 创建应用(从应用模版)
安装 KubeSphere v3.4.0,优秀的K8S容器管理平台

安装 KubeSphere v3.4.0,优秀的K8S容器管理平台

reference

© 版权声明

相关文章

暂无评论

暂无评论...