将 NFS 作为 K8S 默认 StorageClass

今天安装 Kubesphere 时发现缺少 StorageClass,导致无法创建依赖的 PVC,搞起。

CBS作为 StorageClass,性能是好,但成本太高,每一个 PVC 背后会创建一个独立的 CBS硬盘,申明多少就创建了多大的磁盘,很不划算。

用 NFS 比较合适,比如用 腾讯云的 CFS,普通版本的读写性能在 100MB/s ,足够用了。

K8S版本:v1.29.2

1. 创建 NFS

由于 K8S 部署在腾讯云上,于是 NFS 选择了 腾讯云的 CFS

创建文件系统,选择和 CVM 同一个可用区和 VPC。

将 NFS 作为 K8S 默认 StorageClass

创建好了后,key看到容量上限是 160TB,吞吐上限100MB/s,其中 IP地址为:10.0.0.68,下文的示例中会使用到。

将 NFS 作为 K8S 默认 StorageClass

接下来要安装 NFS provisioner,让 K8S 能读写 CFS,同时作为集群默认的存储类。

2. 安装 NFS provisioner

参照 K8S 工作组提供的 nfs-subdir-external-provisioner 准备安装

前置项

下载仓库,后面会用到里面的配置文件。

$ git clone https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner.git
$ cd nfs-subdir-external-provisioner

2.1 配置鉴权

# Set the subject of the RBAC objects to the current namespace where the provisioner is being deployed
$ NS=$(kubectl config get-contexts|grep -e "^\*" |awk '{print $5}')
$ NAMESPACE=${NS:-default}
$ sed -i'' "s/namespace:.*/namespace: $NAMESPACE/g" ./deploy/rbac.yaml ./deploy/deployment.yaml
$ kubectl create -f deploy/rbac.yaml

deploy/rbac.yaml 内容如下,定义了 NFS provisioner 依赖的权限。

apiVersion: v1
kind: ServiceAccount
metadata:
  name: nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: nfs-client-provisioner-runner
rules:
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: run-nfs-client-provisioner
subjects:
  - kind: ServiceAccount
    name: nfs-client-provisioner
    # replace with namespace where provisioner is deployed
    namespace: default
roleRef:
  kind: ClusterRole
  name: nfs-client-provisioner-runner
  apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
rules:
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
subjects:
  - kind: ServiceAccount
    name: nfs-client-provisioner
    # replace with namespace where provisioner is deployed
    namespace: default
roleRef:
  kind: Role
  name: leader-locking-nfs-client-provisioner
  apiGroup: rbac.authorization.k8s.io

2.2 配置 nfs provisioner

配置 nfs provisioner 的 deployment 配置(deploy/deployment.yaml),有三处需要修改:

  • NFS_SERVER: 10.0.0.68 (10.0.0.68 是我的 NFS 地址)
  • NFS_PATH: / (由于创建的 NFS 专门是给 K8S 作为 StorageClass,所以直接选择根目录)
  • image: 国内无法访问默认的镜像仓库,可以通过代理 pull ,然后 push 到自己的仓库
kind: Deployment
apiVersion: apps/v1
metadata:
  name: nfs-client-provisioner
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nfs-client-provisioner
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: nfs-client-provisioner
    spec:
      serviceAccountName: nfs-client-provisioner
      containers:
        - name: nfs-client-provisioner
          image: registry.k8s.io/sig-storage/nfs-subdir-external-provisioner:v4.0.2
          volumeMounts:
            - name: nfs-client-root
              mountPath: /persistentvolumes
          env:
            - name: PROVISIONER_NAME
              value: k8s-sigs.io/nfs-subdir-external-provisioner
            - name: NFS_SERVER
              value: 10.0.0.68
            - name: NFS_PATH
              value: /
      volumes:
        - name: nfs-client-root
          nfs:
            server: 10.0.0.68
            path: /

创建 deployment

kubectl apply -f deploy/deployment.yaml

检查 deployment 部署是否正常

# kubectl  get deployment nfs-client-provisioner
NAME                     READY   UP-TO-DATE   AVAILABLE   AGE
nfs-client-provisioner   1/1     1            1           63m

2.3 部署 StorageClass

修改 deploy/class.yaml,增加 is-default-class 的声明,将 nfs-client 设置为默认 strorageclass

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs-client
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: k8s-sigs.io/nfs-subdir-external-provisioner # or choose another name, must match deployment's env PROVISIONER_NAME'
parameters:
  archiveOnDelete: "false"
kubectl apply -f deploy/class.yaml

检查,可以看到 nfs-client (default) 包含 default 标识

# kubectl  get sc
NAME                   PROVISIONER                                   RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
nfs-client (default)   k8s-sigs.io/nfs-subdir-external-provisioner   Delete          Immediate           false                  66m

2.4 测试

$ kubectl create -f deploy/test-claim.yaml -f deploy/test-pod.yaml

查看状态,一切正常。

# kubectl  get pvc,pod
NAME                               STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
persistentvolumeclaim/test-claim   Bound    pvc-245dcc5a-d58b-441a-bc21-0ea61b8b7c9f   1Mi        RWX            nfs-client     <unset>                 68m

NAME                                          READY   STATUS      RESTARTS   AGE
pod/nfs-client-provisioner-648798c484-ljrc6   1/1     Running     0          36m

查看 nfs 上的目录情况

# mount 10.0.0.68:/ /mnt

# ll /mnt
总用量 0
drwxrwxrwx 2 root root 20 3月  19 10:56 default-test-claim-pvc-245dcc5a-d58b-441a-bc21-0ea61b8b7c9f
drwxrwxrwx 3 root root 26 3月  19 11:07 kubesphere-monitoring-system-prometheus-k8s-db-prometheus-k8s-0-pvc-8214ae61-6092-4334-9f1f-20210c704550

# ll /mnt/default-test-claim-pvc-245dcc5a-d58b-441a-bc21-0ea61b8b7c9f/
总用量 0
-rw-r--r-- 1 root root 0 3月  19 10:56 SUCCESS

至此,将 nfs 设置为k8s 默认 storageclass 成功。

FAQ

0/2 nodes are available: pod has unbound immediate PersistentVolumeClaims.

测试 Pod 时发现调度失败,原因是 pvc 绑定失败,最后排查是因为 nfs provisioner deployment 因为镜像问题部署失败。

kubectl  describe pod test-pod
...
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  34s   default-scheduler  0/2 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.

mount.nfs: mounting :/ifs/kubernetes failed, reason given by server: No such file or directory

nfs provisioner deployment 部署失败的原因是 NFS_PATH 指定的是 /ifs/kubernetes,但 nfs 上并没有这个文件,改成 / 后正常

# kubectl  describe pod nfs-client-provisioner-7f9b667c6b-cr7cj
...
Events:
  Type     Reason       Age                   From               Message
  ----     ------       ----                  ----               -------
  Normal   Scheduled    7m14s                 default-scheduler  Successfully assigned default/nfs-client-provisioner-7f9b667c6b-cr7cj to hadoop-30.com
  Warning  FailedMount  26s (x11 over 7m10s)  kubelet            MountVolume.SetUp failed for volume "nfs-client-root" : mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t nfs 10.0.0.68:/ifs/kubernetes /var/lib/kubelet/pods/b7d26239-23c4-4698-9d55-749f5538bc85/volumes/kubernetes.io~nfs/nfs-client-root
Output: mount.nfs: mounting 10.0.0.68:/ifs/kubernetes failed, reason given by server: No such file or directory

failed to pull and unpack image "registry.k8s.io/sig-storage/nfs-subdir-external-provisioner:v4.0.2"

国内无法访问镜像仓库,需要自己改一下 tag 或推到自己的镜像仓库

# kubectl  describe pod nfs-client-provisioner-7f9b667c6b-cr7cj
...
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  67s                default-scheduler  Successfully assigned default/nfs-client-provisioner-774bb87d75-llc69 to hadoop-30.com
  Warning  Failed     35s                kubelet            Failed to pull image "registry.k8s.io/sig-storage/nfs-subdir-external-provisioner:v4.0.2": rpc error: code = DeadlineExceeded desc = failed to pull and unpack image "registry.k8s.io/sig-storage/nfs-subdir-external-provisioner:v4.0.2": failed to resolve reference "registry.k8s.io/sig-storage/nfs-subdir-external-provisioner:v4.0.2": failed to do request: Head "https://us-west2-docker.pkg.dev/v2/k8s-artifacts-prod/images/sig-storage/nfs-subdir-external-provisioner/manifests/v4.0.2": dial tcp 64.233.188.82:443: i/o timeout
  Warning  Failed     35s                kubelet            Error: ErrImagePull
  Normal   BackOff    35s                kubelet            Back-off pulling image "registry.k8s.io/sig-storage/nfs-subdir-external-provisioner:v4.0.2"
  Warning  Failed     35s                kubelet            Error: ImagePullBackOff
  Normal   Pulling    22s (x2 over 67s)  kubelet            Pulling image "registry.k8s.io/sig-storage/nfs-subdir-external-provisioner:v4.0.2"

reference

© 版权声明

相关文章

暂无评论

暂无评论...