kubernetes搭建rook-ceph

阿新 • • 發佈：2018-11-21

cpu databases ply scribe pac ont forward 回收 serve

簡介

Rook官網：https://rook.io
Rook是雲原生計算基金會(CNCF)的孵化級項目.
Rook是Kubernetes的開源雲本地存儲協調器，為各種存儲解決方案提供平臺，框架和支持，以便與雲原生環境本地集成。
至於CEPH，官網在這：https://ceph.com/
ceph官方提供的helm部署，至今我沒成功過，所以轉向使用rook提供的方案

有道筆記原文：http://note.youdao.com/noteshare?id=281719f1f0374f787effc90067e0d5ad&sub=0B59EA339D4A4769B55F008D72C1A4C0

環境

centos 7.5
kernel 4.18.7-1.el7.elrepo.x86_64

docker 18.06

kubernetes v1.12.2
    kubeadm部署：
        網絡: canal
        DNS: coredns
    集群成員：    
    192.168.1.1 kube-master
    192.168.1.2 kube-node1
    192.168.1.3 kube-node2
    192.168.1.4 kube-node3
    192.168.1.5 kube-node4

所有node節點準備一塊200G的磁盤：/dev/sdb

技術分享圖片

準備工作

所有節點開啟ip_forward

cat <<EOF >  /etc/sysctl.d/ceph.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system

開始部署Operator

部署Rook Operator

#無另外說明，全部操作都在master操作

cd $HOME
git clone https://github.com/rook/rook.git

cd rook
cd cluster/examples/kubernetes/ceph
kubectl apply -f operator.yaml

技術分享圖片

查看Operator的狀態

#執行apply之後稍等一會。
#operator會在集群內的每個主機創建兩個pod:rook-discover,rook-ceph-agent

kubectl -n rook-ceph-system get pod -o wide

技術分享圖片

給節點打標簽

運行ceph-mon的節點打上：ceph-mon=enabled

kubectl label nodes {kube-node1,kube-node2,kube-node3} ceph-mon=enabled

運行ceph-osd的節點，也就是存儲節點，打上：ceph-osd=enabled

kubectl label nodes {kube-node1,kube-node2,kube-node3} ceph-osd=enabled

運行ceph-mgr的節點，打上：ceph-mgr=enabled

#mgr只能支持一個節點運行，這是ceph跑k8s裏的局限
kubectl label nodes kube-node1 ceph-mgr=enabled

配置cluster.yaml文件

官方配置文件詳解：https://rook.io/docs/rook/v0.8/ceph-cluster-crd.html
文件中有幾個地方要註意：
- dataDirHostPath: 這個路徑是會在宿主機上生成的，保存的是ceph的相關的配置文件，再重新生成集群的時候要確保這個目錄為空，否則mon會無法啟動
- useAllDevices: 使用所有的設備，建議為false，否則會把宿主機所有可用的磁盤都幹掉
- useAllNodes：使用所有的node節點，建議為false，肯定不會用k8s集群內的所有node來搭建ceph的
- databaseSizeMB和journalSizeMB：當磁盤大於100G的時候，就註釋這倆項就行了
本次實驗用到的 cluster.yaml 文件內容如下：

apiVersion: v1
kind: Namespace
metadata:
  name: rook-ceph
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: rook-ceph-cluster
  namespace: rook-ceph
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: rook-ceph-cluster
  namespace: rook-ceph
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: [ "get", "list", "watch", "create", "update", "delete" ]
---
# Allow the operator to create resources in this cluster‘s namespace
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: rook-ceph-cluster-mgmt
  namespace: rook-ceph
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: rook-ceph-cluster-mgmt
subjects:
- kind: ServiceAccount
  name: rook-ceph-system
  namespace: rook-ceph-system
---
# Allow the pods in this namespace to work with configmaps
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: rook-ceph-cluster
  namespace: rook-ceph
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: rook-ceph-cluster
subjects:
- kind: ServiceAccount
  name: rook-ceph-cluster
  namespace: rook-ceph
---
apiVersion: ceph.rook.io/v1beta1
kind: Cluster
metadata:
  name: rook-ceph
  namespace: rook-ceph
spec:
  cephVersion:
    # The container image used to launch the Ceph daemon pods (mon, mgr, osd, mds, rgw).
    # v12 is luminous, v13 is mimic, and v14 is nautilus.
    # RECOMMENDATION: In production, use a specific version tag instead of the general v13 flag, which pulls the latest release and could result in different
    # versions running within the cluster. See tags available at https://hub.docker.com/r/ceph/ceph/tags/.
    image: ceph/ceph:v13
    # Whether to allow unsupported versions of Ceph. Currently only luminous and mimic are supported.
    # After nautilus is released, Rook will be updated to support nautilus.
    # Do not set to true in production.
    allowUnsupported: false
  # The path on the host where configuration files will be persisted. If not specified, a kubernetes emptyDir will be created (not recommended).
  # Important: if you reinstall the cluster, make sure you delete this directory from each host or else the mons will fail to start on the new cluster.
  # In Minikube, the ‘/data‘ directory is configured to persist across reboots. Use "/data/rook" in Minikube environment.
  dataDirHostPath: /var/lib/rook
  # The service account under which to run the daemon pods in this cluster if the default account is not sufficient (OSDs)
  serviceAccount: rook-ceph-cluster
  # set the amount of mons to be started
  # count可以定義ceph-mon運行的數量，這裏默認三個就行了
  mon:
    count: 3
    allowMultiplePerNode: true
  # enable the ceph dashboard for viewing cluster status
  # 開啟ceph資源面板
  dashboard:
    enabled: true
    # serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
    # urlPrefix: /ceph-dashboard
  network:
    # toggle to use hostNetwork
    # 使用宿主機的網絡進行通訊
    # 使用宿主機的網絡貌似可以讓集群外的主機掛載ceph
    # 但是我沒試過，有興趣的兄弟可以試試改成true
    # 反正這裏只是集群內用，我就不改了
    hostNetwork: false
  # To control where various services will be scheduled by kubernetes, use the placement configuration sections below.
  # The example under ‘all‘ would have all services scheduled on kubernetes nodes labeled with ‘role=storage-node‘ and
  # tolerate taints with a key of ‘storage-node‘.
  placement:
#    all:
#      nodeAffinity:
#        requiredDuringSchedulingIgnoredDuringExecution:
#          nodeSelectorTerms:
#          - matchExpressions:
#            - key: role
#              operator: In
#              values:
#              - storage-node
#      podAffinity:
#      podAntiAffinity:
#      tolerations:
#      - key: storage-node
#        operator: Exists
# The above placement information can also be specified for mon, osd, and mgr components
#    mon:
#    osd:
#    mgr:
# nodeAffinity：通過選擇標簽的方式，可以限制pod被調度到特定的節點上
# 建議限制一下，為了讓這幾個pod不亂跑
    mon:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: ceph-mon
              operator: In
              values:
              - enabled
    osd:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: ceph-osd
              operator: In
              values:
              - enabled
    mgr:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: ceph-mgr
              operator: In
              values:
              - enabled
  resources:
# The requests and limits set here, allow the mgr pod to use half of one CPU core and 1 gigabyte of memory
#    mgr:
#      limits:
#        cpu: "500m"
#        memory: "1024Mi"
#      requests:
#        cpu: "500m"
#        memory: "1024Mi"
# The above example requests/limits can also be added to the mon and osd components
#    mon:
#    osd:
  storage: # cluster level storage configuration and selection
    useAllNodes: false
    useAllDevices: false
    deviceFilter:
    location:
    config:
      # The default and recommended storeType is dynamically set to bluestore for devices and filestore for directories.
      # Set the storeType explicitly only if it is required not to use the default.
      # storeType: bluestore
      # databaseSizeMB: "1024" # this value can be removed for environments with normal sized disks (100 GB or larger)
      # journalSizeMB: "1024"  # this value can be removed for environments with normal sized disks (20 GB or larger)
# Cluster level list of directories to use for storage. These values will be set for all nodes that have no `directories` set.
#    directories:
#    - path: /rook/storage-dir
# Individual nodes and their config can be specified as well, but ‘useAllNodes‘ above must be set to false. Then, only the named
# nodes below will be used as storage resources.  Each node‘s ‘name‘ field should match their ‘kubernetes.io/hostname‘ label.
#建議磁盤配置方式如下：
#name: 選擇一個節點，節點名字為kubernetes.io/hostname的標簽，也就是kubectl get nodes看到的名字
#devices: 選擇磁盤設置為OSD
# - name: "sdb":將/dev/sdb設置為osd
    nodes:
    - name: "kube-node1"
      devices:
      - name: "sdb"
    - name: "kube-node2"
      devices:
      - name: "sdb"
    - name: "kube-node3"
      devices:
      - name: "sdb"

#      directories: # specific directories to use for storage can be specified for each node
#      - path: "/rook/storage-dir"
#      resources:
#        limits:
#          cpu: "500m"
#          memory: "1024Mi"
#        requests:
#          cpu: "500m"
#          memory: "1024Mi"
#    - name: "172.17.4.201"
#      devices: # specific devices to use for storage can be specified for each node
#      - name: "sdb"
#      - name: "sdc"
#      config: # configuration can be specified at the node level which overrides the cluster level config
#        storeType: filestore
#    - name: "172.17.4.301"
#      deviceFilter: "^sd."

開始部署ceph

部署ceph

kubectl apply -f cluster.yaml

# cluster會在rook-ceph這個namesapce創建資源
# 盯著這個namesapce的pod你就會發現，它在按照順序創建Pod

kubectl -n rook-ceph get pod -o wide  -w

# 看到所有的pod都Running就行了
# 註意看一下pod分布的宿主機，跟我們打標簽的主機是一致的

kubectl -n rook-ceph get pod -o wide

技術分享圖片

切換到其他主機看一下磁盤
- 切換到kube-node1
```
lsblk
```
- 切換到kube-node3
```
lsblk
```

配置ceph dashboard

看一眼dashboard在哪個service上

kubectl -n rook-ceph get service
#可以看到dashboard監聽了8443端口

技術分享圖片

創建個nodeport類型的service以便集群外部訪問

kubectl apply -f dashboard-external-https.yaml

# 查看一下nodeport在哪個端口
ss -tanl
kubectl -n rook-ceph get service

技術分享圖片

找出Dashboard的登陸賬號和密碼

MGR_POD=`kubectl get pod -n rook-ceph | grep mgr | awk ‘{print $1}‘`

kubectl -n rook-ceph logs $MGR_POD | grep password

技術分享圖片

打開瀏覽器輸入任意一個Node的IP+nodeport端口
這裏我的就是：https://192.168.1.2:30290

技術分享圖片

配置ceph為storageclass

官方給了一個樣本文件：storageclass.yaml
這個文件使用的是 RBD 塊存儲
pool創建詳解：https://rook.io/docs/rook/v0.8/ceph-pool-crd.html

apiVersion: ceph.rook.io/v1beta1
kind: Pool
metadata:
  #這個name就是創建成ceph pool之後的pool名字
  name: replicapool
  namespace: rook-ceph
spec:
  replicated:
    size: 1
  # size 池中數據的副本數,1就是不保存任何副本
  failureDomain: osd
  #  failureDomain：數據塊的故障域，
  #  值為host時，每個數據塊將放置在不同的主機上
  #  值為osd時，每個數據塊將放置在不同的osd上
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: ceph
   # StorageClass的名字，pvc調用時填的名字
provisioner: ceph.rook.io/block
parameters:
  pool: replicapool
  # Specify the namespace of the rook cluster from which to create volumes.
  # If not specified, it will use `rook` as the default namespace of the cluster.
  # This is also the namespace where the cluster will be
  clusterNamespace: rook-ceph
  # Specify the filesystem type of the volume. If not specified, it will use `ext4`.
  fstype: xfs
# 設置回收策略默認為：Retain
reclaimPolicy: Retain

創建StorageClass

kubectl apply -f storageclass.yaml
kubectl get storageclasses.storage.k8s.io  -n rook-ceph
kubectl describe storageclasses.storage.k8s.io  -n rook-ceph

技術分享圖片

創建個nginx pod嘗試掛載

cat << EOF > nginx.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nginx-pvc
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
  storageClassName: ceph

---
apiVersion: v1
kind: Service
metadata:
  name: nginx
spec:
  selector:
    app: nginx
  ports: 
  - port: 80
    name: nginx-port
    targetPort: 80
    protocol: TCP

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      name: nginx
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
        volumeMounts:
        - mountPath: /html
          name: http-file
      volumes:
      - name: http-file
        persistentVolumeClaim:
          claimName: nginx-pvc
EOF

kubectl apply -f nginx.yaml

查看pv,pvc是否創建了

kubectl get pv,pvc

# 看一下nginx這個pod也運行了
kubectl get pod

技術分享圖片

刪除這個pod,看pv是否還存在

kubectl delete -f nginx.yaml

kubectl get pv,pvc
# 可以看到，pod和pvc都已經被刪除了，但是pv還在！！！

技術分享圖片

添加新的OSD進入集群

這次我們要把node4添加進集群，先打標簽

kubectl label nodes kube-node4 ceph-osd=enabled

重新編輯cluster.yaml文件

# 原來的基礎上添加node4的信息

cd $HOME/rook/cluster/examples/kubernetes/ceph/
vi cluster.yam

技術分享圖片

apply一下cluster.yaml文件

kubectl apply -f cluster.yaml

# 盯著rook-ceph名稱空間,集群會自動添加node4進來

kubectl -n rook-ceph get pod -o wide -w
kubectl -n rook-ceph get pod -o wide

技術分享圖片

去node4節點看一下磁盤

lsblk

技術分享圖片

再打開dashboard看一眼

技術分享圖片

刪除一個節點

去掉node3的標簽

kubectl label nodes kube-node3 ceph-osd-

重新編輯cluster.yaml文件

# 刪除node3的信息

cd $HOME/rook/cluster/examples/kubernetes/ceph/
vi cluster.yam

技術分享圖片

apply一下cluster.yaml文件

kubectl apply -f cluster.yaml

# 盯著rook-ceph名稱空間

kubectl -n rook-ceph get pod -o wide -w
kubectl -n rook-ceph get pod -o wide

# 最後記得刪除宿主機的/var/lib/rook文件夾

技術分享圖片

常見問題

官方解答：https://rook.io/docs/rook/v0.8/common-issues.html
當機器重啟之後，osd無法正常的Running，無限重啟

#解決辦法：

# 標記節點為 drain 狀態
kubectl drain <node-name> --ignore-daemonsets --delete-local-data

# 然後再恢復
kubectl uncordon <node-name>

kubernetes搭建rook-ceph

cpu databases ply scribe pac ont forward 回收 serve 簡介 Rook官網：https://rook.io Rook是雲原生計算基金會(CNCF)的孵化級項目. Rook是Kubernetes的開源雲本地存儲協調器，為各種存儲解

kubernetes部署rook+ceph儲存系統

rook簡介 Rook官網：https://rook.io 容器的持久化儲存容器的持久化儲存是儲存容器儲存狀態的重要手段，儲存外掛會在容器裡掛載一個基於網路或者其他機制的遠端資料卷，使得在容器裡建立的檔案，實際上是儲存在遠端儲存伺服器上，或者以分散式的方式儲存在多個節點上，而與當前

使用Rook+Ceph在Kubernetes上作持久儲存

使用Rook+Ceph在Kubernetes上作持久儲存作者：Earl C. Ruby III 我想在新的Kubernetes叢集上安裝Prometheus和Grafana，但為了使這些軟體包能夠工作，他們需要一些地方來儲存持久資料。當我在Seagate擔任雲架構師時，我已經對Cep

Kubernetes中分散式儲存Rook-Ceph部署快速演練

最近在專案中有涉及到Kubernetes的分散式儲存部分的內容，也抽空多瞭解了一些。專案主要基於Rook-Ceph執行，考慮到Rook-Ceph部署也不那麼簡單，[官方文件](https://rook.io/docs/rook/v1.5/)的步驟起點也不算低，因此，在整合官方文件的某些步驟的基礎上，寫篇文章簡

kubernetes搭建dashboard報錯

warningconfigmaps is forbidden: User "system:serviceaccount:kube-system:kubernetes-dashboard" cannot list configmaps in the namespace "default"closewarning

CentOS下搭建Teuthology Ceph自動化測試平臺（一）

Paddles及資料庫部署安裝相關軟體這李只列出一些必用的，每個人使用的環境不一樣，可能還會存在一些包沒有安裝的，搭建環境過程中，注意看輸出的日誌資訊，缺少什麼就安裝。 #yum install python-virtualenv postgresql po

kebu之rook-ceph

準備工作所有節點開啟ip_forward cat <<EOF > /etc/sysctl.d/ceph.conf net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridg

Kubernetes共享使用Ceph儲存_Kubernetes中文社群

目錄簡要概述環境測試結果驗證簡要概述 Kubernetes pod 結合Ceph rbd塊裝置的使用，讓Docker 資料儲存在Ceph,重啟Docker或k8s RC重新排程pod 不會引起資料來回遷移。工作原理無非就是拿到ceph叢集的key作為認證，遠端rbdmap對映掛載

初試 Kubernetes 叢集使用 Ceph RBD 塊儲存

目錄 Kubernetes PersistentVolumes 介紹環境、軟體準備單節點使用 Ceph RBD Kubernetes PV & PVC 方式使用 Ceph RBD 測試單節點以及多節點使用 Ceph RBD 1、Kube

centos7設定EPEL源、docker-ce-stable源、kubernetes源、ceph源

選擇一個好的映象源，可以加快Centos7系統的更新升級或者安裝軟體的速度。 1. 更新系統，配置優先順序更新系統，安裝epel源 # yum update -y # yum install -y epel-release 安裝並配置yum

基於Kubernetes搭建MySQL主從叢集

願你，忠於自己，活得像自己。清單：NameVersionCentOS7Kubernetes1.9.0Docker17.09.1-ceMySQL5.7前言令我始料不及的出差活動中，開始接觸Kubernetes並被要求搭建基於此的MySQL主從叢集，由於筆者在Linux、Kubernetes等方面都是小白，故此展

【kubernetes/k8s 部署】minikube 與 kubernetes 搭建

# --logtostderr=true: log to standard error instead of files KUBE_LOGTOSTDERR="--logtostderr=true" # --v=0: log level for V logs KUBE_LOG_LEVEL="--v=4" # -