1. 程式人生 > 實用技巧 >在阿里雲託管kubernetes上利用 cert-manager 自動簽發 TLS 證書[無坑版]

在阿里雲託管kubernetes上利用 cert-manager 自動簽發 TLS 證書[無坑版]

前言

排錯的過程是痛苦的也是有趣的。

運維乃至IT,排錯能力是拉開人與人之間的重要差距。

本篇會記錄我的排錯之旅。

由來

現如今我司所有業務都執行在阿里雲託管kubernetes環境上,因為前端需要對外訪問,所以需要對外域名,考慮申請https證書過於麻煩,所以希望藉助免費的工具自動生成tls證書。

借鑑於網上或者阿里雲的相關文件是存在大坑的,我認為有必要寫一篇無坑版的利用cert-manager自動簽發TLS證書。

思路

cert-manager是Kubernetes上一個管理SSL證書的外掛,配合nginx-ingress可以對網站配置https訪問,在加上letsencrypt提供免費的SSL證書,所有就產生了cert-manager+nginx-ingress+letsencrypt的免費套餐。詳情請到GitHub檢視:

cert-manager

本文將介紹:基於阿里雲託管kubernetes+cert-manager的 單域名,萬用字元域名證書申請。

部署

注:網上大多都使用helm 部署的,而helm部署確實非常簡單,我認為最好最好不要使用他人的helm清單,不然出問題,就不曉得是怎麼一個部署邏輯,還需要去分析資源清單。

前提

kubectl create namespace cert-manager  #建立 cert-manager 名稱空間
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true #標記 cert-manager 名稱空間以禁用資源驗證

配置CRDs

wget https://github.com/jetstack/cert-manager/releases/download/v0.15.1/cert-manager-legacy.crds.yaml
kubectl apply --validate=false -f cert-manager-legacy.crds.yaml

配置cert-manager

# wget https://github.com/jetstack/cert-manager/releases/download/v0.15.1/cert-manager.yaml 
# kubectl apply --validate=false -f cert-manager.yaml

這裡會發現pod一直處於ContainerCreating 容器建立中
# kubectl get pods -n cert-manager 
NAME                                           READY   STATUS              RESTARTS   AGE
cert-manager-75856bb467-fr5zz                  0/1     ContainerCreating   0          16m
cert-manager-cainjector-597f5b4768-jqsvp       0/1     ContainerCreating   0          16m
cert-manager-webhook-5c9f7b5f75-gnphd          0/1     ContainerCreating   0          16m

#檢視pod 詳情,會發現是因為拉取映象的問題,我的解決方案
#在香港機上拉取映象打標籤,推送到映象倉庫然後修改cert-manager.yaml的映象地址
#檢視原資源清單映象
# cat cert-manager.yaml |grep image
          image: "quay.io/jetstack/cert-manager-cainjector:v0.15.1"
          image: "quay.io/jetstack/cert-manager-controller:v0.15.1"
          image: "quay.io/jetstack/cert-manager-webhook:v0.15.1"      

#香港機拉取--> 打標籤 --> 推送
#   docker pull  quay.io/jetstack/cert-manager-cainjector:v0.15.1
# docker tag quay.io/jetstack/cert-manager-cainjector:v0.15.1  registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-cainjector-v0.15.1
# docker push registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-cainjector-v0.15.1
# docker pull quay.io/jetstack/cert-manager-controller:v0.15.1
# docker tag quay.io/jetstack/cert-manager-controller:v0.15.1 registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-controller-v0.15.1
# docker push registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-controller-v0.15.1
# docker pull quay.io/jetstack/cert-manager-webhook:v0.15.1
# docker tag quay.io/jetstack/cert-manager-webhook:v0.15.1 registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-webhook-v0.15.1
# docker push registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-webhook-v0.15.1
#現資源清單映象 【臨時映象倉庫地址為公開】
# cat cert-manager.yaml |grep image
          #image: "quay.io/jetstack/cert-manager-cainjector:v0.15.1"
          image: "registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-cainjector-v0.15.1"
          #image: "quay.io/jetstack/cert-manager-controller:v0.15.1"
          image: "registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-controller-v0.15.1"
          #image: "quay.io/jetstack/cert-manager-webhook:v0.15.1"
          image: "registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-webhook-v0.15.1"
# kubectl apply --validate=false -f cert-manager.yaml
# kubectl get pods -n cert-manager 
NAME                                           READY   STATUS    RESTARTS   AGE
cert-manager-75856bb467-fr5zz                  1/1     Running   0          1h
cert-manager-cainjector-597f5b4768-jqsvp       1/1     Running   0          1h
cert-manager-webhook-5c9f7b5f75-gnphd          1/1     Running   0          1h

驗證cert-manager

# cat test-cert-manager.yaml 
##########################################################################
#Author:                     zisefeizhu
#QQ:                         2********0
#Date:                       2020-08-11
#FileName:                   test-cert-manager.yaml
#URL:                        https://www.cnblogs.com/zisefeizhu/
#Description:                The test script
#Copyright (C):              2020 All rights reserved
###########################################################################
apiVersion: v1
kind: Namespace
metadata:
  name: cert-manager-test
---
apiVersion: cert-manager.io/v1alpha2
kind: Issuer
metadata:
  name: test-selfsigned
  namespace: cert-manager-test
spec:
  selfSigned: {}
---
apiVersion: cert-manager.io/v1alpha2 
kind: Certificate
metadata:
  name: selfsigned-cert
  namespace: cert-manager-test
spec:
  commonName: example.com
  secretName: selfsigned-cert-tls
  issuerRef:
    name: test-selfsigned

# kubectl apply -f test-cert-manager.yaml
  Normal  GeneratedKey  20s   cert-manager  Generated a new private key
  Normal  Requested     20s   cert-manager  Created new CertificateRequest resource "selfsigned-cert-2334779822"
  Normal  Issued        20s   cert-manager  Certificate issued successfully
  
# kubectl delete -f test-cert-manager.yaml  

通過dns配置域名證書

這裡樣式的是阿里雲DNS操作的流程,如果需要其他平臺的方法,可以自行開發,或者找已開源webhook,這是官方的例子:https://github.com/jetstack/cert-manager-webhook-example

這裡用的是這個包:https://github.com/pragkent/alidns-webhook

配置alidns的webhook

# wget https://raw.githubusercontent.com/pragkent/alidns-webhook/master/deploy/bundle.yaml
# kubectl apply -f bundle.yaml
# kubectl get pods -n cert-manager  #檢視webhook
NAME                                           READY   STATUS    RESTARTS   AGE
cert-manager-webhook-alidns-6b87bc8597-tc9pk   1/1     Running   2          1h

配置Issuer

cert-manager 提供了IssuerClusterIssuer 兩種型別的簽發機構,Issuer 只能用來簽發自己所在名稱空間下的證書,ClusterIssuer可以簽發任意名稱空間下的證書。

通過阿里雲RAM建立一個賬號,並授權AliyunDNSFullAccess,管理雲解析(DNS)的許可權,將賬號的AK記下來,並通過下面的命令建立secret,這個secret用於webhook在DNS認證的時候,會向DNS解析裡面寫入一條txt型別的記錄,認證完成後刪除。

建立 alidns AccessKey Id 和 Secret

# kubectl -n cert-manager create secret generic alidns-access-key-id --from-literal=accessKeyId='xxxxxxx'
# kubectl -n cert-manager create secret generic alidns-access-key-secret --from-literal=accessKeySecret='xxxxxxx'

我這裡用 ClusterIssuer 為例,建立 letsencrypt-prod.yaml 檔案

# cat letsencrypt-prod.yaml 
##########################################################################
#Author:                     zisefeizhu
#QQ:                         2********0
#Date:                       2020-08-10
#FileName:                   letsencrypt-prod.yaml
#URL:                        https://www.cnblogs.com/zisefeizhu/
#Description:                The test script
#Copyright (C):              2020 All rights reserved
###########################################################################
apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  labels:
    name: letsencrypt-prod
  name: letsencrypt-prod # 自定義的簽發機構名稱,後面會引用
spec:
  acme:
    email: [email protected] # 你的郵箱,證書快過期的時候會郵件提醒,不過我們可以設定自動續期
    solvers:
    - http01:
        ingress:
          class: nginx
    privateKeySecretRef:
      name: letsencrypt-prod # 指示此簽發機構的私鑰將要儲存到哪個 Secret 物件中
    server: https://acme-v02.api.letsencrypt.org/directory # acme 協議的服務端,我們用Let's Encrypt

讓我們看看acme協議的服務端資訊

{
  "Dmrr3rQDHDQ": "https://community.letsencrypt.org/t/adding-random-entries-to-the-directory/33417",
  "keyChange": "https://acme-v02.api.letsencrypt.org/acme/key-change",
  "meta": {
    "caaIdentities": [
      "letsencrypt.org"
    ],
    "termsOfService": "https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf",
    "website": "https://letsencrypt.org"
  },
  "newAccount": "https://acme-v02.api.letsencrypt.org/acme/new-acct",
  "newNonce": "https://acme-v02.api.letsencrypt.org/acme/new-nonce",
  "newOrder": "https://acme-v02.api.letsencrypt.org/acme/new-order",
  "revokeCert": "https://acme-v02.api.letsencrypt.org/acme/revoke-cert"
}

應用 yaml

# kubectl apply -f letsencrypt-prod.yaml

檢視狀態

# kubectl get  ClusterIssuer
NAME               READY   AGE
letsencrypt-prod   True    1h

至此,在kubernetes上利用 cert-manager 自動簽發 TLS 證書 理論上部署完畢,下面進行驗證!

驗證

在這裡我將提供兩種驗證方法:1. 手動簽發證書 2. 自動簽發證書

注意:這裡存在一個大坑,請留意!

手動簽發證書

# cat test-manual-cert.yaml
##########################################################################
#Author:                     zisefeizhu
#QQ:                         2********0
#Date:                       2020-08-11
#FileName:                   test-manual-cert.yaml
#URL:                        https://www.cnblogs.com/zisefeizhu/
#Description:                The test script
#Copyright (C):              2020 All rights reserved
###########################################################################
apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
  name: test-monkeyrun-net-cert
spec:
  secretName: tls-test-monkeyrun-net # 證書儲存的 secret 名
  duration: 2160h # 90d
  renewBefore: 360h # 15d
  organization:
  - jetstack
  isCA: false
  keySize: 2048
  keyAlgorithm: rsa
  keyEncoding: pkcs1
  dnsNames:
  - test01.advance.realibox.com
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
    group: cert-manager.io

# kubectl apply -f test-manual-cert.yaml
certificate.cert-manager.io/test-monkeyrun-net-cert created

大坑

預警:坑要來了!!!

# kubectl get certificate   #檢查是否生成證書檔案
NAME                      READY   SECRET                   AGE
test-monkeyrun-net-cert   False   tls-test-monkeyrun-net   27s
# # kubectl get certificate
NAME                      READY   SECRET                   AGE
test-monkeyrun-net-cert   False   tls-test-monkeyrun-net   27s
# kubectl describe certificate test-monkeyrun-net-cert  #檢視詳情
Status:
  Conditions:
    Last Transition Time:  2020-08-11T02:53:06Z
    Message:               Waiting for CertificateRequest "test-monkeyrun-net-cert-1270901994" to complete
    Reason:                InProgress
    Status:                False
    Type:                  Ready
Events:
  Type    Reason     Age   From          Message
  ----    ------     ----  ----          -------
  Normal  Requested  40s   cert-manager  Created new CertificateRequest resource "test-monkeyrun-net-cert-1270901994"

這裡證書生成是失敗了的,原因:Waiting for CertificateRequest "test-monkeyrun-net-cert-1270901994" to complete

一直在請求,也就是說請求不到。這個問題從上週五就開始困擾著我。這也是個大坑,可看到的文件基本沒有講到

解決

檢視有關證書生成的元件落到的節點:

# kubectl get pods -n cert-manager -o wide
NAME                                           READY   STATUS    RESTARTS   AGE   IP              NODE                       NOMINATED NODE   READINESS GATES
cert-manager-75856bb467-fr5zz                  1/1     Running   0          18h   192.168.0.98    cn-shenzhen.172.16.0.123   <none>           <none>
cert-manager-cainjector-597f5b4768-jqsvp       1/1     Running   0          18h   192.168.0.101   cn-shenzhen.172.16.0.123   <none>           <none>
cert-manager-webhook-5c9f7b5f75-gnphd          1/1     Running   0          18h   192.168.0.99    cn-shenzhen.172.16.0.123   <none>           <none>
cert-manager-webhook-alidns-6b87bc8597-tc9pk   1/1     Running   2          17h   192.168.0.13    cn-shenzhen.172.16.0.122   <none>           <none>

發現它們落在不同的節點上,靈光一閃,想起來了一件事:https://www.cnblogs.com/zisefeizhu/p/13262239.html 或許這個問題和自處一樣呢?證書頒發者的pod與負載均衡器纏繞在不同的節點上,因此它無法通過入口與其自身進行通訊。有可能哦

登陸阿里雲看此kubernetes叢集的外部流量引入策略

還真是的呢?根據之前對externaltrafficpolicy 的原理性瞭解,我有90%的把握是此處的問題,改為cluster

注:為什麼我要在這裡 我要登陸aliyun 點選更改而不是用命令 匯出資源清單更改呢?這是因為阿里雲的託管k8s有坑,這裡如果用命令來改會導致nginx-ingress的lb的IP 也就是對外的公網IP發生變化,這樣你的域名就全失效了因為IP變了.... 這個需要固定IP

再次測試

# kubectl delete -f test-manual-cert.yaml 
certificate.cert-manager.io "test-monkeyrun-net-cert" deleted
# kubectl apply -f test-manual-cert.yaml
certificate.cert-manager.io/test-monkeyrun-net-cert created
# kubectl get certificate
NAME                      READY   SECRET                   AGE
test-monkeyrun-net-cert   True    tls-test-monkeyrun-net   4s