1. 程式人生 > >在阿里雲託管kubernetes上利用 cert-manager 自動簽發 TLS 證書[無坑版]

在阿里雲託管kubernetes上利用 cert-manager 自動簽發 TLS 證書[無坑版]

## 前言 排錯的過程是痛苦的也是有趣的。 運維乃至IT,排錯能力是拉開人與人之間的重要差距。 本篇會記錄我的排錯之旅。 ### 由來 現如今我司所有業務都執行在阿里雲託管kubernetes環境上,因為前端需要對外訪問,所以需要對外域名,考慮申請https證書過於麻煩,所以希望藉助免費的工具自動生成[tls](https://baike.baidu.com/item/TLS/2979545?fr=aladdin)證書。 借鑑於網上或者[阿里雲](https://developer.aliyun.com/article/718711)的相關文件是存在大坑的,我認為有必要寫一篇無坑版的利用cert-manager自動簽發TLS證書。 ### 思路 cert-manager是Kubernetes上一個管理SSL證書的外掛,配合nginx-ingress可以對網站配置https訪問,在加上letsencrypt提供免費的SSL證書,所有就產生了cert-manager+nginx-ingress+letsencrypt的免費套餐。詳情請到GitHub檢視:[cert-manager](https://github.com/PowerDos/k8s-cret-manager-aliyun-webhook-demo) 本文將介紹:基於阿里雲託管kubernetes+cert-manager的 單域名,萬用字元域名證書申請。 ## 部署 注:網上大多都使用helm 部署的,而helm部署確實非常簡單,我認為最好最好不要使用他人的helm清單,不然出問題,就不曉得是怎麼一個部署邏輯,還需要去分析資源清單。 ### 前提 ``` kubectl create namespace cert-manager #建立 cert-manager 名稱空間 kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true #標記 cert-manager 名稱空間以禁用資源驗證 ``` ### 配置CRDs ``` wget https://github.com/jetstack/cert-manager/releases/download/v0.15.1/cert-manager-legacy.crds.yaml kubectl apply --validate=false -f cert-manager-legacy.crds.yaml ``` ### 配置cert-manager ``` # wget https://github.com/jetstack/cert-manager/releases/download/v0.15.1/cert-manager.yaml # kubectl apply --validate=false -f cert-manager.yaml 這裡會發現pod一直處於ContainerCreating 容器建立中 # kubectl get pods -n cert-manager NAME READY STATUS RESTARTS AGE cert-manager-75856bb467-fr5zz 0/1 ContainerCreating 0 16m cert-manager-cainjector-597f5b4768-jqsvp 0/1 ContainerCreating 0 16m cert-manager-webhook-5c9f7b5f75-gnphd 0/1 ContainerCreating 0 16m #檢視pod 詳情,會發現是因為拉取映象的問題,我的解決方案 #在香港機上拉取映象打標籤,推送到映象倉庫然後修改cert-manager.yaml的映象地址 #檢視原資源清單映象 # cat cert-manager.yaml |grep image image: "quay.io/jetstack/cert-manager-cainjector:v0.15.1" image: "quay.io/jetstack/cert-manager-controller:v0.15.1" image: "quay.io/jetstack/cert-manager-webhook:v0.15.1" #香港機拉取--> 打標籤 --> 推送 # docker pull quay.io/jetstack/cert-manager-cainjector:v0.15.1 # docker tag quay.io/jetstack/cert-manager-cainjector:v0.15.1 registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-cainjector-v0.15.1 # docker push registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-cainjector-v0.15.1 # docker pull quay.io/jetstack/cert-manager-controller:v0.15.1 # docker tag quay.io/jetstack/cert-manager-controller:v0.15.1 registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-controller-v0.15.1 # docker push registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-controller-v0.15.1 # docker pull quay.io/jetstack/cert-manager-webhook:v0.15.1 # docker tag quay.io/jetstack/cert-manager-webhook:v0.15.1 registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-webhook-v0.15.1 # docker push registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-webhook-v0.15.1 #現資源清單映象 【臨時映象倉庫地址為公開】 # cat cert-manager.yaml |grep image #image: "quay.io/jetstack/cert-manager-cainjector:v0.15.1" image: "registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-cainjector-v0.15.1" #image: "quay.io/jetstack/cert-manager-controller:v0.15.1" image: "registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-controller-v0.15.1" #image: "quay.io/jetstack/cert-manager-webhook:v0.15.1" image: "registry.cn-shenzhen.aliyuncs.com/realibox_test/cert-manager:cert-manager-webhook-v0.15.1" # kubectl apply --validate=false -f cert-manager.yaml # kubectl get pods -n cert-manager NAME READY STATUS RESTARTS AGE cert-manager-75856bb467-fr5zz 1/1 Running 0 1h cert-manager-cainjector-597f5b4768-jqsvp 1/1 Running 0 1h cert-manager-webhook-5c9f7b5f75-gnphd 1/1 Running 0 1h ``` #### 驗證cert-manager ``` # cat test-cert-manager.yaml ########################################################################## #Author: zisefeizhu #QQ: 2********0 #Date: 2020-08-11 #FileName: test-cert-manager.yaml #URL: https://www.cnblogs.com/zisefeizhu/ #Description: The test script #Copyright (C): 2020 All rights reserved ########################################################################### apiVersion: v1 kind: Namespace metadata: name: cert-manager-test --- apiVersion: cert-manager.io/v1alpha2 kind: Issuer metadata: name: test-selfsigned namespace: cert-manager-test spec: selfSigned: {} --- apiVersion: cert-manager.io/v1alpha2 kind: Certificate metadata: name: selfsigned-cert namespace: cert-manager-test spec: commonName: example.com secretName: selfsigned-cert-tls issuerRef: name: test-selfsigned # kubectl apply -f test-cert-manager.yaml Normal GeneratedKey 20s cert-manager Generated a new private key Normal Requested 20s cert-manager Created new CertificateRequest resource "selfsigned-cert-2334779822" Normal Issued 20s cert-manager Certificate issued successfully # kubectl delete -f test-cert-manager.yaml ``` ### 通過dns配置域名證書 這裡樣式的是阿里雲DNS操作的流程,如果需要其他平臺的方法,可以自行開發,或者找已開源webhook,這是官方的例子:https://github.com/jetstack/cert-manager-webhook-example 這裡用的是這個包:https://github.com/pragkent/alidns-webhook ### 配置alidns的webhook ``` # wget https://raw.githubusercontent.com/pragkent/alidns-webhook/master/deploy/bundle.yaml # kubectl apply -f bundle.yaml # kubectl get pods -n cert-manager #檢視webhook NAME READY STATUS RESTARTS AGE cert-manager-webhook-alidns-6b87bc8597-tc9pk 1/1 Running 2 1h ``` ### 配置Issuer `cert-manager` 提供了`Issuer` 和 `ClusterIssuer` 兩種型別的簽發機構,`Issuer` 只能用來簽發自己所在名稱空間下的證書,`ClusterIssuer`可以簽發任意名稱空間下的證書。 通過阿里雲RAM建立一個賬號,並授權`AliyunDNSFullAccess,管理雲解析(DNS)的許可權`,將賬號的AK記下來,並通過下面的命令建立secret,這個secret用於webhook在DNS認證的時候,會向DNS解析裡面寫入一條txt型別的記錄,認證完成後刪除。 建立 alidns AccessKey Id 和 Secret ``` # kubectl -n cert-manager create secret generic alidns-access-key-id --from-literal=accessKeyId='xxxxxxx' # kubectl -n cert-manager create secret generic alidns-access-key-secret --from-literal=accessKeySecret='xxxxxxx' ``` 我這裡用 `ClusterIssuer` 為例,建立 `letsencrypt-prod.yaml` 檔案 ``` # cat letsencrypt-prod.yaml ########################################################################## #Author: zisefeizhu #QQ: 2********0 #Date: 2020-08-10 #FileName: letsencrypt-prod.yaml #URL: https://www.cnblogs.com/zisefeizhu/ #Description: The test script #Copyright (C): 2020 All rights reserved ########################################################################### apiVersion: cert-manager.io/v1alpha2 kind: ClusterIssuer metadata: labels: name: letsencrypt-prod name: letsencrypt-prod # 自定義的簽發機構名稱,後面會引用 spec: acme: email: [email protected] # 你的郵箱,證書快過期的時候會郵件提醒,不過我們可以設定自動續期 solvers: - http01: ingress: class: nginx privateKeySecretRef: name: letsencrypt-prod # 指示此簽發機構的私鑰將要儲存到哪個 Secret 物件中 server: https://acme-v02.api.letsencrypt.org/directory # acme 協議的服務端,我們用Let's Encrypt ``` 讓我們看看acme協議的服務端資訊 ``` { "Dmrr3rQDHDQ": "https://community.letsencrypt.org/t/adding-random-entries-to-the-directory/33417", "keyChange": "https://acme-v02.api.letsencrypt.org/acme/key-change", "meta": { "caaIdentities": [ "letsencrypt.org" ], "termsOfService": "https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf", "website": "https://letsencrypt.org" }, "newAccount": "https://acme-v02.api.letsencrypt.org/acme/new-acct", "newNonce": "https://acme-v02.api.letsencrypt.org/acme/new-nonce", "newOrder": "https://acme-v02.api.letsencrypt.org/acme/new-order", "revokeCert": "https://acme-v02.api.letsencrypt.org/acme/revoke-cert" } ``` 應用 `yaml` ``` # kubectl apply -f letsencrypt-prod.yaml ``` 檢視狀態 ``` # kubectl get ClusterIssuer NAME READY AGE letsencrypt-prod True 1h ``` 至此,在kubernetes上利用 cert-manager 自動簽發 TLS 證書 理論上部署完畢,下面進行驗證! ## 驗證 在這裡我將提供兩種驗證方法:1. 手動簽發證書 2. 自動簽發證書 注意:這裡存在一個大坑,請留意! ### 手動簽發證書 ``` # cat test-manual-cert.yaml ########################################################################## #Author: zisefeizhu #QQ: 2********0 #Date: 2020-08-11 #FileName: test-manual-cert.yaml #URL: https://www.cnblogs.com/zisefeizhu/ #Description: The test script #Copyright (C): 2020 All rights reserved ########################################################################### apiVersion: cert-manager.io/v1alpha2 kind: Certificate metadata: name: test-monkeyrun-net-cert spec: secretName: tls-test-monkeyrun-net # 證書儲存的 secret 名 duration: 2160h # 90d renewBefore: 360h # 15d organization: - jetstack isCA: false keySize: 2048 keyAlgorithm: rsa keyEncoding: pkcs1 dnsNames: - test01.advance.realibox.com issuerRef: name: letsencrypt-prod kind: ClusterIssuer group: cert-manager.io # kubectl apply -f test-manual-cert.yaml certificate.cert-manager.io/test-monkeyrun-net-cert created ``` #### 大坑 預警:坑要來了!!! ``` # kubectl get certificate #檢查是否生成證書檔案 NAME READY SECRET AGE test-monkeyrun-net-cert False tls-test-monkeyrun-net 27s # # kubectl get certificate NAME READY SECRET AGE test-monkeyrun-net-cert False tls-test-monkeyrun-net 27s # kubectl describe certificate test-monkeyrun-net-cert #檢視詳情 Status: Conditions: Last Transition Time: 2020-08-11T02:53:06Z Message: Waiting for CertificateRequest "test-monkeyrun-net-cert-1270901994" to complete Reason: InProgress Status: False Type: Ready Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Requested 40s cert-manager Created new CertificateRequest resource "test-monkeyrun-net-cert-1270901994" 這裡證書生成是失敗了的,原因:Waiting for CertificateRequest "test-monkeyrun-net-cert-1270901994" to complete 一直在請求,也就是說請求不到。這個問題從上週五就開始困擾著我。這也是個大坑,可看到的文件基本沒有講到 ``` #### 解決 檢視有關證書生成的元件落到的節點: ``` # kubectl get pods -n cert-manager -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES cert-manager-75856bb467-fr5zz 1/1 Running 0 18h 192.168.0.98 cn-shenzhen.172.16.0.123 cert-manager-cainjector-597f5b4768-jqsvp 1/1 Running 0 18h 192.168.0.101 cn-shenzhen.172.16.0.123 cert-manager-webhook-5c9f7b5f75-gnphd 1/1 Running 0 18h 192.168.0.99 cn-shenzhen.172.16.0.123 cert-manager-webhook-alidns-6b87bc8597-tc9pk 1/1 Running 2 17h 192.168.0.13 cn-shenzhen.172.16.0.122 ``` 發現它們落在不同的節點上,靈光一閃,想起來了一件事:https://www.cnblogs.com/zisefeizhu/p/13262239.html 或許這個問題和自處一樣呢?證書頒發者的pod與負載均衡器纏繞在不同的節點上,因此它無法通過入口與其自身進行通訊。有可能哦 登陸阿里雲看此kubernetes叢集的外部流量引入策略 ![image.png](https://cdn.nlark.com/yuque/0/2020/png/1143489/1597115629641-0aa5cc01-1c4a-4757-b12e-2a006189a77d.png) 還真是的呢?根據之前對externaltrafficpolicy 的原理性瞭解,我有90%的把握是此處的問題,改為cluster 注:為什麼我要在這裡 我要登陸aliyun 點選更改而不是用命令 匯出資源清單更改呢?這是因為阿里雲的託管k8s有坑,這裡如果用命令來改會導致nginx-ingress的lb的IP 也就是對外的公網IP發生變化,這樣你的域名就全失效了因為IP變了.... 這個需要固定IP 再次測試 ``` # kubectl delete -f test-manual-cert.yaml certificate.cert-manager.io "test-monkeyrun-net-cert" deleted # kubectl apply -f test-manual-cert.yaml certificate.cert-manager.io/test-monkeyrun-net-cert created # kubectl get certificate NAME READY SECRET AGE test-monkeyrun-net-cert True tls-test-monkeyrun-net 4s ```