在 Docker 上快速執行 Apache Airflow 2.2.4
阿新 • • 發佈:2022-03-19
Docker 安裝 Apache Airflow
參考資料
安裝依賴
- Docker Engine
- Docker Composite
快速執行 Apache Airflow 2.2.4
在 Docker 使用 CeleryExecutor(一種統計 worker 數量的途徑) 快速執行 Apache Airflow
1. 下載 docker-compose.yaml
命令:
# 建立一個目錄
mkdir -p /home/public/Soft/airflow
cd /home/public/Soft/airflow
# 下載
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.2.4/docker-compose.yaml'
這個檔案包含了多個服務的定義:
- airflow-scheduler - The scheduler monitors all tasks and DAGs, then triggers the task instances once their dependencies are complete.
- airflow-webserver - The webserver is available at http://localhost:8080.
- airflow-worker - The worker that executes the tasks given by the scheduler.
- airflow-init - The initialization service.
- flower - The flower app for monitoring the environment. It is available at http://localhost:5555.
- postgres - The database.
- redis - The redis - broker that forwards messages from scheduler to worker.
2. 在 docker-compose.yaml 同級目錄下建立資料夾
在 docker-compose.yaml 同級目錄下,建立 dags logs plugins資料夾
cd /home/public/Soft/airflow
mkdir -p ./dags
mkdir -p ./logs
mkdir -p ./plugins
dags logs plugins資料夾 作用:
- ./dags - you can put your DAG files here.
- ./logs - contains logs from task execution and scheduler.
- ./plugins - you can put your custom plugins here.
3. 初始化環境
初始化環境,就是新增幾個資料夾。
3.1 設定正確的使用者
命令:
cd /home/public/Soft/airflow
echo -e "AIRFLOW_UID=$(id -u)" > .env
其中,AIRFLOW_UID 是 Docker Compose 環境變數,具體請看(https://airflow.apache.org/docs/apache-airflow/2.2.4/start/docker.html#docker-compose-env-variables )。
生成的 .env 檔案內容可能如下:
AIRFLOW_UID=50000
3.2 初始化資料庫
cd /home/public/Soft/airflow
docker-compose up airflow-init
控制檯可能列印如下內容:
airflow-init_1 | Upgrades done
airflow-init_1 | Admin user airflow created
airflow-init_1 | 2.2.4
start_airflow-init_1 exited with code 0
初始化,預設的 Airflow 的登陸使用者和密碼 : airflow airflow
4. 執行 airflow
cd /home/public/Soft/airflow
docker-compose up
5. 訪問環境
有3中方式訪問環境:命令列,瀏覽器訪問,REST API。
5.1 命令列
下載 airflow.sh
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.2.4/airflow.sh'
chmod +x airflow.sh
airflow.sh 指令碼內容如下:
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
# Run airflow command in container
#
PROJECT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
set -euo pipefail
export COMPOSE_FILE="${PROJECT_DIR}/docker-compose.yaml"
if [ $# -gt 0 ]; then
exec docker-compose run --rm airflow-cli "${@}"
else
exec docker-compose run --rm airflow-cli
fi
使用 airflow.sh 可以快速執行命令,例如:
arflow.sh info
5.2 瀏覽器訪問
瀏覽器訪問 http://localhost:8080
預設登入名和密碼: airflow airflow
5.3 給 REST API 發請求
使用 curl 發請求:
ENDPOINT_URL="http://localhost:8080/"
curl -X GET \
--user "airflow:airflow" \
"${ENDPOINT_URL}/api/v1/pools"
清除容器
清除容器,卷等,命令如下:
docker-compose down --volumes --rmi all
清除環境資訊
以上是快速啟動配置,如果需要定製化配置,則可以先清除環境資訊
- 停止容器
cd /home/public/Soft/airflow
docker-compose down --volumes --remove-orphans
- 刪除下載目錄和 docker-compose.yaml
cd /home/public/Soft/airflow
rm -rf *
- 重新下載 docker-compose.yaml
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.2.4/docker-compose.yaml'
- 從開頭重新執行指令