1. 程式人生 > 其它 >在 Docker 上快速執行 Apache Airflow 2.2.4

在 Docker 上快速執行 Apache Airflow 2.2.4

Docker 安裝 Apache Airflow

目錄

參考資料

安裝依賴

  1. Docker Engine
  2. Docker Composite

快速執行 Apache Airflow 2.2.4

在 Docker 使用 CeleryExecutor(一種統計 worker 數量的途徑) 快速執行 Apache Airflow

1. 下載 docker-compose.yaml

命令:


# 建立一個目錄
mkdir -p /home/public/Soft/airflow
cd /home/public/Soft/airflow
# 下載 
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.2.4/docker-compose.yaml'

這個檔案包含了多個服務的定義:

  • airflow-scheduler - The scheduler monitors all tasks and DAGs, then triggers the task instances once their dependencies are complete.
  • airflow-webserver - The webserver is available at http://localhost:8080.
  • airflow-worker - The worker that executes the tasks given by the scheduler.
  • airflow-init - The initialization service.
  • flower - The flower app for monitoring the environment. It is available at http://localhost:5555.
  • postgres - The database.
  • redis - The redis - broker that forwards messages from scheduler to worker.

2. 在 docker-compose.yaml 同級目錄下建立資料夾

在 docker-compose.yaml 同級目錄下,建立 dags logs plugins資料夾

cd /home/public/Soft/airflow
mkdir -p ./dags 
mkdir -p ./logs
mkdir -p ./plugins

dags logs plugins資料夾 作用:

  • ./dags - you can put your DAG files here.
  • ./logs - contains logs from task execution and scheduler.
  • ./plugins - you can put your custom plugins here.

3. 初始化環境

初始化環境,就是新增幾個資料夾。

3.1 設定正確的使用者

命令:

cd /home/public/Soft/airflow
echo -e "AIRFLOW_UID=$(id -u)" > .env

其中,AIRFLOW_UID 是 Docker Compose 環境變數,具體請看(https://airflow.apache.org/docs/apache-airflow/2.2.4/start/docker.html#docker-compose-env-variables )。

生成的 .env 檔案內容可能如下:

AIRFLOW_UID=50000

3.2 初始化資料庫

cd /home/public/Soft/airflow
docker-compose up airflow-init

控制檯可能列印如下內容:

airflow-init_1       | Upgrades done
airflow-init_1       | Admin user airflow created
airflow-init_1       | 2.2.4
start_airflow-init_1 exited with code 0

初始化,預設的 Airflow 的登陸使用者和密碼 : airflow airflow

4. 執行 airflow

cd /home/public/Soft/airflow
docker-compose up

5. 訪問環境

有3中方式訪問環境:命令列,瀏覽器訪問,REST API。

5.1 命令列

下載 airflow.sh

curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.2.4/airflow.sh'
chmod +x airflow.sh

airflow.sh 指令碼內容如下:

#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.

#
# Run airflow command in container
#

PROJECT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

set -euo pipefail

export COMPOSE_FILE="${PROJECT_DIR}/docker-compose.yaml"
if [ $# -gt 0 ]; then
    exec docker-compose run --rm airflow-cli "${@}"
else
    exec docker-compose run --rm airflow-cli
fi

使用 airflow.sh 可以快速執行命令,例如:

arflow.sh info  

5.2 瀏覽器訪問

瀏覽器訪問 http://localhost:8080
預設登入名和密碼: airflow airflow

5.3 給 REST API 發請求

使用 curl 發請求:

ENDPOINT_URL="http://localhost:8080/"
curl -X GET  \
    --user "airflow:airflow" \
    "${ENDPOINT_URL}/api/v1/pools"

清除容器

清除容器,卷等,命令如下:

docker-compose down --volumes --rmi all

清除環境資訊

以上是快速啟動配置,如果需要定製化配置,則可以先清除環境資訊

  1. 停止容器
cd /home/public/Soft/airflow
docker-compose down --volumes --remove-orphans
  1. 刪除下載目錄和 docker-compose.yaml
cd /home/public/Soft/airflow
rm -rf *
  1. 重新下載 docker-compose.yaml
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.2.4/docker-compose.yaml'

  1. 從開頭重新執行指令