python 實現儲存最新的三份檔案,其餘的都刪掉
阿新 • • 發佈:2020-01-09
我就廢話不多說了,直接上程式碼吧!
""" 對於每天儲存檔案,檔案數量過多,佔用空間 採用儲存最新的三個檔案 """ from airflow import DAG from airflow.operators.python_operator import PythonOperator from airflow.models import Variable from sctetl.airflow.utils import dateutils from datetime import datetime,timedelta import logging import os import shutil """ base_dir = "/data" data_dir = "/gather" "gather下邊存在不同的資料夾" "/data/gather/test" "test路徑下有以下資料夾" "20180812、20180813、20180814、20180815、20180816" """ base_dir = Variable.get("base_dir") data_dir = Variable.get("data_dir") keep = 3 default_arg = { "owner":"airflow","depends_on_past":False,"start_date":dateutils.get_start_date_local(2018,8,27,18,5),"email":[''],"email_on_failure":False,"email_on_retry":False,"retries":1,"retry_delay":timedelta(minutes=5) } dag = DAG(dag_id="keep_three_day",default_args=default_arg,schedule_interval=dateutils.get_schedule_interval_local(18,5)) def keep_three_day(): path = os.path.join(base_dir,data_dir) date_cates = os.listdir(path) for cate in date_cates: p = os.path.join(base_dir,data_dir,cate) if os.path.isdir(p): dir_names = os.listdir(p) dir_names.sort() for i in dir_names[:-keep]: logging.info("刪除目錄 {path}".format(path=os.path.join(p,i))) shutil.rmtree(os.path.join(p,i)) with dag: keep_three_file = PythonOperator(task_id="keep_three_file",python_callable=keep_three_day(),dag=dag) keep_three_file
以上這篇python 實現儲存最新的三份檔案,其餘的都刪掉就是小編分享給大家的全部內容了,希望能給大家一個參考,也希望大家多多支援我們。