Apache Superset 二次開發
基本概念
Superset 是 Airbnb 開源的一個旨在視覺,直觀和互動式的資料探索平臺(曾用名 Panoramix、Caravel,現已進入 Apache 孵化器)
基礎元件
Flask
Python 幾大著名 Web 框架之一,以其輕量級, 高可擴充套件性而著名
Jinja2
模板引擎Werkzeug
WSGI 工具集
Gunicorn
Gunicorn 是一個開源的 Python WSGI HTTP 伺服器,移植於 Ruby 的 Unicorn 專案的採用 pre-fork 模式的伺服器
WSGI
WSGI,即 Python **W**eb **S**erver **G**ateway **I**nterface,是專門用於 Python 應用程式或框架與 Web 伺服器之間的一種介面,沒有官方的實現,因為 WSGI 更像一個協議,只要遵照這些協議,WSGI 應用都可以在 任何伺服器上執行,反之亦然
Pre-Fork
一個程序處理一個請求,基於 select 模型,所以最多一次建立 1024 個程序
預先建立程序,pre-fork 採用的是預派生子程序方式,用子程序處理不同的請求,每個請求對應一個子程序,程序之間是彼此獨立的
一定程度上加快了程序的響應速度
Django
Django 是一個開放原始碼的 Web 應用框架,由 Python 寫成。採用了 MVC 的軟體設計模式,使得開發複雜的、資料庫驅動的網站變得簡單
Django 注重元件的重用性和” 可插拔性”,敏捷開發和 DRY 法則(Do not Repeat Yourself)
核心元件
* 物件導向的對映器,用作資料模型(以 Python 類的形式定義)和 關聯性資料庫間的媒介
* 基於正則表示式的 URL 分發器
* 檢視系統,用於處理請求
* 模板系統
PyDruid
A Python connector for Druid
Exposes a simple API to create, execute, and analyze Druid queries
Pandas
Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive
SciPy
SciPy 是基於 Numpy 構建的一個集成了多種數學演算法和方便的函式的 Python 模組
Scikit-learn
Machine Learning in Python
D3.js
D3.js 是一個操縱資料的 JavaScript 庫
安裝
基礎環境
OS
$ uname -a
Linux 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
$ cat /proc/version
Linux version 2.6.32-431.el6.x86_64 ([email protected]) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Nov 22 03:15:09 UTC 2013
# For Fedora and RHEL-derivatives
# [Doc]: Other System https://superset.apache.org/installation.html#os-dependencies
$ sudo yum upgrade python-setuptools -y
$ sudo yum install gcc libffi-devel python-devel python-pip python-wheel openssl-devel libsasl2-devel openldap-devel -y
Machines
# 外網(http://192.168.1.10:9097/)
superset01 192.168.1.10 Superset
druid01 192.168.1.11 Druid
druid02 192.168.1.12 MySQL
# Cluster 配置
Cluster druid cluster
Coordinator Host 192.168.1.11
Coordinator Port 8081
Coordinator Endpoint druid/coordinator/v1/metadata
Broker Host 192.168.1.13
Broker Port 8082
Broker Endpoint druid/v2
Cache Timeout 86400 # 1day: result_backend
# 線上(http://192.168.2.10:9097)
druid-prd01 192.168.2.10 Superset
druid-prd02 192.168.2.11 Druid
# Cluster 配置
Cluster druid cluster
Coordinator Host 192.168.2.11
Coordinator Port 8081
Coordinator Endpoint druid/coordinator/v1/metadata
Broker Host 192.168.2.13
Broker Port 8082
Broker Endpoint druid/v2
Cache Timeout 86400 # 1day: result_backend
Python 相關
Python
$ python --version
Python 2.7.8
[Note]: Superset is tested using Python 2.7 and Python 3.4+. Python 3 is the recommended version, Python 2.6 won't be supported.'
## 升級 Python(stable: Python 2.7.12 | 3.4.5, lastest: Python 3.5.2 [2016/12/15])
https://www.python.org/downloads/
# 在 python ftp 伺服器中下載到,對應版本的 python
$ wget http://python.org/ftp/python/2.7.12/Python-2.7.12.tgz
# 編譯
$ tar -zxvf Python-2.7.12.tgz
$ cd /root/software/Python-2.7.12
$ ./configure --prefix=/usr/local/python27
$ make
$ make install
$ ls /usr/local/python27/ -al
drwxr-xr-x. 6 root root 4096 12月 15 14:22 .
drwxr-xr-x. 13 root root 4096 12月 15 14:20 ..
drwxr-xr-x. 2 root root 4096 12月 15 14:22 bin
drwxr-xr-x. 3 root root 4096 12月 15 14:21 include
drwxr-xr-x. 4 root root 4096 12月 15 14:22 lib
drwxr-xr-x. 3 root root 4096 12月 15 14:22 share
# 覆蓋原來的 python6
$ which python
/usr/local/bin/python
# mv /usr/bin/python /usr/bin/python_old
$ mv /usr/local/bin/python /usr/local/bin/python_old
$ ln -s /usr/local/python27/bin/python /usr/local/bin/
$ python --version
Python 2.7.12
# 修改 yum 引用的 python 版本為舊版 2.6 的 python
$ vim /usr/bin/yum
# 第一行修改為 python2.6
#!/usr/bin/python2.6
$ yum --version | sed '2,$d'
3.2.29
Pip
$ pip --version
$ pip 9.0.1 from /usr/local/lib/python2.7/site-packages (python 2.7)
# upgrade setup tools and pip
$ pip install --upgrade setuptools pip
## Offline 環境下安裝 pip
# https://pypi.python.org/pypi/setuptools#code-of-conduct 下載 setuptools-32.0.0.tar.gz
$ tar zxvf setuptools-32.0.0.tar.gz
$ cd setuptools-32.0.0
$ cd setuptools-32.0.0
$ python setup.py install
# https://pypi.python.org/pypi/pip 下載 pip-9.0.1.tar.gz
$ wget --no-check-certificate https://pypi.python.org/packages/11/b6/abcb525026a4be042b486df43905d6893fb04f05aac21c32c638e939e447/pip-9.0.1.tar.gz#md5=35f01da33009719497f01a4ba69d63c9
$ tar zxvf pip-9.0.1.tar.gz
$ cd pip-9.0.1
$ python setup.py install
Installed /usr/local/python27/lib/python2.7/site-packages/pip-9.0.1-py2.7.egg
Processing dependencies for pip==9.0.1
Finished processing dependencies for pip==9.0.1
$ pip --version
pip 9.0.1 from /root/software/pip-9.0.1 (python 2.7)
Virtualenv
$ pip install virtualenv
# virtualenv is shipped in Python 3 as pyvenv
$ virtualenv venv
$ source venv/bin/activate
## Offline 環境下安裝 virtualenv
# https://pypi.python.org/pypi/virtualenv#downloads 下載 virtualenv-15.1.0.tar.gz
$ tar zxvf virtualenv-15.1.0.tar.gz
$ cd virtualenv-15.1.0
$ python setup.py install
$ virtualenv --version
15.1.0
Superset 相關
Superset 初始化
$ pip install superset
## Offline 環境下安裝 superset
# https://pypi.python.org/pypi/superset 下載 superset-0.15.0.tar.gz
$ tar zxvf superset-0.15.0.tar.gz
$ cd superset-0.15.0
$ python setup.py install
# Create an admin user
$ fabmanager create-admin --app superset
Username [admin]: # login name
User first name [admin]: # first name
User last name [user]: # lastname
Email [[email protected]]: # email, must unique
Password:
Repeat for confirmation:
Error: the two entered values do not match
Password: #superset
Repeat for confirmation: #superset
// ...
Recognized Database Authentications.
2016-12-14 17:53:40,945:INFO:flask_appbuilder.security.sqla.manager:Added user superset db upgrade
Admin User superset db upgrade created.
# Initialize the database
$ superset db upgrade
// ...
INFO [alembic.runtime.migration] Context impl SQLiteImpl.
INFO [alembic.runtime.migration] Will assume transactional DDL.
# Load some data to play with
$ superset load_examples
Loading examples into <SQLA engine=u'sqlite:////root/.superset/superset.db'>
Creating default CSS templates
Loading energy related dataset
Creating table [wb_health_population] reference
2016-12-14 17:58:09,568:INFO:root:Creating database reference
2016-12-14 17:58:09,575:INFO:root:sqlite:////root/.superset/superset.db
Loading [World Bank's Health Nutrition and Population Stats]'
Creating table [wb_health_population] reference
2016-12-14 17:58:30,840:INFO:root:Creating database reference
2016-12-14 17:58:30,846:INFO:root:sqlite:////root/.superset/superset.db
# Create default roles and permissions
$ superset init
Loading examples into <SQLA engine=u'sqlite:////root/.superset/superset.db'>
Creating default CSS templates
Loading energy related dataset
Creating table [wb_health_population] reference
2016-12-14 17:58:09,568:INFO:root:Creating database reference
2016-12-14 17:58:09,575:INFO:root:sqlite:////root/.superset/superset.db
Loading [World Bank's Health Nutrition and Population Stats]
Creating table [wb_health_population] reference
2016-12-14 17:58:30,840:INFO:root:Creating database reference
2016-12-14 17:58:30,846:INFO:root:sqlite:////root/.superset/superset.db
Creating slices
Creating a World's Health Bank dashboard
Loading [Birth names]
Done loading table!
--------------------------------------------------------------------------------
Creating table [birth_names] reference
2016-12-14 17:58:52,276:INFO:root:Creating database reference
2016-12-14 17:58:52,280:INFO:root:sqlite:////root/.superset/superset.db
Creating some slices
Creating a dashboard
Loading [Random time series data]
Done loading table!
--------------------------------------------------------------------------------
Creating table [random_time_series] reference
2016-12-14 17:58:53,953:INFO:root:Creating database reference
2016-12-14 17:58:53,957:INFO:root:sqlite:////root/.superset/superset.db
Creating a slice
Loading [Random long/lat data]
Done loading table!
--------------------------------------------------------------------------------
Creating table reference
2016-12-14 17:59:09,732:INFO:root:Creating database reference
2016-12-14 17:59:09,736:INFO:root:sqlite:////root/.superset/superset.db
Creating a slice
Loading [Multiformat time series]
Done loading table!
--------------------------------------------------------------------------------
Creating table [multiformat_time_series] reference
2016-12-14 17:59:10,421:INFO:root:Creating database reference
2016-12-14 17:59:10,426:INFO:root:sqlite:////root/.superset/superset.db
Creating some slices
Loading [Misc Charts] dashboard
Creating the dashboard
# Start the web server on port 8088
$ superset runserver -p 8088
# To start a development web server, use the -d switch
# superset runserver -d
# Refresh Druid Datasource (after config it)
$ superset refresh_druid
Virtualenv 工作空間
# superset01 192.168.1.10
$ cd root
$ virtualenv -p /usr/local/bin/python --system-site-packages --always-copy superset
$ source superset/bin/activate
# 詳見下文 `遇到的坑` - `安裝 superset需要下載依賴庫` 部分
# pip install --download package -r requirements.txt
$ pip install -r /root/requirements.txt
$ superset runserver -a 0.0.0.0 -p 8088
# 建議使用 rsync,詳見 `部署上線` 部分
$ cd /root
$ tar zcvf virtualenv.tar.gz virtualenv/
$ scp virtualenv.tar.gz [email protected]192.168.1.13:/root/
# 192.168.1.13
$ cd /root/virtualenv/superset
$ source bin/activate
## 【拓展】
# virtualenvwrapper 是 virtualenv 的擴充套件工具,可以方便的建立、刪除、複製、切換不同的虛擬環境
$ pip install virtualenvwrapper
$ mkdir ~/workspaces
$ vim ~/.bashrc
# 增加
export WORKON_HOME=~/virtualenv
source /usr/local/bin/virtualenvwrapper.sh
$ mkvirtualenv --python=/usr/bin/python superset
Running virtualenv with interpreter /usr/bin/python
New python executable in /root/virtualenv/superset/bin/python
Installing setuptools, pip, wheel...done.
virtualenvwrapper.user_scripts creating /root/virtualenv/superset/bin/predeactivate
virtualenvwrapper.user_scripts creating /root/virtualenv/superset/bin/postdeactivate
virtualenvwrapper.user_scripts creating /root/virtualenv/superset/bin/preactivate
virtualenvwrapper.user_scripts creating /root/virtualenv/superset/bin/postactivate
virtualenvwrapper.user_scripts creating /root/virtualenv/superset/bin/get_env_details
(superset) [[email protected] virtualenv]#
(superset) [[email protected] virtualenv]# deactivate
$ workon superset
(superset) [[email protected] virtualenv]# lsvirtualenv -b
superset
部署上線
拷貝
# rsync 替換 scp 可以確保軟連結 也能被 cp
$ rsync -avuz -e ssh /home/superset/superset-0.15.4/ [email protected]:/home/yuzhouwan/superset-0.15.4
//...
sent 142935894 bytes received 180102 bytes 3920986.19 bytes/sec
total size is 359739823 speedup is 2.51
# 在 本機 和 目標機器 的 superset 目錄下校驗檔案數量
$ find | wc -l
10113
# 重複以上步驟,從跳板機 rsync 到線上機器
$ rsync -avuz -e ssh /home/yuzhouwan/superset-0.15.4/ [email protected]192.168.2.10:/home/superset/superset-0.15.4
# virtualenv 建立依賴的 python
$ rsync -avuz -e ssh /root/software [email protected]:/home/yuzhouwan
$ rsync -avuz -e ssh /home/yuzhouwan/software [email protected]:/root
$ cd /root/software
$ tar zxvf Python-2.7.12.tgz
$ cd Python-2.7.12
$ ./configure --prefix=/usr --enable-shared CFLAGS=-fPIC
$ make && make install
$ /sbin/ldconfig -v | grep / # nessnary!!
$ python -V
Python 2.7.12
動態連結庫
# 雖然軟連結已經 rsync 過來了,但是 目標機器相關目錄下,沒有對應的 Python 的動態連結庫
$ file /root/superset/lib/python2.7/lib-dynload
/root/superset/lib/python2.7/lib-dynload: broken symbolic link to `/usr/local/python27/lib/python2.7/lib-dynload`
# 需要和聯網環境中,建立 VirtualEnv 時的 Python 全域性環境一致
$ ./configure --prefix=/usr/local/python27 --enable-shared CFLAGS=-fPIC
$ make && make install
$ /sbin/ldconfig -v | grep /
$ ls /usr/local/python27/lib/python2.7/lib-dynload -sail
使用者許可權
# 建立使用者
$ adduser superset
$ cd /home/superset
# 如果存在版本號,需要建立 軟連結
$ chown -R superset:superset superset-0.15.4
$ ln -s superset-0.15.4 superset
$ chown -h superset:superset superset
$ su - superset
元資料儲存
# 修改資料庫
$ vim ./lib/python2.7/site-packages/superset/config.py
# SQLALCHEMY_DATABASE_URI = 'sqlite:///' + os.path.join(DATA_DIR, 'superset.db')
SQLALCHEMY_DATABASE_URI = 'mysql+pymysql://user:[email protected]:3306/superset1?charset=utf8'
$ mysql -hmysql01 -p3306 -uuser -ppassword
> use superset1;
> show tables;
+-------------------------+
| Tables_in_superset1 |
+-------------------------+
| ab_permission |
| ... |
| url |
+-------------------------+
28 rows in set (0.00 sec)
# mysqldump -hmysql01 -p3306 -uuser -ppassword superset1 > superset1.sql
$ mysqldump -hmysql01 -p3306 -uuser -ppassword --single-transaction superset1 > superset1.sql
啟動
$ cd /home/superset/superset-0.15.4
$ source bin/activate
$ mkdir logs
$ nohup superset runserver -a 0.0.0.0 -p 9097 2>&1 -w 4 > logs/superset.log &
本地執行
依賴
Windows 相關
Microsoft Visual C++ 9.0 is required (Unable to find vcvarsall.bat)
描述
error: Microsoft Visual C++ 9.0 is required (Unable to find vcvarsall.bat). Get it from http://aka.ms/vcpython27
解決
# download vcredist_x64.exe from http://www.microsoft.com/en-us/download/details.aspx?id=2092
$ pip install wheel setuptools
# VCForPython27.msi 下載安裝
‘openssl/opensslv.h’: No such file or directory
解決
# download openssl-0.9.8h-1-setup.exe from http://gnuwin32.sourceforge.net/packages/openssl.htm
參考
Cannot open include file: ‘stdint.h’: No such file or directory
解決
# Microsoft Visual C++ 2015 Redistributable Update 3
# download vc_redist.x64.exe from https://www.microsoft.com/zh-CN/download/details.aspx?id=53840
$ vim D:\apps\Python27\Lib\distutils\msvc9compiler.py
def get_build_version():
return 9.0
def find_vcvarsall(version):
return r'C:\Users\yuzhouwan\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\vcvarsall.bat'
$ cd superset-0.15.4
$ python setup.py install
# Microsoft 提供的 VCForPython27.msi 預設使用 VC2008,而 stdint.h 是從 VC2012 開始支援的
# 2014 年之後,VCForPython27.msi 便不再維護,決定嘗試用 ubuntu or remote debug ...
參考
Ubuntu 相關
安裝 VMware
Python 相關
Make sure that you use the correct version of ‘pip’
描述
Try to run this command from the system terminal. Make sure that you use the correct version of 'pip' installed for your Python interpreter located at 'D:\apps\Python27\python.exe'
解決
# 安裝 pip,下載 https://bootstrap.pypa.io/get-pip.py 安裝檔案
$ python get-pip.py
$ pip --version
pip 8.1.1 from d:\apps\python27\lib\site-packages (python 2.7)
參考
‘Connection to pypi.python.org timed out. (connect timeout=15)’
描述
$ pip install --upgrade pip
'Connection to pypi.python.org timed out. (connect timeout=15)'
解決
# 設定 proxy
$ export https_proxy="http://10.10.10.10:8080"
$ pip install --upgrade pip
$ pip --version
pip 9.0.1 from d:\apps\python27\lib\site-packages (python 2.7)
參考
setup.py failed with error code 1
描述
Command "d:\apps\python27\python.exe -u -c "import setuptools, tokenize;__file__='c:\\users\\yuzhouwan\\appdata\\local\\temp\\pip-build-zzbhrq\\sasl\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record c:\users\yuzhouwan\appdata\local\temp\pip-erwavd-record\install-record.txt --single-version-externally-managed --compile" failed with error code 1 in c:\users\yuzhouwan\appdata\local\temp\pip-build-zzbhrq\sasl\
解決
$ pip install --upgrade setuptools pip
$ pip install superset
# Download superset-0.15.4.tar.gz from https://pypi.python.org/pypi/superset
$ tar zxvf superset-0.15.4.tar.gz
$ cd superset-0.15.4
$ python setup.py install
參考
開發環境搭建
依賴
$ cd /root/software
$ tar zxvf Python-2.7.12.tgz
$ cd Python-2.7.12
$ ./configure --prefix=/usr/local/python27 --enable-shared CFLAGS=-fPIC
$ make && make install
$ /sbin/ldconfig -v | grep /
$ python -V
$ Python 2.7.12
$ mv /usr/local/bin/python /usr/local/bin/python_bak
$ ln -s /usr/local/python27/bin/python /usr/local/bin/python
虛擬環境
$ cd /root
$ virtualenv -p /usr/local/bin/python --system-site-packages env
$ cd env
$ mkdir code
程式碼
# windows
$ cd E:\Github\super\env
$ git init
$ git remote add origin master https://github.com/asdf2014/superset.git
$ git pull origin master
# SFTP
# 上傳到 /root/env/code
安裝
$ cd /root/env/code
$ source /root/env/bin/activate
$ cd /root/env/code/superset/static
$ mv assets assets_bak
$ ln -s ../assets assets
$ cd /root/env/code
$ python setup.py develop
Finished processing dependencies for superset==0.15.4
$ pip freeze | grep superset
superset==0.15.4
# Create an admin user
$ fabmanager create-admin --app superset
Username [admin]: # login name
User first name [admin]: # first name
User last name [user]: # lastname
Email [[email protected]]: # email, must unique
Password:
Repeat for confirmation:
Error: the two entered values do not match
Password: #superset
Repeat for confirmation: #superset
// ...
Recognized Database Authentications.
2016-12-14 17:53:40,945:INFO:flask_appbuilder.security.sqla.manager:Added user superset db upgrade
Admin User superset db upgrade created.
$ superset db upgrade
$ superset init
$ superset load_examples
Npm
# [Mac OS]
$ sudo yum group install "Development Tools" --setopt=group_package_types=mandatory,default,optional --skip-broken -y
$ sudo yum install curl git m4 ruby texinfo bzip2-devel curl-devel expat-devel ncurses-devel zlib-devel -y
# ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/linuxbrew/go/install)" # Do not run this as root!
$ wget https://raw.githubusercontent.com/Homebrew/linuxbrew/go/install --no-check-certificate
$ mv install install.rb
$ vim install.rb
# abort "Don't run this as root!" if Process.uid == 0
$ mkdir -p /root/.linuxbrew/bin
$ export PATH="/root/.linuxbrew/bin:$PATH"
$ ruby install.rb
$ vim ~/.bashrc
export PATH="$HOME/.linuxbrew/bin:$PATH"
export MANPATH="$HOME/.linuxbrew/share/man:$MANPATH"
export INFOPATH="$HOME/.linuxbrew/share/info:$INFOPATH"
# [CentOS]
$ yum install npm
$ cd /root/env/code/superset/assets # package.json
$ npm install
# if visit https://github.com/jquery/jquery.git return timeout
$ vim /etc/hosts
192.30.253.112 github.com
151.101.100.133 assets-cdn.github.com
192.30.253.117 api.github.com
192.30.253.121 codeload.github.com
測試
$ cd /root/env/code
$ chmod 777 *sh
$ cd /root/env/code/superset/bin
$ chmod 777 superset
$ cd /root/env/code
$ bash run_tests.sh
IDE 中遠端開發
Remote Debug
詳見我的另一篇部落格中 Remote Debug 部分:《Python》
參考
二次開發
Others Category
問題
描述
對 HBase 的 Region 層面進行聚合,group 出來的 Region 會很多,在 DistributionPieViz
中展示會很卡頓,而且不美觀
解決
增加 row_limit 可以排除 topN 之外的資料
$ cd /root/superset-0.15.4
$ vim ./lib/python2.7/site-packages/superset/viz.py
fieldsets = ({
'label': None,
'fields': (
'metrics', 'groupby',
'limit',
'pie_label_type',
('donut', 'show_legend'),
'labels_outside',
'row_limit',
)
},)
others_category 將 topN 之外的資料聚合
$ cd /root/superset-0.15.4
$ vim ./lib/python2.7/site-packages/superset/viz.py
fieldsets = ({
'label': None,
'fields': (
'metrics', 'groupby',
'limit',
'pie_label_type',
('donut', 'show_legend'),
'labels_outside',
'row_limit',
'others_category',
)
},)
$ vim ./lib/python2.7/site-packages/superset/forms.py
'others_category': (BetterBooleanField, {
"label": _("Others category"),
"default": True,
"description": _("Aggregate data outside of topN into a single category")
}),
# models.py
# Others類別,沒有被排在最後,而是重新又進行了一次排序
# "others_category": "y" 屬性沒有傳遞下來
self.status = None
self.error_message = None
self.others_category = form_data.get("others_category")
top_n = 10
if top_n > 0:
df_head = df.head(top_n)
df_tail = df.tail(len(df) - 10)
other_metrics_sum = []
for i in range(0, len(metrics) - 1):
metric = metrics[i]
other_metrics_sum[i] = df_tail[metric].sum()
df_other = pd.DataFrame([['Others', other_metrics_sum]], columns=df.columns)
df = df_head.append(df_other, ignore_index=True)
Y 軸資料異常
描述
Y 軸本應該是 0 的起點,變成 -997m 負數
解決
後期優化
MySQL 時區問題
查詢
描述
$ lib/python2.7/site-packages/superset/config.py
from dateutil import tz
# Druid query timezone
# tz.tzutc() : Using utc timezone
# tz.tzlocal() : Using local timezone
# other tz can be overridden by providing a local_config
DRUID_IS_ACTIVE = True
DRUID_TZ = tz.tzlocal() # +08:00
# DRUID_TZ = tz.gettz('Asia/Shanghai')
解決
展示
描述
dttm.tz_convert(dttm.tzinfo._filename.split('zoneinfo/')[1]) - pytz.timezone(dttm.tzinfo._filename.split('zoneinfo/')[1]).localize(EPOCH)
解決
參考
Superset 升級
# 直接利用 pip install 的方式進行升級
$ pip freeze | grep superset
$ superset==0.13.2
$ pip install superset==-1
versions: 0.12.0, 0.13.0, 0.13.1, 0.13.2, 0.14.0, 0.14.1, 0.15.0, 0.15.1, 0.15.3, 0.15.4
$ pip install superset==0.15.4
# 發現之前的配置資料 都消失了,需要做一些 config 的調整
$ vim ./lib/python2.7/site-packages/superset/config.py
# SQLALCHEMY_DATABASE_URI = 'sqlite:///' + os.path.join(DATA_DIR, 'superset.db')
SQLALCHEMY_DATABASE_URI = 'mysql+pymysql://root:[email protected]:3306/superset?charset=utf8'
$ vim /root/superset-0.15.4/bin/activate
# VIRTUAL_ENV="/root/superset"
VIRTUAL_ENV="/root/superset-0.15.4"
# then could just run "superset runserver -a 0.0.0.0 -p 9097"
Unknown column ‘datasources.filter_select_enabled’ in ‘field list’
描述
InternalError: (pymysql.err.InternalError) (1054, u"Unknown column 'datasources.filter_select_enabled' in 'field list'") [SQL: u'SELECT datasources.created_on AS datasources_created_on, datasources.changed_on AS datasources_changed_on, datasources.id AS datasources_id, datasources.datasource_name AS datasources_datasource_name, datasources.is_featured AS datasources_is_featured, datasources.is_hidden AS datasources_is_hidden, datasources.filter_select_enabled AS datasources_filter_select_enabled, datasources.description AS datasources_description, datasources.default_endpoint AS datasources_default_endpoint, datasources.user_id AS datasources_user_id, datasources.cluster_name AS datasources_cluster_name, datasources.offset AS datasources_offset, datasources.cache_timeout AS datasources_cache_timeout, datasources.params AS datasources_params, datasources.perm AS datasources_perm, datasources.changed_by_fk AS datasources_changed_by_fk, datasources.created_by_fk AS datasources_created_by_fk \nFROM datasources \nWHERE datasources.datasource_name = %(datasource_name_1)s \n LIMIT %(param_1)s'] [parameters: {u'param_1': 1, u'datasource_name_1': u'bi-dfp-oms-detail'}]
解決
$ superset db upgrade
$ superset refresh_druid
Issues with Druid timezones
描述
Those methods that named tzutc and tzlocal in tz work for me…
Oh no.. They are not working when i upgrade superset from v0.13.2 into v0.15.4, even if i try to use DRUID_TZ = tz.gettz(‘Asia/Shanghai’) :-(
解決
$ cd /root/superset-0.15.4
$ ./bin/python -m pip freeze | grep superset
superset==0.13.2
$ ./bin/python -m pip uninstall superset
$ ./bin/python -m pip install superset==0.15.4
$ ./bin/python -m pip freeze | grep superset
superset==0.15.4
$ ./bin/python ./bin/easy_install lib/pycharm-debug.egg
# config remote python
$ ./bin/python ./bin/superset runserver -a 0.0.0.0 -p 9097
# nohup ./bin/python ./bin/superset runserver -a 0.0.0.0 -p 9097 2>&1 > logs/superset.log &
$ ./bin/python ./bin/superset db upgrade
$ ./bin/python ./bin/superset refresh_druid
pydevd 無法進行 remote debug
描述
版本從 0.13.2 升級到 0.15.4,在 debug 的時候會啟動兩個程序(會導致 pydevd 無法進行 remote debug)
$ ps -ef | grep superset | grep -v grep
root 22567 1632 19 12:05 pts/0 00:00:03 ./bin/python ./bin/superset runserver -d -p 9097
root 22578 22567 24 12:05 pts/0 00:00:03 /root/superset-0.15.4/bin/python ./bin/superset runserver -d -p 9097
解決
直接用 cli.py 啟動 –not ok
$ vim ./lib/python2.7/site-packages/superset/config.py
# append
manager.run()
$ ./bin/python ./lib/python2.7/site-packages/superset/cli.py runserver -a 0.0.0.0 -p 9097
$ ps -ef | grep superset | grep -v grep
root 25238 1632 35 13:07 pts/0 00:00:03 ./bin/python ./lib/python2.7/site-packages/superset/cli.py runserver -d -p 9097
root 25247 25238 55 13:07 pts/0 00:00:03 /root/superset-0.15.4/bin/python ./lib/python2.7/site-packages/superset/cli.py runserver -d -p 9097
嘗試解決 WARNING:werkzeug: * Debugger is active! 問題
$ vim lib/python2.7/site-packages/werkzeug/serving.py
class ThreadedWSGIServer(ThreadingMixIn, BaseWSGIServer):
"""A WSGI server that does threading."""
multithread = True
$ vim lib/python2.7/site-packages/flask/app.py
options.setdefault('use_reloader', self.debug)
$ superset/__init__.py
參考
Sqlite3 切換為 MySQL
嘗試 SQLite 自帶的 dump 命令
# superset01 192.168.1.10 Superset
$ cd /root/.superset
$ ll -sail
1285 43256 -rw-r--r-- 1 root root 44288000 Jan 22 14:06 superset.db
$ sqlite3 superset.db
sqlite> .databases
seq name file
--- --------------- ----------------------------------------------------------
0 main /root/.superset/superset.db
sqlite> .tables
ab_permission columns multiformat_time_series
ab_permission_view css_templates query
ab_permission_view_role dashboard_slices random_time_series
ab_register_user dashboard_user slice_user
ab_role dashboards slices
ab_user datasources sql_metrics
ab_user_role dbs table_columns
ab_view_menu energy_usage tables
access_request favstar url
alembic_version logs wb_health_population
birth_names long_lat
clusters metrics
# not suit for mysql
# sqlite> .output superset.sql
# sqlite> .dump
$ vim dump_for_mysql.py
# https://github.com/EricHigdon/sqlite3tomysql
$ sqlite3 superset.db .dump | python dump_for_mysql.py > superset.sql
$ ls -sail
1285 43256 -rw-r--r-- 1 root root 44288000 Jan 22 14:06 superset.db
18631 76968 -rw-r--r-- 1 root root 78812197 Jan 22 14:35 superset.sql
$ vim superset.sql
id INTEGER NOT NULL,
# 替換為 (主鍵) 自增長
id INTEGER PRIMARY KEY NOT NULL AUTO_INCREMENT,
$ scp superset.sql [email protected]192.168.1.12:/home/mysql
自己實現 sqlite3tomysql.py
# druid02 192.168.1.12 MySQL
$ ps -ef | grep mysql | grep -v druid | grep -v grep
mysql 11435 8530 0 14:13 pts/4 00:00:00 /bin/sh /home/mysql/bin/mysqld_safe --defaults-file=/home/mysql/my.cnf
mysql 12192 11435 0 14:13 pts/4 00:00:00 /home/mysql/bin/mysqld --defaults-file=/home/mysql/my.cnf --basedir=/home/mysql --datadir=/home/mysql/data --plugin-dir=/home/mysql/lib/mysql/plugin --log-error=/home/mysql/data/druid02.err --open-files-limit=8192 --pid-file=/home/mysql/data/druid02.pid --socket=/home/mysql/data/mysql.sock --port=3306
mysql 12223 8530 0 14:13 pts/4 00:00:00 mysql -uroot -p -S /home/mysql/data/mysql.sock
$ su - mysql
$ mysql -uroot -p -S /home/mysql/data/mysql.sock
mysql> show databases;
mysql> create database superset;
mysql> show databases;
mysql> use superset;
# 執行 sqlite3tomysql.py
mysql -uroot -p superset2 -S /home/mysql/data/mysql.sock --default-character-set=utf8 < superset.sql.schema.sql
mysql -uroot -p superset2 -S /home/mysql/data/mysql.sock --default-character-set=utf8 < superset.sql.data.sql
# 避免表之間 外來鍵依賴,可以在 mysql 命令列中,使用 source .superset.sql.schema.sql 的方式,多次批量匯入
元資料儲存
<