實戰:基於NumPy的股價統計分析應用
阿新 • • 發佈:2020-09-09
目錄
基於NumPy的股價統計分析應用
構造資料
- 構造以下資料。其中,第4-8列,即EXCEL表格中的D-H列,分別為股票的開盤價,最高價,最低價,收盤價,成交量。
with open("./data/stock_data.csv", "w") as fdata: fdata.write("""AAPL,28-01-2011, ,344.17,344.4,333.53,336.1,21144800 AAPL,31-01-2011, ,335.8,340.04,334.3,339.32,13473000 AAPL,01-02-2011, ,341.3,345.65,340.98,345.03,15236800 AAPL,02-02-2011, ,344.45,345.25,343.55,344.32,9242600 AAPL,03-02-2011, ,343.8,344.24,338.55,343.44,14064100 AAPL,04-02-2011, ,343.61,346.7,343.51,346.5,11494200 AAPL,07-02-2011, ,347.89,353.25,347.64,351.88,17322100 AAPL,08-02-2011, ,353.68,355.52,352.15,355.2,13608500 AAPL,09-02-2011, ,355.19,359,354.87,358.16,17240800 AAPL,10-02-2011, ,357.39,360,348,354.54,33162400 AAPL,11-02-2011, ,354.75,357.8,353.54,356.85,13127500 AAPL,14-02-2011, ,356.79,359.48,356.71,359.18,11086200 AAPL,15-02-2011, ,359.19,359.97,357.55,359.9,10149000 AAPL,16-02-2011, ,360.8,364.9,360.5,363.13,17184100 AAPL,17-02-2011, ,357.1,360.27,356.52,358.3,18949000 AAPL,18-02-2011, ,358.21,359.5,349.52,350.56,29144500 AAPL,22-02-2011, ,342.05,345.4,337.72,338.61,31162200 AAPL,23-02-2011, ,338.77,344.64,338.61,342.62,23994700 AAPL,24-02-2011, ,344.02,345.15,338.37,342.88,17853500 AAPL,25-02-2011, ,345.29,348.43,344.8,348.16,13572000 AAPL,28-02-2011, ,351.21,355.05,351.12,353.21,14395400 AAPL,01-03-2011, ,355.47,355.72,347.68,349.31,16290300 AAPL,02-03-2011, ,349.96,354.35,348.4,352.12,21521000 AAPL,03-03-2011, ,357.2,359.79,355.92,359.56,17885200 AAPL,04-03-2011, ,360.07,360.29,357.75,360,16188000 AAPL,07-03-2011, ,361.11,361.67,351.31,355.36,19504300 AAPL,08-03-2011, ,354.91,357.4,352.25,355.76,12718000 AAPL,09-03-2011, ,354.69,354.76,350.6,352.47,16192700 AAPL,10-03-2011, ,349.69,349.77,344.9,346.67,18138800 AAPL,11-03-2011, ,345.4,352.32,345,351.99,16824200 """)
讀取資料
- 使用 np.loadtxt 方法讀取CSV檔案
import numpy as np
end_price, turnover = np.loadtxt(
fname="./data/stock_data.csv",
delimiter=',',
usecols=(6, 7),
unpack=True
)
print(end_price)
print(turnover)
[336.1 339.32 345.03 344.32 343.44 346.5 351.88 355.2 358.16 354.54 356.85 359.18 359.9 363.13 358.3 350.56 338.61 342.62 342.88 348.16 353.21 349.31 352.12 359.56 360. 355.36 355.76 352.47 346.67 351.99] [21144800. 13473000. 15236800. 9242600. 14064100. 11494200. 17322100. 13608500. 17240800. 33162400. 13127500. 11086200. 10149000. 17184100. 18949000. 29144500. 31162200. 23994700. 17853500. 13572000. 14395400. 16290300. 21521000. 17885200. 16188000. 19504300. 12718000. 16192700. 18138800. 16824200.]
numpy.loadtxt需要傳入4個關鍵字引數:
1.fname是檔名,資料型別為字串str;
2.delimiter是分隔符,資料型別為字串str;
3.usecols是讀取的列數,資料型別為元組tuple,其中元素個數有多少個,則選出多少列;
4.unpack是是否解包,資料型別為布林bool。
應用
計算成交量加權平均價格
- 概念:成交量加權平均價格,英文名VWAP(Volume-Weighted Average Price,成交量加權平均價格)是一個非常重要的經濟學量,代表著金融資產的“平均”價格。
- 某個價格的成交量越大,該價格所佔的權重就越大。VWAP就是以成交量為權重計算出來的加權平均值。
import numpy as np
end_price, turnover = np.loadtxt(
fname="./data/stock_data.csv",
delimiter=',',
usecols=(6, 7),
unpack=True
)
print(np.average(end_price))
print(np.average(end_price, weights=turnover))
351.0376666666667
350.5895493532009
計算最大值和最小值
import numpy as np
high_price, low_price = np.loadtxt(
fname="./data/stock_data.csv",
delimiter=',',
usecols=(4, 5),
unpack=True
)
print("max=", high_price.max())
print("min=", low_price.min())
max= 364.9
min= 333.53
計算極差
- 計算股價近期最高價的最大值和最小值的差值 和 計算股價近期最低價的最大值和最小值的差值
np.ptp(a, axis=None, out=None, keepdims=
)
import numpy as np
high_price, low_price = np.loadtxt(
fname="./data/stock_data.csv",
delimiter=',',
usecols=(4, 5),
unpack=True
)
print("max - min of high price:", np.ptp(high_price))
print("max - min of low price:", np.ptp(low_price))
max - min of high price: 24.859999999999957
max - min of low price: 26.970000000000027
計算中位數
- 計算收盤價的中位數。
import numpy as np
end_price = np.loadtxt(
fname="./data/stock_data.csv",
delimiter=',',
usecols=6
)
print("median =", np.median(end_price))
median = 352.055
計算方差
- 計算收盤價的方差。
import numpy as np
end_price = np.loadtxt(
fname="./data/stock_data.csv",
delimiter=',',
usecols=6
)
print("variance =", np.var(end_price))
print("variance =", end_price.var())
variance = 50.126517888888884
variance = 50.126517888888884
計算股票收益率、年波動率及月波動率
- 在投資學中,波動率是對價格變動的一種度量,歷史波動率可以根據歷史價格資料計算得出。計算曆史波動率時,需要用到對數收益率。
- 年波動率等於對數收益率的標準差除以其均值,再乘以交易日的平方根,通常交易日取252天。
- 月波動率等於對數收益率的標準差除以其均值,再乘以交易月的平方根。通常交易月取12月。
import numpy as np
end_price = np.loadtxt(
fname="./data/stock_data.csv",
delimiter=',',
usecols=6
)
log_returns = np.diff(np.log(end_price))
annual_volatility = log_returns.std() / log_returns.mean() * np.sqrt(252)
monthly_volatility = log_returns.std() / log_returns.mean() * np.sqrt(12)
print("年波動率", annual_volatility)
print("月波動率", monthly_volatility)
年波動率 129.27478991115134
月波動率 28.210071915112593