時間序列--視覺化的幾種方式
阿新 • • 發佈:2019-01-01
一:隨時間變化的線性曲線
除了最常見的,還可以設定分組,比如
from pandas import Series from pandas import DataFrame from pandas import TimeGrouper from matplotlib import pyplot series = Series.from_csv('daily-minimum-temperatures.csv', header=0) groups = series.groupby(TimeGrouper('A')) years = DataFrame() for name, group in groups: years[name.year] = group.values years.plot(subplots=True, legend=False) pyplot.show()
每一行都是365天
二:直方圖或者密度圖
series.plot(kind='kde')--密度圖
series.hist()--直方圖
三:箱線圖
series.boxplot()
下面給出一個年份不同月的箱線圖
from pandas import Series from pandas import DataFrame from pandas import TimeGrouper from matplotlib import pyplot from pandas import concat series = Series.from_csv('daily-minimum-temperatures.csv', header=0) one_year = series['1990'] groups = one_year.groupby(TimeGrouper('M')) months = concat([DataFrame(x[1].values) for x in groups], axis=1) months = DataFrame(months) months.columns = range(1,13) months.boxplot() pyplot.show()
四:熱力圖
現在橫座標代表每年中的某一天,縱座標代表某一年,另外一個值大小就用顏色表示
from pandas import Series from pandas import DataFrame from pandas import TimeGrouper from matplotlib import pyplot series = Series.from_csv('daily-minimum-temperatures.csv', header=0) groups = series.groupby(TimeGrouper('A')) years = DataFrame() for name, group in groups: years[name.year] = group.values years = years.T pyplot.matshow(years, interpolation=None, aspect='auto') pyplot.show()
當然也可以畫出月份和日期的熱力圖
from pandas import Series
from pandas import DataFrame
from pandas import TimeGrouper
from matplotlib import pyplot
from pandas import concat
series = Series.from_csv('daily-minimum-temperatures.csv', header=0)
one_year = series['1990']
groups = one_year.groupby(TimeGrouper('M'))
months = concat([DataFrame(x[1].values) for x in groups], axis=1)
months = DataFrame(months)
months.columns = range(1,13)
pyplot.matshow(months, interpolation=None, aspect='auto')
pyplot.show()
五:滯後項和原項之間的點圖
Pandas has a built-in function for exactly this called the lag plot. It plots the observation at time t on the x-axis and the lag1 observation (t-1) on the y-axis.
- If the points cluster along a diagonal line from the bottom-left to the top-right of the plot, it suggests a positive correlation relationship.主對角線--正相關
- If the points cluster along a diagonal line from the top-left to the bottom-right, it suggests a negative correlation relationship.次對角線--負相關
- Either relationship is good as they can be modeled.
More points tighter in to the diagonal line suggests a stronger relationship and more spread from the line suggests a weaker relationship.越分散--越不相關
from pandas import Series
from matplotlib import pyplot
from pandas.tools.plotting import lag_plot
series = Series.from_csv('daily-minimum-temperatures.csv', header=0)
lag_plot(series)
pyplot.show()
這裡只畫出來t和t-1的圖,當然也可以畫t和t-1,t和t-2,t和t-3...t和t-7
from pandas import Series
from pandas import DataFrame
from pandas import concat
from matplotlib import pyplot
from pandas.plotting import scatter_matrix
series = Series.from_csv('daily-minimum-temperatures.csv', header=0)
values = DataFrame(series.values)
lags = 7
columns = [values]
for i in range(1,(lags + 1)):
columns.append(values.shift(i))
dataframe = concat(columns, axis=1)
columns = ['t+1']
for i in range(1,(lags + 1)):
columns.append('t-' + str(i))
dataframe.columns = columns
pyplot.figure(1)
for i in range(1,(lags + 1)):
ax = pyplot.subplot(240 + i)
ax.set_title('t+1 vs t-' + str(i))
pyplot.scatter(x=dataframe['t+1'].values, y=dataframe['t-'+str(i)].values)
pyplot.show()
六:自相關圖(和五的區別是這一步量化相關關係)
用的是pearson相關係數
from pandas import Series
from matplotlib import pyplot
from pandas.tools.plotting import autocorrelation_plot
series = Series.from_csv('daily-minimum-temperatures.csv', header=0)
autocorrelation_plot(series)
pyplot.show()
https://machinelearningmastery.com/time-series-data-visualization-with-python/