Building Bullet Graphs and Waterfall Charts with Bokeh

阿新 • • 發佈：2018-12-29

Waterfall Chart

I decided to take Bryan’s comments as an opportunity to create a waterfall chart in Bokeh and see how hard (or easy) it is to do. He recommended that the candlestick chart would be a good place to start and I did use that as the basis for this solution. All of the code is in a notebook that is available

here.

Let’s start with the Bokeh and pandas imports and enabling the notebook output:

from bokeh.plotting import figure, show
from bokeh.io import output_notebook
from bokeh.models import ColumnDataSource, LabelSet
from bokeh.models.formatters import NumeralTickFormatter
import pandas as pd

output_notebook 
()

For this solution, I’m going to create a pandas dataframe and use Bokeh’s ColumnDataSource to make the code a little simpler. This has the added benefit of making this code easy to convert to take an Excel input instead of the manually created dataframe.

Feel free to refer to this cheatsheet if you need some help understanding how to create the dataframe as shown below:

# Create the initial dataframe
index = ['sales','returns','credit fees','rebates','late charges','shipping']
data = {'amount': [350000,-30000,-7500,-25000,95000,-7000]}
df = pd.DataFrame(data=data,index=index)

# Determine the total net value by adding the start and all additional transactions
net = df['amount'].sum()

amount
sales	350000
returns	-30000
credit fees	-7500
rebates	-25000
late charges	95000
shipping	-7000

The final waterfall code is going to require us to define several additional attributes for each segment including:

starting position
bar color
label position
label text

By adding this to a single dataframe, we can use Bokeh’s built in capabilities to simplify the final code.

For the next step, we’ll add the running total, segment start location and the position of the label:

df['running_total'] = df['amount'].cumsum()
df['y_start'] = df['running_total'] - df['amount']

# Where do we want to place the label?
df['label_pos'] = df['running_total']

Next, we add a row at the bottom on the dataframe that contains the net value:

df_net = pd.DataFrame.from_records([(net, net, 0, net)],
                                   columns=['amount', 'running_total', 'y_start', 'label_pos'],
                                   index=["net"])
df = df.append(df_net)

For this particular waterfall, I would like to have the negative values a different color and have formatted the labels below the chart. Let’s add columns to the dataframe with the values:

df['color'] = 'grey'
df.loc[df.amount < 0, 'color'] = 'red'
df.loc[df.amount < 0, 'label_pos'] = df.label_pos - 10000
df["bar_label"] = df["amount"].map('{:,.0f}'.format)

Here’s the final dataframe containing all the data we need. It did take some manipulation of the data to get to this state but it is fairly standard pandas code and is easy to debug if something goes awry.

amount	running_total	y_start	label_pos	color	bar_label
sales	350000	350000	0	350000	grey	350,000
returns	-30000	320000	350000	310000	red	-30,000
credit fees	-7500	312500	320000	302500	red	-7,500
rebates	-25000	287500	312500	277500	red	-25,000
late charges	95000	382500	287500	382500	grey	95,000
shipping	-7000	375500	382500	365500	red	-7,000
net	375500	375500	0	375500	grey	375,500

Creating the actual plot, is fairly standard Bokeh code since the dataframe has all the values we need:

TOOLS = "box_zoom,reset,save"
source = ColumnDataSource(df)
p = figure(tools=TOOLS, x_range=list(df.index), y_range=(0, net+40000),
           plot_width=800, title = "Sales Waterfall")

By defining the ColumnDataSource as our dataframe, Bokeh takes care of creating all segments and labels without doing any looping.

p.segment(x0='index', y0='y_start', x1="index", y1='running_total',
          source=source, color="color", line_width=55)

We will do some minor formatting to add labels and format the y-axis nicely:

p.grid.grid_line_alpha=0.3
p.yaxis[0].formatter = NumeralTickFormatter(format="($ 0 a)")
p.xaxis.axis_label = "Transactions"

The final step is to add all the labels onto the bars using the LabelSet :

labels = LabelSet(x='index', y='label_pos', text='bar_label',
                  text_font_size="8pt", level='glyph',
                  x_offset=-20, y_offset=0, source=source)
p.add_layout(labels)

Here’s the final chart:

Once again, I think the final solution is simpler than the matplotlib code and the resulting output looks pleasing. You also have the added bonus that the charts are interactive and could be enhanced even more by using the Bokeh server (see my Australian Wine Ratings article for an example). The code should also be straightforward to modify for your specific datasets.

Building Bullet Graphs and Waterfall Charts with Bokeh

Waterfall Chart

Building Bullet Graphs and Waterfall Charts with Bokeh

Building a Smart Air Pressure Sensor with Espruino and Angular

ORA-31693, ORA-02354 and ORA-01555 with Export Datapump

[CodeForces850C]Arpa and a game with Mojtaba

Codeforces 850C Arpa and a game with Mojtaba

linear regression and logistic regression with pytorch

[Javascirpt] Developer-friendly Flow Charts with flowchart.js

LeetCode:309. Best Time to Buy and Sell Stock with Cooldown

LeetCode 714. Best Time to Buy and Sell Stock with Transaction Fee

122. Best Time to Buy and Sell Stock II\\714. Best Time to Buy and Sell Stock with Transaction Fee

714. Best Time to Buy and Sell Stock with Transaction Fee

TensorFlow Low-Level-APIs Graphs and Sessions學習筆記

leetcode 714. 買賣股票的最佳時機含手續費 best-time-to-buy-and-sell-stock-with-transaction-fee

LeetCode 309. Best Time to Buy and Sell Stock with Cooldown

letcode 714. Best Time to Buy and Sell Stock with Transaction Fee

圖及其衍生演算法（Graphs and graph algorithms）

論文翻譯：Deep SORT: Simple Online and Realtime Tracking with a Deep Association Metric

Best Time to Buy and Sell Stock with Transaction Fee

Best Time to Buy and Sell Stock with Cooldown

【文藝學生】Learning with exploration, and go ahead with learning. Let's progress together! :)

Building Bullet Graphs and Waterfall Charts with Bokeh

Waterfall Chart

相關推薦