1. 程式人生 > >Tweets, pipelines, Google Cloud, and poetry 

Tweets, pipelines, Google Cloud, and poetry 

On a recent GCP project, a customer did say,“Can you spin up a pipeline by the end of the day?”One simple requirement, easily understood,Analyse live tweets with BigQuery under the hood.

Now I know what y’all thinking, “just use GKE!”,But spinning up Kubernetes wasn’t to be.Anything but PaaS frankly wasn’t allowed,Never to tell why, is something I’ve vowed.

It could never be done, they did nervously claim, “Nonsense, challenge accepted!”, I did joyfully exclaim.Surely, someone smart had built this before,My first port of call was the interweb to explore.

Something I found on my computering quest,Was a link to a repo with a code treasure chest.It looked perfectly suited for the task at hand,But I quickly discovered it used something banned.

With GKE at its core, ‘twas code I couldn’t abide,But, I forked it anyway as my hacking guide.You see, it’s all just containers, nothing complex,And so, I decided to run it in App Engine Flex!

Some libraries were old, but that was an easy fix,A sprinkling of Docker with a little spicy Python mix.The code for reading tweets, I could easily reuse,It even published to

PubSub, I had nothing to lose!

Python (2.x) for reading tweets & publishing to PubSub

The Docker file contained was so very concise,A few little tweaks, and it was quickly looking nice.To use with App Engine, another file I did create,Nine lines of YAML, just one more than eight!

It doesn’t get any easier than this
Nine lines of YAML for App Engine

Next up, a Dataflow pipeline to read from PubSub,Writing this was faster than enjoying a pint at the pub.Now, pay attention to this important piece of code,All it writes to BigQuery is the message payload!

Dataflow pipeline in Java — don’t judge me, please

No complex schemas to create, less to code to write,“I’ll easily be finished before the onset of night!”.The last piece of the puzzle, the essential glue if you will,Cloud Build to deploy, giving me more time to chill.

Trivial Cloud Build config
Glorious Cloud Build logs

With everything now deployed, sitting pretty on Google Cloud,Data began streaming in and I felt so-oh very proud!The solution wasn’t pretty, some would say ’twas a hack,But I didn’t have much time, so cut me some slack!

App Engine’y stuff
Dataflow chugging on PubSub & writing to BigQuery

Using its native JSON functions to parse the tweet,BigQuery’s power and speed just can’t be beat.A few lines of SQL that even my Mum could write,Seeing the tweets show up was a beholding sight!

Yayy! Tweets in BigQuery. High-five!

Before the sun did set on that fine Melbourne night,I’d spun up a pipeline, highlighting GCP’s might.Just one simple command to deploy this demo app,On my back, I did proudly give myself a well deserved clap!

Yes, I do own that book

Note: here is the repo.