The 2017 Segment Open Fellows · Segment Blog
It’s not hyperbole to say that Segment would not exist, if not for open source. We’re heavy users of Kafka, Redis, Terraform, Docker, Golang, and Node.js, just to name a few of the tools we use. And we literally got our start as an open source library launched on Hacker News.
Today, our engineering team actively contributes to hundreds of
Following in the footsteps of Stripe and Google, we opened a request for “Open Source Fellows”.
The pitch was simple: we give you $24,000 and three months to develop open source software, no strings attached. At the end, you present your results.
Today, we’re excited to announce the results of the program and the progress our fellows have made.
Tobias Koppers’ work on Webpack
Ben Weinstein’s work on DeepMeerkat
Justin Keyes’ work on Neovim
Julia Evans upcoming work on a Ruby Debugger and Profiler
For more information on the progress each participant made, read on!
Tobias Koppers
Tobias is one of the core maintainers and project lead for Webpack–a module bundler for web applications. Tens of thousands of companies (including us here at Segment!) use Webpack to bundle together their image assets, javascript files, and CSS into single ‘bundled files’.
Instead of writing various pieces of inlined javascript, Webpack is the “universal tool” to get all of your raw source code into a single set of optimized, dependency-managed files.
Tobias started Webpack back in 2012 while working on his masters thesis. As part of his thesis, he was trying to create a simple webapp, and wanted a way to bundle his javascript together. But he couldn’t find a lot of good prior art out there.
At that time, the frontend Javascript ecosystem looked very different. You didn’t really see require
statements or dependency management. Most libraries were referenced through global variables stored on the window object (think jQuery’s window.$
object).
In the best case, you’d have an asset pipeline like Rails that injected all of your core libraries into a single HTML template.
Worst case (let’s be real, average case), you’d manage them by hand like this:
<script src="./js/libs/jquery.min.js">
<script src="./js/libs/modernizr.min.js">
<!-- add my custom assets in order -->
<script src="./js/src/base.js">
<script src="./js/src/shared.js">
<script src="./js/src/homepage.js">
And while the odd project might be using a tool like require.js or browserify–at the time they weren’t really mainstream.
So Tobias started by looking at Medikoo’s modules-webmake
to package together the dependencies in his webapp. He liked the idea of not having to worry about the dependencies themselves and instead focusing just on the core development.
Tobias had previously used GWT’s code splitting feature, so he pull requested it as an addition to modules-webmake
. But it was such a drastic change, he ended up forking the project into a new repository dubbed modules-webpack
, which eventually became webpack/webpack
.
As part of his work over the past three months, Tobias has focused on three separate additions to Webpack: implementing properly ordered scope hoisting for modules, speeding up the core parser, and in-lining ‘pure’ modules.
The features themselves are really cool–so it’s worth digging a little bit into what’s going under the hood.
Scope Hoisting
For a bundler like Webpack, it’s a top priority to respect the ECMAScript specification when it comes to re-writing and generating spec-compliant code. And previously, Webpack had a few issues here.
As a quick example, suppose that we have two files, an index.js
containing most of our code, and a corresponding b.js
file which contains additional code. It should be possible to structure these modules like this:
index.js
import { b } from './b'; // b is imported
export function a() { // a should first be hoisted as an exported function
return "a";
};
console.log(b); // "a"
b.js
import { a } from './index';
export const b = a(); // should equal "a", which was hoisted though not yet evaluated
While this is a contrived example, it matches the hoisting rules for ES6 functions and modules (all defined functions should be ‘hoisted’ to the scope of the module).
You can see the fix for yourself here.
Parser Speedup
Second, Tobias turned his focus towards speeding up the core webpack parser. Since Webpack might re-build its source javascript hundreds of times locally in a given development cycle, one of the core requirements is that it has to be fast.
In particular, he implemented what he terms a StackedSetMap
. The general idea is that it combines the stack-based rules of variable and functional scoping with the quick access of the maps and sets. You can think of it as a “stack” of “maps”, with each map representing the variables in-scope at a certain point.
Instead of repeatedly copying large amounts of memory, the StackedSetMap just tracks parent references (typically another stack frame) and then searches the map of locally defined variables.
class StackedSetMap {
constructor(parentStack) {
this.stack = parentStack === undefined ? [] : parentStack.slice(); // create a new list with the elements of the parent stack
this.map = new Map(); // create a new map for this frame of the stack
this.stack.push(this.map); // push the new map onto the stack to track
}
add(item) {
this.map.set(item, true); // add the item to the local scope's set
}
set(item, value) {
this.map.set(item, value === undefined ? UNDEFINED_MARKER : value); // set the value for the key in the local scope
}
get(item) { // get will traverse the stack, retrieving the first item it sees from the most scoped to the most global variable
const topValue = this.map.get(item);
if(topValue !== undefined)
return topValue === TOMBSTONE || topValue === UNDEFINED_MARKER ? undefined : topValue;
if(this.stack.length > 1) {
for(var i = this.stack.length - 2; i >= 0; i--) { // traverse "up" the stack
const value = this.stack[i].get(item);
if(value !== undefined) {
this.map.set(item, value); // copy the value to the local frame
return value === TOMBSTONE || value === UNDEFINED_MARKER ? undefined : value;
}
}
this.map.set(item, TOMBSTONE); // if we didn't find it, set a TOMBSTONE so we don't have to traverse a second time
}
return undefined;
}
}
In this way, the StackedSetMap keeps both variable access and writing fast:
creating child scope: O(1)
reading a value the first time: O(n) with n is the number of parent scopes
reading a value the second time: O(1)
writing a value: O(1)
If the parser ever wants to know that a given variable is in scope, it can then query the map like so.
If you’d like to see the change in action, you can find it here.
Side-effect-free pure modules
Finally, Tobias added support for side-effect-free ‘pure’ modules. As a toy example, suppose we have the following folder structure.
my-module
├── a.js
├── b.js
└── index.js
0 directories, 3 files
Here we have an a.js
, a b.js
and an index.js
file which exports them both. Suppose our index file looks something like this:
import { a } from "./a.js"
import { b } from "./b.js"
export default { a, b }
In most cases, the compiler or bundler will have to re-package the entire module (index.js, a.js, and b.js) to meet the spec. After all, ES7 specifies that the entire module be loaded, just in case something from a.js
modifies something from b.js
.
However, Webpack module authors can specifically mark their fields as being "pure-module": true
in their package.json
.
When this happens, Webpack knows that each individual file won’t have any sort of side-effects when imported directly. So it can introduce an optimization and effectively omit including the index.js
file, and instead require my-module/a
and my-module/b
directly.
This creates smaller builds which skip the extra set of lookups. It’s a win for everyone! And here’s the PR.
You can find Tobias’ full write up of all his improvements (and those of the full Webpack team) up on his the Webpack medium blog.
Ben Weinstein
Ben differs from the majority of the applicants to the fellowship. His focus isn’t on developer tooling or infrastructure. And while he has some software engineering background, he self-identifies primarily as a field research biologist.
Ben is a postdoc at the Oregon State University where he studies ecology. He spends his time applying statistical methods to Ecology, and is one of the world experts on Taxonomic, phylogenetic, and trait beta diversity in South American hummingbirds.
Ben tells us that the most common means for ecologists to collect data is by observing animals directly in their natural habitat.
Here’s the typical setup. There’s a stationary camera somewhere, filming a congregation spot for hummingbirds, or butterflies, even sharks. The camera films the location for fifteen days in a row (!), and then the ecologist comes and retrieves the footage.
In his case, his camera takes a still frame several times per second, gathering 46,000 images over the course of the day.
The ecologist then combs through the video, hoping to catch a fleeting glimpse of the animal to identify it. Once they catch the handful of ‘good frames’, they’ll loop those frames again and again to properly tag the specimen.
It’s a needle in a haystack search to find 10-15mb of “good frames” amongst nearly 25gb of noise.
Obviously, this process is time-consuming, error-prone, and tedious. So Ben thought it was a ripe candidate for a little bit of automation.
The result is a project he calls “DeepMeerkat”. It’s designed to run on your laptop, you just feed it an animal video, and it will automatically skip empty frames, and highlight the useful/interesting ones.
Under the hood, DeepMeerkat uses Tensorflow to build a model to help identify the animals in videos. Each video is loaded into memory, and first analyzed using OpenCV.
For each frame in the video, OpenCV will filter for background vs foreground noise by using a Gaussian image filter and then draw ‘bounding boxes’ around any distinct shapes.
If the difference of the foreground is significant, the frame will be then passed into Tensorflow to train as part of the model there.
Ben’s been testing his model against a variety of animals and landscapes:
And he’s uploaded a number of sample videos to his youtube channel to help train his dataset against.
Over the course of the fellowship, Ben built out most of the software, and just added the ability to substitute your own training models in Tensorflow. His idea is that anyone can train their own Tensorflow model for their particular dataset–though right now he’s trained his net against ImageNet.
Additionally, Ben has started work to do the full model training in Tensorflow, rather than passing it to OpenCV. While pushing frames into Tensorflow currently is insanely slow (2-3fps), he’d like to be able to do a first “filter” with OpenCV and then actually identify the animal within Tensorflow.
Currently, everything runs locally on his laptop, but he’s been experimenting with feeding the data into Google’s Cloud Dataflow to run it in the cloud. It’s his dream that someday he can just upload a video and have it automatically tagged.
He’s hoping in the next six months he can spread it more widely. Over the past year, he’s seen about 800 downloads of DeepMeerkat, and he’d like to start onboarding more contributors. You can find the source on his github.
Justin Keyes
Justin Keyes has been working on Neovim, a modernized fork of popular text editor: vim
. If you’ve SSH’d into a unix-based server at some point in the past 10 years, you’ve probably used vim
to edit your files.
Vim itself has been around since 1991, when Bram Moolenaar ported the stevie
editor from the Atari ST (which in turn, was based upon Bill Joy’s vi
for Unix) to the Amiga.
And while vim has shipped on practically every linux, unix, and mac distribution since 1991, it means that there are a lot of parts of the codebase that have been added to handle 20 years worth of quirks and inconsistencies.
Enter Neovim–an effort to create a fork of vim that is extensible, pluggable, and hackable.
Instead of starting from scratch and then re-building all of the battle-tested functionality that vim has acquired over its 20-year history, the Neovim team started with vim itself as a foundation. But with the explicit goal of encouraging contributions, hacking, and pluggable functionality.
While Vim still just has a single named contributor (Bram Moolenaar), Neovim takes commits from hundreds of different engineers and enthusiasts.
Over the three months of the fellowship, Justin has been hard at work closing issues, merging contributions, and generally focusing on cleaning up various parts of the codebase.
In addition, he’s been working on a significant new feature: multicursor support. The goal is to allow users to queue up multiple actions at once, and then apply them in one go.
But when he started to build out the new feature, Justin realized that creating multi-cursor support in Neovim wasn’t actually feasible with the current codebase.
Most projects that re-implement vim, model sets of operations as "pipelines" that produce nice, composable “commands”. These commands are then “executed”. But Neovim wasn't implemented like that.
Instead, in Neovim, many normal-mode operations are implemented by pushing keys onto the internal input queue, as if the user had typed them. It’s clever, and preserves parity, but hard to reason about. And even harder to leverage programmatically.
So Justin started work to fix the root problem by introducing two new primitives: atoms and contexts.
Atoms
Atoms are simple, they provide a combination of user-actions, rather than individual keystrokes. And as the name implies, they can be grouped and applied atomically. As an example, instead of using 3j
to jump ahead three characters and treating 3
and j
as different single keystrokes–this update groups them as an ‘atom’ that can be repeated re-used.
Previously, Neovim would track macros as a string buffer of chacters. A given macro might look like the raw input stream defined below:
macro: ll3jddgUUiabc
With the multicursor work, Neovim now tracks the individual atoms:
macro: ll3jddgUUiabc
^^^ ^ ^ ^ atoms
By grouping individual commands into actions, user scripts, plugins, and remote clients can query the atoms queue at any time via the Neovim API.
For example, here's the last 3 items in the atoms queue from a sample editing session:
:echo nvim_get_atoms()[-3:]
[{'seq': 1, 'type': 8, 'keys': 'k'}, {'seq': 2, 'type': 10, 'keys': '^D'}, {'seq': 1, 'type': 8, 'keys': '3j'}]
It's easy to see that this is useful for applications beyond multicursor, such as introspection of user editing patterns, or re-invoking the last "thing" or the Nth last "thing". Using the current implementation, the following code maps "space" to repeat the last atom (unlike dot-repeat, this repeats motions, navigation, etc.):
:nnoremap <space> :call
feedkeys(filter(nvim_get_atoms(),'v:val.keys!~#":"')[-1].keys)<CR>
With that mapping, a motion like 3j
or CTRL-D can be repeated by pressing "space".
Context
The context contains the current Neovim state. It allows you to serialize a given Neovim session and actions, and retrieve it via Neovim’s API. It should enable multiple frontends to read from the same Neovim state to enable functionality like remote vim UIs.
In the current implementation, you can call the nvim_get_context()
API call, and it will return you the full context–no complicated parsing required:
:echo nvim_get_context()
{ 'pos': {'lnum': 0, 'col': 0, 'coladd': 0},
'registers': {'a': {'lines': ['current_SID']},
'b': {'lines': ['vim_FullName']},
'c': {'lines': ['"apjwldi)kllllllllllldrf)j^C^C']}, ...,
'0': {'lines': ['VV_TYPE_FUNC']},
'1': {'lines': ['foo']}, ...,
}
}
Justin expects the work to land just after the upcoming release in 0.2.1. And it helps ensure that everything in Neovim is API first.
If you’d like to check out Neovim’s source, you can find it on Github.
Julia Evans
Last, we have Julia Evans who deferred her participation until early next year. She hasn’t yet started the program, but we’re excited to help fund her next project: an new profiling tool for Ruby.
If you haven’t yet read her blog, Julia runs a wonderful set of descriptions and zines on linux performance and tracing tools. She’s written invaluable posts on everything from strace to pprof to eBPF.
When Julia applied, she said she was excited to work on better profiling tools for ruby. And naturally, she’d already outlined some initial work in a post. :D
The inspiration for her work comes from using interactive tools like gdb
or perf
, and a desire to provide that same kind of experience for Ruby.
Today, if you want to get a CPU profile for an arbitrary Ruby program, you… can’t. You can use tools like stackprof (which is great), but to use stackprof you need to instrument your program in advance. The initial goal of this project is to be able to get a CPU profile for any Ruby program. You can do this for C/C++ programs, Rust programs, Go programs, and Java programs… so why should Ruby have worse tooling?
As she talks about in her post, you can interactively connect to a Ruby process with gdb
. And you can even get the currently executing spot within that program using your gdb session. This excerpt is taken from her blog post:
# Phew. That was kind of long. Luckily, we just care about
# `location.path` and `location.label.`
# Let's print those out!
(gdb) p *((struct RString*) (ruby_current_thread->cfp + 1)->iseq.location.label)
$7 = {basic = {...}, as = {heap = ...,
ary = "block in initialize\000\000\000\000"}}
(gdb) p *((struct RString*) (ruby_current_thread->cfp + 1)->iseq.location.path)
$8 = {basic = {flags = 546318437, klass = 94660819015280}, as = {heap = {len = 64,
ptr = 0x5617f3432440 "/home/bork/.rbenv/versions/2.1.6/lib/ruby/2.1.0/webrick/utils.rb",}
If you only know the process number, you can still understand exactly what your program is doing at this very moment.
Yet this still isn’t great. gdb
uses the ptrace
system call, which causes the program to stop in its tracks and then intensely query it for its internals. It’s not really a ‘passive’ profiler at all, and won’t work in cases where your program is actually under load.
In just a few days, Julia built a prototype in Rust which could interactively introspect the system calls, function calls, and even spy on the memory contents using the process_vm_readv
call (a syscall which can directly read memory from a user-space program).
Here’s an example of a flamegraph generated using her prototype:
It’s still has a little ways to go, and she plans to make it more portable and work quickly and reliably across any sort of Linux distribution.
She’ll be starting early next year to work full-time on the fellowship. And during that time, she’ll be on sabbatical from her work at Stripe.
If you’d like you can check out the early code on her Github, you can find it here and her full blog post outlining the project here.
Looking Ahead
We’ve been quite pleased with the breadth and depth of open source work that has come out of the fellowship. And we’ve been impressed with what a single focused individual or small team can accomplish in as little as three months.
A number of the fellows agreed that having the funding and 3-month period allowed them to really focus on tackling ‘bigger’ projects that they wouldn’t have had time for otherwise.
We’re hopeful that we can continue the program again next year, and encourage another batch of fellows to help spread the open source love. If you’re interested in applying next year, leave your email as part of our Open Fellowships email list and we’ll send you a reminder once applications are open.