Spreading the Love in the LinkedIn Feed with Creator
Standard A/B testing can’t measure network impact.
Our first choice might be to do two experiments: one randomized on the viewers and one randomized on the creators. To do the creator-side experiment, we could take the creators and boost half of them in the feeds of their network, while leaving the other half un-boosted so we can compare them to each other. We’ve frequently used this technique in other areas of LinkedIn, such as notifications. Unfortunately, that option doesn’t work in this case due to technical limitations, so for now we’d have to rely on randomizing on the feed viewers.
A second option, which we used in this case, is using “upstream/downstream metrics.” These are metrics calculated on a single member, but which measure some aspect of their impact on their network. As a basic example, if we create an experiment that causes members to post more, we can measure the downstream impact on their networks with a metric like “total feedback received.” For a given poster, the “total feedback received” metric calculates the number of likes and comments they received on their posts. This would help us distinguish between a treatment that causes members to create boring posts that get very little feedback versus a treatment that helps members create interesting posts that get more feedback.
In the case of this new feed model, to get a sense of the potential impact on creators, we used a suite of upstream metrics, including “first likes given.” This metric quantifies how often I, as a feed viewer, like a post that didn’t previously have any likes. If I see a post that doesn’t have any likes, and I click the like button for it, then I just created one “first like given.” The suite of metrics contains several variations on this theme involving comments, freshness of the post, and the changing value of feedback beyond the first piece (which rolls into a metric called “creator love given”), but it all follows the pattern of measuring value given to the creator. The big caveat here is that even if a viewer gives 10 more “first likes given” in the treatment group than in the control group, that doesn’t mean creators will receive 10 more first likes if we ramp the experiment to everyone. This is because if someone in the treatment group didn’t give that first like, it’s possible that someone in the control group would still have given it later, so all our treatment has done is give the creator their first like a bit sooner than they would’ve gotten it otherwise. The metric gives us a directional indication that we’re changing things in the way that we want to, but it doesn’t scale linearly to the whole population. If we want to accurately measure the actual impact on creators, we’ll need a different method.