1. 程式人生 > >Localization Technologies at Netflix

Localization Technologies at Netflix

Localization Technologies at Netflix

The localization program at Netflix is centered around linguistic excellence, a great team environment, and cutting-edge technology. The program is only 4 years old, which for a company our size is unusual to find. We’ve built a team and toolset representative of the scope and scale that a localization team needs to operate at in 2015, not one that is bogged down with years of legacy process and technology, as is often the case.

We haven’t been afraid to experiment with new localization models and tools, going against localization industry norms and achieving great things along the way. At Netflix we are given the freedom to trailblaze.

In this blog post we’re going to take a look at two major pieces of technology we’ve developed to assist us on our path to global domination…

Netflix Global String Repository

Having great content by itself is not enough to make Netflix successful; how the content is presented has a huge impact. Having an intuitive, easy to use, and localized user interface (UI) contributes significantly to Netflix’s success. Netflix is available on the web and on a vast number of devices and platforms including Apple iOS, Google Android, Sony PlayStation, Microsoft Xbox, and TVs from Sony, Panasonic, etc. Each of these platforms has their own standards for internationalization, and that poses a challenge to our localization team.

Here are some situations that require localization of UI strings:

  • New languages are introduced
  • New features are developed
  • Fixes are made to current text data

Traditionally, getting UI strings translated is a high-touch process where a localization PM partners with a dev team to understand where to get the source strings from, what languages to translate them into, and where to deliver the final localized files. This gets further complicated when multiple features are being developed in parallel using different branches in Git.

Once translations are completed and the final files delivered, an application typically goes through a build, test and deploy process. For device UIs, a build might need additional approval from a third party like Apple. This causes unnecessary delays, especially in cases where a fix to a string needs to be rolled out immediately.

What if we can make this whole process transparent to the various stakeholders — developers, and localization? What if we can make builds unnecessary when fixes to text need to be delivered?

In order to answer those questions we have developed a global repository for UI strings, called Global String Repository, that allows teams to store their localized string data and pull it out at runtime. We have also integrated Global String Repository with our current localization pipeline making the whole process of localization seamless. All translations are available immediately for consumption by applications.

Global String Repository allows isolation through bundles and namespaces. A bundle is a container for string data across multiple languages. A namespace is a placeholder for bundles that are being worked upon. There is a default namespace that is used for publishing. A simple workflow would be:

  1. A developer makes a change to the English string data in a bundle in a namespace
  2. Translation workflows are automatically triggered
  3. Linguist completes the translation workflow
  4. Translations are made available to the bundle in the namespace

Applications have a choice when integrating with Global String Repository:

  • Runtime: Allows fast propagation of changes to UIs
  • Build time: Uses Global String Repository solely for localization but packages the data with the builds

Global String Repository allows build time integration by making all necessary localized data available through a simple REST API.

We expose the Global String Repository via the Netflix edge APIs and it is subjected to the same scaling and availability requirements as the other metadata APIs. It is a critical piece especially for applications that are integrating at runtime. With over 60 million customers, a large portion of whom stream Netflix on devices, Global String Repository is in the critical path.

True to the Netflix way, Global String Repository is comprised of a back-end microservice and a UI. The microservice is built as a Java web application using Apache Cassandra and ElasticSearch. It is deployed in AWS across 3 regions. We collect telemetry for every API interaction.

The Global String Repository UI is developed using Node.js, Bootstrap and Backbone and is also deployed in the AWS cloud.

On the client side, Global String Repository exposes REST APIs to retrieve string data and also offers a Java client with in-built caching.

While we have Global String Repository up and running, there is still a long way to go. Some of the things we are currently working on are:

  • Enhancing support for quantity strings (plurals) and gender based strings
  • Making the solution more resilient to failures
  • Improving scalability
  • Supporting multiple export formats (Android XML, Microsoft .Resx, etc)

The Global String Repository has no binding to Netflix’s business domain, so we plan on releasing it as open source software.

Hydra

Netflix, as a soon-to-be global service, supports many locales across myriad of device/UI combinations; testing this manually just does not scale. Previously, members of the localization and UI teams would manually use actual devices, from game consoles to iOS and Android, to see all of these strings in context to test for both the content as well as any UI issues, such as truncations.

At Netflix, we think there is always a better way; with that attitude we rethought how we do in context, on device localization testing, and Hydra was born.

The motivation behind Hydra is to catalogue every possible unique screen and allow anyone to see a specific set of screens that they are interested in, across a wide range of filters including devices and locales. For example, as a German localization specialist you could, by selecting the appropriate filters, see the non-member flow in German across PS3, Website and Android. These screens can then be reviewed in a fraction of the time it would take to get to all of those different screens across those devices.

How Screens Reach Hydra

Hydra itself does not take any of the screens, it serves to catalogue and display them. To get screens into Hydra, we leverage our existing UI automation. Through Jenkins CI jobs, data driven tests are run in parallel across all supported locales, to take screenshots and post them screens to Hydra with appropriate metadata, including page name, feature area, major UI platform, and one critical piece of metadata, unique screen definition.

The purpose of the unique screen definition is to have a full catalogue of screens without any unnecessary overlap. This allows for fewer screens to be reviewed as well as for longer term to be able to compare a given screen against itself over time. The definition of a unique screen is different from UI to UI, for browser it is a combination of page name, browser, resolution, local and dev environment.

The Technology