LinkedIn Spotlights GDMix, Its New Open-Source Deep Ranking Framework
LinkedIn’s 706 million members and hundreds of billions of posts pose a clear challenge for a social network that aims to provide busy professionals with the most useful information possible. How does LinkedIn know which John Smith a given member is looking for? How does LinkedIn assess which courses and posts to position on a given member’s feed? The answer, of course: AI. In a blog post, LinkedIn software engineer Jun Shi and his colleagues highlighted GDMix, a new, open-source deep ranking personalization framework created by LinkedIn software engineers to efficiently tackle these kinds of questions.
“The interactions between a member and an item such as a member profile, a job posting, or a blog post usually result in a large number of features,” the authors wrote. “For example, the interactions between more than 700 million members and millions of items on LinkedIn result in a model with tens of billions to a trillion features. It is very difficult, if not impossible, to train models of this size efficiently. Training such models may require specialized processors, extraordinarily large system memory, and ultra fast network connections, among other challenges.
Image courtesy of the researchers/LinkedIn.
Enter GDMix (for “generalized deep mixed model”). GDMix, the authors explained, “breaks down a large model into a global model (a.k.a. “fixed effect”) and a large number of small models (a.k.a. “random effects”), then solves them individually.” This “divide-and-conquer” approach, they say, allows for efficient training of large models with lightweight hardware, expands support for deep learning models and generally offers a substantial improvement over its predecessor (Photon-ML).
Along with scalability, flexibility and efficiency, GDMix’s deep learning compatibility is a particular highlight. The tool leverages DeText, a deep learning ranking framework for text understanding, to model the relationships between a source (such as a member profile) and a target (such as a job posting).
The authors claim that GDMix is suited for ranking tasks across a variety of activities, including job recommendations, app store recommendations, movie recommendations, ecommerce searches, content searches and ad ranking. The researchers evaluated GDMix on internal LinkedIn datasets, seeing a 10 to 40 percent decrease in linear model training time. Based on that success, GDMix is now part of an “overall redesign of the search experience for LinkedIn members” and “efforts are underway to integrate GDMix into LinkedIn’s AI production workflows.”
GDMix, which is still under active development, is open source and available on a GitHub repository accessible at this link.