Welcome to New America, redesigned for what’s next.

A special message from New America’s CEO and President on our new look.

Read the Note

Case Study: Netflix

netflix
sitthiphong / Shutterstock.com

Netflix is an American digital content streaming and production company that was founded in 1997 by Reed Hastings and Marc Randolph. The company initially offered a mail-based DVD rental subscription service, but it introduced digital video streaming services in 2007.1 In 2016, the company separated its DVD service onto a new platform, known as DVD.com: A Netflix Company.2 The company then transitioned its main platform to subscription-based digital video streaming. The company is regarded as the world’s largest subscription streaming service,3 with most recent estimates suggesting that the company now has 167.1 million subscribers around the world, with over 100 million of these users residing outside the United States.4 The company currently ranks twenty-first for global internet engagement on Alexa rankings.5 According to Nielsen’s Total Audience Report released in February 2020,6 Netflix accounted for 31 percent of streaming time in U.S. homes capable of over-the-top streaming.7

Netflix’s recommendation system is an important contributor to its revenue generation model, driving approximately 80 percent of hours of content streamed on the platform.8 Unlike YouTube and Amazon, the platform does not deliver targeted advertisements to its users. Rather, the company relies on subscriptions to both its digital video streaming service and DVD-delivery service to generate revenue. As competition between video streaming platforms has heated up, the company has also invested heavily in producing original content. The platform’s financial success relies on attracting and maintaining user attention, and preventing users from leaving Netflix in favor of competitors. Its recommendation engine is vital to achieving this, and it is therefore key to the company’s business model.

Netflix’s recommendation system holds a significant amount of influence over how the platform operates and how users engage with the service. Despite this, the platform has offered limited transparency around how this system is designed and how it operates. This is concerning given that the company's recommendation systems have raised concerns related to biased and discriminatory outcomes. In addition, Netflix offers users only a limited set of controls over how algorithmic decision-making shapes their platform experience.

A Technical Overview of Netflix’s Recommendation System

Netflix operates its own proprietary recommendation engine that delivers personalized title recommendations to users. When the company was a mail-based DVD rental company, this recommendation system was known as Cinematch, and it was designed to help users fill their order queues with titles that they wanted to receive in the mail over the coming days, weeks, and months. Under the DVD-rental business model, each title took a few days to be delivered and users could only order a limited number of titles at a time. Netflix’s recommendation system aimed to predict the number of stars that a user would rate a title, on a scale of one to five, after they had watched it. Once a user returned a title, they provided their actual rating. This served as the primary source of feedback from the user that the company then used to retrain and optimize its recommendation algorithms.9

In 2006, the company launched the Netflix Prize challenge in order to improve predictions of user enjoyment based on individuals’ movie preferences.10 The company made a portion of its film database public for the competition, challenging participants to develop a collaborative filtering algorithm that could improve Netflix’s own results by 10 percent.11 The winning team, BellKor’s Pragmatic Chaos,12 was able to improve the results by 10.06 percent,13 and they were awarded a grand prize of $1 million.14 Netflix still uses the algorithm the team developed to help predict ratings.15

However, as Netflix shifted toward a video streaming-focused service, it began transforming its recommendation system to align with new parameters and goals. When the company expanded into more customer segments and geographies, it realized a growing need to deliver and recommend personalized and interesting titles to different categories of users.16 In 2014, the company invested $150 million (approximately 3 percent of its revenue at the time) in creating a team of 300 employees dedicated to improving the company’s recommendation engine.17 When Netflix introduced its services globally in 2016, it developed a single global recommender system that shares and relies on data from across each of its countries of operation. The company hoped that doing so would improve its recommendations for users in smaller markets without negatively impacting recommendations for users in larger markets.18

As a video streaming-focused company, Netflix had to alter its recommendation engine to provide users with multiple real-time suggestions on what to watch in the moment. Unlike subscribers to Netflix’s DVD rental service, streaming subscribers can sample as many titles as they want before settling on one. In addition, these users are able to view numerous titles in one sitting or in quick succession.19 Consumer research has suggested that the average Netflix user loses interest in the service after approximately 60 to 90 seconds of browsing and considering potential titles. Once a user loses interest, it is likely that they will switch over to another streaming service.20 As a result, user retention is dependent on the company’s recommendation engine being able to provide real-time, personalized recommendations.21 Unlike the recommendation algorithms used before 2006, the recommendation algorithms that the company now uses can be optimized in real time using explicit data as well as granular implicit data.22

Netflix collects explicit feedback data from users by enabling them to provide a thumbs up or thumbs down on titles.23 This is similar to the explicit data the company collected during its DVD rental period, albeit at a larger scale. The company also collects implicit data, which consists of behavioral user data that is collected in real-time as a user navigates and engages with content on the service. Implicit data can include what titles a user has watched, the time of day and day of the week that a user is watching, what devices a user is watching on, and how long they are watching certain titles.24 The company also collects data on where in each row of recommendations the selected title appeared, and what titles were recommended to a user but not selected.25 It also considers data from other users, such as the combined ratings for a title made by all users who are similar to a given user.26 Netflix has stated that the implicit data it collects does not include demographic data such as race, age, and gender,27 although presumably user patterns and behaviors could enable the company to infer this information. Because implicit data provides more granular insights into a user’s behaviors and preferences, the company considers it to be more useful than explicit data for constructing and retraining recommendation algorithms.28

Netflix’s recommendation engine is at play at all stages of a user’s journey on the platform. When a user creates a Netflix account or adds a new profile to their account, they are first prompted to select a few titles that they like. These titles are used to jump start their recommendations. This process, however, is optional. If a user elects to forego this process, then the initial recommendations the system delivers will be for a diverse and popular set of titles.29 Once a user begins watching titles on the platform, the system will use the explicit and implicit data it collects on the user and their viewing practices to adjust its recommendations. These data points will supersede any initial preference indications that a user previously provided, suggesting that revealed preferences can carry greater weight in the recommendations process than a user’s initial-stated preferences. As a user watches more content over time, the recommendation engine will begin to weigh titles that the user consumed more recently heavier than titles that were consumed in the past.30

Today, Netflix’s recommendation engine is a machine learning-driven collection of algorithms that serve different purposes and collectively create the Netflix streaming experience.31 This recommendation system offers multiple categories of recommendations. Most of these categories of recommendations reside on the homepage, which is the page that a user first sees when they log into their profile on any device. The homepage is the main hub for a user’s personalized recommendations and it is where two of every three hours of content streamed on the platform are discovered.32

Below is a breakdown of the different categories of recommendation algorithms that comprise Netflix’s recommendation system:

  1. Personalized Video Ranker (PVR): The PVR algorithm operates on the Netflix homepage and presents users with the entire catalog of titles available on the platform for the region in which they live, as well as certain categories of titles filtered by specific genre-based themes in a personalized manner. The Netflix homepage is designed in a matrix-like layout. In this matrix, each entry is a recommended title and each row of recommended videos are grouped together and labeled based on an overarching theme (e.g. Award-Winning Documentaries, Soapy TV Dramas, etc.).33 On average, each user will find approximately 40 rows on their homepage, with up to 75 videos per row. However, these figures may vary depending on the functionality of the user’s device. Because the homepage experience is personalized to each user, the titles in each row, the order of the titles in each row, and the rows themselves vary per person. However, because the PVR algorithm is applied widely across the platform, it is designed so that it can provide broader and more generalized content recommendations and rankings. This limits the extent of the personalization that the algorithm can subsequently generate. As a result, this algorithm works best when it is producing recommendations based on a combination of personalized signals and general popularity signals, which are used to generate the recommendations in the Popular row. 34
  2. Top N Video Ranker: The Top N video-ranker algorithm is used to produce recommendations for titles that appear in the Top Picks row on the homepage. This algorithm is designed to identify a limited number of personalized recommendations from the entire Netflix catalog based on titles that are ranked highly. Whereas the PVR algorithm is assessed and optimized using metrics and algorithms that focus on the ranking that is produced for the entire catalog, the Top N video-ranker algorithm relies on metrics and algorithms that focus on the top percentiles of the catalog ranking. Like the PVR algorithm, the Top N video-ranker algorithm combines personalization and popularity metrics when making its recommendations. In addition, like the PVR algorithm, it is also able to identify and integrate user viewing trends it has collected over different time periods, ranging from a day to a year.35
  3. Continue Watching Video Ranker (CWR): The CWR is a video ranking algorithm that is used to order the titles that appear in the Continue Watching row on the homepage. Whereas the majority of the video rankers that Netflix deploys order unviewed titles based on implicit data on user preferences and interests, the CWR sorts and ranks titles that a user has recently watched based on its calculations of whether the user will continue watching the title, or rewatch it, and whether a user abandoned a title because they found it uninteresting. The CWR algorithm considers signals such as the time that has elapsed since a user viewed the title, when a user stopped watching the title (e.g. mid-title, at the beginning of the title, or at the end of the title), whether the user has viewed different titles since abandoning the title, what device the user was watching the title on, etc.36
  4. Video-Video Similarity (Sims): The Sims algorithm is an algorithm used to generate the Because You Watched (BYW) row, which features recommendations that are generated based on a user’s consumption of one particular title. The algorithm evaluates every single title in the Netflix catalogue in order to identify titles that are similar to a title a user has recently watched. It then ranks and presents these similar titles in the BYW row. Although the Sims algorithm generates groupings that are not personalized to a specific user (i.e. everyone will be recommended the same predetermined list of similar titles if they watch a certain title), which BYW rows eventually appear on a user’s homepage (e.g. BYW recommendations for Title A or Title B that a user has watched), as well as which titles within a predetermined BYW list appear, is personalized. This personalization is based on estimates of which titles a user would be interested in, and what they have already watched.37

Each of the platform’s video ranking algorithms relies on different mathematical and statistical signals and data as input. They also require different model training scenarios that are constructed based on the specific goal of each algorithm.38 Whether or not the recommendation system is able to suggest titles that a user is likely to watch strongly influences user retention on the platform. As a result, this is a key metric for indicating quality and effectiveness of the recommender system.39 However, there is little transparency around the results of these tests, and around the training data the company uses to structure its recommendation system. This makes it difficult to understand how and if the company’s recommendation system caters to the needs of different categories of users (e.g. users in different regions, users of different genders, etc.). It also makes it difficult to identify instances of bias.

In February 2020, Netflix added another layer of personalized recommendations to the homepage of its users, known as the Top 10 list. The list is updated daily and the positioning of the row on a user’s homepage is dependent on how relevant Netflix believes the titles in the list are to each user. When a user clicks on Movies and TV Shows tabs, they are also able to see lists outlining the top 10 movies and top 10 TV shows in their country at that time. Titles that appear in Top 10 lists that also appear in other recommendation rows on a user’s homepage are marked with a badge that reads Top 10. This recommendation feature was initially piloted in the United Kingdom and Mexico, and was introduced worldwide in February 2020.40

In each row on a user’s homepage, there are three distinct types of personalization: the choice of which rows are displayed in which order (e.g. Because You Watched, Continue Watching, Romantic Comedies, etc.), which titles appear in each row, and the ranking or order that these titles appear within each row. Titles that are the most highly recommended will appear at the top of the homepage, and will be ordered from left to right in each row. However, for users who have set their language preference to Arabic or Hebrew, this is reversed, and the most highly recommended titles are ordered from right to left, since this is the direction in which those languages are read.41 The company’s recommendation algorithm calculates a percent match score for each user that is displayed next to each title. This score is different for each user and it is a prediction of how likely a user is to like a given title.42

In addition to these four-core recommendation algorithms, the platform also deploys other algorithms, which work in conjunction with the recommendation system. These algorithms include:

  1. Evidence Selection: Evidence Selection algorithms assess all information that is available about a title and determine which characteristics to feature in the description of the title. This choice is based on the algorithm’s prediction of what the company believes a certain user will find most helpful when they are considering a recommendation. Evidence is the information that a user can see on the top left of their home page, such as the synopsis of a title, thumbnail images of a title, and other relevant information such as the cast members and awards. For example, evidence selection algorithms determine whether a title being recommended to a user should be labeled as an Oscar-winning film or a film that is similar to another film that the user recently watched. Evidence selection algorithms are also responsible for determining which of the image thumbnails should be displayed to a user.43 In 2017, Netflix began personalizing the artwork or thumbnails that each user can see when they browse different titles. For example, for the TV show Stranger Things, there are nine different artwork options that a user could potentially see while browsing. Which thumbnail a user eventually sees is based on their browsing history and whether they have demonstrated an interest in comedy, horror, or suspense titles in the past.44
  2. Search: As previously noted, Netflix’s recommendation system accounts for 80 percent of the hours of content streamed on the platform. Its search functionality, on the other hand, accounts for the remaining 20 percent. Using the search functionality, users can search the entire Netflix catalog available in their country for a particular title, actor, genre, and so on. When a user inputs a query, the results are based on the top results related to the actions of other users who have entered the same or similar queries.45 The search functionality relies on its own set of algorithms. If an item is not in Netflix’s catalog, the search algorithms recommend alternative content based on the search query. This can be challenging given that many search queries are incomplete phrases or terms that consist of only a few letters. The search functionality relies on three different algorithms:46

    1. The first algorithm aims to identify titles that match a search query (e.g. delivering Fantasia when a user searches “fan”).
    2. The second algorithm predicts a user’s interest in a specific concept when they enter a partial search query (e.g. identifying and suggesting the concept “fantasy” when a user searches for “fan”).
    3. The third algorithm provides title recommendations for a specific concept (e.g. providing specific title recommendations for fantasy movies).

According to Todd Yellin, Netflix’s former vice president of innovation, now vice president of product, Netflix’s recommendation engine relies on three distinct categories of information:47

  1. Information on Netflix Members: As previously mentioned, Netflix collects granular implicit behavioral data on its users including their viewing history, when they are watching, and on what devices.
  2. Information on Content and Titles: Netflix employs numerous in-house and freelance individuals to watch each Netflix title and assign tags to them. These tags are granular in nature and can include information such as whether a title has an ensemble cast, is set in space, and stars a strong female lead. The company deploys the same tags globally and many of these tags are visible to users when they are navigating through various titles on the platform. However, a smaller subset of tags are used across different regions, languages, and cultures in order to enhance localization (e.g. the English language tag “gritty drama” may not translate well into other languages, and as a result a more localized tag will be used).48
  3. Information Produced by Machine Learning Algorithms: Netflix’s machine learning algorithms combine the information the platform has collected on its users and the information produced during the tagging process to identify important elements and patterns and assign them relevant weights. Based on this data, the algorithms generate thousands of “taste communities.” Taste communities are categories of users who are interested in similar titles.49 According to Netflix, examples of these communities include users who watched House of Cards and also It’s Always Sunny in Philadelphia, or Making a Murderer and the John Mulaney: The Comeback Kid comedy special.50 Each user will fit into numerous taste communities and which taste communities a user is assigned to impact the various recommendations they ultimately receive on the platform.

As outlined, Netflix’s recommendation engine involves a complex interplay of various algorithms and signals. Although the company does not sell ads, the ability to retain user attention on the platform still significantly influences the company’s bottom line as it means users will consume more titles on the platform, and not be consuming content on other services. According to 2016 estimates, the combination of such automated personalization and recommendation tools saved the company over $1 billion per year, as they enabled the platform to reduce the monthly number of customers that stopped using the service (known as customer churn) by a few percentage points. This has become increasingly important as competing streaming platforms have emerged in the United States and around the world.51

Controversies Related to Netflix’s Recommendation System

Netflix’s recommendation systems, however, have surfaced some concerns regarding biased and discriminatory outcomes. For example, in 2017, the company introduced a new algorithm to personalize the artwork or thumbnail images a user sees for each title. The company introduced this algorithm after its own research found that these visual elements were the biggest influencing factor when a viewer was deciding which title to watch, comprising 82 percent of their focus. The artwork that a user sees changes based on the user’s tastes and viewing history. After the algorithm was introduced, however, numerous African-American users in the United States found that the thumbnail images they were seeing were racially and ethnically driven, and were often misrepresentative of the actual cast of the movie. For example, one user saw a thumbnail for the romantic comedy film Love Actually featuring British black actor Chiwetel Ejiofor, who plays a minor role in the film, and Keira Knightley. Other users saw a thumbnail featuring the main stars of the film: Hugh Grant, Emma Thompson, and Colin Firth.52 Users reported similar outcomes for titles such as the murder myster series The Good Cop.53 These results caused outrage among many Netflix users, who were concerned that Netflix was tailoring its recommendations to them based on their race and ethnicity. Many also expressed concern that in altering the artwork to include black characters, the platform was falsely suggesting plot lines and actor prominence in titles in order to attain user engagement and attention.54 The company responded by stating “we don’t ask members for their race, gender or ethnicity so we cannot use this information to personalize their individual Netflix experience. The only information we use is a member’s viewing history.”55 Nonetheless, these incidents provide a further example of how algorithmic tools can make inferences on a user’s demographic characteristics using patterns identified in data points such as viewing history. This can result in biased and discriminatory outcomes. In this case, the outcomes did not have significant, harmful offline effects. However, they do demonstrate a need for the company to provide greater transparency and accountability around how these algorithms are trained and how they engage in decision-making. This could help researchers identify when Netflix’s recommendation system is making concerning inferences (especially in situations like the above where the system seemingly made race-based inferences although the company does not collect racial data). Given that the company is used widely around the world by users who may be part of marginalized communities, it is important that the company invests in ensuring these individuals are not being misrepresented on the platform.

In addition, although Netflix states it does not collect information such as race, gender, and ethnicity, information on user’s viewing patterns is also a highly sensitive category of data. This has been demonstrated in debates over U.S. surveillance law. Section 215 of the Foreign Intelligence Surveillance Act, (which has more recently received attention as the surveillance law under which the government has collected phone records), was previously known as the “library records provision.” Section 215 authorizes the government to collect business records, including financial records and library records. The provision thus can require librarians to provide information on customer reading and computer records. Numerous librarians have opposed the provision,56 explaining that information on viewing patterns is highly sensitive, and requiring librarians to hand over this data amounts to a serious privacy violation.57

Given the extent to which recommendation algorithms are used on the Netflix platform, Netflix should provide greater transparency around these tools’ impact on user’s Netflix experiences. Netflix has made some positive strides in this regard.

Many of the company’s executives have publicly spoken and written about the company’s recommendation engine in varying levels of technicality.58 In addition, in its online help center, the platform published a page that provides a high level overview of how the company’s recommendation system works. This page outlines the different factors the system considers when producing recommendations; how the initial jump start recommendation process works when a user creates a new profile; how recommendations are defined by row, rank and title; and how the company improves its recommendation systems. Although this page does not provide a granular and technical overview of the company’s recommendation systems, it does present this information in a user-friendly and digestible manner. This is a positive first step toward building user trust and agency, and in providing transparency and accountability around how the company uses automated decision-making in its recommendations process.

User Controls Related to Netflix’s Recommendation System

Netflix presents an interesting case study in that the company discloses information on how its recommendation system works, thus offering users a sense of agency through a limited set of controls that aim to help promote awareness of its algorithms. For example, in the My Profile section, each user has access to pages that outline all of the titles a user has previously rated and all of the titles they have watched and that are stored in their watch history. Through these pages, users can change or remove their rating for a title and remove a title from their watch history. In addition, users can also access their recent device streaming activity, which outlines which devices the user profile has recently watched Netflix on, where, and on what dates and times. However, Netflix does not offer users the ability to view the various characteristics and signals its recommendation system uses to make video recommendations or view explanations of why a particular title was recommended to them. Users also do not have the option to control or opt out of having certain factors considered by Netflix’s recommendation system. In addition, Netflix does not offer users the ability to opt out of receiving title suggestions from the recommendation system altogether.

This is an interesting comparison with companies such as YouTube and Amazon, which offer less transparency and accountability around how their recommendation systems work, but offer users a greater set of controls over their personal platform experiences.

Citations
  1. "About Netflix," Netflix Media Center, source
  2. Ingrid Lunden, "Netflix Sharpens Focus on DVDs With DVD.com, But Don't Cry Qwikster. (It's Staying)," TechCrunch, March 30, 2012, source
  3. Dan Moskowitz, "Who Are Netflix's Main Competitors?," Investopedia, last modified October 28, 2019, source
  4. Mike Snider, "Netflix Adds 8.8 Million New Subscribers, 'The Witcher' Tracks As Most-Viewed New Series," USA Today, January 21, 2020, source
  5. "Netflix.com Competitive Analysis, Marketing Mix and Traffic," Alexa Internet, source
  6. Nielsen, Nielsen Total Audience Report, February 2020, source
  7. Over-the-top streaming services are streaming services that are available to users directly via the internet, rather than through outlets such as cable. Examples of other such streaming services are YouTube and Hulu.
  8. Carlos A. Gomez-Uribe and Neil Hunt, "The Netflix Recommender System: Algorithms, Business Value, and Innovation," ACM Transactions on Management Information Systems 6, no. 4 (December 2015): source
  9. Paul Sawers, "Remember Netflix's $1m Algorithm Contest? Well, Here's Why It Didn't Use The Winning Entry.," The Next Web, April 13, 2012, source , Gomez-Uribe and Hunt, "The Netflix".
  10. Sawers, "Remember Netflix's".
  11. Titiriga, "Social Transparency".
  12. Titiriga, "Social Transparency".
  13. Andreas Toscher, Michael Jahrer, and Robert M. Bell, The BigChaos Solution to the Netflix Grand Prize, September 5, 2009, source
  14. Sawers, "Remember Netflix's".
  15. Gomez-Uribe and Hunt, "The Netflix".
  16. Roettgers, "Netflix Spends".
  17. Janko Roettgers, "Netflix Spends $150 Million On Content Recommendations Every Year," Gigaom, October 9, 2014, source
  18. Gomez-Uribe and Hunt, "The Netflix".
  19. Sawers, "Remember Netflix's".
  20. Gomez-Uribe and Hunt, "The Netflix".
  21. Gomez-Uribe and Hunt, "The Netflix".
  22. Sawers, "Remember Netflix's".
  23. "Netflix Ratings & Recommendations," Netflix Help Center, source
  24. "How Netflix's Recommendations System Works," Netflix Help Center, source
  25. Gomez-Uribe and Hunt, "The Netflix".
  26. "Netflix Ratings," Netflix Help Center.
  27. "How Netflix's," Netflix Help Center.
  28. Libby Plummer, "This Is How Netflix's Top-Secret Recommendation System Works," WIRED, August 22, 2017, source
  29. "How Netflix's," Netflix Help Center.
  30. "How Netflix's," Netflix Help Center.
  31. Gomez-Uribe and Hunt, "The Netflix".
  32. Gomez-Uribe and Hunt, "The Netflix".
  33. Gomez-Uribe and Hunt, "The Netflix".
  34. Gomez-Uribe and Hunt, "The Netflix".
  35. Gomez-Uribe and Hunt, "The Netflix".
  36. Gomez-Uribe and Hunt, "The Netflix".
  37. Gomez-Uribe and Hunt, "The Netflix".
  38. Gomez-Uribe and Hunt, "The Netflix".
  39. Ekstrand et al., "All The Cool".
  40. Netflix, "Now – For The First Time – You Can See What's Popular On Netflix," Netflix Media Center, last modified February 24, 2020, source
  41. "How Netflix's," Netflix Help Center.
  42. "Netflix Ratings," Netflix Help Center.
  43. Gomez-Uribe and Hunt, "The Netflix".
  44. Gina Barton, "Why Your Netflix Thumbnails Don't Look Like Mine," Vox, November 21, 2018, source
  45. "How Netflix's," Netflix Help Center.
  46. Gomez-Uribe and Hunt, "The Netflix".
  47. Plummer, "This Is How Netflix's".
  48. Plummer, "This Is How Netflix's".
  49. Plummer, "This Is How Netflix's".
  50. Nicole Nguyen, "Netflix Wants To Change The Way You Chill," BuzzFeed News, December 13, 2018, source
  51. Gomez-Uribe and Hunt, "The Netflix".
  52. Nosheen Iqbal, "Film Fans See Red Over Netflix 'Targeted' Posters For Black Viewers," The Guardian, October 20, 2018, source
  53. Iqbal, "Film Fans".
  54. Iqbal, "Film Fans".
  55. Iqbal, "Film Fans".
  56. Andrea Peterson, "Librarians Won't Stay Quiet About Government Surveillance," The Washington Post, October 3, 2014, source
  57. April Glaser, "Long Before Snowden, Librarians Were Anti-Surveillance Heroes," Slate, June 3, 2015, source
  58. Gomez-Uribe and Hunt, "The Netflix", Plummer, "This Is How Netflix's".

Table of Contents

Close