Search Ranking
Search engines have emerged as an essential tool as the internet has expanded, supporting key aspects of free expression online through information access and information dissemination. Search engines enable users to more effectively sift through and access endless amounts of information, and they have empowered individuals, businesses, and publishers to disseminate information.
Although the process of conducting a search on a search engine seems straightforward, the ways algorithms curate and rank search results raise a number of concerns. These algorithms underpin the immense power of search engines, which are able to determine what information a user sees, the type of results they can access, and which publishers and pieces of information a user engages with first. In this way, search engines play a significant role in shaping every one of their user’s perspectives and mindsets.1 Additionally, not all search engines operate in the same manner. Therefore, which search engine a person uses will also influence their viewpoints and opinions.2 As Wired’s Brian Barrett wrote, “The internet is a window on the world; a search engine warps and tints it.”3
The topic of algorithmic curation and ranking of search results has become especially prominent in the news recently, as conservative politicians in the United States have claimed that search engines, such as Google, and internet platforms, such as Facebook and Twitter, have instituted a liberal bias within their search results and news feed curation practices.4 However, there is little evidence that such bias is actually present.5
Technology companies that operate search engines can utilize artificial intelligence and machine learning in a number of ways.6 These include powering speech to text searches and enabling visual searches. This section of this report, however, will focus on how these algorithmic tools are used to curate and rank search results. It will use three case studies—Google, Bing, and DuckDuckGo—to explore how different companies have structured and implemented the practices of curating and ranking their search results, and what challenges they have faced in the process.
Case Study: Google
Google is the world’s largest search engine.7 It was founded in 1998 by Larry Page and Sergey Brin, and today operates under parent company Alphabet Inc. Over the past two decades, Google’s influence in the technology sector, and in society broadly, has grown significantly. As of July 2019, Google had 92.19 percent of the global search engine market share.8 It also currently ranks first for global internet engagement on Alexa rankings.9 Google’s search engine product, known as Search, is currently available in over 150 languages and over 190 countries.10
Brin and Page’s original vision of the Google search engine was primarily based on the tenets of citation analysis used by scholars, researchers, and scientists. In such research-oriented fields, the more a work is cited, the more it is generally considered legitimate and high-quality. Brin and Page saw value in this approach; by identifying web pages that other web pages frequently linked to, they would be able to identify similarly legitimate, or at least popular, online sources. From this sprang Google’s original search engine ranking algorithm, known as PageRank.11 However, as outlined by Safiya Noble, this approach comes with a range of problems. For example, when citing work in a research publication, all citations are given the same weight in the final bibliography, despite expected differences in how much the author relied on each work cited. In addition, in a research publication, all citations are weighted equally, regardless of whether the author mentioned a work and its contents to validate it, reject it, and so on. Brin and Page predicted some of these complications, and the ranking algorithm has evolved considerably since its original conception to account for further limitations.12
According to Google, its Search product aims to enhance a user’s search experience by providing them with personalized search results. These search results are customized based on personal user data collected by Google, such as browsing and purchase history, and are generated using a combination of manual and algorithmic tools. This personalized search feature also, however, serves as a major source of advertising revenue for the company.
Over the past few years, Google has come under significant criticism for its personalized Search feature. In particular, many researchers and internet activists have expressed concerns that by delivering search results that users are likely to click on and be interested in based on their prior searches and personal data, Google is creating a “filter bubble.”13
Recently, Google has stated that it is moving away from offering personalized search results.14 However, today users still receive personalized search results on the platform.
Underpinning Google’s search engine is a process called crawling. Crawling deploys software, known as web crawlers, to identify publicly available web pages. Web crawlers determine which websites to browse, how often to browse them, and how many pages should be browsed from each website. Typically, web crawlers select web pages to crawl based on previous crawls and sitemaps.15 Once the web crawlers have identified a set of web pages, they visit them and utilize links on these web pages to identify other web pages. During this process, the web crawlers specifically look to identify new websites, changes that have occurred to existing websites, and dead links. These web crawlers then bring data about these web pages back to Google’s servers. When a crawler identifies a web page, Google’s system renders the content of the page, like a browser does, and works to identify signals such as keywords and website freshness (the recency of a website’s content). These signals are continuously monitored using the Google Search index. The Google Search index contains hundreds of billions of web pages and it includes an entry for every word seen on every web page that has ever been indexed. When a new web page is indexed, it is added to the entries for all the words it contains.16 When a user enters a query into Search, the Search algorithm identifies keywords in the query and matches it against web pages in the index. It then ranks the subsequent search results based on over 200 different signals.17 The search result ranking process is, for the most part, conducted using algorithms. According to Google, the company does not deploy human curation when ranking its search results.18
Some of the factors and signals that influence the Search ranking algorithm are:
- The meaning of a user’s query: In order to provide a user with a relevant result for their query, Google first needs to identify the intent behind this query. To do this, the company has developed language models, including natural language processing models, which are adept at understanding which particular keywords should be looked up in the index. Additionally, Google has developed a synonym system, which enables for search results to include results related to synonyms of keywords in the original query—for example, if the query included blanket, then the search results may also provide links to websites related to duvets as well. The introduction of this synonym system has enhanced Google search results in over 30 percent of searches across languages. Search algorithms also attempt to decipher what category of information a user is looking for, such as whether it is a specific or broad search, or a language-specific search. If a user searches for a trending keyword or topic, such as the results of the latest UFC fight card, Google’s freshness algorithm will interpret this as a signal that recent information may be more useful than older results.19
- Relevance of web pages: After assessing the meaning of a user’s query, algorithms begin to evaluate the content of different web pages in order to understand whether a page contains information that is relevant to the initial query. One of the clearest indicators that a web page may be relevant is if the headings or body of the page contain the same keywords as the ones in the search query. In addition to keyword matching, Google asserts that it uses aggregated and anonymized interaction data in order to determine whether the search results are relevant to the initial query. This data informs signals that enable Google’s machine-learning systems to better determine relevance. These relevance signals help the Search algorithm determine whether a web page contains information that answers the initial search query, or whether it simply repeats the same question posed in the query. For example, if a user’s query was for “books”, the algorithm would determine whether a web page contains relevant content aside from the keyword “books,” such as pictures, videos, lists, reviews, and so on. According to Google, although these search systems are constructed to seek out quantifiable signals in order to determine relevance, they are not structured to assess subjective notions such as the political ideology of a page’s content.20
- Quality of content: Google’s Search algorithms also aim to prioritize the most reliable and high-quality search results. In order to determine reliability and quality of content, Google’s systems identify signals that can help assess expertise, authoritativeness, and trustworthiness on a given topic. They also search for web pages that many users appear to value for similar search queries. For example, if other reliable and prominent websites link to a web page, that is considered a good indicator that the content of that web page is reliable. This is known as PageRank. PageRank is a mathematical formula that can be used to judge the “value of a page” on the web by assessing the quantity and quality of other pages that link to it. PageRank was one of the fundamental components of the original Google Search algorithm and it was inspired by the system scientists used to gauge the importance of scientific papers, which was to evaluate how many other scientific papers referenced or cited them.21 However, after Google deployed a public PageRank score for each web page, bad actors began working to game this system, which led to a large volume of link spamming.22 As a result, the public PageRank scoring was retired, although PageRank remains a component of the Search algorithm. Additionally, as other factors became increasingly more important for the Search algorithm and for ranking, other signals were incorporated and adopted into the algorithm.
Google also uses aggregated feedback from Google’s Search quality evaluation process to further refine its ability to assess information quality. Further, Google uses spam algorithms in order to determine the quality of a page and ensure that low-quality, harmful or manipulative web pages are not ranked highly in search results.23
- Usability of web pages: When ranking search results, Google Search assesses whether web pages are easy to use and ranks those that are deemed more user-friendly higher than those that are not. Some of the signals that inform whether a web page is usable include whether the website is formatted properly for multiple browsers, whether it has been formatted for various devices and sizes (such as desktops, tablets, and smartphones), and whether the web page can be loaded by users with slower internet connections.24 This can make the Google Search product more accessible to those using the service through different devices or in different regions. However, it may also render some otherwise high-quality sites inaccessible to these users if web page owners do not abide by Google’s user-friendly requirements, thus stifling information flows.
- Context and settings: As previously mentioned, Google extracts insights from users’ personal data in order to inform and tailor search results. These data points include location, purchase history (as determined by crawling users’ Gmail inboxes for purchase receipts25), past Search history, and Search settings. According to Google, a user’s Search settings on factors such as preferred language or SafeSearch—a tool that enables users to filter out explicit or sensitive content—also enable Google to understand which results are likely to be the most useful for them. Google has also stated that Search results may be personalized based on a user’s Google account activity. For example, if a user searches for “events near me,” the search engine may personalize recommendations based on events it thinks the user may be interested in. According to the company, these inferences are made so that search results can match a user’s interests. The company asserts that they are not, however, designed to infer a user’s race, religion, political affiliations, or other sensitive characteristics.26
The weight applied to each factor depends on the nature of the query. For example, the freshness algorithm may play a more prominent role for queries related to current events.27 In addition, Google has detailed specifications for how pages that may impact the “future happiness, health, financial stability, or safety of users” are weighted.28 These web pages are known as “Your Money or Your Life” pages (YMYL) and they include websites that let users make purchases or pay bills, offer financial, medical or legal information, or produce news. Google has especially high page-quality rating standards for these pages, as low quality content on such pages could have a significant negative impact on users.29 As a result, when Search algorithms detect that a query is related to a YMYL subject, it places more weight in the ranking system on factors such as authoritativeness, expertise, and trustworthiness.30
As the largest search engine in the world, Google is responsible for curating and delivering a large amount of information to users. It therefore assumes some responsibility for the impact of its algorithms in shaping the worldviews of users. The platform, however, has come under significant criticism for not providing enough transparency and accountability around its search curation and ranking practices, particularly around how it personalizes search results.
According to many search engines, a personalized search experience can help users filter through the vast amount of information available on the internet and access results that are relevant and useful. Personalized search experiences also have helped platforms like Google achieve significant growth and boost revenue. Personalized search results were a key advantage of Google when it first released, and they became an integral feature of Google Search. In 2009, the company even made personalized search the default option for all users who were logged in to their Google accounts. Also in 2009, the company stated it would deploy an anonymous cookie in order to provide personalized search results to users that were logged out or didn’t have a Google account. This cookie operated separately from a user’s Google Account and web history, which were only available to users who signed in.31 In a blog post, the company also stated they used contextual signals, such as those that aim to reduce the ambiguity of queries, in order to rank search engine results, even for users who were logged out. In this case, this means that Google would use information from a user’s recent search to clarify their current search. The platform said that this would not result in significantly different search results.32 However, its deployment did raise a number of privacy concerns regarding how Google was collecting and using user information across its different Search modes.
Over the past few years, the platform has asserted that it has moved away from personalizing search results. In September 2018, for example, CNBC reported meeting with Google’s algorithm team and learning that because each search query requires a significant amount of context, the opportunities for personalization are quite limited.33 Additionally, the company has suggested that personalization does not significantly improve the quality of the Search experience, and as a result they have moved away from it.34 However, it is unclear whether these shifts are due to increased scrutiny on the practice of offering personalized search results or if the company is purposefully changing the way their Search algorithm works.
Many critics believe Google continues to curate and personalize search results to a large extent.35 One of Search’s biggest critics is privacy-focused search engine DuckDuckGo. In June 2018, DuckDuckGo conducted a study in which 87 volunteers in the United States conducted searches on Google’s private browsing mode (known as “Incognito”) while logged out and on the regular Search platform. The volunteers searched for three politically charged topics: “gun control,” “immigration,” and “vaccinations.”36 The study revealed that most participants received unique results, some saw search results that others did not,37 and some received fewer domain results than others.38 Additionally, the ranking of these search results also varied.39 The researchers found more than double the variation in the search results when comparing Incognito searches to regular searches.40 There was also significant variation between news and video results.41 This suggested that even if users were conducting searches in Incognito mode (which aims to provide users with a private browsing option that does not save a user’s search history), and while logged out, they still received personalized search results. This is likely because websites can use IP addresses and browser fingerprinting in order to identify users even when they are searching in these modes.42 According to the researchers, had the results been truly non-personalized, then all users would have received the same search results.43
DuckDuckGo’s researchers worked to control factors that could have influenced these results, such as location, time, and being logged into Google, by having the volunteers conduct searches while logged out, and by having them conduct the searches at the same time and on the same day. In addition, they controlled for potential variances in location, which might have resulted in, for example, local news stories appearing in the search results, by reviewing all links by hand and comparing them to the city and state of the volunteer who viewed them.44
Google responded to the research by calling it flawed,45 stating that DuckDuckGo’s attempts to control for time and location differences had been ineffective. They also claimed that the researchers assumed that any difference in the search results automatically suggested personalization.46 Further, they highlighted that search results related to news and current events were likely to continuously change depending on daily occurrences. They also stated that personalization was performed on a small portion of overall queries, primarily related to location47 and utilizing previous searches in order to decipher context for a current search.48 Google has also suggested that any variation in search results is likely due to factors, such as a user’s location, the language the search is performed in, and the distribution of Search index updates throughout Google’s data centers.49 Despite these rebuttals, however, there are still significant concerns around Google’s efforts to personalize search results, and the lack of transparency and accountability around these practices.
In particular, internet activists, such as Eli Pariser, have asserted that personalized search results can create and maintain filter bubbles and promote certain biases or world views. In his book, The Filter Bubble: What the Internet is Hiding From You, Pariser argues that by providing users with content a system predicts they will like, internet platforms are filtering out and therefore preventing users from accessing information that may challenge their perspectives or broaden their horizons. This places these users in a “filter bubble” and amplifies their confirmation bias. It can also result in users becoming ill-informed, developing perceptions that are skewed towards one perspective, and even developing a distaste for ideas that are unfamiliar or contrary to their own. This is particularly concerning in the context of political discourse and rising political polarization around the globe.50 This has become an increasingly prominent topic since the 2016 U.S. presidential election, in which numerous internet platforms, including Google, were accused of creating filter bubbles or echo chambers through their algorithmic content curation practices.
However, some researchers push back against the notion that an online filter bubble has enhanced polarization. For example, economists from Brown University and Stanford University studied the relationship between polarization and the use of online media in American adults between 1996 and 2012. They found that polarization has largely been driven by those Americans who spend the least amount of time online, such as those over the age of 75. According to the study, those belonging to younger demographics who use the internet more frequently, demonstrated little difference in their level of polarization in 2012 than they had in 1996, when online platforms were far less prevalent and influential.51
Additionally, some conservative lawmakers in the United States allege that Google has demonstrated bias in how it curates and ranks its search engine results. According to this critique, this bias prioritizes and preferences liberal information sources, in particular ones that are critical of conservatives. In December 2018, Google CEO Sundar Pichai testified before the U.S. House Judiciary Committee on the topic of alleged bias against conservatives on the platform, denying that such bias exists.52 This conversation has been particularly prominent in the political sphere as internet platforms have ramped up their efforts to remove misinformation and disinformation on their services. In the process, a number of politically charged web pages and pieces of content have also been impacted, contributing to claims of conservative bias.53
However, as previously discussed, there is little evidence to support the notion that such political biases exists in Google’s search results. Google has stated that Search is designed to decipher the usefulness and relevance of a web page, not to promote the political and ideological viewpoints of the individuals who built or audited the system.54 The company has also stated that it does not utilize human curation when ranking search results, and rather relies exclusively on algorithms in order to debunk some of these claims. According to Danny Sullivan, Google’s public liaison for Search, Google does not manually intervene on specific search results when addressing issues with ranking. This is because tweaking one search result or addressing one query when the search engine receives trillions per day does not have a strong impact on the overall Search experience.55 Additionally, the platform has asserted that its systems are not designed to make subjective determinations about truthfulness on web pages. Rather, it uses a range of measurable signals—such as the PageRank signal, which is used to determine authoritativeness56—to assess how users and other web pages perceive the expertise, trustworthiness, and authority of a web page and its content.57 Google’s ranking algorithms then promotes these web pages, particularly during searches where the original query could surface misleading information.58
However, just because Google does not deploy human curation during the search ranking process, it does not mean that no bias is present. Algorithms are not neutral and bias-free. The signals that an algorithm uses are designed to prioritize certain information or qualities over others in order to curate and rank content. This is a key part of how search engines and news feeds today work, and is often seen as integral to their operations. On the other hand, the term “bias” in this context does not solely refer to inappropriate preferences based on protected categories like race or political affiliation. Although an algorithm may contain biases based on how it sorts and ranks content, it is difficult to know exactly what these biases are. This is particularly true with black box machine learning systems. Therefore, it is difficult to draw reliable conclusions, such as whether algorithms are biased against a certain political party. Additionally, these algorithms incorporate the judgments, preferences, and priorities of the engineers who developed them, particularly around what information users are likely to find interesting and meaningful. Furthermore, as outlined by Dr. Safiya Noble, an associate professor at the University of California, Los Angeles, search results can also reinforce and perpetuate societal biases. In her book, Algorithms of Oppression, Noble outlines how in the early 2010’s Google’s search engine results related to women, in particular women of color, were overly sexualized and stereotyped. Noble also highlights a case in 2016, in which Google Images search results for “three black teenagers” delivered mugshots of African-American teenagers, whereas similar search results for “three white teenagers” delivered “wholesome and all-American” results. In this way, search engine algorithms can reinforce existing societal stereotypes, often disproportionately impacting already marginalized communities.59
Although Google has shared some information about the signals that contribute to its search algorithm, it has not provided a comprehensive overview of all the signals, how they interact with one another, and what impact they have on online expression as a whole. One reason many platforms fail to provide comprehensive transparency around these signals is because these signals make up the platform’s algorithmic “secret sauce,'” and they therefore want to keep it confidential in order to maintain a competitive edge. However, even if it is valid to maintain confidentiality for certain operational details, it is important to promote transparency and accountability to the greatest extent feasible, and companies’ claims about trade secrets should not outweigh the public interest.
Additionally, many website owners have expressed frustration over the fact that Google does not always announce when it is updating or making changes to its search algorithm. As a result, after a change, some content creators find that their web pages are no longer ranking as well. The ranking of websites in search results is of vital importance to website publishers, as it influences the success of their websites. Typically, a user only clicks on the top few search results. The remaining search results on the page receive far lower clickthrough rates. As a result, being able to rank high in search results, and understand how various ranking signals impact one’s website and associated information flows, commerce outlets, and so on, is important.60
Part of the reason Google may not make frequent announcements is because it regularly makes changes to its Search algorithms. In 2017 alone, Google performed over 200,000 experiments and subsequently instituted 2,400 changes to Search.61 In July 2019, the company announced that over the past year they had made 3,200 changes to Search systems. These included updates for specific features or elements as well as broad core updates.62 According to Google, when making changes to its Search systems, the platform identifies areas of improvement, develops a solution, and then tests that solution. After deciphering whether the solution is feasible and improves the Search experience, they then implement it. These algorithmic changes apply to a broad range of similar searches.63 Thus far, Google has only provided public updates around some broad core changes they have instituted. The company has stated that they aim to provide site owners with prior notice of “significant, actionable changes to our Search algorithms,”64 but in many instances this has proven to not be enough.
Greater transparency would enable publishers of web pages to better understand how their content is curated and ranked, and help them ensure they can adequately distribute their content. Currently, there are a number of search engine optimization (SEO) organizations and communities that have sprung up that speculate on algorithmic changes. However, without clear direction and information, the ability of these communities to understand how they can effectively exercise their free speech online is limited, and the ability of users to access content is therefore also limited. In addition, greater transparency would also help Google disprove or debunk growing claims of political bias.
Although Google’s search curation and ranking process is conducted primarily using algorithms, humans still play a role in this process. According to Google, the platform does not remove or delist search results. Rather, it seeks to promote higher quality content in the rankings over lower quality search results. The company has asserted that in a few rare exceptions they intervene manually in order to remove or delist content from search results. These cases include when the platform receives legal requests to remove or delist search results, when a web page violates Google’s webmaster guidelines, and when a webmaster of a page requests that their web page be removed or delisted.65 Although Google states that they do not frequently engage in these practices, government and legal efforts to remove content have increased globally.66
Google provides transparency and accountability around its search result curation practices through its transparency report, which is published twice a year. In this report, Google reports on government requests to remove content across all its products, content delistings due to copyright, and requests to delist content under European privacy law (known as the “right to be forgotten”). Its data on government requests to remove content can be broken down by product, enabling for a greater understanding of how such requests impact the Search product in particular. The report includes metrics such as removal requests by the numbers, items specified by the requests, and a percentage breakdown of which products were affected. It also includes reasons for requests.67 This is a best practice that other platforms should adopt, as it provides transparency around how external parties are influencing search results, and how Google is responding. Google has also asserted that where possible it aims to inform website owners about requests for removal through its Webmaster Console. This is a vital portion of accountability as well.68
Additionally, some research indicates that Google has intervened to manually alter search results when the results sparked controversy. For example, in December 2016 British Investigative Journalist Carole Cadwalladr wrote about how the top search result for “did the Holocaust actually happen?” in Google Search was a white nationalist web page which denied that the Holocaust had ever happened. Cadwalladr’s finding sparked outrage and eventually the results changed. Although Google has stated that it prefers “to take a scalable algorithmic approach to fix problems” rather than “fix the results of an individual query by hand,” many suspect that the company took corrective action nonetheless. However, others have suggested that the online outrage sparked by Cadwalladr’s article also contributed to this change in the results, as it drove traffic to certain web pages on this topic, thus impacting the search results and the ranking of these results.69
Another way that humans play a role in Google’s search curation and ranking process is through the Search Quality Rating process. In order to ensure its search algorithms promote relevant and high-quality content, Google developed a rigorous testing process that involves live tests and review by thousands of trained external search quality raters around the world. This process is deployed every time Google considers implementing a change or update to its search algorithm. In order to roll out a change, Google must be able to determine that the change provides a net positive. This means that a significant amount of search results will be made more helpful without subsequently creating major losses in other areas. Making such changes to organic search results can take a large amount of time.70
Search quality raters are external individuals based around the globe who help evaluate whether a website provides users with the content they were looking for. They also help evaluate the quality of search results based on the expertise, authoritativeness, and trustworthiness of the content. Each search quality rater represents a specific language and geographic expertise or perspective, in order to ensure Google’s Search product is useful around the world. Although these individual’s ratings do not directly impact the ranking of any web page, they enable Google to benchmark the quality of their results and identify areas for improvement.71 This informs Google’s search algorithms, which aim to prioritize high-quality content and web pages.72
In order to ensure that search quality raters are using a consistent approach, Google provides them with Search Quality Rater Guidelines, which outline Google’s goals for its ranking systems and includes examples for appropriate ratings. According to Google, in order to ensure consistency in the rating program globally, all search quality raters are required to pass a comprehensive exam and are continuously audited. Evaluators also assess each improvement to Search that is rolled out through side-by-side experiments in which evaluators see two different sets of search results, one with the change and one without. Evaluators must then identify which experience is of greater relevance and quality. This feedback is used to improve Search and launch efforts.73 These search quality rater guidelines are publicly available, therefore providing a degree of transparency around these curation and rating efforts. However, not as much information is known about who these raters are and what perspectives, regions, and cultures they represent. Knowing this would enable researchers and users to get a better sense of how search results are being curated and rated.
Given that Google’s search engine is a vital way for website publishers to disseminate their information and gain traction, many publishers have invested significant time and resources into ensuring that they rank well in Google’s search results, a practice known as SEO. Google’s publicly available webmaster guidelines outline how publishers can ensure that Google finds, indexes, and ranks their website. It also provides guidelines on topics such as quality, as well as rules around prohibited or illicit activity such as spam, malware, and deceptive websites.74
When a website violates Google’s webmaster guidelines, it can be penalized in one of three ways: Google can penalize the website, thus neutralizing the impact of the spam; demote the website; or remove the website from search results completely.75 According to Danny Sullivan, Google’s algorithms can detect the majority of spam and automatically prevent the ranking system from promoting such content by demoting or removing it.76 The remainder of spam results are typically manually addressed by a spam removal team. They review the pages in question, typically based on user feedback, and flag them for penalty if they have been found to violate the webmaster guidelines.77 Manual actions can be used to penalize an entire website, subdomain, sections of a website, or specific pages. Manual action can also demote websites in search rankings and delist them.78 If a web page owner feels that they have incorrectly or unfairly been penalized, they can submit a reconsideration request.79 However, processing and responding to these requests often takes a significant amount of time, and this therefore can undermine the operations and success of a website for an extensive period of time. In order to provide greater transparency and accountability around this process, Google should enable appeals in a more timely manner.
Finally, in order to provide greater transparency and accountability around its search curation and ranking practices, Google needs to provide its users with greater controls around how their data is used and how their search experience is tailored.
In 2009, Google made Search the default search option for all users, including users who are not logged into a Google account.80 Users who were signed in, however, could access a tab which outlined how Google had customized their search results, and how they could turn it off.81 Today, a similar tab exists that explains to users how activity data, location data, and data from other Google products and services make Search work. The tab also lets users delete their search activity; choose whether their activity is saved on Google sites, apps, and services; and whether users would like personalized advertisements. Logged-in users therefore do have a range of controls over algorithmic curation available that enable them to disable the personalization of search results based on their account activity to a certain extent. However, as the DuckDuckGo study indicated, search results are often still personalized, even if a user is not logged in or if a user is using Incognito mode.82 All users, regardless of whether or not they are logged in to a Google account or browsing in Incognito mode need to be able to access controls that enable them to opt out of algorithmic content curation and ranking during the search experience. They also need to have strong privacy controls over how their data is tracked, collected, and used. Additionally, similar controls need to be afforded to users who use Search but do not have a Google account, as they do not have access to the suite of settings that Search users with accounts do.
In terms of controls, Google also offers website owners a series of controls over how their website appears in Search results. They do this through the Webmaster Tools feature, which lets website owners provide granular instructions on how Google crawls and processes pages on their website. Website owners can also request a recrawl or opt out of crawling altogether.83 This enables website owners to control how their content is processed to an extent as well. However, the algorithm eventually determines how well a website ranks and performs for each user.
Case Study: Bing
Bing is an internet search engine that is owned and operated by Microsoft. It launched in June 200984 and enables a variety of search services including web, video, image, and map search.85 When Bing launched, Microsoft sought to position it as more than a simple search service. Rather, the company claimed to provide a product that enabled consumers to rapidly acquire more relevant and informed insights from the web, and to use these insights effectively. Their marketing described this ideal search engine as a “Decision Engine.”86 Initially, Bing focused on four verticals: making a purchase decision, planning a trip, researching a health condition, and finding a local business.87 As with Google, by providing users with more relevant and informed search results, Microsoft also hoped to boost its growth and revenue.
Although Bing accounts for a small percentage of Microsoft’s overall revenue,88 the search engine has grown to be the second largest in the world in terms of market share,89 boasting 1.3 billion unique monthly global visitors90 and ranking 31st for global internet engagement on Alexa rankings.91 Although Google still dominates the search engine industry in terms of market share,92 Bing is often considered the most comparable alternative available to users.93
Bing has its own web crawler, known as Bingbot, which uses an algorithm to determine which websites to crawl, how often to crawl, and how many web pages to gather from each website. The algorithm chooses web pages to crawl by prioritizing relevant known URLs that are not indexed yet, and URLs that are already indexed but that need to be revalidated to check for changes or dead links. Bingbot also seeks to identify new web pages that have not been crawled or indexed yet.94
Like Google, Bing uses a combination of algorithmic signals when ranking search engine results and uses human editors.95 However, Microsoft has not recently made any disclosures about which signals it uses to rank search engine results. The latest major disclosure it made was in 2014, via a blog post on the role of content quality in Bing search results. The blog post outlined that the relevance of a result is a significant consideration for the Bing ranking algorithm. The relevance of a result is a function of three things: topical relevance to a user’s query (does the result sufficiently address the query?), content quality, and context (is the query related to a recent topic?, where the user is located?, etc.).96 Content quality is based on three primary pillars: authority (can the content be trusted?), utility (is the content useful and detailed?), and presentation (is the content well-formatted, accessible, and easy to find?). Authority is determined based on a range of factors including signals from social networking platforms, cited sources, name recognition, and information about the author. In order to assess the utility of a website, Bing’s models aim to predict whether the page’s content provides adequate supporting information, whether it is detailed enough for the intended user, and whether it includes supporting content such as videos, graphs, etc. The models also consider the level of expertise required to produce the content on the web page, with a preference for content that is unique and does not reproduce existing materials.97
Microsoft has also stated that both the signals and human editors account for live search activity and real-time news events when ranking news results.98 In addition, the search engine only uses metrics related to how many clicks a web page gets when it evaluates how a search ranking algorithm is working. Click metrics are not a primary signal that is considered when ranking search engine results.99
According to Microsoft, the Bing search engine was constructed to identify which results most satisfied users. Based on these insights, Microsoft develops guidelines and training datasets. These training datasets are evaluated by search quality raters (known as “judges” for the Bing search engine) who operate in Bing’s Human Relevance System project.100 These judges work to identify which search results are the most satisfying according to factors such as relevance and accuracy. They also use click metrics to evaluate whether users are satisfied with the search results they received.101 In addition, Microsoft pushes out models to subsets of users in order to observe which results most satisfy real users. It follows a similar process when implementing updates to its ranking algorithm.102
The Bing search engine also uses machine learning to scale the process of generalizing. Generalizing is when Bing judges103 manually rank search results using a set of guidelines and is typically most accurate when it is performed by humans. However, because it is conducted manually, it cannot be done at scale. The use of machine learning in this instance therefore aims to provide users with search results comparable to those that judges would deliver, but at scale. This can only be achieved by generalizing the ranking algorithm as much as possible.104 Today, approximately 90 percent of Bing search results are ranked based on machine learning.105
In addition, like other search engine’s ranking algorithms, Bing’s search ranking algorithms are continuously updated to attempt to provide users with a better experience.106 These updates aim to refine and improve search results and remove spam from search indexes to protect users from negative or manipulative search results. The search engine uses algorithms to demote and penalize websites that violate Microsoft’s guidelines.107 As a result, it observes how users react to search results, how the search engine assesses the search results, and what the actual ranking algorithm returns. This process cannot be perfect and the search engine algorithm therefore has to continuously be updated.108
Bing is the second largest search engine in terms of market share and it is therefore responsible for curating a significant amount of content for a large user base. Despite this, it demonstrates a lack of transparency around its search result curation and ranking process.
First, Microsoft does not publicly disclose substantial information around the signals that define how search engine results are curated and ranked. This makes it difficult for users and website owners to know how the search experience is developed and personalized, and which qualities, results, and voices are prioritized over others. In addition, Microsoft does not regularly share explanations of why its search algorithm is being updated. Given that these changes impact how and if publishers’ content is viewed by users, and given that these changes alter the scope of content users engage with, this is concerning.109 Microsoft does offer website owners some information on how to use the Bing search engine via its publicly available Webmaster Guidelines. These guidelines outline how providers can work to ensure their content is found and indexed within Bing.110 But, this resource does not offer live and up-to-date information on recent changes.111
Furthermore, although Microsoft deploys judges to help improve its search experience, it does not publicly share the guidelines these judges use.112 Some organizations, such as online blog Search Engine Land, appeared to be able to obtain copies of these guidelines and have written about them, but Microsoft itself has not publicly disclosed information around these efforts. Microsoft also does not share any information about who these judges are, and what perspectives, regions, and backgrounds they represent. This makes it difficult to understand how search results are benchmarked and curated, and which voices and perspectives are being considered when producing the search experience. In addition, according to a spokesperson, Microsoft deploys both algorithmic tools and human editors during the search curation and ranking process. Microsoft does not, however, provide any further information on the role of these human editors, how they differ or are similar to judges, and what role they play. The presence of human editing in this process creates the very real opportunity to instill bias in the search ranking process. Although, as previously noted, the term “bias” in this context does not solely refer to inappropriate preferences based on protected categories like race or political affiliation, this still raises concerns. Algorithmic biases, which can originate from their creators and from biased training data, are also a significant concern. However, given that Microsoft shares limited information about its algorithmic curation and ranking process, it is difficult to assess what these biases are and how they impact the search experience.
Like Google, Microsoft lets users control their search experience on Bing to an extent. When a Bing user is logged in to their Microsoft account, they can view and clear their browsing, search, location and other relevant activity history from the account. They can also manage and control some of the data that Microsoft collects. However, these controls are not available to users who are searching in Bing in a private browsing mode, or who are searching while not logged in.
In addition, Microsoft also demonstrates some concerning practices when it comes to accountability with the Bing search engine. According to Frédéric Dubut, the head of Bing’s spam team, Microsoft aims to assess intent when deciding whether to penalize a website for violating its guidelines (such as by spamming).113 However, how the company assesses intent is unclear. As previously highlighted, a web page can be penalized in a number of ways. These include neutralizing the impact of spam or negative intent, demoting a website in search rankings, or removing the website from search results.114 A publisher can submit a reconsideration request if they believe their website has been unfairly penalized. This is valuable, as it provides an appeals mechanism for users of Bing’s search engine. However, this process has been described as lengthy, thus raising the risk that an error on the part of Microsoft can seriously damage the success of a website.115 In order to provide greater accountability around their search result ranking procedure, Microsoft should improve this procedure so that it generates resolutions in a timely manner.
Microsoft does, however, also demonstrate some positive practices. For example, it lets website owners request recrawls of their sites.116 Additionally, like Google, Microsoft receives various legal, copyright, and private party requests to remove and delist websites from Bing.117 The company issues an annual transparency report regarding content removal requests, which outlines the scope and scale of such requests. The report provides data on government requests for content removal, copyright removal requests, “Right to be forgotten” requests, and non-consensual pornography (“revenge porn”) removal requests. The report, however, does not break down these data points by Microsoft product, and as a result it is difficult to ascertain how often search results on Bing are impacted by these requests.118
Case Study: DuckDuckGo
DuckDuckGo is an internet search engine that launched in 2008, largely based on Free and Open Source Software (FOSS).119 Typically, search engines aim to distinguish themselves based on the comprehensiveness and accuracy of their search index, and the relevance of their search results.120 DuckDuckGo seeks to further distinguish itself by providing users with strong privacy protections that also enable them to evade the so-called filter bubble created by the personalization of search results. According to DuckDuckGo, the platform does not profile users and delivers all users the same search results, regardless of their past search history. The platform also asserts that it prioritizes providing users with the highest-quality search results, rather than the largest number of results. Today, DuckDuckGo is the sixth largest search engine in the world by market share.121 As of January 2019, DuckDuckGo achieved a new record in traffic with over 1 billion monthly searches.122 It ranks 186th for global internet engagement on Alexa rankings.123 DuckDuckGo uses its own web crawler, known as DuckDuckBot, and approximately 400 other sources in order to generate and curate its search results. These sources include other search engines, such as Bing, Yahoo!, and Yandex, and websites, such as Wikipedia.124 As concerns around consumer privacy have grown—especially following the 2013 Snowden disclosures and more recent data-sharing controversies like the Cambridge Analytica scandal—DuckDuckGo has seen a significant increase in its user base and website traffic.125
According to DuckDuckGo, the platform enforces encrypted HTTPS connections whenever websites provide them. When a user connects to an HTTPS-secured server, secure websites that offer encrypted HTTPS connections will evaluate a website’s security certificate and authenticate that it was issued by a legitimate authority. This helps secure sensitive information sent over an HTTPS connection from electronic eavesdropping.126 Additionally, it can prevent the information from being modified while in transit. DuckDuckGo also assigns each page a user visits a score that assesses to what extent that website is trying to mine the user’s data. In order to maintain user anonymity online, DuckDuckGo asserts that it blocks tracking cookies, which can be used to identify a user and their devices. It also scans and scores the privacy policies of different websites that a user visits. On the DuckDuckGo search engine, a user has the ability to clear their tabs and data automatically, at the end of a session or after a preset period of inactivity.127 In addition, although the company still provides advertisements, these are “contextual” ads that are based only on the content of a website (such as a current search query) rather than on a user’s behavioral profile, including their prior search history.128 Further, DuckDuckGo has stated that it does not store personal user information; it does, however, maintain a log of all search terms that have been used on its service.129
Despite the differences in DuckDuckGo’s practices, some studies have indicated that DuckDuckGo is able to return results with the same level of quality as Google.130 However, this is generally accurate for broad topic search results, rather than niche topics.
Although DuckDuckGo seeks to provide users with a positive search experience that comes with strong privacy protections, the company does not provide a significant amount of transparency and accountability around its search curation and ranking practices. According to the DuckDuckGo research study conducted on Google’s ranking practices, a neutral search engine that is truly delivering non-personalized search results should be able to deliver the same results to all users, regardless of what browsing mode they are in. However, just because all users on the DuckDuckGo platform see the same results, it does not mean that these results are not curated and ranked using automated tools. The company does not share which signals it uses to perform this ranking. On the DuckDuckGo website, the company states “ranking is a bit opaque and difficult to discern/communicate on an individual query basis because of all the various factors involved (and which change frequently). Nevertheless, the best way to get good rankings (in nearly all search engines) is to get links from high-quality sites.”131 It is therefore difficult to understand which factors DuckDuckGo’s search curation and ranking processes prioritize, and how these judgments impact users’ search experiences on the platform. By providing greater transparency, the platform could enhance its value proposition and demonstrate to users why they argue they are a better search engine choice.
One way the platform aims to deliver high-quality search results is by removing search results associated with content mill companies. Content mill companies are websites that publish numerous daily articles which are often produced by freelance writers (for example, eHow). These forms of content are considered low-quality, but they are written so they can rank highly in Google’s search index. DuckDuckGo, however, removes them.132 The search engine has also begun experimenting with algorithms to remove spam links and other forms of low quality content. However, DuckDuckGo provides little transparency around the scope and scale of this process.
According to DuckDuckGo, the platform’s search engine supports user privacy, provides users greater protections around how their data is used, and aims to deliver neutral search results that do not exhibit bias and that prevent the creation of filter bubbles. However, the company does not provide adequate transparency and accountability around its ranking process, making it difficult for users and website owners to understand how expression is being controlled. The growth of the platform suggests there is a market for a service whose value proposition is built on protecting user privacy.133 This value proposition should be extended to include transparency and accountability.
Citations
- Carole Cadwalladr, "Google, Democracy and the Truth about Internet Search," The Guardian, December 4, 2016, source
- Brian Barrett, "I Used Only Bing for 3 Months. Here's What I Found—And What I Didn't," WIRED, October 17, 2018, source
- Barrett, "I Used".
- Jillian D'Onfro, "Trump is Slamming Google's News Results But Here's How Microsoft's Bing Stacks Up," CNBC, August 29, 2018, source
- Studies have been conducted by Dr. Francesca Tripodi from James Madison University and by researchers at Google which debunk the presence of bias against conservatives. Dr. Tripodi and Mr. Karan Bhatia, Vice President for Government Affairs & Public Policy at Google highlighted this research when they testified to the Senate Committee on the Judiciary on July 16, 2019. The hearing was titled “Google and Censorship through Search Engines”.
- Bing, "Bing Delivers Text-to-Speech and Greater Coverage of Intelligent Answers and Visual Search," Bing Blogs, entry posted March 20, 2019, source
- "Search Engine Market Share Worldwide," StatCounter, last modified July 2019, source
- "Search Engine," StatCounter.
- "Google.com Competitive Analysis, Marketing Mix and Traffic," Alexa Internet, source
- Google, How Google Fights Disinformation, February 2019, source
- Tim Soulo, "Google PageRank is NOT Dead: Why It Still Matters," Ahrefs Blog, entry posted September 13, 2018, source
- Safiya Umoja Noble, Algorithms of Oppression: How Search Engines Reinforce Racism (New York University Press, 2018).
- Eli Pariser, The Filter Bubble: What the Internet is Hiding From You (New York: The Penguin Press, 2011).
- source
- Google, "How Search Organizes Information," How Search works, source ; A sitemap is a file which contains information about the web pages, videos and other files on a website, and the relationships between these elements. It informs a search engine which files are the most important and provides integral information about the pages including when they were last updated, how often the page has been changed and whether it is available in other languages. Google uses sitemaps in order to more effectively crawl a website.
- Google, "How Search Organizes Information," How Search works.
- Transparency & Accountability: Examining Google and its Data Collection, Use and Filtering Practices: Hearings Before the Judiciary Committee (2018) (statement of Mr. Sundar Pichai). source
- Danny Sullivan, "How We Keep Search Relevant and Useful," The Keyword (blog), entry posted July 15, 2019, source
- Google, "How Search Algorithms Work," Google, source
- Google, "How Search Algorithms Work," Google.
- Tim Soulo, "Google PageRank is NOT Dead: Why It Still Matters," Ahrefs Blog, entry posted September 13, 2018, source
- Andreea Sauciuc, "Cognitive SEO," Does Google PageRank Still Matter in 2018? A Retrospective View in the PageRank History, source
- Google, "How Search Algorithms Work," Google.
- Google, "How Search Algorithms Work," Google.
- Todd Haselton and Megan Graham, "Google Uses Gmail to Track a History of Things You Buy — And It's Hard to Delete," CNBC, May 17, 2019, source
- Google, "How Search Algorithms Work," Google.
- Google, "How Search Algorithms Work," Google.
- Google, Search Quality Evaluator Guidelines, September 5, 2019, source
- Google, Search Quality Evaluator Guidelines.
- Google, How Google Fights Disinformation.
- Google, "Personalized Search for Everyone," Google Official Blog, entry posted December 4, 2009, source
- Natasha Lomas, "Google 'Incognito' Search Results Still Vary From Person to Person, DDG Study Finds," TechCrunch, December 4, 2018, source
- Lomas, "Google 'Incognito,' Search Results Still Vary From Person to Person, DDG Study Finds”.
- Barry Schwartz, "Google Admits It's Using Very Limited Personalization In Search Results," Search Engine Land (blog), entry posted September 17, 2018, source
- Lomas, "Google 'Incognito,' Search Results Still Vary From Person to Person, DDG Study Finds”.
- DuckDuckGo, Measuring the "Filter Bubble": How Google is Influencing What You Click, December 4, 2018, source
- DuckDuckGo, Measuring the "Filter Bubble": How Google is Influencing What You Click.
- Lomas, "Google 'Incognito,' Search Results Still Vary From Person to Person, DDG Study Finds”.
- DuckDuckGo, Measuring the "Filter Bubble": How Google is Influencing What You Click.
- DuckDuckGo, Measuring the "Filter Bubble": How Google is Influencing What You Click.
- Lomas, "Google 'Incognito,' Search Results Still Vary From Person to Person, DDG Study Finds”.
- DuckDuckGo, Measuring the "Filter Bubble": How Google is Influencing What You Click.
- DuckDuckGo, Measuring the "Filter Bubble": How Google is Influencing What You Click.
- DuckDuckGo, Measuring the "Filter Bubble": How Google is Influencing What You Click.
- Lomas, "Google 'Incognito,' Search Results Still Vary From Person to Person, DDG Study Finds”.
- Lomas, "Google 'Incognito,' Search Results Still Vary From Person to Person, DDG Study Finds”.
- Nick Statt, "Google Personalizes Search Results Even When You're Logged Out, New Study Claims," The Verge, December 4, 2018, source
- Statt, "Google Personalizes Search Results Even When You’re Logged Out, New Study Claims"
- Google, How Google Fights Disinformation.
- Pariser, The Filter Bubble: What the Internet is Hiding From You.
- Will Oremus, "The Filter Bubble Revisited," Slate, April 5, 2017, source
- The Hill Staff, "Watch Live: Google CEO Testifies Before House Judiciary Committee," The Hill, December 11, 2018, source
- Casey Newton, "The Real Bias on Social Networks Isn't Against Conservatives," The Verge, April 11, 2019, source
- Google, How Google Fights Disinformation.
- Sullivan, "How We Keep," The Keyword (blog).
- Google, How Google Fights Disinformation.
- Google, How Google Fights Disinformation.
- Google, How Google Fights Disinformation.
- Noble, Algorithms of Oppression.
- Berin Szoka and Adam Marcus, eds., The Next Digital Decade: Essays on the Future of the Internet (TechFreedom, 2011).
- Google, How Google Fights Disinformation.
- Barry Schwartz, "SEOs Noticing Ranking Volatility in Google's Search Results," Search Engine Journal, last modified January 10, 2019, source
- Sullivan, "How We Keep," The Keyword (blog).
- Google, How Google Fights Disinformation.
- Google, How Google Fights Disinformation.
- Google, Government Requests to Remove Content, source
- Google, Government Requests to Remove Content.
- "The Santa Clara Principles On Transparency and Accountability in Content Moderation," source
- Deirdre K. Mulligan and Daniel S. Griffin, "Rescripting Search to Respect the Right to Truth," Georgetown Law Technology Review 2.2, no. 557 (2018): source
- Sullivan, "How We Keep," The Keyword (blog).
- Google, How Google Fights Disinformation.
- Google, How Google Fights Disinformation.
- Google, How Google Fights Disinformation.
- Sullivan, "How We Keep," The Keyword (blog).
- Barry Schwartz, "Google and Bing Talk Web Spam and Penalties," Search Engine Land, last modified June 6, 2019, source
- Sullivan, "How We Keep," The Keyword (blog).
- Google, How Google Fights Disinformation.
- Schwartz, "Google and Bing," Search Engine Land.
- Schwartz, "Google and Bing," Search Engine Land.
- Google, "Personalized Search," Google Official Blog.
- Lomas, "Google 'Incognito,' Search Results Still Vary From Person to Person, DDG Study Finds”.
- DuckDuckGo, Measuring the "Filter Bubble": How Google is Influencing What You Click.
- Google, "How Search Organizes Information," How Search works.
- Microsoft, "Microsoft's New Search at Bing.com Helps People Make Better Decisions," entry posted May 28, 2009, source
- Microsoft, "Microsoft's New Search at Bing.com Helps People Make Better Decisions."
- Microsoft, "Microsoft's New Search at Bing.com Helps People Make Better Decisions."
- Microsoft, "Microsoft's New Search at Bing.com Helps People Make Better Decisions."
- Microsoft, Annual Report 2018, source
- "Search Engine," StatCounter.
- J. Clement, "Bing – Statistics & Facts," Statista, last modified October 24, 2018, source
- "Bing.com Competitive Analysis, Marketing Mix and Traffic," Alexa Internet, source
- Adam Dorfman, "How Bing Is Enhancing Search and Apparently Growing As A Result," Search Engine Land, last modified August 7, 2018, source
- Barrett, "I Used Only Bing for 3 Months. Here's What I Found—And What I Didn't"
- source
- Barrett, "I Used Only Bing for 3 Months. Here's What I Found—And What I Didn't"
- Jan Pedersen, "The Role of Content Quality in Bing Ranking," Bing Blogs, entry posted December 9, 2014, source
- Pedersen, "The Role," Bing Blogs.
- Barrett, "I Used Only Bing for 3 Months. Here's What I Found—And What I Didn't"
- Schwartz, "Google and Bing," Search Engine Land.
- Matt McGee, "Yes, Bing Has Human Search Quality Raters & Here's How They Judge Web Pages," Search Engine Land, last modified August 15, 2012, source
- Schwartz, "Google and Bing," Search Engine Land.
- Schwartz, "Google and Bing," Search Engine Land.
- Jillian D'Onfro, "Trump is Slamming Google's News Results But Here's How Microsoft's Bing Stacks Up," CNBC, August 29, 2018, source
- Barry Schwartz, "Bing: Machine Learning Leads To More Human Like Ranking In Search," Search Engine Roundtable, last modified December 12, 2018, source
- Patrick Stox, "90%+ of Bing search results would be based on machine learning. LambdaMART is the core. @CoperniX #SMX," Twitter, January 30, 2019, source
- Schwartz, "Google and Bing," Search Engine Land.
- Schwartz, "Google and Bing," Search Engine Land.
- Schwartz, "Google and Bing," Search Engine Land.
- Schwartz, "Google and Bing," Search Engine Land.
- "Bing Webmaster Guidelines," Bing, source
- "Bing Webmaster Guidelines," Bing.
- McGee, "Yes, Bing," Search Engine Land.
- Schwartz, "Google and Bing," Search Engine Land.
- Schwartz, "Google and Bing," Search Engine Land.
- Schwartz, "Google and Bing," Search Engine Land.
- Alan Sembera, "How to Recrawl a Website With Bing," AZ Central, source
- Bing, "Bing To Use Location for RTBF," Bing Blogs, entry posted August 12, 2016, source
- Microsoft, Content Removal Requests Report, source
- DuckDuckGo, "Open Source Overview," DuckDuckGo Help Pages, source
- Manish Agarwal and David K. Round, "The Emergence of Global Search Engines: Trends in History and Competition," Competition Policy International 7, no. 1 (Spring 2011): source
- "Search Engine," StatCounter.
- Cory Hedgepeth, "DuckDuckGo Explodes With 1 Billion Monthly Searches (Um, Is This Really Happening?)," Direct Online Marketing, last modified February 6, 2019, source
- "Duckduckgo.com Competitive Analysis, Marketing Mix and Traffic," Alexa Internet, source
- Sam Hollingsworth, "DuckDuckGo vs. Google: An In-Depth Search Engine Comparison," Search Engine Journal, last modified April 12, 2019, source
- Lisa Lacy, "DuckDuckGo Is Shedding Its Black Sheep Status Thanks to Its Dedication to Privacy," AdWeek, December 3, 2018, source
- Kevin Bankston and Ross Schulman, Case Study #1: Using Transit Encryption by Default, February 2017, source
- David Nield, "It's Time to Switch to a Privacy Browser," WIRED, June 16, 2019, source
- Becky Chao and Eric Null, Paying for Our Privacy: What Online Business Models Should Be Off-Limits?, September 17, 2019, source
- DuckDuckGo, "Privacy," DuckDuckGo, source
- Rob Pegoraro, "What It's Like To Use A Search Engine That's More Private Than Google," Yahoo! Finance, November 4, 2018, source
- DuckDuckGo, "Results Rankings (SEO)," DuckDuckGo Help Pages, source
- Christopher Mims, "The Search Engine Backlash Against 'Content Mills,'" MIT Technology Review, July 26, 2010, source
- Dorfman, "Why DuckDuckGo," Search Engine Land.