Table of Contents
- Introduction
- The Growth of Today’s Digital Advertising Ecosystem
- The Role of Data in the Targeted Advertising Industry
- The Role of Automated Tools in Digital Advertising
- Concerns Regarding Digital Advertising Policies and Practices
- Case Study: Google
- Case Study: Facebook
- Case Study: LinkedIn
- Promoting Fairness, Accountability, and Transparency Around Ad Targeting and Delivery Practices
The Role of Data in the Targeted Advertising Industry
User data has been integral to the rise of digital advertising. The growth in the digital advertising industry would not have been possible without expanding digital data collection practices. As Ben Scott and Dipayan Ghosh outline in their report Digital Deceit: The Technologies Behind Precision Propaganda on the Internet, there are a variety of mechanisms that companies in the digital ad ecosystem can use to collect user’s data. These can be broken down into four broad categories:1
- Web Tracking: When a user visits a website, the website loads built-in web tracking technology. The most common of these technologies is the web cookie, which can be “first-party” or “third-party.” First-party cookies are developed and placed on a website by the website owner. This enables the website to track a user’s movements and activity between web pages on the website. Third-party cookies are developed and placed on a website by a third-party entity, in partnership with the website. This enables the third party to monitor and track a user’s activity on and across the website, as well as across every website a user visits that has the third-party cookie code embedded on it. Cookies can enable the tracking party to accrue data that can help them infer the interests, preferences, behaviors, and routines of a user based on their behavior across a network of third-party sites. This information can then be used to target these users with specific and relevant ads.
Although cookies are a highly pervasive form of web tracking, users can manage how cookies monitor and collect information on them by clearing cookies in their browser settings. Users can also deploy tools such as the “Do Not Track” feature available on some browsers, which sends a request to websites a user visits to disable its cross-site user tracking, which includes cookies. However, respecting Do Not Track is a decision made by websites, and some advertisers can actually use the fact that users have opted to use Do Not Track use as a signal in browser “fingerprinting,” discussed later in this section.2
- Location Tracking: Granular location data is an integral component of the digital advertising ecosystem, as it provides a significant amount of information about the interests, preferences, behaviors, and routines of a user. For example, a user’s location information can provide insight into where a user lives, where they work, what stores or businesses they regularly visit, and where they spend their free time. This information can be used to determine which ads a user should be targeted with. Smartphone application makers routinely use GPS signals, cellular network triangulation, Wi-Fi SSIDS, and Bluetooth connectivity to collect such location information.
Users have limited avenues to prevent or manage collection of their location information. These include turning off location settings on their phone, managing the location information access of individual smartphone apps, and opting out of location-based ads on single-site ad platforms. However, it is important to note that many smartphone applications, such as rideshare applications, require location information in order to operate, and they often mandate that users provide constant access to this data in order to use the app.
- Cross-Device Tracking: Consumers, especially in the United States, often access the internet from multiple devices. However, advertisers generally want to avoid delivering duplicate advertisements to consumers across multiple devices. In order to control when and where the ads are delivered, the digital advertising industry has developed cross-device tracking technologies that monitor user activity across devices. Once cross-device inferences are made with high enough confidence, many companies will associate a unique identifier with a user. This identifier becomes a central anchor of user data that is collected across multiple applications, internet platforms, and devices.
- Browser Fingerprinting: A browser fingerprint is a compilation of data about the setup of a user’s browser or operating system. This information can include a user’s browser provider, browser version, operating system, preferred language, browser plugins (software that is typically third-party that adds functionality to a browser when installed), tracking settings, ad blocker settings, and time zone. Because users are able to customize their browser settings and preferences, their browser fingerprint can be used to identify them across the internet. A 2015 study conducted by the Electronic Frontier Foundation concluded that 84 percent of participating internet users had unique browser fingerprints. Another study has found that websites were able to track a user’s browser fingerprint even if they were using a different browser, as a user retains similar browsing characteristics such as similar operating systems, IP addresses, etc.3 This raises significant privacy concerns. To evade browser fingerprinting some individuals are using Tor, an open-source software that enables anonymous communication. Tor manipulates tracking signals such as IP address and operating system or normalizes them across all Tor Browser users in order to prevent fingerprinting.
Data brokers such as Acxiom, Experian, and Oracle also play an important role in data collection and user targeting. These companies combine user records from a range of sources, including retail purchases and census data, in order to provide advertising platforms such as Facebook with hundreds of unique data points that they can use to enhance their profile database.4 Internet platforms, advertisers, and third-party data brokers can also use data modeling techniques to make further inferences and predictions about consumer traits and behaviors. Data modeling enables these parties to use information on observed actions and self-reported preferences and interests to supplement and fill in profile information. For example, a data broker could use a user’s zip code and name to infer their ethnicity based on census or other data showing ethnic breakdowns by zip code. Data modeling can also be used to categorize users based on factors such as creditworthiness or interest in certain topics.5 In 2018, Facebook patented a system that combines a range of data points to predict “socioeconomic group classification.”6
According to Scott and Ghosh, the pervasive use of such methods and the rampant data collection of internet platforms foster a vicious cycle when it comes to the collection of highly personal data. The more highly personal data companies are able to collect, the more relevant advertisements they are able to deliver to users. The more relevant these advertisements are, the longer these companies can keep users on their platforms, thus maximizing the potential advertising space for each user, and driving revenue for the platform.7 However, there is disagreement around what relevant really means in this context. Companies assert that relevant content is content that a user is likely to be interested in based on their online profile and history. But this means that an advertiser could target their advertisements to focus on specific personality types and perceived emotional states. They could also use profiles to target and exploit an already vulnerable category of users.8 This has already been documented. For example, cigarette companies have targeted low-income communities with their advertisements, and similarly vulnerable communities have been the primary audience for products such as diet pills.9 Similarly, in 2013, media planning agency PHD released a study highlighting the dates, times, and occasions when U.S. women felt the least attractive. Based on this, they laid the groundwork for advertisers to target women during “prime vulnerability moments” such as on Monday mornings when the research indicated women “feel least attractive.”10 This raises a number of ethical concerns regarding how such relevant advertising can be used to exploit marginalized groups or other individuals when they are most vulnerable for the purpose of driving further revenue.
Although internet platforms collect a significant amount of user data, the largest companies typically do not share or sell these vast datasets. They assert that this is due to concerns regarding user privacy, competitiveness, and the protection of intellectual property. However, the collection of user data is still central to these platform’s business models, and although these companies do not sell this data, they are still able to monetize it and gain significant financial benefits. For example, platforms such as Google and Facebook have been able to successfully profit off of the vertical integration of behavior tracking and ad targeting.11 Similarly, by offering advertisers tools to target users based on the platforms’ analysis of their users’ data, the platforms are effectively selling their users’ attention.12
Today, there are few legal safeguards in the United States that limit what companies can do with their users’ data. Although the California Consumer Privacy Act went into effect in January 2020, the United States still lacks a comprehensive federal law to protect consumer privacy. Further, in 2017, President Trump repealed broadband privacy regulations that required internet service providers to “obtain consumer consent before using precise geolocation, financial information, health information, children’s information and web-browsing history for advertising and marketing.”13 In the United States, internet platforms and civil society organizations have called for comprehensive privacy legislation. However, it is unlikely that any legislation will be enacted soon, as there is little consensus over what specific provisions and safeguards such legislation should cover.
Citations
- Ghosh and Scott, Digital Deceit.
- Geoffrey Fowler, "Think You're Anonymous Online? A Third of Popular Websites Are 'Fingerprinting' You.," The Washington Post, October 31, 2019, source
- Dan Goodin, "Now Sites Can Fingerprint You Online Even When You Use Multiple Browsers," Ars Technica, February 13, 2017, source
- Anthony Nadler, Matthew Crain, and Joan Donovan, Weaponizing the Digital Influence Machine: The Political Perils of Online Ad Tech, October 17, 2018, source
- Nadler, Crain, and Donovan, Weaponizing the Digital.
- Brendan Sullivan et al., Socioeconomic Group Classification Based on User Features, US Patent 20180032883, filed July 27, 2016, and issued Feb. 1, 2018, source
- Ghosh and Scott, Digital Deceit.
- Matsakis, "Facebook's Targeted".
- Matsakis, "Facebook's Targeted".
- PHD Media, "New Beauty Study Reveals Days, Times And Occasions When U.S. Women Feel Least Attractive," news release, October 2, 2013, source
- Ghosh and Scott, Digital Deceit.
- Charles Duhigg, "How Companies Learn Your Secrets," The New York Times, February 16, 2012, source
- David Shepardson, "Trump Signs Repeal of U.S. Broadband Privacy Rules," Reuters, April 3, 2017, source