Defining Data Governance
The term “data” refers to information created, processed, saved, and stored digitally by a computer in ones and zeros—or binary format. Network connections or devices allow this data to be transferred from one computer to another. There is also a distinction that needs to be drawn between “data” (machine-readable ones and zeros, or “code”) and “information” (what that data means to humans).1 Such data and information can have different implications depending on their type (e.g., pertaining to finance, health, social media, law enforcement, etc.).
Based on these definitions and distinctions, we generally define data governance as the rules for how governments interact with the private sector—as well as with other governments—when it comes to managing data to determine who has access to it and the ways in which those with access can use it. As previously articulated, this includes the design and enforcement of standards, policies, and laws.
We understand that the term “data governance” has many different meanings depending on the context and the perspective of various stakeholders. For the purposes of our roundtable, our goal was to have a structured discussion about data governance as it applies to the following three issues (in no particular order):
- National security/law enforcement: a government’s interest in ensuring access to data for purposes of domestic and international security; other governments’ converse concerns about misuse of that data; and desires to protect data against foreign collection;
- Economic growth/innovation: objectives to create and access large databases of data for research and development of data-intensive technologies like machine learning/artificial intelligence, as well as for cross-border transactions and ecommerce; and
- Content moderation policies and practices: competing demands on what is and is not permissible content, and possible ways to manage that conflict while also ensuring the free flow of data.
Across these three distinct areas, there are different types of tools or “levers” that set the terms around how data is collected, used, transferred, and stored. These levers essentially set the bar for concepts such as “trust” in Prime Minister Abe’s concept of “Data Free Flow with Trust.” Since the concept of trust is quite vague (with significant debate regarding the degree to which regulation even enhances trust),2 a core objective of this project is to consider how the various levers at play can and should be configured to achieve certain safeguards. We discuss these levers in the next section.
Additionally, the issue of how governments support data flows across borders—or conversely, how governments restrict those flows—is a major focal point across each of the three aforementioned areas of data governance. The term “data localization,” for example, appears frequently in policy discussions to mean restrictions on the ability of firms to transfer data from domestic sources to foreign countries—in other words, the opposite of free data flow.3 In reality, the term could have several different meanings. There is a spectrum when it comes to severity.
On the most permissive side of the spectrum is “mirroring,” where a country requires that a copy of data be stored on a server within that country before it’s allowed to be sent out. Partial data localization could mean that restrictions only exist on certain domain names or on data from specific sectors like health or finance. China’s system is stricter than that of many other countries in that the government requires firms to store certain kinds of data on servers inside the country, while allowing transfer in or out under certain conditions. There still appears to be a regulatory gray zone in which multinationals in China can send certain kinds of data outside the country, but it is not clear the extent to which this will be the case in the future, given the significant weight given to national security in Beijing’s approach to data regulation.
In other cases, however, data localization may be implemented in an even stricter manner by requiring local storage and local processing while prohibiting outbound transfer altogether. This could mean foreign firms cannot access and use data to create value outside of that geographic area. Russia and India already take such an approach with some kinds of data (i.e., payment data in India’s case), and other countries are increasingly considering it. But at least for now, with a few exceptions, most governments have yet to notably implement these stricter forms of data localization.
The above pathways all fall under the “data localization” umbrella of policy options. But localized storage and processing requirements are by no means the only policy option available for limiting free flows on data; countries could also potentially implement some form of algorithmic filtering in order to allow or disallow certain kinds of data, possibly even from certain places, to flow into or out of their borders.4 This could focus on anything from sensitive personal health information to political online content, depending on factors such as the government’s policy priorities and its technical capabilities.
We discuss the key challenges for enabling cross-border data flows as part of Theme 1 later in this report. Before that, however, we turn in the next section to the “levers” of data governance and their relationships at super-national, national, and sub-national levels.
Citations
- This is a distinction one of us previously established in: Robert Morgus and Justin Sherman, “The Idealized Internet vs. Internet Realities,” (Washington DC, New America, 2018) source 27
- Daniel Castro and Eline Chivot, “The GDPR Was Supposed to Boost Consumer Trust. Has it Succeeded?” European Views, June 6, 2019, source
- World Trade Report “How do we prepare for the technology-induced reshaping of trade?” (2018) source
- We have both noted, in a range of places and contexts, how hypothetical limitations on the flow of AI-related data around the world (i.e., code for neural networks, training data sets, etc.) stand in stark contrast to the current state of AI research, which remains incredibly open. See: Justin Sherman, “U.S. Tech Needs Hard Lines on China,” Foreign Policy, May 3, 2019, source Samm Sacks, “Smart Competition: Adapting U.S. Strategy Toward China at 40 Years,” Testimony before the House Foreign Affairs Committee, May 8, 2019, source and Justin Sherman, “The Pitfalls of Trying to Curb Artificial Intelligence Exports,” World Politics Review, June 6, 2019, source