Literature Review: Ongoing Efforts

The need for standardized, machine-readable schemas for regulatory, governance, and policy documents has driven substantial innovation across academia, government, and industry. These initiatives have addressed different facets of the broader challenge of structured governance documentation, each providing insights and precedents relevant to GovSCH (see Table 2).

One foundational effort is the Akoma Ntoso standard, developed by OASIS Open and the United Nations in 2018, that provides an XML-based schema to structure legal, legislative, and judicial documents.1 Akoma Ntoso has enabled consistent markup and semantic clarity across documents such as bills, acts, and parliamentary records, supporting human readability and machine parsing. Similarly, the United Kingdom’s Legal Schema (UKLS)2 project of 2023 leverages structured markup through the Legal Schema Language to produce human-readable and machine-executable digital contracts simultaneously. Additionally, the Semantics of Business Vocabulary and Business Rules (SBVR) standard from the Object Management Group introduced controlled natural language approaches in 2019, transforming business rules into machine-interpretable logic.3 These foundational efforts establish key practices for the structured encoding of formal policy and legal content, setting essential precedents for GovSCH.

Recent literature has explored innovative ways of transforming policy intentions into actionable schemas in the AI compliance domain. AI Cards, developed in 2024, provides a structured framework for documenting AI systems following regulatory requirements like the EU AI Act, integrating human and machine-readable elements.4 Furthermore, the 2024 Compliance-as-Code 2.0 advances regulatory compliance automation by converting regulations into executable code through agentic AI systems.5 The 2023 Regulatory Knowledge Graphs leverage graph-based semantic representations to facilitate the automated interpretation and querying of regulatory documents.6 Complementing these efforts, the 2018 LexNLP presents a comprehensive natural language processing toolkit for extracting structured regulatory information as well as facilitating schema generation and policy automation.7 These initiatives underscore the potential for structured schemas to significantly enhance policy and regulatory compliance workflows, providing practical insights that inform GovSCH’s structural decisions.

Infrastructure-as-code (IaC) emerged in 2013 from early configuration‑management tools like CFEngine and evolved to declarative provisioning platforms like Terraform and AWS CloudFormation.8 IaC’s proven capabilities include enabling rapid rollback, immutable infrastructure, and consistent environment replication. Security extensions like Compliance‑as‑Code leverage policy engines such as Open Policy Agent to scan infrastructure manifests against rules. These successes all set the precedent for policy-as-code.

NIST’s OSCAL provides layered JSON and XML models, including Catalog, Profile, and Component for System Security Plan (SSP); Security Assessment Plan (SAP); Security Assessment Report (SAR); and Plan of Actions and Milestones (POA&M), to streamline control assessments.9 Studies report a 30-percent reduction in FedRAMP package preparation time when OSCAL is adopted; however, OSCAL still defers to external policy documents for provenance and the creation of a traceability gap.10

Further supporting governance documentation standardization, several metadata-focused initiatives exist. The National Information Exchange Model (NIEM) and NIEMOpen initiatives offer a common vocabulary and interoperable schemas for structured data exchange across government agencies.11 Similarly, the not-for-profit organization MITRE developed governance metadata standards to structure policy documents and consent forms in health and research governance. Additionally, the “Legislative Recipe: Syntax for Machine-Readable Legislation” provides a formalized approach for translating natural language legislation into semantic logic models, enhancing legislative clarity and enforceability.12 These efforts offer robust examples of governance document structuring, demonstrating methodologies that GovSCH can integrate or adapt.

Cross-domain applications of metadata schema further illustrate versatile approaches to schema design and interoperability. For example, researcher Elena Makisha explored translating construction regulations into machine-readable formats to automate compliance verification in building information modeling.13 These diverse implementations highlight schema interoperability and automation, principles central to GovSCH’s approach.

In the cybersecurity-specific domain, although not explicitly targeted at policy documents, the Open Cybersecurity Schema Framework (OCSF) has successfully developed a standardized schema for cybersecurity telemetry and event data, showing how community-driven schemas can achieve widespread adoption.14 OCSF is a practical example of schema openness, standardization, and industry-wide collaboration, which are attributes GovSCH aims to embody.

These initiatives highlight diverse ongoing efforts and provide context for understanding GovSCH’s unique position: synthesizing these varied methodologies into comprehensive schemas tailored to executive orders, structured frameworks, and international regulations.

Citations
  1. Monica Palmirani, Roger Sperberg, Grant Vergottini, and Fabio Vitali, eds., Akoma Ntoso Version 1.0 Part 1: XML Vocabulary (OASIS Open, 2018), source; Fabio Vitali, Monica Palmirani, Roger Sperberg, and Véronique Parisse, eds., Akoma Ntoso Version 1.0. Part 2: Specifications (OASIS Open, 2018), source.
  2. Peter Hunn, ed., et al., A Structured Data Format for Digital Contracts in the UK: Legal Schema (LawTech UK, 2023), source.
  3. Semantics of Business Vocabulary and Business Rules, Version 1.5 (Object Management Group, 2019), source.
  4. Delaram Golpayegani et al., “AI Cards: Towards an Applied Framework for Machine‑Readable AI and Risk Documentation Inspired by the EU AI Act,” arXiv, June 26, 2024, source.
  5. Aman Sardana, Swaminathan Sethuraman, and Priya Dharshini Kalyanasundaram, “Compliance‑as‑Code 2.0: Orchestrating Regulatory Operations with Agentic AI,” Artificial Intelligence General Science 5, no. 1 (2024): 546–563, source.
  6. Vladimir Ershov, “A Case Study for Compliance as Code with Graphs and Language Models: Public Release of the Regulatory Knowledge Graph,” arXiv, February 3, 2023, source.
  7. Michael J. Bommarito II, Daniel Martin Katz, and Eric M. Detterman, “LexNLP: Natural Language Processing and Information Extraction for Legal and Regulatory Texts,” arXiv, June 10, 2018, source.
  8. Waldemar Hummer, Florian Rosenberg, Fábio Oliveira, and Tamar Eilam, “Testing Idempotence for Infrastructure as Code,” Lecture Notes in Computer Science 8275 (2013), source
  9. “OSCAL: Open Security Controls Assessment Language,” source
  10. “The Challenges OSCAL NOW Addresses and Solves,” Pathways Consulting Group, April 9, 2024, source.
  11. NIEM Metamodel and Common Model Format (NIEMOpen, 2024), source.
  12. Megan Ma and Bryan Wilson, “The Legislative Recipe: Syntax for Machine‑Readable Legislation,” arXiv, August 19, 2021, source.
  13. Elena Makisha, “Features of Regulation Document Translation into a Machine‑Readable Format within the Verification of Building Information Models,” CivilEng 4, no. 2 (2023): 373–390, source.
  14. “Categories,” Open Cybersecurity Schema Framework, source.
Literature Review: Ongoing Efforts

Table of Contents

Close