Section Two: Building Open Source Software

Depending on the size, mission, and complexity of your government organization, you likely use custom-built software systems. Maybe your office runs a custom database and processing system for a benefit application, inventory and supply chain system, or public-facing website or mobile app. In this section, we advocate for and provide concrete guidance on writing your software in the open, as an open source system. By opening your source code to the public, providing clear documentation, and allowing others to reuse it for their needs, you can instill best practices, receive collaborative support on your systems, and share your knowledge with other jurisdictions.

Working in the Open

When we talk about “working in the open,” we specifically mean publicly publishing your work on software projects, including existing drafts, future progress, and other work products. By definition, this is different from publishing a single finalized version at a point late in the development, or once it’s determined to be complete. Working in the open means performing the actual work—the individual code changes, the code reviews, discussions, project management, and more—in the open, for anyone to see. This type of process has many benefits, such improving transparency, accountability, and collaboration. With the exception of a small number of cases (e.g. building software for a soon-to-be-released policy), software for the public benefit should always be open source.

Through our research we found many lists of good practices for working in the open, such as this list produced by GOV.UK. Here, we present a select overview of good practices to follow when working in the open, as well as key considerations.

Publicly Show Progress

When working in the open, all progress should be public. This includes progressive and iterative changes to the code, as well as discussions about changes, evolutions to documentation, and potential security flaws and bugs.

The source of truth for the software should be a publicly-available code repository, hosted on a service such as GitHub. These kinds of services allow simple online access to the code repository, and generally support a wide range of other features, such as access control management, automated security scans, and more. For example, with access control features, you can ensure that only approved developers are able to submit code changes to the codebase, or to approve proposed changes from third parties. Changes to the code should be made with small, publicly posted modifications (or “commits”) to that codebase, which should be reviewed by an owner or maintainer of the codebase during a code review, which should also be public.

By publicly showing progress, the software development team can ensure full transparency, build a historical record of modifications and decisions, and allow others to learn from the progression.

Opening More than Code

As the name implies, OSS’s source code must be publicly available. But stopping there would fall short of following open source principles. Teams should publicly share their project work and documentation. Project plans and roadmaps should be made available so that contributors and the public can see the long-term plan for the software. Research used to design the service should be made available to increase transparency on the decisions for the service. Even more granular project documentation, such as user stories, testing, and project discussions can be made public to provide contributors the information they need to understand how they can best support the software development and maintenance. For good examples of this, see CA.gov’s, New Jersey’s Ask a Scientist repository, and Canada’s VAC Find Benefits and Services documentation.

Separate the General Solution

For your software to be most useful to others, it needs to be designed to solve a generalized problem. Designing your software in this way means structuring the architecture of the code so it can easily be modified for other uses. Keep the parts of the code that are specific to your context separate from the portions of the software that can address the general problem that other jurisdictions may face. For example, if you have software that sends text reminders to certain city staff, separate out the messages the software sends from its general ability to send messages. Developing your software in this way allows other people and organizations to take the code and reuse it for their purposes easily, without having to make significant modifications.

Managing Sensitive Information

Most software systems contain sensitive information that should not be accessible to the public. Even in closed-source systems, this information should be separated from the codebase as a best practice to ensure the security of the information. Developing in the open more heavily incentivizes this essential security practice.

User data, also frequently referred to as personally identifiable information (PII), should never be included in the source code of any application, especially open source applications. However, sensitive information is more than just user data. Sensitive information also includes passwords, credentials, keys, and certificates—effectively, any information that would give data access to unapproved parties. This type of information is generally referred to as “secrets,” and there are many best practices to follow to manage these secrets. Software should never have passwords written directly into the code. Instead, the best practice is to keep secrets separate from the source code in tools built for this purpose (such as HashiCorp), with the running application to be given access to this information only when necessary. In this way, even when the source code is public, the public cannot access sensitive information. By keeping secrets and user data out of your source code, you easily and drastically reduce any risk of data exposure when using open source software.

By keeping secrets and user data out of your source code, you easily and drastically reduce any risk of data exposure when using open source software.

This simple best practice maintains a separation between the public source code and the private secrets. By separating these from each other, you can publicly open your source code without any worry of undue access to sensitive data or systems. When opening source code publicly, take care to ensure that this sensitive information is not included in the project's version history. If it is, there are two ways to resolve this:

  1. Change all of the secret information used. Reset these secrets, such as passwords or credentials, so that old versions no longer allow access to key data. Do not include these new secrets in the source code, but instead separate them out. This fully ensures that the risk has been mitigated, as long as the new secret information is not made public. This is the simplest and most surefire way to secure your system after secret information has been added to the codebase.
  2. Remove all mentions of the secret information from the codebase, including from the historical record of the codebase. Version control systems (which are highly recommended) store the full history of the codebase, and simply removing a password from the code will not prevent people from looking into the history to find it. This can be an arduous but critical task. If even one instance is forgotten, it can lead to vulnerabilities and breaches.

Utilizing Open Source Communities

While working in the open is a critical part of open source software, creating a truly open source solution also includes building and managing a community. This community includes contributors, partners, maintainers, and anyone copying the code and repurposing it for their use. Building and managing a community takes work, but it provides an opportunity for expanded support and collaboration. Aside from transparency and distributed lessons, a healthy community may contribute to the source code, improving its quality or creating entirely new features. For example, Mozilla developed a broad community in their open source form generating tool, which the U.S. Digital Service modified and repurposed to create the U.S. Forms System.

Creating a community around an open source government tool ensures that governments from anywhere can engage with the software and find support to help them adapt it to their needs. When you open source your own projects, you openly invite other people and organizations to use and contribute to your source code. To the extent that people outside of your organization contribute to the project, they do so as coordinated volunteers. You still ultimately own and maintain the system, and while there is not any expectation that public contributors will be compensated for contributions, it does take time and effort to create an environment where a community can thrive.

Building and managing a community takes work, but it provides an opportunity for expanded support and collaboration.

The most critical portion of building a community is understanding what you, as the owner and maintainer of the software, want from the community. For example, you may want the public to see your work for accountability, but may not want code contributions. Alternatively, you may be looking to offer your code to other jurisdictions with similar problems. Whatever the reason may be, making a decision up-front will greatly inform how you actually go about building and managing the community around your open source software, including how you inform others of your work.

As you start building and engaging with a community around your project, you will be publicly attached to the code—which is a good thing! It further creates direct transparency and accountability, and allows the public to recognize you and your team as the maintainers and experts for this product. It also improves your ability to publicly represent the project by providing a clear connection between the public and the maintainers of the service. Structures should be put in place to manage that interaction between the public and the maintainers, and it should provide the public with a sense of community engagement. By building an effective community, you give the public a place to engage with services they depend on.

Building a Community

Building a community of contributors and enthusiasts incorporates practices that relate to building communities of any kind. People need to be able to find your community and understand what it is and how they can benefit from the code or application. People who join should be able to clearly understand the software or service you are building, and whether it might be useful to them. This clarity will allow the public to self-select whether to join the community, or move on. GitHub provides a great guide to building a community that we recommend. Below, we provide select considerations for building an open source community.

Make your work visible and discoverable. This step applies regardless of whether you’re looking to add contributors to a project, or just to share the project with other jurisdictions. Some marketing is needed to let your targeted audience know about your open source project. As with any communication and marketing, determine your target audience first and find ways to inform them. This can be via direct contact (e.g. emails, posts in LinkedIn groups or other software or development communities like GitHub and Stack Overflow) or indirectly (e.g. gaining placement in articles in news sources or on social media platforms). One of the most difficult things to navigate in the open source ecosystem is the discoverability of projects or finding a project that fits your needs. As the producer of an open source project, you’ll need to put effort into making your project easier to discover by tagging it on open source code hosting services, publishing blog posts about it, and promoting it via your social and personal networks.

If you want your community to contribute and support the codebase, provide them with opportunities to engage once they join. For example, let the community know how to stay up-to-date with the project, whether through an email list, a public discussion forum, or following a Twitter account. Inform them about the current state of the project, and any upcoming releases. Be clear about how the project will be maintained and updated, and how they, as contributors, can take part in it. For a great example of building an inclusive code repository, see Microsoft’s GitHub repository for their application .

Set expectations upfront on how the project will be run. Clearly articulate the various processes and statuses for the project. The public should be able to see how often the project will be updated, what processes are used to manage contributions, what communication channels you will use, and even expected timelines for the project’s lifecycle. This type of communication and clarity will provide contributors and the public everything they need to know to be able to understand and effectively engage with the project.

There should never be an expectation that the community will complete work that you need for your product. Open source communities are not sources of free labor. Instead, the community can provide insight and updates to the codebase at their own pace, which will be proportional to how easily they can engage with the project.

Prioritizing Documentation

Clear documentation is key to building an effective open source project. Good documentation lays out the purpose and plans of the project in plain language, and offers guidelines for engagement, contributions, licenses, and other necessary elements of being a member of the community. It also outlines the source code itself, writing in prose the structure and organization of the software. With good documentation, your project will be easily understood by anyone who comes across it.

Documentation must be simple enough that entry-level contributors can understand the project and how to get involved. Detail the expected roadmap, so that contributors and users can see the plans for future modifications and improvements. Other forms of project documentation, such as research or short-term project progress, should be available for contributors and the public to see as well. For good references on projects that have prioritized documentation, see the Atom repository and the previously mentioned VSCode.

It is critical to have documentation that explains the source code itself. In most cases, this is documentation provided for contributors, maintainers, or the public to refer to when using the software, such as integration general usage instructions. Documents that describe the design of the source code are also helpful for people looking to understand and reuse the code itself. High-level documentation should be included to describe the structure and architecture of the code. More fine-grained documentation should be in-line with the code itself.

Lastly, documentation on how the source code project is run should be available as well. For example, documentation should clearly explain how code reviews are performed on contributions, how third-party contributors can become maintainers, what coding practices are expected of contributors, and how members of the community are expected to conduct themselves. These forms of documentation provide clarity, stability, and bring critical decisions up before they are needed.

Managing Contributions to Open Source Projects

One of the core benefits of building a community is receiving collaborative contributions. These contributions may be bug fixes, suggested features, potential use cases, redesign specifications to allow for more generic application of the software, or other types of modifications. Depending on how you build your community, contributions may even come from people who are not software engineers—for instance, translations of texts into different languages.

It’s a best practice for open source projects to include a contributor guide, clearly labeled for people to easily find. Contributor guides let contributors know what is expected of contributions. Contribution guides outline the process to contribute, making it simple for contributors to engage with the open source project in an organized fashion. For example, contributor guides may require that all code adheres to a consistent coding style. GitHub provides useful advice for building contributor guides, and the Atom and OpenGovernment code repositories provide good examples to follow as well.

As mentioned previously, even if code is open source, that doesn’t mean anyone can modify the code directly. Instead, people can suggest modifications, which may be pulled into the codebase by the owners after a review. Code repository hosting tools (such as GitHub) have simple access controls that let you manage who is allowed to contribute to the codebase, and what processes they must follow to submit their contributions. You should clearly communicate the policies for how contributions are managed in your open source project, so that contributors understand what processes to follow and can submit contributions in a way that is easiest for you to manage. For example, you may require that contributors create their own copy of the codebase (generally called a “fork”), make their modifications on that copy, and then request that the copy be merged into the original codebase pending your review. The OpenGovernment repository has a good example of simple contributor guidelines.

Contribution policies should be outlined in a contributor guide included in the codebase and documentation. As the owner and author of the software, you will fill the role and responsibilities of the “maintainer” for the project. Maintainers are the main group of software developers who manage the codebase. Maintainers generally have full access to make changes to the codebase, and also manage access controls for other types of contributors as well. Maintainers are also responsible for reviewing and accepting contributions from the community. Code reviews should be thorough, and allow for back-and-forth conversation between the maintainers and the contributors. Provide feedback to contributors to let them know what needs to change before their proposed change is accepted into the main codebase.

Licensing Considerations

When releasing open source software, you should always release it under an explicit license. Software licenses lay out the terms under which the software can be used by other people or entities. A license is legally binding, and protects you as the author from unauthorized use of the software you create. It also protects members of the community by clearly outlining permitted uses and granting permission for use broadly so that they don’t need to seek individual deals with you. Note that open source licenses don’t eliminate copyright, but work around copyright to provide the software as open for all. Ownership and copyright are covered in further detail later on in this report.

When releasing open source software, you should always release it under an explicit license.

Different licenses permit different usage. For example, you might choose a license that requires that the software may be used, repurposed, or modified freely, as long as attribution is given to you. Opensource.com has a good breakdown of the different kinds of attributes of software licenses.

The best practice for open source government software is to release under the most permissive licenses available. Many government organizations use the MIT license, which effectively gives full permissions to any parties to modify, use, redistribute, or perform other activities with the source code. The MIT license is popular because of its broad permissibility, as well as its brevity. Other organizations use the Creative Commons Zero license, which puts the software in the public domain. Public domain commonly refers to creative materials not protected by intellectual property laws such as copyright, trademark, or patent laws. The Open Source Initiative maintains a list of licenses that they have reviewed and approved, which can be helpful when determining which license or licenses you will use for your software.

Starting Open Versus Becoming Open

As you move to adopt open source practices across your projects, you will find that there are differences between new projects that use open source from the start, and existing projects and systems that need redevelopment to become open. Starting new projects in an open source discipline ensures best practices up front, enabling collaboration and providing transparency at the outset. Making existing software and services open source allows others to learn from your work and reuse the components you have already built. In addition, allowing contributions to existing projects can bring in additional features or fixes, and infuse the best practices of documentation and security into the project.

Starting New Projects in the Open

The best time to make a project open source is right from the start. This brings all of the benefits of producing open source to your project immediately, such as ensuring secure designs and simplifying collaboration. It saves time by building for security and public availability as you go along, rather than having to lump it all together at the end.

Starting projects as open source encourages better documentation from the start, which enables the rapid onboarding of new contributors or team members as the project grows. It often results in better designed software, since OSS is typically more modular so that it can be easily understood by contributors. This decreases the complexity of the software and improves the ease with which the system can be built and ultimately maintained.

The best time to make a project open source is right from the start.

Opening Existing Projects

It may be the case that you have existing custom-built, proprietary software supporting your services. Reengineering existing software so it is open source still provides the benefits we have discussed. It also provides an opportunity to revisit the software and improve its security, design, and reusability. There are valuable lessons in projects that have already seen real-world use, and making these projects open source can save time and effort on future projects for you, your organization, and for others around the world.

When making existing projects (sometimes called legacy solutions) open source, there are steps you can take to ensure a smooth transition. We discuss some of these in other parts of the report, but we have consolidated them here as an outline for moving legacy solutions into the open. We also recommend reviewing two documents written by GOV.UK: a case study about moving one of their projects to the open, and a blog post on how to open up closed code.

Section Two: Building Open Source Software

Table of Contents

Close