In this section, we’ll explain the role of open source licenses, describe the most common types of licenses and give guidance on how to choose the right license for a particular situation. We’ll also cover the basics of intellectual property and copyright in software, as these are key concepts to understanding licensing in general.
By the end of this section, you should be able to:
Intellectual property is a core element that needs to be understood to be able to make intelligent choices around open source and other software license types. There are several categories of Intellectual Property, as listed below:
For our purposes in this course we’ll focus on copyright and patents, the areas most relevant to Open Source license compliance.
Copyright is one of the two critical elements (patents is the other) that inform open source license compliance. Here are some basic elements of copyright:
There are some important rights related to copyrights with software. How these rights are granted relates to licenses (which we will cover shortly). Specifically, the relevant rights, which vary by jurisdiction, are:
It’s important to note here that the interpretation of what constitutes a “derivative work” or a “distribution” is subject to debate in the Open Source community and within Open Source legal circles, so this is an area that will continue to evolve over time.
Patents are also an important area that can have a significant bearing on open source compliance (depending on license type, which we will also cover a bit later).
Some critical elements of patents include:
It’s critical to note that patent infringement may occur even if other parties independently create the same invention or software.
While we will cover more detailed aspects of licenses shortly, it’s important to have some basic understanding of what licenses do and what they provide.
Armed with the information from the preceding pages, you should have a basic understanding of what licenses are used for. Let’s take a look at the overall license landscape (including closed source licenses) below:
This diagram gives a general overview of both Open Source and Closed Source licenses. While we will dive into more detail on the open source licenses types shortly, it’s good to get a perspective of the different types of licenses generally available.
On the open source side, licenses generally fall into two main categories:
Permissive
These licenses only impose minimal requirements on what you must do when redistributing the software. Those requirements are typically limited to things like retaining or delivering attribution notices.
Copyleft/Reciprocal
Copyleft licenses are sometimes called protective or reciprocal licenses. They have requirements for how the software can be redistributed, as well as requirements that may impact how derivative works can be distributed, such as requiring release of all changes/enhancements you may make to the software.
An important resource to bookmark is the Open Source Initiative (https://opensource.org/), the organization responsible for tracking and vetting approved open source licenses. There is a lot more detail in their website about the definition and types of open source licenses.
As mentioned earlier, permissive licenses generally have the least amount of restrictions on what you must do if you make changes and redistribute the software. For that reason, they are generally (but not always) on pre-approved lists within companies that specify what open source software license types can be consumed by engineers in the organization.
Let’s take the example of the BSD-3-Clause license. This license is an example of a permissive license that allows unlimited redistribution of changes for any purpose in source or object code form as long as its copyright notices and the license’s disclaimers of warranty are maintained.
However, the license contains a clause restricting use of the names of contributors for endorsement of a derived work without specific permission.
Other examples of permissive licenses include: MIT, Apache-2.0.
Some licenses require that if derivative works (or software in the same file, same program or other boundary) are distributed, the distribution is under the same terms as the original work.
This is referred to as a “copyleft” or “reciprocal” effect and can have important consequences if your derivative work is, for example, a highly proprietary piece of software used to provide a unique advantage for your company. In some cases, this could require you to release the source code of a proprietary work that is combined with open source software to anyone to whom you distribute the compiled or binary version of the work.
Here is an example of license reciprocity from the GPL version 2.0:
You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed […] under the terms of this License.
Other licenses that include reciprocity or Copyleft clauses include all versions of the GPL, LGPL, AGPL, MPL and CDDL. You can see more details for these at https://opensource.org/licenses.
License compatibility is the process of ensuring that license terms do not conflict, and this can get particularly challenging as many pieces of software, including ones that are internally developed, are likely to build on each other and be built from pieces of software with different license types.
Here are some examples:
In this case, the distributor cannot satisfy both conditions at once so the module may not be distributed, creating an example of license incompatibility.
Remember that the definition of “derivative work” is subject to different views in the Open Source community and its interpretation in law is likely to vary from jurisdiction to jurisdiction, so it’s important that you check with the appropriate resources to make that determination for your particular case.
Notices, such as text in comments in file headers, often provide authorship and licensing information. Open Source licenses may also require the placement of notices in or alongside source code or documentation to give credit to the author (an attribution) or to make it clear the software includes modifications.
For example:
Copyright © A. Person (2020)
There are some cases where a copyright owner may choose to offer the code under multiple licenses, a practice referred to as “multi-licensing.”
As an example, software could be “dual licensed,” with the copyright owner giving each recipient the choice of two licenses.
It’s important to note that this should not be confused for situations in which a licensor imposes more than one license, and you must comply with all of them.
With all of the information presented so far, it might seem daunting to figure out how to choose the right license for open source code that you want to consume in your organization, eventually contribute back features or changes to, or to create a brand new open source project of your own.
Thankfully, there are some common questions to ask and processes to follow to help you make an informed choice. Here’s a general overview:
When contributing to an existing project, unless the project uses a contribution mechanism with a different inbound license, the common practice is to make your contributions under the terms of the license the project as a whole is governed by.
When contributing to or creating a new project, it is very important to clarify the things that someone using the code is required to do (must), what they are permitted to do (can), and what they are forbidden to do (cannot). The license selected is your way of specifying this information. By choosing a standard and commonly-used open source license, you help to make it easier for everyone else to understand what their rights and obligations are.
Properties to Consider
When choosing a license, it is important to be clear on your goals for releasing the code. Who (what types of people/organizations) do you want to adopt the code? Do you want to see any changes people make to your code when they redistribute it? Do you want other people to be able to sell your code for a profit?
You should also consider the following common properties:
The list on the page contains some common questions you should understand the answer to before making your code public, and choose a license that reflects your answers. This is sometimes a scary task, but in the last couple of years, there have been websites created to help with this, which are listed on the next chapter.
Below are a few popular sites that discuss the types of licenses and properties to consider in selecting a license for your code, or other creative works. Their purpose is to help you choose a license, and explain more of the background behind some of the options.
Source Code
Open Source Licenses by Category from the Open Source Initiative lists the approved open source licenses.
Choose an Open Source License is sponsored by GitHub. It walks you through the properties you must consider, helping you decide what license makes sense.
License text sometimes gets treated as “too long, didn’t read” - denoted TL;DR. tl;dr Legal is trying to clarify the legal text into standard properties. The website creators work with volunteer lawyers to classify and color code the properties associated with specific licenses to help you navigate easier and better understand the existing licenses.
This is a very useful tool for understanding the terms of some of the common licenses. For example, for Mozilla Public License version 1.0, you cannot hold the contributors liable, but if you use it, you must include copyright, license, state any changes, and disclose the source.
Various Licenses and Comments About Them from the Free Software Foundation’s Licensing and Compliance Lab provides a description of many licenses and comments about them.
Other Creative Work
Creative Commons Licenses help you understand license options for images and documentation. An example of this is the CC-BY-SA 4.0 license. We encourage you to click the link to the creative commons site, and read the legal code file. It specifies the attribution, share-alike, and other properties associated with this license.
Additional Resources
In addition, you can take a look at this online resource: Open Source License Compliance Handbook from Jilayne Lovejoy and FINOS (The Fintech Open Source Foundation). The handbook provides “self-serve” information to help users and redistributors of open source software understand the specific requirements for complying with various licenses.
The SPDX License List is another useful resource for identifying licenses. It provides a curated catalog of commonly-seen licenses used in publicly-distributed software. Not all of the licenses are necessarily open source; the License List indicates which ones have been approved by the Open Source Initiative and which have been listed as free/libre by the Free Software Foundation.
The License List does not include interpretations of licenses. Rather, it can be useful when searching for the license text that corresponds to a particular license name or identifier. The License List contributors also maintain versions of the license texts with markup for certain sections of the license text that are considered replaceable while still being substantially the same license, which can be useful when automating detection of license notices in source code.
If you are looking for guidelines on how to structure the license information into your project, we recommend you consult the REUSE Software guidelines from the Free Software Foundation Europe. They provide detailed examples on how to add identifiers, and the full text of licenses into projects, as well as scripts to check for compliance with the guidelines.
If you’re trying to use a license that is not on the SPDX License List, they also have good recommendations on how it should be documented so that tools can find the information.
In this section, we will provide information on the background of an effective compliance program, and how to build and staff such an activity, including the importance of engineering leadership and legal partnerships.
By the end of this section, you should be able to:
While the word “compliance” may seem overbearing or scary in some cases, in this case, it can be broken down into two very concrete goals:
Know your obligations
You should have a process for identifying and tracking Open Source components that are present in your software.
Satisfy license obligations
Your process should be capable of handling Open Source license obligations that arise from your organization’s business practices.
While there are of course many details embodied in these two goals (more on that later), it’s important to keep in mind that all of an organization’s process decisions around compliance can be traced back to these two overarching goals.
Since obligations in the compliance domain are important, it’s key to understand what obligations must be satisfied.
Depending on the Open Source license(s) involved, your compliance obligations may consist of:
Attribution and Notices
You may need to provide or retain copyright and license text in the source code and/or product documentation or user interface, so that downstream users know the origin of the software and their rights under the licenses. You may also need to provide notices regarding modifications, or full copies of the license.
Source code availability
You may need to provide source code for the Open Source software, for modifications you make, for combined or linked software, and scripts that control the build process.
Reciprocity
You may need to maintain modified versions or derivative works under the same license that governs the Open Source component.
Other terms
The Open Source license may restrict use of the copyright holder name or trademark, may require modified versions to use a different name to avoid confusion, or may terminate upon any breach.
“Distribution” is something that triggers license obligations in many open source license cases. But what does distribution mean exactly? In general, “distribution” means dissemination of material to an outside entity. Sometimes this can be a challenging area, even in legal circles, but here are a few examples:
Distribution Events
Some licenses define the trigger event to include permitting access to software running on a server (e.g., all versions of the Affero GPL if the software is modified) or in the case of “users interacting with it remotely through a computer network.”
Another major element of license obligations can happen when source code is modified, such as for fixing a bug you find or adding a new feature. Additionally, combining open source code with your own code, or even other open source components can have potential impacts.
Under some Open Source licenses, modifications may cause additional obligations upon distribution, such as:
Later in this module we’ll cover how to build the appropriate processes to track and manage the results of distribution and modification events.
It’s important to note that though there can be occasionally challenging aspects related to open source compliance, especially around modification or distribution, that a properly build and functioning compliance program is straightforward and brings many benefits to your organization such as:
Organizations that have been successful at Open Source compliance have created their policies, processes, training and tools to:
It’s very important to note that these policies, processes, training and tools need to provide oversight without becoming overbearing. You can theoretically build the best open source compliance program in the world, but if it’s too complicated and burdensome for the engineering teams to utilize, it most likely will not be utilized, or will be severely hampered by teams finding ways around the process.
Additionally, an Open Source compliance program should be tailored to the nature and requirements of your own organization. Every organization develops and builds software in a different way, and your organization may be able to comply with its license obligations without following the exact set of processes that are described here.
While we will go into more detail shortly, the main areas that you should be considering when building your compliance practices are:
An important consideration during open source review is considering how you are planning on using the open source software component in question.
Common scenarios include:
We’ll cover these in more detail in the next several pages.
A developer may copy portions of an Open Source component into your software product.
Relevant terms include:
A developer may link or join an Open Source component with your software product.
Relevant terms include:
A developer may make changes to an Open Source component, including:
A developer may transform the code from one state to another.
Examples include:
In addition to human engineers performing some of these tasks, it’s important to note for compliance purposes that some development tools also can perform these functions behind the scenes.
For example, a tool may inject portions of its own code into output of the tool.
As mentioned earlier, it’s important to consider how a particular open source component will be distributed, specifically:
After Program/Product Management and engineers have reviewed proposed Open Source components for usefulness and quality, a review of the rights and obligations associated with the use of the selected components should be initiated.
A key element to an Open Source Compliance Program is the Open Source Review process. This process is where a company can analyze the Open Source software it uses and understand its rights and obligations.
The process includes the following steps:
Anyone working with Open Source in the company should be able to initiate an Open Source Review, including Program or Product Managers, Engineers, and Legal team members
Note: The process often starts when new Open Source-based software is selected by engineering or outside vendors.
When analyzing your open source usage, you’ll need to gather information about the identity of the component in question, its origin, and how the component is intended to be used. This information may include:
Putting together a team to effectively run open source reviews requires participation from several stakeholders.
An Open Source Review team includes the company representatives that support, guide, coordinate and review the use of Open Source. These representatives may include:
The Open Source Review team should assess the information it has gathered before providing guidance for issues. This may include scanning the code to confirm the accuracy of the information.
Considerations include:
We will go into more detail in a later section on the different types of scanning tools and what criteria you should be considering for choosing a tool, but here is a general overview.
There are many different automated source code scanning tools, and all of the solutions address specific needs and - for that reason - none will solve all possible challenges. Because of that, most companies pick the solution most suited to their specific market area and product. In general, most companies try to use both an automated tool and manual review to spot check the results of scans.
One popular and good example of freely available source code scanning tool is FOSSology, a project hosted by the Linux Foundation.
It’s important to note that the Open Source Review process crosses disciplines, including engineering, business, and legal teams. For maximum effectiveness, It should be interactive to ensure all those groups correctly understand the issues and can create clear, shared guidance.
The Open Source Review process should have executive oversight to resolve disagreements and approve the most important decisions.
It’s critical that the review process be treated as a cross-disciplinary activity in the organization, because simply characterizing it as “an engineering problem,” or “a legal problem” not only diminishes the importance, but can have detrimental effects to both engineering productivity and legal risk.
Treating the process as a collaborative partnership does require more up-front work in getting all stakeholders on board, but pays dividends as your organization becomes more familiar with the end-to-end compliance management process.
Compliance management is a set of actions that manages Open Source components used in products. Companies may have similar processes in place for proprietary components.
Such actions often include:
Here is an example checklist that can be used as a basis for your own organization’s compliance management process.
Ongoing Compliance Tasks:
Support Requirements:
Here is a graphical overview of a typical enterprise compliance process for open source:
We’ll go through the important sections of this process in the next several pages.
The first step in the process is identification of the open source components in your code.
Here are the steps and outcomes expected during this phase:
After identification, code auditing takes place, with the following steps and outcomes:
Once the audit is complete, time needs to be allocated to resolving any issues that were spotted as part of the audit process. Steps and Outcomes include:
At this point, you’ll need to review the resolved issues to verify that the resolution matches your corporate open source policy.
Based on the results of the software audit and review in previous steps, software may or may not be approved for use. The approval should specify versions of approved Open Source components, the approved usage model for the component, and any other applicable obligations under the Open Source license.
Also, approvals should be made at appropriate authority levels (up to and including the executive review committee if necessary).
Once an Open Source component has been approved for usage in a product, it should be added to the software inventory for that product and the approval and its conditions should be registered in a tracking system.
The tracking system should make it clear that a new approval is needed for a new version of an Open Source component or if a new usage model is proposed.
After registration, you’ll need to prepare appropriate notices for any Open Source used in a product release:
Prior to any software distribution, you’ll need to run a series of verifications steps including:
At this stage of the process, you’re ready to provide the accompanying source code to meet any license obligations specified by the license of the open source code used. You’ll need to make sure to:
In this final verification step, you’ll need to validate that you’ve complied with all appropriate license obligations by:
In this section, we will look in more detail at license compliance tooling, including providing context around what kinds of problems tooling will solve and what kinds of criteria you should be considering as you determine the best compliance tooling and scanning software for your organization.
By the end of this section, you should be able to:
As you’ve undoubtedly determined from the content in this module to this point, conceptually, compliance is fairly straightforward. The challenge comes in the form of the amount of software that’s available to use in the open source world and the variety of ways in which it can be combined with your own organization’s software.
Tracking and building an effective compliance process therefore requires tooling in various forms to help alleviate possible human error and speed up the process of achieving compliance. However, there are some important things to consider as you think about possible tooling:
As you can see, there are many different areas of compliance where tooling can be put to use. However, resist the urge to build all of this tooling as a monolithic stack. Determining what your biggest potential pain points are gives you the opportunity to build out tools in an open source/agile style (e.g. - in an iterative fashion as your compliance needs grow).
As you think about tooling, it’s important to consider the different categories of software and situations you’ll be addressing. As noted above, you basically have three categories: Inbound, Your Own, and Outbound software to consider.
At a high level, you need to consider the cases noted above and what’s required as you think about your tooling needs.
For Inbound software, it’s critical that you document the actual situation (license type, obligations, etc.). For your own software, you need to exercise quality control in terms of understanding how and why you are linking to or calling open source packages. For the combined Outbound software, you need to understand what your deliverable looks like in the combined package of your software and the open source component and make sure you are abiding by all license obligations related to distribution.
There are several areas to consider as you analyze inbound software, not the least of which is that even inbound commercial software can itself contain open source (part of their distribution). It’s also important to think about:
Figuring out the licenses for inbound software can be relatively easy, or potentially very challenging. It’s one of the reasons that tools can help a great deal. Here are some details on the easy cases and the more challenging ones.
Some licenses ask for copyright notice or author listing, resulting in an obligation to provide these, but as you can see below, sometimes parsing and untangling copyright notices can be problematic, so this is usually where software (and projects like SPDX, which we’ll cover a bit more later) come in.
Binaries are compiled applications, libraries, software that can be used without access to source code. Binaries can be part of an open source component distribution and can themselves contain open source.
The main issues here revolve around how to understand what is contained in a binary:
While it is hoped that your organization will practice good coding and engineering habits, there is always a temptation for “copy & paste” solutions to make their way into your code base. There are many reasons for this including:
Copying and paste of source code from the Internet in your code can be done, as reuse is generally better than reinventing the wheel each time. However, it’s important to respect the author’s interests by observing any licensing or copyright obligations.
When you begin to package your product for sale or distribution, you’ll need to focus on what the combined outbound software stack looks like in the context of open source compliance, as well as other ancillary tasks. We’ll cover these items in the following pages.
If your project or product will be distributing open source as part of your deliverable, you’ll need:
To be able to provide all of this, you’ll need tooling which gathers the following information:
Since your goal with compliance is to ensure that you are meeting all appropriate obligations for the open source code you use, you’ll need to consider both tooling and human review of licenses in your outbound software.
For example, some licenses are not compatible, such as the GNU Public License (GPL) and the Eclipse Public License (EPL), and works based on code containing GPL and EPL licenses can be problematic.
In addition, even with tooling, some license statements are ambiguous, for example “Licensed under BSD”. In cases like this, it’s important to involve your legal team and stakeholders in determining how to proceed.
One of the most critical things that code scanning or compliance tooling can provide is a programmatic way of determining what is in the software or product that you are shipping. This is in the form of a Software Bill of Materials (SBOM).
An SBOM provides a detailed account of what is in a software package delivery, including identifying how much of that software package consists of open source components and which licenses are in use for those components.
The Software Package Data Exchange (SPDX) project specifies one implementation of how to express a Software Bill of Materials.
We will cover different types of tooling in our next section, but it’s important to understand what tools can provide in compliance, and also what areas need to be considered before choosing a solution.
As you can see from this section, there are several things that are needed in building effective compliance, and tooling is by no means a “silver bullet” that will make all compliance burdens go away. Tools are very good at analysis, reporting and helping drive management decisions, but they cannot operate in isolation - they need an effective process, married to a clear set of expectations and policies for consuming open source, as well as distributing your own software which builds upon open source packages.
Remember that there also isn’t necessarily a single tool that meets all needs, so you will likely be dealing with integration of different systems/tools, and you should have a clear understanding of what APIs and interfaces the tools provide in order to reduce manual integration effort.
There are many types of tools in the open source compliance space, including (but not limited to):
We’ll cover each of these areas in the following pages.
Since organizations have access to their own source code, as well as the open source packages used to build their products, source code scanning tools are some of the most widely used tools in compliance.
There are many commercial tools (and some open source ones) available that perform this function. In general, these tools rely on “hashing” fingerprints of existing open source code bases (or, potentially internal components if added to the scanning database) to make a determination of what software components are part of a distribution. One of their biggest advantages is in building the Software Bill of Materials (SBOM) we mentioned earlier.
Some scanning tools can also identify “code snippets”, which is often helpful when determining if “copy and pasted” code from a particular open source package was used. However, snippet scanning comes at a price - it will often take longer to run a full snippet scan analysis on source code rather than just relying on hashed fingerprints.
The biggest differentiator for many of these tools is their data sources - how often their databases are updated, and how much open source code is represented in their data. Cost, complexity, integration ability for your build environment, and reporting features are also key features you’ll need to evaluate before making a tool selection.
There is also the possibility of “false positives” or cases where expert knowledge may need to be brought in (legal, engineering) to determine the relevance of results from source code scanning.
While we are covering license scanning tools as a separate item, in practice, most commercial source code scanning tools also include license scanning capabilities.
License scanning relies on searching source code for relevant keywords, and/or machine readable markers (such as SPDX blocks) to determine the relevant licenses attached to each file or package. These scans can also identify copyright, author statements and sometimes acknowledgements.
While the database of open source licenses isn’t as large as the database of open source components required to build a component identification piece in the SBOM, it still does require access to a knowledge base of existing open source licenses. In general, license scanning has a harder time identifying non-OSS licenses, as there are a larger variety of these types of licenses.
As noted earlier, a primary use of license scanning is when checking inbound open source software to verify the license in use. This is often one of the first steps performed (after evaluation for fitness of technical purpose) before validating an open source component for use in your organization.
Even with the best pattern matching or utilization of machine-readable markers, there may be cases where legal or engineering stakeholders need to be utilized to clarify a license identification that may be ambiguous.
The purpose of binary scanning is similar to source code scanning (identification of open source components and their versions), which can help with SBOM creation, as well as identification of potential vulnerabilities for specific software packages coming into your organization.
The challenge here, of course, is that without readable source code, binary scanning is a heuristic that relies on some characteristic elements of binaries, such as string variables, filenames and sometimes method and field names from languages with run-time code available (e.g. Java). Because hardware architectures and compilers can change over time, binary scanners have to be frequently adjusted to try and account for these changes.
There are also some cases where a reliable scan isn’t fully available for a particular binary. But, it’s still a good idea to scan binaries if you can in an attempt to identify open source in packages that you don’t have the source code for.
DevOps integration using custom-built software and custom processes can be used to augment other previously mentioned scanning mechanisms and gain additional information from the processes used to build the software.
Because the DevOps build system is able to determine dependencies during builds, it can combine that information with the output of other tools to help create a more robust SBOM. This is especially true of development organizations that may have intricate dependencies, or legacy packages in their software that are unlikely to be identified correctly by commercial or open source scanning technologies.
The one downside is that these custom configurations/systems do require effort to build and maintain, but if your organization already has a build system tied to a DevOps infrastructure, integrating outside scanning tools into this environment may be possible and may help alleviate a fair amount of manual review/compliance work.
A handy compliance tool to help you bring together your various associated Software Bill of Materials (SBOMs) and provide documentation and reporting is a Component Management System. There are various commercial vendors and some open source projects (search github.com) that provide this kind of functionality.
Some organizations even choose to write this kind of database program for themselves, but the important thing is that it can help with a variety of things, including vulnerability management, approval of open source components, tracking of licenses, and identification of which open source components are used throughout all of the software in an organization.
Implementing this as a web portal can help multiple stakeholders, especially legal, security, and even engineering teams when they have to address questions of license compliance, security vulnerabilities, or tracking versions of open source components in use in the organization.
There are a host of industry initiatives that have sprung up around compliance tooling, including FOSSology for scanning, Microsoft’s ClearlyDefined and tl;dr Legal for license clarification and review, and many others.
In the next two pages, we’ll highlight two initiatives that are important for automation and supply chain efforts to help make open source more sustainable and easily consumed in organizations - SPDX and OpenChain.
Software Package Data Exchange (SPDX) is a project, a standard, and a set of license data that helps enable machine (and human) readable license information to be embedded in source code, but also exchanged between different compliance tools and systems.
SPDX is also a meritocratic community workgroup developing a set of collateral that can be used to more clearly convey complete license information in a standard/reusable fashion and to facilitate compliance. The advantages of this are:
The last point here is probably the most important in the context of our tooling discussion so far - for commercial, open source, or custom-built compliance tools, the ability to have a standard format to exchange license data is absolutely critical - without that, there would be a lot of manual effort and review to maintain effective open source compliance.
The OpenChain Project ISO 5230 is the International Standard for the key requirements of a quality open source compliance program. The project provides a specification and certification program regarding supply chain exchanges of source code, build scripts, license copies, attribution notices, modifications notices, SPDX data and other materials open source licenses governing a software deliverable may require.
Additionally, the project provides a set of curriculum, as well as a free assessment tool that can help your organization determine what areas can be improved to help your own open source compliance.
Perhaps most important of all, OpenChain is a growing community of people who are a great resource for organizations just starting on their journey in open source compliance.
In this section, we will provide an explanation of the role that open source audits play when Mergers & Acquisitions (M&A) activities bring in new code to your existing products. Effective audits can help with both legal compliance as well as be a strategic means to identify areas for software reuse.
By the end of this section, you should be able to:
We’ve already established that software, and specifically open source software, plays a big role in not only technology companies, but many companies that previously weren’t in the software or technology business. So far in this module, we’ve covered the details of what it means to build an effective compliance process for Inbound Software, Your Own Software, and Outbound Software.
A special case for Inbound Software occurs when an organization is about to acquire the intellectual property (in software) of another company. This software due diligence process, in which the acquirer performs a comprehensive review of the target’s software and their compliance practices, is becoming a standard part of any merger or acquisition. During this process it’s common to encounter open source software, which presents a set of verification challenges different from proprietary software.
In the rest of this section, we’ll provide an overview of the open source audit process in merger and acquisition (M&A) transactions.
Why Conduct an Audit?
While every M&A transaction is different, the need to verify the impact of acquiring open source obligations is a constant. Open source audits are carried out to understand the depth of use and the reliance on open source software. Additionally, they offer great insights about any compliance issues and even about the target’s engineering practices.
Examples of ways open source software can impact the acquired assets include:
A common question is whether an open source audit is needed at all. The answer differs by company, purpose of acquisition, and size of the source code base. For instance, for small acquisitions, some companies prefer to just review the open source bill of materials (BOM) provided by the acquiring target (assuming it is available) and have a discussion with their engineering lead about their open source practices. Even if the purpose of the acquisition is to acquire the talent, an audit can help uncover whether there are undisclosed liabilities due to historical license obligations from products which have already shipped.
Inputs & Outputs
The audit process has one primary input and one primary output (see figure above). The input to the process is the complete software stack subject of the M&A transaction being conducted. This includes proprietary, open source and 3rd party software.
On the end side of the process, the primary output is a detailed open source software bill of materials that lists:
Assessing the Scope of an Audit
The size, scope, and cost of an audit varies by transaction, and generally increases with source code size and complexity. To determine the cost and time for an open source audit, auditors need to get some basic understanding of the size and characteristics of the code base, as well as the urgency of the project.
The first questions will be related to code metrics, such as the size of the source code base, the number of lines of source code, and the number of files that need to be audited. Auditors also ask if the codebase consists exclusively of source code, or if it includes binary files, configuration files, documentation, and possibly other file formats. Sometimes, it is also helpful for the auditor to know the file extensions subject to the audit. As we’ve already learned in this module, understanding these things will help the team pick the right tools to support the audit.
Because audit price discussions happen early in the process based on size and scope, the acquirer may not have access to all the information described above. At minimum, the auditor needs to understand the number of files to be scanned before proceeding, although additional information will help refine the estimates. When the auditor has enough information to understand the scope of the work, they will also need to understand the urgency, as this has a significant impact on the cost of an audit.
Overview
Depending on what the auditors find during the initial acquisition discussion phase, they may have to rely on several types of tools (scanning, license identification, component management) to perform the audits.
There are two audit methods:
Traditional Audit
This method is called traditional, because it’s the original method of source code scanning for open source compliance purposes. Traditional audits are those where a compliance auditor from a third-party auditing company gets access to the source remotely via a cloud system or physically while visiting on site and performs the source code scan.
Please note that the process may vary slightly from one service provider to another. A typical traditional audit process follows these steps:
This method is common across most audit service providers, and it allows the opportunity to collect multiple bids for the same audit job and the ability to choose the best bid given your requirements.
For this model to work, the target company must be willing to transfer the code to the auditors or allow them to visit their offices to complete the job on-site.
Do-It-Yourself (DIY) Audit
The Do-It-Yourself (DIY) audit provides the acquirer or the target company time-limited access to the compliance cloud tools, enabling them to run the scan themselves. They can then perform the audits internally with complete access to the knowledge base and all reporting facilities.
This is an approach that is particularly interesting for companies that have in-house employees with sufficient experience to interpret scan results and suggest remediation procedures. It can quickly become more cost-effective for companies that go through the M&A process several times per year. An independent certification can be performed by the auditing tools service provider to verify the findings, to further secure the integrity of the audit.
This approach has several advantages, such as the ability to start the audit as soon as it’s needed because it uses internal resources and is not dependent on the availability of third-party auditors. This approach potentially shortens the timeline and reduces an external source of cost.
Any compliance problem can be addressed immediately, because it is conducted by the people who have direct access to the code and can apply fixes directly. Finally, the audit can be verified by the provider of the audit tool to ensure correctness and completeness.
Final Audit Report Notes
Many auditing tools can also be tuned to highlight potential issues. After viewing the results carefully, you might find many of the results to be non-issues, but you should expect the potential for a lot of “noise” in the initial reports.
The noise may come from things like leftover code that is in the code tree but not used. Therefore, the initial report may be lengthy, and you should be prepared to invest time to filter the report to find the real issues.
Note that a Software Package Data Exchange (SPDX) conformant report is usually provided on demand. Therefore, if you would like your audit service provider to provide such a report, you will need to request it, and if you’ve already invested in SPDX-compatible tooling, this will make importing and tracking your audit results much easier.
Pre and Post Acquisition Remediation
By this point, the acquiring company should have a clear idea how the target uses and manages open source software and how successful they’ve been at satisfying their open source license obligations. The acquirer and target should use this information to negotiate remediation for any open source compliance issues.
If any issues are uncovered in the audit, there are a few options for resolving them as a part of the pending transaction. The first option is to simply remove any offending code. If the open source software only augments proprietary code, it may be possible to eliminate it entirely. Another option is to design around the offending component, or rewrite any code using cleanroom techniques.
If the section of code is truly essential or if it has been previously distributed, the only remaining option is to bring the code into compliance. The cost of each option can be used when determining the valuation of the target. Whatever option is chosen, it’s crucial to identify the individuals who participated in incorporating the open source code, and to get them involved in the remediation effort. They might have additional documentation or knowledge that can be useful in resolving issues.
Think About Your Needs
As an acquirer, you must take action and make decisions before the audit is commissioned and you have additional obligations after you receive the results. Therefore, it’s important that you consider your needs, as well as which of the audit methods mentioned earlier would work best for your organization and particular situation.
It’s equally important that you determine what you most care about as an organization in terms of audit results. The report from the source code audit may provide a significant amount of information, depending on the complexity of the scanned code. Therefore, it’s important to identify which licenses and use-cases are regarded as critical ahead of getting the results.
Being clear about your needs both before and after the audit will not only make the audit process smoother, but also more cost-efficient in the long run.
Ask the Right Questions
The open source audit report offers a lot of information about the target’s source code and the licenses involved. However, many other data points will require further investigation in order to clarify or confirm compliance-related concerns. In this section, we offer a collection of questions as a starting point to frame what is important to you, and what questions you should address with the target company.
Identify Items to be Resolved
In some cases, an open source audit may reveal instances of licenses or compliance practices that are not acceptable to the acquirer. The acquirer can then request these instances to be mitigated as a condition for closing the transaction.
For instance, the target company may use a code component that uses License A, but the acquiring company has a strict policy against using any source code licensed under License A.
In such a situation, both parties will need to discuss and figure out a possible solution.
Create a Post-acquisition Compliance Improvement Plan
Creating a compliance improvement plan is especially important when the acquirer is a large company buying a smaller startup that will continue to operate as a subsidiary. In this scenario, the acquirer often helps the target establish a formal compliance policy and process, provides training on their own practices, and offers ongoing guidance and support.
There may also be an opportunity to assist the acquisition with improved tooling/processes and/or staffing.
Be Prepared
Passing an open source compliance audit is not hard if you’re prepared. However, it is very unlikely to happen if you begin preparing only once an acquirer shows interest. These activities are meant to go hand-in-hand with your daily business and development activities. The objective is to ensure the company tracks all open source components and respects open source license obligations resulting from use of open source components. These same measures can be of great help if your company becomes a target for a corporate transaction, as they minimize the risk of surprises.
As you’ll see in the next few chapters, these practices are consistent with what we’ve learned to this point in this module.
Know What’s In Your Code
Knowing what’s in your code is the golden rule of compliance. You need to maintain a complete software inventory for all software components including their origin and license information. This covers software components created by your organization, open source components, and components originating from third parties. The most important point is having a process for identifying and tracking open source components. You don’t always need a complex compliance program, however, you should have five basic elements: policy, process, staff, training, and tools.
Policy and Process
This figure restates some of what we’ve already covered, but it’s important to revisit it here.
The open source compliance policy is a set of rules that govern the management of open source software (both use of and contribution to). Processes are detailed specifications as to how a company will implement these rules on a daily basis. Compliance policies and processes govern various aspects of using, contributing, auditing, and distribution of open source software.
This figure illustrates a sample compliance process, with the various steps each software component will go through as part of the due diligence as you build your product or software stack.
The output of the process is an open source BoM that you can publish, along with a written offer and various copyright, license and attributions notices fulfilling the legal obligations of the components in your BoM.
For a detailed discussion on the open source compliance process, please download the free e-book Open Source Compliance in the Enterprise, published by The Linux Foundation.
Staff
In large enterprises, the open source compliance team is a cross-disciplinary group consisting of various individuals tasked with the mission of ensuring open source compliance. The core team, often called the Open Source Review Board (OSRB), consists of representatives from engineering and product teams, one or more legal counsel, and a compliance officer.
The extended team consists of various individuals across multiple departments that contribute on an ongoing basis to the compliance efforts: Documentation, Supply Chain, Corporate Development, IT, and Localization. However, in smaller companies or startups, this can be as simple as an engineering manager supported with a legal counsel.
Training
Education is an essential building block in a compliance program, to help ensure that employees possess a good understanding of policies governing the use of open source software. The goal of providing open source and compliance training is to raise awareness of open source policies and strategies, and to build a common understanding of the issues and facts of open source licensing. It should also cover the business and legal risks of incorporating open source software in products and/or software portfolios.
Both formal and informal training methods are available. Formal methods include instructor-led training courses where employees have to pass a knowledge exam to pass the course. Informal methods include webinars, brown bag seminars, and presentations to new hires as part of the new employee orientation session.
Tooling
As you’ve already learned so far in this module, there are many different types of tools and can be utilized in the compliance process. An important thing to remember is that the tools are no substitute for good process, and knowledgeable staff making decisions based on policies and the data provided by the tools.
It’s also important to consider taking an “open source” approach to your tooling as well - continual evaluation of the tools in place, and the ability to pivot when necessary is critical to maintain a healthy compliance program.
Be in Compliance
If you have shipped products containing open source software—whether intentionally or not—you will need to comply with the various licenses governing those software components. Hence the importance of knowing what’s in your code, because a complete bill of materials makes compliance much easier.
Being in compliance is not a simple task, and it varies from product to product based upon the licenses and the structure of the code. At a high level, being in compliance means that you:
Keep Up with Latest Release for Security
One of the benefits of a comprehensive compliance program is that it’s easier to find products with insecure versions of open source components and replace them. Most source code scanning tools now provide functionality to flag security vulnerabilities disclosed in older software components.
One important consideration when upgrading an open source component is to always ensure that the component retains the same license as the previous version. Open source projects have occasionally changed licenses on major releases. Companies are encouraged to engage with open source project communities to help avoid situations where they are using a version with security vulnerabilities.
It is not reasonable or feasible to be active in all of the open source projects you use, therefore a certain level of prioritization is needed to identify the most critical components. The various levels of engagement range from joining mailing lists and participating in the technical discussions, to contributing bug fixes and small features, to making major contributions. At minimum, it is beneficial for corporate developers working on a specific open source project to subscribe to and monitor the mailing list for reports related to security vulnerabilities and available fixes.
Measure Your Compliance Efforts
The easiest and most effective first step for organizations of all sizes is to engage with the OpenChain Project (mentioned earlier) and to obtain “OpenChain Conformant” status. This is done by filling out a series of questions either online or manually.
The questions used for OpenChain Conformance help to confirm that an organization has created processes or policies for open source software compliance. OpenChain is an industry standard, similar to ISO 9001. It is focused on the “big picture,” with precise processes and policy implementations up to each individual organization.
OpenChain Conformance shows that open source compliance processes or policies exist, and that further details can be shared when requested by a supplier or customer. OpenChain is designed to build trust between organizations across the global supply chain.
Conclusion
Open source due diligence is generally one item in a long list of tasks that need to be successfully completed in an M&A transaction. However, it is still an important aspect of the general due diligence exercise given the central role of software and potential IP risks.
Although open source due diligence may seem like a lengthy process, it often can be completed quickly, especially if both parties are prepared, and working with a swift compliance service provider.
How can you be prepared?
If you are the target, you can maintain proper open source compliance practices by ensuring your development and business processes include:
If you are the acquirer, you should know what to look for and have the skills on-hand to address issues quickly:
Open source compliance is an ongoing process. Maintaining good open source compliance practices enables companies to be prepared for any scenario where software changes hands, from a possible acquisition, a sale, or product or service release. For this reason, companies are highly encouraged to invest in building and improving upon their open source compliance programs.