Announcing a unified vulnerability schema for open source

In recent months, Google has launched several efforts to strengthen open source security on several fronts. An important focus is to improve how we identify and respond to known security issues without performing extensive manual work. It is important to have a precise common data format to triage and address security vulnerabilities, especially when communicating risks of affected dependencies – it enables easier automation and allows consumers of open source software to know when they are affected and make security fixes so as soon as possible.

We released the Open Source Vulnerabilities (OSV) database in February with the aim of automating and improving vulnerability triage for open source software developers and users. This initial effort was bootstrapped with a dataset of a few thousand vulnerabilities from the OSS-Fuzz project. Implementing OSV to communicate accurate vulnerability data for hundreds of critical open source projects proved the format’s success and applicability and obtained feedback to help us improve the project; for example, we dropped the Cloud API key requirement, making the database even more accessible to more users. The Community response also showed that there was widespread interest in extending efforts further.

Today, we are pleased to announce a new milestone in expanding OSV to more important open source ecosystems: Go, Rust, Python and DWF. This extension unites and brings together four key vulnerabilities databases, giving software developers a better way to track and address the vulnerabilities that affect them. Our efforts are also in line with the recent US Executive on Improving the Nation’s Cyber ​​Security, which emphasized the need to remove barriers to sharing threat information in order to strengthen national infrastructure. This extended shared vulnerability database marks an important step towards creating a more secure open source environment for all users.

 
A simple, comprehensive form for accurately describing vulnerabilities

As with open source development, open source vulnerability databases follow a distributed model in which many ecosystems and organizations create their own database. Since each uses their own format to describe vulnerabilities, a client tracking vulnerability across multiple databases must handle each one separately. Sharing vulnerabilities between databases is also difficult.

The Google Open Source Security Team, the Go Team, and the wider open-source community have developed a simple vulnerability exchange scheme to describe vulnerabilities designed from the outset for open source ecosystems. After starting work on the form a few months ago, we asked for public feedback and received hundreds of comments. We have incorporated input from readers to arrive at the current schedule:

{

        “id”: cord,

        “changed”: cord,

        “published”: cord,

        “withdrawn”: cord,

        “aliases”: [ string ],

        “related”: [ string ],

        “package”: {

                “ecosystem”: cord,

                “name”: cord,

                “wrong”: cord,

        },

        “Summary”: cord,

        “details”: cord,

        “affects”: [ {

                “ranges”: [ {

                        “type”: string,

                        “repo”: string,

                        “introduced”: string,

                        “fixed”: string

                } ],

                “versions”: [ string ]

        } ],

        “references”: [ {

                “type”: string,

                “url”: string

        } ],

        “ecosystem specific”: { see spec },

        “database_specific”: { see spec },

}

This new vulnerability scheme aims to address some key issues with open source vulnerability management. We found that there was no existing default format, such as:

  • Enforces version specification that exactly matches name and version schemas used in actual open source package ecosystems. For example, it is difficult to match a vulnerability, e.g. A CVE, with a package name and a set of versions in a package manager using existing mechanisms, e.g. CPEs.
  • Can be used to describe vulnerabilities in any open source ecosystem, while not requiring ecosystem-dependent logic to address them.
  • Is easy to use by both automated systems and humans.

With this form, we hope to define a format that all vulnerability databases can export. A unified format means that vulnerability databases, open source users and security researchers can easily share tools and consume vulnerabilities across all open source. This means a more complete overview of vulnerabilities in open source for everyone as well as faster registration and remediation times due to easier automation.

The current state


The Vulnerability Schedule specification has undergone several iterations, and we invite further feedback as it nears completion. A number of public vulnerability databases are already exporting this format, with several in the pipeline:

  • Go vulnerability database to Go packages
  • Rust advisory database for Cargo packages
  • Python Advisory Database for PyPI Packages
  • DWF Database for Linux kernel vulnerabilities and other popular software
  • OSS-Fuzz database for vulnerabilities in C / C ++ software found by OSS-Fuzz

The OSV service has also compiled all these vulnerability databases, which can be viewed in our web UI. They can also be queried with a single command via the same existing APIs:


curl X POST d

      ‘{“commit”: “a46c08c533cfdf10260e74e2c03fa84a13b6c456”}’

      “https://api.osv.dev/v1/query”

    

curl X POST d

      ‘{“version”: “2.4.1”, “package”: {“name”: “jinja2”, “ecosystem”: “PyPI”}}’

      “https://api.osv.dev/v1/query”


Automation of vulnerability database maintenance


It is also difficult to produce high quality vulnerability data. In addition to OSV’s existing automation, we built several automation tools to maintain the vulnerability database and used these tools to bootstrap the common Python advisory database. This automation takes existing feeds, matches them exactly with packages, and generates records that contain accurate, validated versions with minimal human intervention. We plan to extend this tool to other ecosystems for which there is no existing vulnerability database, or little support for ongoing database maintenance.


Get involved


Thanks to all the open source developers who have provided feedback and adopted this format. We continue to work with open source communities to further develop this and earn more widespread adoption in all ecosystems. If you are interested in using this format, we would appreciate any feedback on our public specifications.

Leave a Comment