diff --git a/docs/source/api.rst b/docs/source/api.rst index c779784a5..b4ef5dbcd 100644 --- a/docs/source/api.rst +++ b/docs/source/api.rst @@ -11,6 +11,17 @@ Browse the Open API documentation - https://public.vulnerablecode.io/api/schema/ for the OpenAPI schema +How to use OpenAPI documentation +-------------------------------------- + +The API documentation is available at https://public.vulnerablecode.io/api/docs/. +To use the endpoints you need to authenticate with an API key. Request your API key +from https://public.vulnerablecode.io/account/request_api_key/. Once you have +your API key, click on the ``Authorize`` button on the top right of the page and enter +your API key in the ``value`` field with ``Token`` prefix, so if your token is "1234567890abcdef" +then you have to enter this: ``Token 1234567890abcdef``. + + Enable the API key authentication ------------------------------------ @@ -34,10 +45,121 @@ Access the API using curl curl -X GET -H 'Authorization: Token ' https://public.vulnerablecode.io/api/ +.. _Package Vulnerabilities Query: + +Query for Package Vulnerabilities +------------------------------------ + +The package endpoint allows you to query vulnerabilities by package using a +purl or purl fields. + +Sample python script:: + + import requests + + # Query by purl + resp = requests.get( + "https://public.vulnerablecode.io/api/packages?purl=pkg:maven/log4j/log4j@1.2.27", + headers={"Authorization": "Token 123456789"}, + ).json() + + # Query by purl type, get all the vulnerable maven packages + resp = requests.get( + "https://public.vulnerablecode.io/api/packages?type=maven", + headers={"Authorization": "Token 123456789"}, + ).json() + +The response will be a list of packages, these are packages +that are affected by and/or that fix a vulnerability. + + +.. _Package Bulk Search: + +Package Bulk Search +--------------------- + + +The package bulk search endpoint allows you to search for purls in bulk. You can +pass a list of purls in the request body and the endpoint will return a list of +purls with vulnerabilities. + + +You can pass a list of ``purls`` in the request body. Each package should be a +valid purl string. + +You can also pass options like ``purl_only`` and ``plain_purl`` in the request. +``purl_only`` will return only a list of vulnerable purls from the purls received in request. +``plain_purl`` allows you to query the API using plain purls by removing qualifiers +and subpath from the purl. + +The request body should be a JSON object with the following structure:: + + { + "purls": [ + "pkg:pypi/flask@1.2.0", + "pkg:npm/express@1.0" + ], + "purl_only": false, + "plain_purl": false, + } -API endpoints ---------------- +Sample python script:: + import requests + + request_body = { + "purls": [ + "pkg:npm/grunt-radical@0.0.14" + ], + } + + resp = requests.post('https://public.vulnerablecode.io/api/packages/bulk_search', json= request_body, headers={'Authorization': "Token 123456789"}).json() + + +The response will be a list of packages, these are packages +that are affected by and/or that fix a vulnerability. + +.. _CPE Bulk Search: + +CPE Bulk Search +--------------------- + + +The CPE bulk search endpoint allows you to search for packages in bulk. +You can pass a list of packages in the request body and the endpoint will +return a list of vulnerabilities. + + +You can pass a list of ``cpes`` in the request body. Each cpe should be a +non empty string and a valid CPE. + + +The request body should be a JSON object with the following structure:: + + { + "cpes": [ + "cpe:2.3:a:apache:struts:2.3.1:*:*:*:*:*:*:*", + "cpe:2.3:a:apache:struts:2.3.2:*:*:*:*:*:*:*" + ] + } + +Sample python script:: + + import requests + + request_body = { + "cpes": [ + "cpe:2.3:a:apache:struts:2.3.1:*:*:*:*:*:*:*" + ], + } + + resp = requests.post('https://public.vulnerablecode.io/api/cpes/bulk_search', json= request_body, headers={'Authorization': "Token 123456789"}).json() + +The response will be a list of vulnerabilities that have the following CPEs. + + +API endpoints reference +-------------------------- There are two primary endpoints: @@ -48,3 +170,83 @@ There are two primary endpoints: And two secondary endpoints, used to query vulnerability aliases (such as CVEs) and vulnerability by CPEs: cpes/ and aliases/ + +.. list-table:: Table for the main API endpoints + :widths: 30 40 30 + :header-rows: 1 + + * - Endpoint + - Query Parameters + - Expected Output + * - ``/api/packages`` + - + - ``purl`` (string) = package-url of the package + - ``type`` (string) = type of the package + - ``namespace`` (string) = namespace of the package + - ``name`` (string) = name of the package + - ``version`` (string) = version of the package + - ``qualifiers`` (string) = qualifiers of the package + - ``subpath`` (string) = subpath of the package + - ``page`` (integer) = page number of the response + - ``page_size`` (integer) = number of packages in each page + - Return a list of packages using a package-url (purl) or a combination of + type, namespace, name, version, qualifiers, subpath purl fields. See the + `purl specification `_ for more details. See example at :ref:`Package Vulnerabilities Query` section for more details. + * - ``/api/packages/bulk_search`` + - Refer to package bulk search section :ref:`Package Bulk Search` + - Return a list of packages + * - ``/api/vulnerabilities/`` + - + - ``vulnerability_id`` (string) = VCID (VulnerableCode Identifier) of the vulnerability + - ``page`` (integer) = page number of the response + - ``page_size`` (integer) = number of vulnerabilities in each page + - Return a list of vulnerabilities + * - ``/api/cpes`` + - + - ``cpe`` (string) = value of the cpe + - ``page`` (integer) = page number of the response + - ``page_size`` (integer) = number of cpes in each page + - Return a list of vulnerabilities + * - ``/api/cpes/bulk_search`` + - Refer to CPE bulk search section :ref:`CPE Bulk Search` + - Return a list of cpes + * - ``/api/aliases`` + - + - ``alias`` (string) = value of the alias + - ``page`` (integer) = page number of the response + - ``page_size`` (integer) = number of aliases in each page + - Return a list of vulnerabilities + +.. list-table:: Table for other API endpoints + :widths: 30 40 30 + :header-rows: 1 + + * - Endpoint + - Query Parameters + - Expected Output + * - ``/api/packages/{id}`` + - + - ``id`` (integer) = internal primary id of the package + - Return a package with the given id + * - ``/api/packages/all`` + - No parameter required + - Return a list of all vulnerable packages + * - ``/api/vulnerabilities/{id}`` + - + - ``id`` (integer) = internal primary id of the vulnerability + - Return a vulnerability with the given id + * - ``/api/aliases/{id}`` + - + - ``id`` (integer) = internal primary id of the alias + - Return an alias with the given id + * - ``/api/cpes/{id}`` + - + - ``id`` = internal primary id of the cpe + - Return a cpe with the given id + +Miscellaneous +---------------- + +The API is paginated and the default page size is 100. You can change the page size +by passing the ``page_size`` parameter. You can also change the page number by passing +the ``page`` parameter. diff --git a/docs/source/contributing.rst b/docs/source/contributing.rst index 18f021b4d..fa6e7075b 100644 --- a/docs/source/contributing.rst +++ b/docs/source/contributing.rst @@ -89,3 +89,577 @@ Helpful Resources on how to write good commit messages - `Pro Git book `_ - `How to write a good bug report `_ + +.. _tutorial_add_a_new_importer: + +Add a new importer +------------------- + +This tutorial contains all the things one should know to quickly implement an importer. +Many internal details about importers can be found inside the +:file:`vulnerabilites/importer.py` file. +Make sure to go through :ref:`importer-overview` before you begin writing one. + +TL;DR +------- + +#. Create a new :file:`vulnerabilities/importers/{importer_name.py}` file. +#. Create a new importer subclass inheriting from the ``Importer`` superclass defined in + ``vulnerabilites.importer``. It is conventional to end an importer name with *Importer*. +#. Specify the importer license. +#. Implement the ``advisory_data`` method to process the data source you are + writing an importer for. +#. Add the newly created importer to the importers registry at + ``vulnerabilites/importers/__init__.py`` + +.. _tutorial_add_a_new_importer_prerequisites: + +Prerequisites +-------------- + +Before writing an importer, it is important to familiarize yourself with the following concepts. + +PackageURL +^^^^^^^^^^^^ + +VulnerableCode extensively uses Package URLs to identify a package. See the +`PackageURL specification `_ and its `Python implementation +`_ for more details. + +**Example usage:** + +.. code:: python + + from packageurl import PackageURL + purl = PackageURL(name="ffmpeg", type="deb", version="1.2.3") + + +AdvisoryData +^^^^^^^^^^^^^ + +``AdvisoryData`` is an intermediate data format: +it is expected that your importer will convert the raw scraped data into ``AdvisoryData`` objects. +All the fields in ``AdvisoryData`` dataclass are optional; it is the importer's resposibility to +ensure that it contains meaningful information about a vulnerability. + +AffectedPackage +^^^^^^^^^^^^^^^^ + +``AffectedPackage`` data type is used to store a range of affected versions and a fixed version of a +given package. For all version-related data, `univers `_ library +is used. + +Univers +^^^^^^^^ + +`univers `_ is a Python implementation of the `vers specification `_. +It can parse and compare all the package versions and all the ranges, +from debian, npm, pypi, ruby and more. +It processes all the version range specs and expressions. + +Importer +^^^^^^^^^ + +All the generic importers need to implement the ``Importer`` class. +For ``Git`` or ``Oval`` data source, ``GitImporter`` or ``OvalImporter`` could be implemented. + +.. note:: + + ``GitImporter`` and ``OvalImporter`` need a complete rewrite. + Interested in :ref:`contributing` ? + +Writing an importer +--------------------- + +Create Importer Source File +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +All importers are located in the :file:`vulnerabilites/importers` directory. +Create a new file to put your importer code in. +Generic importers are implemented by writing a subclass for the ``Importer`` superclass and +implementing the unimplemented methods. + +Specify the Importer License +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Importers scrape data off the internet. In order to make sure the data is useable, a license +must be provided. +Populate the ``spdx_license_expression`` with the appropriate value. +The SPDX license identifiers can be found at https://spdx.org/licenses/. + +.. note:: + An SPDX license identifier by itself is a valid licence expression. In case you need more complex + expressions, see https://spdx.github.io/spdx-spec/v2.3/SPDX-license-expressions/ + +Implement the ``advisory_data`` Method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``advisory_data`` method scrapes the advisories from the data source this importer is +targeted at. +It is required to return an *Iterable of AdvisoryData objects*, and thus it is a good idea to yield +from this method after creating each AdvisoryData object. + +At this point, an example importer will look like this: + +:file:`vulnerabilites/importers/example.py` + +.. code-block:: python + + from typing import Iterable + + from packageurl import PackageURL + + from vulnerabilities.importer import AdvisoryData + from vulnerabilities.importer import Importer + + + class ExampleImporter(Importer): + + spdx_license_expression = "BSD-2-Clause" + + def advisory_data(self) -> Iterable[AdvisoryData]: + return [] + +This importer is only a valid skeleton and does not import anything at all. + +Let us implement another dummy importer that actually imports some data. + +Here we have a ``dummy_package`` which follows ``NginxVersionRange`` and ``SemverVersion`` for +version management from `univers `_. + +.. note:: + + It is possible that the versioning scheme you are targeting has not yet been + implemented in the `univers `_ library. + If this is the case, you will need to head over there and implement one. + +.. code-block:: python + + from datetime import datetime + from datetime import timezone + from typing import Iterable + + import requests + from packageurl import PackageURL + from univers.version_range import NginxVersionRange + from univers.versions import SemverVersion + + from vulnerabilities.importer import AdvisoryData + from vulnerabilities.importer import AffectedPackage + from vulnerabilities.importer import Importer + from vulnerabilities.importer import Reference + from vulnerabilities.importer import VulnerabilitySeverity + from vulnerabilities.severity_systems import SCORING_SYSTEMS + + + class ExampleImporter(Importer): + + spdx_license_expression = "BSD-2-Clause" + + def advisory_data(self) -> Iterable[AdvisoryData]: + raw_data = fetch_advisory_data() + for data in raw_data: + yield parse_advisory_data(data) + + + def fetch_advisory_data(): + return [ + { + "id": "CVE-2021-23017", + "summary": "1-byte memory overwrite in resolver", + "advisory_severity": "medium", + "vulnerable": "0.6.18-1.20.0", + "fixed": "1.20.1", + "reference": "http://mailman.nginx.org/pipermail/nginx-announce/2021/000300.html", + "published_on": "14-02-2021 UTC", + }, + { + "id": "CVE-2021-1234", + "summary": "Dummy advisory", + "advisory_severity": "high", + "vulnerable": "0.6.18-1.20.0", + "fixed": "1.20.1", + "reference": "http://example.com/cve-2021-1234", + "published_on": "06-10-2021 UTC", + }, + ] + + + def parse_advisory_data(raw_data) -> AdvisoryData: + purl = PackageURL(type="example", name="dummy_package") + affected_version_range = NginxVersionRange.from_native(raw_data["vulnerable"]) + fixed_version = SemverVersion(raw_data["fixed"]) + affected_package = AffectedPackage( + package=purl, affected_version_range=affected_version_range, fixed_version=fixed_version + ) + severity = VulnerabilitySeverity( + system=SCORING_SYSTEMS["generic_textual"], value=raw_data["advisory_severity"] + ) + references = [Reference(url=raw_data["reference"], severities=[severity])] + date_published = datetime.strptime(raw_data["published_on"], "%d-%m-%Y %Z").replace( + tzinfo=timezone.utc + ) + + return AdvisoryData( + aliases=[raw_data["id"]], + summary=raw_data["summary"], + affected_packages=[affected_package], + references=references, + date_published=date_published, + ) + + +.. note:: + + | Use ``make valid`` to format your new code using black and isort automatically. + | Use ``make check`` to check for formatting errors. + +Register the Importer +^^^^^^^^^^^^^^^^^^^^^^ + +Finally, register your importer in the importer registry at +:file:`vulnerabilites/importers/__init__.py` + +.. code-block:: python + :emphasize-lines: 1, 4 + + from vulnerabilities.importers import example + from vulnerabilities.importers import nginx + + IMPORTERS_REGISTRY = [nginx.NginxImporter, example.ExampleImporter] + + IMPORTERS_REGISTRY = {x.qualified_name: x for x in IMPORTERS_REGISTRY} + +Congratulations! You have written your first importer. + +Run Your First Importer +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If everything went well, you will see your importer in the list of available importers. + +.. code-block:: console + :emphasize-lines: 5 + + $ ./manage.py import --list + + Vulnerability data can be imported from the following importers: + vulnerabilities.importers.nginx.NginxImporter + vulnerabilities.importers.example.ExampleImporter + +Now, run the importer. + +.. code-block:: console + + $ ./manage.py import vulnerabilities.importers.example.ExampleImporter + + Importing data using vulnerabilities.importers.example.ExampleImporter + Successfully imported data using vulnerabilities.importers.example.ExampleImporter + +See :ref:`command_line_interface` for command line usage instructions. + +Enable Debug Logging (Optional) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +For more visibility, turn on debug logs in :file:`vulnerablecode/settings.py`. + +.. code-block:: python + + DEBUG = True + LOGGING = { + 'version': 1, + 'disable_existing_loggers': False, + 'handlers': { + 'console': { + 'class': 'logging.StreamHandler', + }, + }, + 'root': { + 'handlers': ['console'], + 'level': 'DEBUG', + }, + } + +Invoke the import command now and you will see (in a fresh database): + +.. code-block:: console + + $ ./manage.py import vulnerabilities.importers.example.ExampleImporter + + Importing data using vulnerabilities.importers.example.ExampleImporter + Starting import for vulnerabilities.importers.example.ExampleImporter + [*] New Advisory with aliases: ['CVE-2021-23017'], created_by: vulnerabilities.importers.example.ExampleImporter + [*] New Advisory with aliases: ['CVE-2021-1234'], created_by: vulnerabilities.importers.example.ExampleImporter + Finished import for vulnerabilities.importers.example.ExampleImporter. Imported 2 advisories. + Successfully imported data using vulnerabilities.importers.example.ExampleImporter + +.. _tutorial_add_a_new_improver: + +Add a new improver +--------------------- + +This tutorial contains all the things one should know to quickly +implement an improver. +Many internal details about improvers can be found inside the +:file:`vulnerabilites/improver.py` file. +Make sure to go through :ref:`improver-overview` before you begin writing one. + +TL;DR +------- + +#. Locate the importer that this improver will be improving data of at + :file:`vulnerabilities/importers/{importer_name.py}` file. +#. Create a new improver subclass inheriting from the ``Improver`` superclass defined in + ``vulnerabilites.improver``. It is conventional to end an improver name with *Improver*. +#. Implement the ``interesting_advisories`` property to return a QuerySet of imported data + (``Advisory``) you are interested in. +#. Implement the ``get_inferences`` method to return an iterable of ``Inference`` objects for the + given ``AdvisoryData``. +#. Add the newly created improver to the improvers registry at + ``vulnerabilites/improvers/__init__.py``. + +Prerequisites +-------------- + +Before writing an improver, it is important to familiarize yourself with the following concepts. + +Importer +^^^^^^^^^^ + +Importers are responsible for scraping vulnerability data from various data sources without creating +a complete relational model between vulnerabilites and their fixes and storing them in a structured +fashion. These data are stored in the ``Advisory`` model and can be converted to an equivalent +``AdvisoryData`` for various use cases. +See :ref:`importer-overview` for a brief overview on importers. + +Importer Prerequisites +^^^^^^^^^^^^^^^^^^^^^^^ + +Improvers consume data produced by importers, and thus it is important to familiarize yourself with +:ref:`Importer Prerequisites `. + +Inference +^^^^^^^^^^^ + +Inferences express the contract between the improvers and the improve runner framework. +An inference is intended to contain data points about a vulnerability without any uncertainties, +which means that one inference will target one vulnerability with the specific relevant affected and +fixed packages (in the form of `PackageURLs `_). +There is no notion of version ranges here: all package versions must be explicitly specified. + +Because this concrete relationship is rarely available anywhere upstream, we have to *infer* +these values, thus the name. +As inferring something is not always perfect, an Inference also comes with a confidence score. + +Improver +^^^^^^^^^ + +All the Improvers must inherit from ``Improver`` superclass and implement the +``interesting_advisories`` property and the ``get_inferences`` method. + +Writing an improver +--------------------- + +Locate the Source File +^^^^^^^^^^^^^^^^^^^^^^^^ + +If the improver will be working on data imported by a specific importer, it will be located in +the same file at :file:`vulnerabilites/importers/{importer-name.py}`. Otherwise, if it is a +generic improver, create a new file :file:`vulnerabilites/improvers/{improver-name.py}`. + +Explore Package Managers (Optional) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If your Improver depends on the discrete versions of a package, the package managers' VersionAPI +located at :file:`vulnerabilites/package_managers.py` could come in handy. You will need to +instantiate the relevant ``VersionAPI`` in the improver's constructor and use it later in the +implemented methods. See an already implemented improver (NginxBasicImprover) for an example usage. + +Implement the ``interesting_advisories`` Property +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This property is intended to return a QuerySet of ``Advisory`` on which the ``Improver`` is +designed to work. + +For example, if the improver is designed to work on Advisories imported by ``ExampleImporter``, +the property can be implemented as + +.. code-block:: python + + class ExampleBasicImprover(Improver): + + @property + def interesting_advisories(self) -> QuerySet: + return Advisory.objects.filter(created_by=ExampleImporter.qualified_name) + +Implement the ``get_inferences`` Method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The framework calls ``get_inferences`` method for every ``AdvisoryData`` that is obtained from +the ``Advisory`` QuerySet returned by the ``interesting_advisories`` property. + +It is expected to return an iterable of ``Inference`` objects for the given ``AdvisoryData``. To +avoid storing a lot of Inferences in memory, it is preferable to yield from this method. + +A very simple Improver that processes all Advisories to create the minimal relationships that can +be obtained by existing data can be found at :file:`vulnerabilites/improvers/default.py`, which is +an example of a generic improver. For a more sophisticated and targeted example, you can look +at an already implemented improver (e.g., :file:`vulnerabilites/importers/nginx.py`). + +Improvers are not limited to improving discrete versions and may also improve ``aliases``. +One such example, improving the importer written in the :ref:`importer tutorial +`, is shown below. + +.. code-block:: python + + from datetime import datetime + from datetime import timezone + from typing import Iterable + + import requests + from django.db.models.query import QuerySet + from packageurl import PackageURL + from univers.version_range import NginxVersionRange + from univers.versions import SemverVersion + + from vulnerabilities.importer import AdvisoryData + from vulnerabilities.improver import MAX_CONFIDENCE + from vulnerabilities.improver import Improver + from vulnerabilities.improver import Inference + from vulnerabilities.models import Advisory + from vulnerabilities.severity_systems import SCORING_SYSTEMS + + + class ExampleImporter(Importer): + ... + + + class ExampleAliasImprover(Improver): + @property + def interesting_advisories(self) -> QuerySet: + return Advisory.objects.filter(created_by=ExampleImporter.qualified_name) + + def get_inferences(self, advisory_data) -> Iterable[Inference]: + for alias in advisory_data.aliases: + new_aliases = fetch_additional_aliases(alias) + aliases = new_aliases + [alias] + yield Inference(aliases=aliases, confidence=MAX_CONFIDENCE) + + + def fetch_additional_aliases(alias): + alias_map = { + "CVE-2021-23017": ["PYSEC-1337", "CERTIN-1337"], + "CVE-2021-1234": ["ANONSEC-1337", "CERTDES-1337"], + } + return alias_map.get(alias) + + +.. note:: + + | Use ``make valid`` to format your new code using black and isort automatically. + | Use ``make check`` to check for formatting errrors. + +Register the Improver +^^^^^^^^^^^^^^^^^^^^^^ + +Finally, register your improver in the improver registry at +:file:`vulnerabilites/improvers/__init__.py`. + +.. code-block:: python + :emphasize-lines: 7 + + from vulnerabilities import importers + from vulnerabilities.improvers import default + + IMPROVERS_REGISTRY = [ + default.DefaultImprover, + importers.nginx.NginxBasicImprover, + importers.example.ExampleAliasImprover, + ] + + IMPROVERS_REGISTRY = {x.qualified_name: x for x in IMPROVERS_REGISTRY} + +Congratulations! You have written your first improver. + +Run Your First Improver +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If everything went well, you will see your improver in the list of available improvers. + +.. code-block:: console + :emphasize-lines: 6 + + $ ./manage.py improve --list + + Vulnerability data can be processed by these available improvers: + vulnerabilities.improvers.default.DefaultImprover + vulnerabilities.importers.nginx.NginxBasicImprover + vulnerabilities.importers.example.ExampleAliasImprover + +Before running the improver, make sure you have imported the data. An improver cannot improve if +there is nothing imported. + +.. code-block:: console + + $ ./manage.py import vulnerabilities.importers.example.ExampleImporter + + Importing data using vulnerabilities.importers.example.ExampleImporter + Successfully imported data using vulnerabilities.importers.example.ExampleImporter + +Now, run the improver. + +.. code-block:: console + + $ ./manage.py improve vulnerabilities.importers.example.ExampleAliasImprover + + Improving data using vulnerabilities.importers.example.ExampleAliasImprover + Successfully improved data using vulnerabilities.importers.example.ExampleAliasImprover + +See :ref:`command_line_interface` for command line usage instructions. + +Enable Debug Logging (Optional) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +For more visibility, turn on debug logs in :file:`vulnerablecode/settings.py`. + +.. code-block:: python + + DEBUG = True + LOGGING = { + 'version': 1, + 'disable_existing_loggers': False, + 'handlers': { + 'console': { + 'class': 'logging.StreamHandler', + }, + }, + 'root': { + 'handlers': ['console'], + 'level': 'DEBUG', + }, + } + +Invoke the improve command now and you will see (in a fresh database, after importing): + +.. code-block:: console + + $ ./manage.py improve vulnerabilities.importers.example.ExampleAliasImprover + + Improving data using vulnerabilities.importers.example.ExampleAliasImprover + Running improver: vulnerabilities.importers.example.ExampleAliasImprover + Improving advisory id: 1 + New alias for : PYSEC-1337 + New alias for : CVE-2021-23017 + New alias for : CERTIN-1337 + Improving advisory id: 2 + New alias for : CERTDES-1337 + New alias for : ANONSEC-1337 + New alias for : CVE-2021-1234 + Finished improving using vulnerabilities.importers.example.ExampleAliasImprover. + Successfully improved data using vulnerabilities.importers.example.ExampleAliasImprover + +.. note:: + + Even though CVE-2021-23017 and CVE-2021-1234 are not supplied by this improver, the output above shows them + because we left out running the ``DefaultImprover`` in the example. The ``DefaultImprover`` + inserts minimal data found via the importers in the database (here, the above two CVEs). Run + importer, DefaultImprover and then your improver in this sequence to avoid this anomaly. diff --git a/docs/source/images/pkg_details.png b/docs/source/images/pkg_details.png new file mode 100644 index 000000000..f1df78302 Binary files /dev/null and b/docs/source/images/pkg_details.png differ diff --git a/docs/source/images/pkg_search.png b/docs/source/images/pkg_search.png new file mode 100644 index 000000000..b152eaeaf Binary files /dev/null and b/docs/source/images/pkg_search.png differ diff --git a/docs/source/images/vuln_affected_packages.png b/docs/source/images/vuln_affected_packages.png new file mode 100644 index 000000000..d326b8af3 Binary files /dev/null and b/docs/source/images/vuln_affected_packages.png differ diff --git a/docs/source/images/vuln_details.png b/docs/source/images/vuln_details.png new file mode 100644 index 000000000..9de3459b5 Binary files /dev/null and b/docs/source/images/vuln_details.png differ diff --git a/docs/source/images/vuln_fixed_packages.png b/docs/source/images/vuln_fixed_packages.png new file mode 100644 index 000000000..428671790 Binary files /dev/null and b/docs/source/images/vuln_fixed_packages.png differ diff --git a/docs/source/images/vuln_search.png b/docs/source/images/vuln_search.png new file mode 100644 index 000000000..11cda2712 Binary files /dev/null and b/docs/source/images/vuln_search.png differ diff --git a/docs/source/index.rst b/docs/source/index.rst index 4f32eb472..7a93dc017 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -21,19 +21,13 @@ In this documentation you will find information on: :caption: Getting Started introduction + user-interface installation + api contributing faq misc -.. toctree:: - :maxdepth: 2 - :caption: Tutorial - - tutorial_add_new_importer - tutorial_add_new_improver - - .. toctree:: :maxdepth: 2 :caption: Reference Documentation @@ -43,7 +37,6 @@ In this documentation you will find information on: reference_framework_overview command-line-interface importers_link - api .. toctree:: :maxdepth: 1 diff --git a/docs/source/tutorial_add_new_importer.rst b/docs/source/tutorial_add_new_importer.rst deleted file mode 100644 index 454b60c81..000000000 --- a/docs/source/tutorial_add_new_importer.rst +++ /dev/null @@ -1,301 +0,0 @@ -.. _tutorial_add_a_new_importer: - -Add a new importer -==================== - -This tutorial contains all the things one should know to quickly implement an importer. -Many internal details about importers can be found inside the -:file:`vulnerabilites/importer.py` file. -Make sure to go through :ref:`importer-overview` before you begin writing one. - -TL;DR -------- - -#. Create a new :file:`vulnerabilities/importers/{importer_name.py}` file. -#. Create a new importer subclass inheriting from the ``Importer`` superclass defined in - ``vulnerabilites.importer``. It is conventional to end an importer name with *Importer*. -#. Specify the importer license. -#. Implement the ``advisory_data`` method to process the data source you are - writing an importer for. -#. Add the newly created importer to the importers registry at - ``vulnerabilites/importers/__init__.py`` - -.. _tutorial_add_a_new_importer_prerequisites: - -Prerequisites --------------- - -Before writing an importer, it is important to familiarize yourself with the following concepts. - -PackageURL -^^^^^^^^^^^^ - -VulnerableCode extensively uses Package URLs to identify a package. See the -`PackageURL specification `_ and its `Python implementation -`_ for more details. - -**Example usage:** - -.. code:: python - - from packageurl import PackageURL - purl = PackageURL(name="ffmpeg", type="deb", version="1.2.3") - - -AdvisoryData -^^^^^^^^^^^^^ - -``AdvisoryData`` is an intermediate data format: -it is expected that your importer will convert the raw scraped data into ``AdvisoryData`` objects. -All the fields in ``AdvisoryData`` dataclass are optional; it is the importer's resposibility to -ensure that it contains meaningful information about a vulnerability. - -AffectedPackage -^^^^^^^^^^^^^^^^ - -``AffectedPackage`` data type is used to store a range of affected versions and a fixed version of a -given package. For all version-related data, `univers `_ library -is used. - -Univers -^^^^^^^^ - -`univers `_ is a Python implementation of the `vers specification `_. -It can parse and compare all the package versions and all the ranges, -from debian, npm, pypi, ruby and more. -It processes all the version range specs and expressions. - -Importer -^^^^^^^^^ - -All the generic importers need to implement the ``Importer`` class. -For ``Git`` or ``Oval`` data source, ``GitImporter`` or ``OvalImporter`` could be implemented. - -.. note:: - - ``GitImporter`` and ``OvalImporter`` need a complete rewrite. - Interested in :ref:`contributing` ? - -Writing an importer ---------------------- - -Create Importer Source File -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -All importers are located in the :file:`vulnerabilites/importers` package. -Create a new file to put your importer code in. -Generic importers are implemented by writing a subclass for the ``Importer`` superclass and -implementing the unimplemented methods. - -Specify the Importer License -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Importers scrape data off the internet. In order to make sure the data is useable, a license -must be provided. -Populate the ``spdx_license_expression`` with the appropriate value. -The SPDX license identifiers can be found at https://spdx.org/licenses/. - -.. note:: - An SPDX license identifier by itself is a valid licence expression. In case you need more complex - expressions, see https://spdx.github.io/spdx-spec/v2.3/SPDX-license-expressions/ - -Implement the ``advisory_data`` Method -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The ``advisory_data`` method scrapes the advisories from the data source this importer is -targeted at. -It is required to return an *Iterable of AdvisoryData objects*, and thus it is a good idea to yield -from this method after creating each AdvisoryData object. - -At this point, an example importer will look like this: - -:file:`vulnerabilites/importers/example.py` - -.. code-block:: python - - from typing import Iterable - - from packageurl import PackageURL - - from vulnerabilities.importer import AdvisoryData - from vulnerabilities.importer import Importer - - - class ExampleImporter(Importer): - - spdx_license_expression = "BSD-2-Clause" - - def advisory_data(self) -> Iterable[AdvisoryData]: - return [] - -This importer is only a valid skeleton and does not import anything at all. - -Let us implement another dummy importer that actually imports some data. - -Here we have a ``dummy_package`` which follows ``NginxVersionRange`` and ``SemverVersion`` for -version management from `univers `_. - -.. note:: - - It is possible that the versioning scheme you are targetting has not yet been - implemented in the `univers `_ library. - If this is the case, you will need to head over there and implement one. - -.. code-block:: python - - from datetime import datetime - from datetime import timezone - from typing import Iterable - - import requests - from packageurl import PackageURL - from univers.version_range import NginxVersionRange - from univers.versions import SemverVersion - - from vulnerabilities.importer import AdvisoryData - from vulnerabilities.importer import AffectedPackage - from vulnerabilities.importer import Importer - from vulnerabilities.importer import Reference - from vulnerabilities.importer import VulnerabilitySeverity - from vulnerabilities.severity_systems import SCORING_SYSTEMS - - - class ExampleImporter(Importer): - - spdx_license_expression = "BSD-2-Clause" - - def advisory_data(self) -> Iterable[AdvisoryData]: - raw_data = fetch_advisory_data() - for data in raw_data: - yield parse_advisory_data(data) - - - def fetch_advisory_data(): - return [ - { - "id": "CVE-2021-23017", - "summary": "1-byte memory overwrite in resolver", - "advisory_severity": "medium", - "vulnerable": "0.6.18-1.20.0", - "fixed": "1.20.1", - "reference": "http://mailman.nginx.org/pipermail/nginx-announce/2021/000300.html", - "published_on": "14-02-2021 UTC", - }, - { - "id": "CVE-2021-1234", - "summary": "Dummy advisory", - "advisory_severity": "high", - "vulnerable": "0.6.18-1.20.0", - "fixed": "1.20.1", - "reference": "http://example.com/cve-2021-1234", - "published_on": "06-10-2021 UTC", - }, - ] - - - def parse_advisory_data(raw_data) -> AdvisoryData: - purl = PackageURL(type="example", name="dummy_package") - affected_version_range = NginxVersionRange.from_native(raw_data["vulnerable"]) - fixed_version = SemverVersion(raw_data["fixed"]) - affected_package = AffectedPackage( - package=purl, affected_version_range=affected_version_range, fixed_version=fixed_version - ) - severity = VulnerabilitySeverity( - system=SCORING_SYSTEMS["generic_textual"], value=raw_data["advisory_severity"] - ) - references = [Reference(url=raw_data["reference"], severities=[severity])] - date_published = datetime.strptime(raw_data["published_on"], "%d-%m-%Y %Z").replace( - tzinfo=timezone.utc - ) - - return AdvisoryData( - aliases=[raw_data["id"]], - summary=raw_data["summary"], - affected_packages=[affected_package], - references=references, - date_published=date_published, - ) - - -.. note:: - - | Use ``make valid`` to format your new code using black and isort automatically. - | Use ``make check`` to check for formatting errrors. - -Register the Importer -^^^^^^^^^^^^^^^^^^^^^^ - -Finally, register your importer in the importer registry at -:file:`vulnerabilites/importers/__init__.py` - -.. code-block:: python - :emphasize-lines: 1, 4 - - from vulnerabilities.importers import example - from vulnerabilities.importers import nginx - - IMPORTERS_REGISTRY = [nginx.NginxImporter, example.ExampleImporter] - - IMPORTERS_REGISTRY = {x.qualified_name: x for x in IMPORTERS_REGISTRY} - -Congratulations! You have written your first importer. - -Run Your First Importer -^^^^^^^^^^^^^^^^^^^^^^^^^^ - -If everything went well, you will see your importer in the list of available importers. - -.. code-block:: console - :emphasize-lines: 5 - - $ ./manage.py import --list - - Vulnerability data can be imported from the following importers: - vulnerabilities.importers.nginx.NginxImporter - vulnerabilities.importers.example.ExampleImporter - -Now, run the importer. - -.. code-block:: console - - $ ./manage.py import vulnerabilities.importers.example.ExampleImporter - - Importing data using vulnerabilities.importers.example.ExampleImporter - Successfully imported data using vulnerabilities.importers.example.ExampleImporter - -See :ref:`command_line_interface` for command line usage instructions. - -Enable Debug Logging (Optional) -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -For more visibility, turn on debug logs in :file:`vulnerablecode/settings.py`. - -.. code-block:: python - - DEBUG = True - LOGGING = { - 'version': 1, - 'disable_existing_loggers': False, - 'handlers': { - 'console': { - 'class': 'logging.StreamHandler', - }, - }, - 'root': { - 'handlers': ['console'], - 'level': 'DEBUG', - }, - } - -Invoke the import command now and you will see (in a fresh database): - -.. code-block:: console - - $ ./manage.py import vulnerabilities.importers.example.ExampleImporter - - Importing data using vulnerabilities.importers.example.ExampleImporter - Starting import for vulnerabilities.importers.example.ExampleImporter - [*] New Advisory with aliases: ['CVE-2021-23017'], created_by: vulnerabilities.importers.example.ExampleImporter - [*] New Advisory with aliases: ['CVE-2021-1234'], created_by: vulnerabilities.importers.example.ExampleImporter - Finished import for vulnerabilities.importers.example.ExampleImporter. Imported 2 advisories. - Successfully imported data using vulnerabilities.importers.example.ExampleImporter diff --git a/docs/source/tutorial_add_new_improver.rst b/docs/source/tutorial_add_new_improver.rst deleted file mode 100644 index 16fc7beab..000000000 --- a/docs/source/tutorial_add_new_improver.rst +++ /dev/null @@ -1,271 +0,0 @@ -.. _tutorial_add_a_new_improver: - -Add a new improver -==================== - -This tutorial contains all the things one should know to quickly -implement an improver. -Many internal details about improvers can be found inside the -:file:`vulnerabilites/improver.py` file. -Make sure to go through :ref:`improver-overview` before you begin writing one. - -TL;DR -------- - -#. Locate the importer that this improver will be improving data of at - :file:`vulnerabilities/importers/{importer_name.py}` file. -#. Create a new improver subclass inheriting from the ``Improver`` superclass defined in - ``vulnerabilites.improver``. It is conventional to end an improver name with *Improver*. -#. Implement the ``interesting_advisories`` property to return a QuerySet of imported data - (``Advisory``) you are interested in. -#. Implement the ``get_inferences`` method to return an iterable of ``Inference`` objects for the - given ``AdvisoryData``. -#. Add the newly created improver to the improvers registry at - ``vulnerabilites/improvers/__init__.py``. - -Prerequisites --------------- - -Before writing an improver, it is important to familiarize yourself with the following concepts. - -Importer -^^^^^^^^^^ - -Importers are responsible for scraping vulnerability data from various data sources without creating -a complete relational model between vulnerabilites and their fixes and storing them in a structured -fashion. These data are stored in the ``Advisory`` model and can be converted to an equivalent -``AdvisoryData`` for various use cases. -See :ref:`importer-overview` for a brief overview on importers. - -Importer Prerequisites -^^^^^^^^^^^^^^^^^^^^^^^ - -Improvers consume data produced by importers, and thus it is important to familiarize yourself with -:ref:`Importer Prerequisites `. - -Inference -^^^^^^^^^^^ - -Inferences express the contract between the improvers and the improve runner framework. -An inference is intended to contain data points about a vulnerability without any uncertainties, -which means that one inference will target one vulnerability with the specific relevant affected and -fixed packages (in the form of `PackageURLs `_). -There is no notion of version ranges here: all package versions must be explicitly specified. - -Because this concrete relationship is rarely available anywhere upstream, we have to *infer* -these values, thus the name. -As inferring something is not always perfect, an Inference also comes with a confidence score. - -Improver -^^^^^^^^^ - -All the Improvers must inherit from ``Improver`` superclass and implement the -``interesting_advisories`` property and the ``get_inferences`` method. - -Writing an improver ---------------------- - -Locate the Source File -^^^^^^^^^^^^^^^^^^^^^^^^ - -If the improver will be working on data imported by a specific importer, it will be located in -the same file at :file:`vulnerabilites/importers/{importer-name.py}`. Otherwise, if it is a -generic improver, create a new file :file:`vulnerabilites/improvers/{improver-name.py}`. - -Explore Package Managers (Optional) -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -If your Improver depends on the discrete versions of a package, the package managers' VersionAPI -located at :file:`vulnerabilites/package_managers.py` could come in handy. You will need to -instantiate the relevant ``VersionAPI`` in the improver's constructor and use it later in the -implemented methods. See an already implemented improver (NginxBasicImprover) for an example usage. - -Implement the ``interesting_advisories`` Property -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -This property is intended to return a QuerySet of ``Advisory`` on which the ``Improver`` is -designed to work. - -For example, if the improver is designed to work on Advisories imported by ``ExampleImporter``, -the property can be implemented as - -.. code-block:: python - - class ExampleBasicImprover(Improver): - - @property - def interesting_advisories(self) -> QuerySet: - return Advisory.objects.filter(created_by=ExampleImporter.qualified_name) - -Implement the ``get_inferences`` Method -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The framework calls ``get_inferences`` method for every ``AdvisoryData`` that is obtained from -the ``Advisory`` QuerySet returned by the ``interesting_advisories`` property. - -It is expected to return an iterable of ``Inference`` objects for the given ``AdvisoryData``. To -avoid storing a lot of Inferences in memory, it is preferable to yield from this method. - -A very simple Improver that processes all Advisories to create the minimal relationships that can -be obtained by existing data can be found at :file:`vulnerabilites/improvers/default.py`, which is -an example of a generic improver. For a more sophisticated and targeted example, you can look -at an already implemented improver (e.g., :file:`vulnerabilites/importers/nginx.py`). - -Improvers are not limited to improving discrete versions and may also improve ``aliases``. -One such example, improving the importer written in the :ref:`importer tutorial -`, is shown below. - -.. code-block:: python - - from datetime import datetime - from datetime import timezone - from typing import Iterable - - import requests - from django.db.models.query import QuerySet - from packageurl import PackageURL - from univers.version_range import NginxVersionRange - from univers.versions import SemverVersion - - from vulnerabilities.importer import AdvisoryData - from vulnerabilities.improver import MAX_CONFIDENCE - from vulnerabilities.improver import Improver - from vulnerabilities.improver import Inference - from vulnerabilities.models import Advisory - from vulnerabilities.severity_systems import SCORING_SYSTEMS - - - class ExampleImporter(Importer): - ... - - - class ExampleAliasImprover(Improver): - @property - def interesting_advisories(self) -> QuerySet: - return Advisory.objects.filter(created_by=ExampleImporter.qualified_name) - - def get_inferences(self, advisory_data) -> Iterable[Inference]: - for alias in advisory_data.aliases: - new_aliases = fetch_additional_aliases(alias) - aliases = new_aliases + [alias] - yield Inference(aliases=aliases, confidence=MAX_CONFIDENCE) - - - def fetch_additional_aliases(alias): - alias_map = { - "CVE-2021-23017": ["PYSEC-1337", "CERTIN-1337"], - "CVE-2021-1234": ["ANONSEC-1337", "CERTDES-1337"], - } - return alias_map.get(alias) - - -.. note:: - - | Use ``make valid`` to format your new code using black and isort automatically. - | Use ``make check`` to check for formatting errrors. - -Register the Improver -^^^^^^^^^^^^^^^^^^^^^^ - -Finally, register your improver in the improver registry at -:file:`vulnerabilites/improvers/__init__.py`. - -.. code-block:: python - :emphasize-lines: 7 - - from vulnerabilities import importers - from vulnerabilities.improvers import default - - IMPROVERS_REGISTRY = [ - default.DefaultImprover, - importers.nginx.NginxBasicImprover, - importers.example.ExampleAliasImprover, - ] - - IMPROVERS_REGISTRY = {x.qualified_name: x for x in IMPROVERS_REGISTRY} - -Congratulations! You have written your first improver. - -Run Your First Improver -^^^^^^^^^^^^^^^^^^^^^^^^^^ - -If everything went well, you will see your improver in the list of available improvers. - -.. code-block:: console - :emphasize-lines: 6 - - $ ./manage.py improve --list - - Vulnerability data can be processed by these available improvers: - vulnerabilities.improvers.default.DefaultImprover - vulnerabilities.importers.nginx.NginxBasicImprover - vulnerabilities.importers.example.ExampleAliasImprover - -Before running the improver, make sure you have imported the data. An improver cannot improve if -there is nothing imported. - -.. code-block:: console - - $ ./manage.py import vulnerabilities.importers.example.ExampleImporter - - Importing data using vulnerabilities.importers.example.ExampleImporter - Successfully imported data using vulnerabilities.importers.example.ExampleImporter - -Now, run the improver. - -.. code-block:: console - - $ ./manage.py improve vulnerabilities.importers.example.ExampleAliasImprover - - Improving data using vulnerabilities.importers.example.ExampleAliasImprover - Successfully improved data using vulnerabilities.importers.example.ExampleAliasImprover - -See :ref:`command_line_interface` for command line usage instructions. - -Enable Debug Logging (Optional) -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -For more visibility, turn on debug logs in :file:`vulnerablecode/settings.py`. - -.. code-block:: python - - DEBUG = True - LOGGING = { - 'version': 1, - 'disable_existing_loggers': False, - 'handlers': { - 'console': { - 'class': 'logging.StreamHandler', - }, - }, - 'root': { - 'handlers': ['console'], - 'level': 'DEBUG', - }, - } - -Invoke the improve command now and you will see (in a fresh database, after importing): - -.. code-block:: console - - $ ./manage.py improve vulnerabilities.importers.example.ExampleAliasImprover - - Improving data using vulnerabilities.importers.example.ExampleAliasImprover - Running improver: vulnerabilities.importers.example.ExampleAliasImprover - Improving advisory id: 1 - New alias for : PYSEC-1337 - New alias for : CVE-2021-23017 - New alias for : CERTIN-1337 - Improving advisory id: 2 - New alias for : CERTDES-1337 - New alias for : ANONSEC-1337 - New alias for : CVE-2021-1234 - Finished improving using vulnerabilities.importers.example.ExampleAliasImprover. - Successfully improved data using vulnerabilities.importers.example.ExampleAliasImprover - -.. note:: - - Even though CVE-2021-23017 and CVE-2021-1234 are not supplied by this improver, the output above shows them - because we left out running the ``DefaultImprover`` in the example. The ``DefaultImprover`` - inserts minimal data found via the importers in the database (here, the above two CVEs). Run - importer, DefaultImprover and then your improver in this sequence to avoid this anomaly. diff --git a/docs/source/user-interface.rst b/docs/source/user-interface.rst new file mode 100644 index 000000000..251896c8a --- /dev/null +++ b/docs/source/user-interface.rst @@ -0,0 +1,73 @@ +.. _user-interface: + +User Interface +================ + +.. _pkg-search: + +Search by packages +------------------ + +The search by packages is a very powerful feature of +VulnerableCode. It allows you to search for packages by the +package URL or purl prefix fragment such as +``pkg:pypi`` or by package name. + +The search by packages is available at the following URL: + + `https://public.vulnerablecode.io/packages/search `_ + +How to search by packages: + + 1. Go to the URL: `https://public.vulnerablecode.io/packages/search `_ + 2. Enter the package URL or purl prefix fragment such as ``pkg:pypi`` + or by package name in the search box. + 3. Click on the search button. + +The search results will be displayed in the table below the search box. + + .. image:: images/pkg_search.png + +Click on the package URL to view the package details. + + .. image:: images/pkg_details.png + + +.. _vuln-search: + +Search by vulnerabilities +--------------------------- + +The search by vulnerabilities is a very powerful feature of +VulnerableCode. It allows you to search for vulnerabilities by the +VCID itself. It also allows you to search for +vulnerabilities by the CVE, GHSA, CPEs etc or by the +fragment of these identifiers like ``CVE-2021``. + +The search by vulnerabilities is available at the following URL: + + `https://public.vulnerablecode.io/vulnerabilities/search `_ + +How to search by vulnerabilities: + + 1. Go to the URL: `https://public.vulnerablecode.io/vulnerabilities/search `_ + 2. Enter the VCID, CVE, GHSA, CPEs etc. in the search box. + 3. Click on the search button. + +The search results will be displayed in the table below the search box. + + .. image:: images/vuln_search.png + +Click on the VCID to view the vulnerability details. + + .. image:: images/vuln_details.png + +Affected packages tab shows the list of packages affected by the +vulnerability. + + .. image:: images/vuln_affected_packages.png + +Fixed by packages tab shows the list of packages that fix the +vulnerability. + + .. image:: images/vuln_fixed_packages.png