diff --git a/docs/source/command-line-interface.rst b/docs/source/command-line-interface.rst index 4b24e28a3..f4a02858d 100644 --- a/docs/source/command-line-interface.rst +++ b/docs/source/command-line-interface.rst @@ -6,7 +6,7 @@ Command Line Interface The main entry point is Django's :guilabel:`manage.py` management commands. ``$ ./manage.py --help`` ------------------------ +------------------------ Lists all sub-commands available, including Django built-in commands. VulnerableCode's own commands are listed under the ``[vulnerabilities]`` section:: diff --git a/docs/source/conf.py b/docs/source/conf.py index 7eaba1b07..0fb591301 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -18,8 +18,8 @@ # -- Project information ----------------------------------------------------- project = "VulnerableCode" -copyright = "nexb Inc. and others" -author = "nexb Inc. and others" +copyright = "nexB Inc. and others" +author = "nexB Inc. and others" # -- General configuration --------------------------------------------------- diff --git a/docs/source/importers_link.rst b/docs/source/importers_link.rst new file mode 100644 index 000000000..ca5b18175 --- /dev/null +++ b/docs/source/importers_link.rst @@ -0,0 +1,6 @@ +.. _importers_link: + +Importers +========= + +.. include:: ../../SOURCES.rst diff --git a/docs/source/installation.rst b/docs/source/installation.rst index ce5036c1e..4f098bc3f 100644 --- a/docs/source/installation.rst +++ b/docs/source/installation.rst @@ -109,7 +109,7 @@ Local development installation Supported Platforms ^^^^^^^^^^^^^^^^^^^ -**VulnerableCode* has been tested and is supported on the following operating systems: +**VulnerableCode** has been tested and is supported on the following operating systems: #. **Debian-based** Linux distributions #. **macOS** 12.1 and up @@ -122,7 +122,7 @@ Pre-installation Checklist Before you install VulnerableCode, make sure you have the following prerequisites: - * **Python: 3.8+* found at https://www.python.org/downloads/ + * **Python: 3.8+** found at https://www.python.org/downloads/ * **Git**: most recent release available at https://git-scm.com/ * **PostgreSQL**: release 10 or later found at https://www.postgresql.org/ or https://postgresapp.com/ on macOS @@ -212,8 +212,6 @@ application. This setup is **not suitable for deployments** and **only supported for local development**. -An overview of the web application usage is available at :ref:`user_interface`. - Upgrading ^^^^^^^^^ diff --git a/docs/source/introduction.rst b/docs/source/introduction.rst index 4e9fe81b3..ceee77bca 100644 --- a/docs/source/introduction.rst +++ b/docs/source/introduction.rst @@ -7,8 +7,8 @@ VulnerableCode is a work-in-progress towards a free and open vulnerabilities database and the packages they impact and the tools to aggregate and correlate these vulnerabilities. -Why VulnerableCode ? ---------------------- +Why VulnerableCode? +------------------- The existing solutions are commercial proprietary vulnerability databases, which in itself does not make sense because the data is about FOSS (Free and Open @@ -27,12 +27,12 @@ security issues because: fundamental questions "Is package foo vulnerable" and "Is package foo vulnerable to vulnerability bar?" -How does it work ? -------------------- +How does it work? +----------------- VulnerableCode independently aggregates many software vulnerability data sources and supports data re-creation in a decentralized fashion. These data sources -(see complete list `here <./SOURCES.rst>`_) include security advisories +(see complete list :ref:`here `) include security advisories published by Linux and BSD distributions, application software package managers and package repositories, FOSS projects, GitHub and more. Thanks to this approach, the data is focused on specific ecosystems yet aggregated in a single @@ -59,14 +59,17 @@ exposure due to various reasons like but not limited to the complicated procedure to receive CVE ID or not able to classify a bug as a security compromise. -Recent presentations: -- `Open Source Summit 2020 `_ +Is VulnerableCode being actively developed? +------------------------------------------- -Should I use VulnerableCode ? -------------------------------- +Yes -- VulnerableCode is a work in progress! Please stay in touch on our `Gitter channel `_; and if you have any feedback, feel free to `enter an issue in our GitHub repo `_. -VulnerableCode is a work in progress project and will likely go through major changes. Please stay in touch on our `Gitter channel `_ + +Recent presentations +-------------------- + +- `Open Source Summit 2020 `_ .. Some of this documentation is borrowed from the metaflow documentation and is also under Apache-2.0 diff --git a/docs/source/reference_importer_overview.rst b/docs/source/reference_importer_overview.rst index 2c794a480..104c7c6ec 100644 --- a/docs/source/reference_importer_overview.rst +++ b/docs/source/reference_importer_overview.rst @@ -3,34 +3,33 @@ Importer Overview ================== -Importers are responsible to scrape vulnerability data from various data sources without creating -a complete relational model between vulnerabilites, their fixes and store them in a structured -fashion. +Importers are responsible for scraping vulnerability data such as vulnerabilities and their fixes +and for storing the scraped information in a structured fashion. The structured data created by the +importer then provides input to an improver (see :ref:`improver-overview`), which is responsible +for creating a relational model for vulnerabilities, affected packages and fixed packages. -All importer implementation related code is defined in :file:`vulnerabilites/importer.py`. +All importer implementation-related code is defined in :file:`vulnerabilites/importer.py`. -Whereas, the framework related code for actually invoking and processing the importers are -situated in :file:`vulnerabilites/import_runner.py`. +In addition, the framework-related code for actually invoking and processing the importers is +located in :file:`vulnerabilites/import_runner.py`. -The importers, after scraping, provide with ``AdvisoryData`` objects. These objects are then +The importers, after scraping, provide ``AdvisoryData`` objects. These objects are then processed and inserted into the ``Advisory`` model. While implementing an importer, it is important to make sure that the importer does not alter the -upstream data at all. Its only job is to convert the data from a data source into structured - yet -non relational - data. The importers must **not** be smart or performing trickeries -under the hood. -This ensures that we always have a *true* copy of an advisory without any speculations or -improvements. +upstream data at all. Its only job is to convert the data from a data source into structured -- yet +non-relational -- data. This ensures that we always have a *true* copy of an advisory without any +modifications. -As importers do not speculate and given that a lot of advisories publish version ranges of affected +Given that a lot of advisories publish version ranges of affected packages, it is necessary to store those ranges in a structured manner. *Vers* was designed to solve this problem. It has been implemented in the `univers `_ library whose development goes hand in hand with VulnerableCode. -The data imported by importers is not useful by itself, it must be processed into a relational -model. The version ranges are required to be dissolved into concrete ranges. These are achieved by -``Improvers``. For more, see: :ref:`improver-overview` +The data imported by importers is not useful by itself: it must be processed into a relational +model. The version ranges are required to be resolved into concrete ranges. These are achieved by +``Improvers`` (see :ref:`improver-overview` for details). -As of now, the following importers have been implemented in VulnerableCode +As of now, the following importers have been implemented in VulnerableCode: .. include:: ../../SOURCES.rst diff --git a/docs/source/reference_improver_overview.rst b/docs/source/reference_improver_overview.rst index 01fd0b81b..4de0ff1c0 100644 --- a/docs/source/reference_improver_overview.rst +++ b/docs/source/reference_improver_overview.rst @@ -6,29 +6,29 @@ Improver Overview Improvers improve upon already imported data. They are responsible for creating a relational model for vulnerabilites and packages. -An Improver is supposed to contain data points about a vulnerability and the relevant discrete +An Improver is intended to contain data points about a vulnerability and the relevant discrete affected and fixed packages (in the form of `PackageURLs `_). -There is no notion of version ranges here, all package versions must be explicitly specified. -As this concrete relationship might not always be absolutely correct, improvers supply with a +There is no notion of version ranges here; all package versions must be explicitly specified. +As this concrete relationship might not always be absolutely correct, improvers supply a confidence score and only the record with the highest confidence against a vulnerability and package relationship is stored in the database. There are two categories of improvers: - **Generic**: Improve upon some imported data irrespective of any importer. These improvers are - defined in :file:`vulnerabilites/improvers/` + defined in :file:`vulnerabilites/improvers/`. - **Importer Specific**: Improve upon data imported by a specific importer. These are defined in the corresponding importer file itself. Both types of improvers internally work in a similar fashion. They indicate which ``Advisory`` they are interested in and when supplied with those Advisories, they return Inferences. -An ``Inference`` is more explicit than an ``Advisory`` and is able to answer the questions like, "Is -package A vulnerable to Vulnerability B ?". Of course, there is some confidence attached with the -answer which could also be ``MAX_CONFIDENCE`` in certain cases. +An ``Inference`` is more explicit than an ``Advisory`` and is able to answer questions like "Is +package A vulnerable to Vulnerability B ?". Of course, there is some confidence attached to the +answer, which could also be ``MAX_CONFIDENCE`` in certain cases. -The possibilities with improvers is endless, they are not restricted to take one approach. Features -like *Time Travel* and *finding fix commits* could be Implemented as well. +The possibilities with improvers are endless; they are not restricted to take one approach. Features +like *Time Travel* and *finding fix commits* could be implemented as well. You can find more in-code documentation about improvers in :file:`vulnerabilites/improver.py` and -the framework responsible for invoking these improvers in :file:`vulnerabilites/improve_runner.py` +the framework responsible for invoking these improvers in :file:`vulnerabilites/improve_runner.py`. diff --git a/docs/source/tutorial_add_new_importer.rst b/docs/source/tutorial_add_new_importer.rst index d8034bc81..137317985 100644 --- a/docs/source/tutorial_add_new_importer.rst +++ b/docs/source/tutorial_add_new_importer.rst @@ -3,9 +3,8 @@ Add a new importer ==================== -This tutorial contains all the things one should know to quickly -implement an importer. -A lot of internal sausage about importers could be found inside the +This tutorial contains all the things one should know to quickly implement an importer. +Many internal details about importers can be found inside the :file:`vulnerabilites/importer.py` file. Make sure to go through :ref:`importer-overview` before you begin writing one. @@ -15,7 +14,7 @@ TL;DR #. Create a new :file:`vulnerabilities/importers/{importer_name.py}` file. #. Create a new importer subclass inheriting from the ``Importer`` superclass defined in ``vulnerabilites.importer``. It is conventional to end an importer name with *Importer*. -#. Specify the importer licence. +#. Specify the importer license. #. Implement the ``advisory_data`` method to process the data source you're writing an importer for. #. Add the newly created importer to the importers registry at ``vulnerabilites/importers/__init__.py`` @@ -45,24 +44,24 @@ VulnerableCode extensively uses Package URLs to identify a package. See the AdvisoryData ^^^^^^^^^^^^^ -``AdvisoryData`` is an intermediate data-format, -it is expected, that your importer converts the raw scraped data into ``AdvisoryData`` objects. -All the fields in ``AdvisoryData`` dataclass are optional, it is the importer's resposibility to -ensure that it must contain meaningful information about a vulnerability. +``AdvisoryData`` is an intermediate data format: +it is expected that your importer will convert the raw scraped data into ``AdvisoryData`` objects. +All the fields in ``AdvisoryData`` dataclass are optional; it is the importer's resposibility to +ensure that it contains meaningful information about a vulnerability. AffectedPackage ^^^^^^^^^^^^^^^^ ``AffectedPackage`` data type is used to store a range of affected versions and a fixed version of a -given package. For all version related data, `univers `_ library +given package. For all version-related data, `univers `_ library is used. Univers ^^^^^^^^ -`univers `_ is a python implementation of the `vers specification `_. -It can parse and compare all the package versions and all the ranges. -From debian, npm, pypi, ruby and more. +`univers `_ is a Python implementation of the `vers specification `_. +It can parse and compare all the package versions and all the ranges, +from debian, npm, pypi, ruby and more. It processes all the version range specs and expressions. Importer @@ -90,24 +89,24 @@ implementing the unimplemented methods. Specify the Importer License ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Importers scrape data off the internet, in order to make sure the data is useable, a license must be -provided. -Populate the ``spdx_license_expression`` with appropriate value. -The SPDX license identifies can be found at https://spdx.org/licenses/ +Importers scrape data off the internet. In order to make sure the data is useable, a license +must be provided. +Populate the ``spdx_license_expression`` with the appropriate value. +The SPDX license identifiers can be found at https://spdx.org/licenses/. .. note:: An SPDX license identifier by itself is a valid licence expression. In case you need more complex - expressions, see: https://spdx.github.io/spdx-spec/SPDX-license-expressions/ + expressions, see https://spdx.github.io/spdx-spec/SPDX-license-expressions/ Implement the ``advisory_data`` Method ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -The ``advisory_data`` method scrapes the advisories from the data source this importer is targeted -at. -It is required to return an *Iterable of AdvisoryData objects*, thus it is a good idea to yield from -this method after creating each AdvisoryData object +The ``advisory_data`` method scrapes the advisories from the data source this importer is +targeted at. +It is required to return an *Iterable of AdvisoryData objects*, and thus it is a good idea to yield +from this method after creating each AdvisoryData object. -At this point, an example importer will look like: +At this point, an example importer will look like this: :file:`vulnerabilites/importers/example.py` @@ -133,11 +132,11 @@ This importer is only a valid skeleton and does not import anything at all. Let us implement another dummy importer that actually imports some data. Here we have a ``dummy_package`` which follows ``NginxVersionRange`` and ``SemverVersion`` for -version management from `univers `_ +version management from `univers `_. .. note:: - It is possible that versioning scheme you are targetting has not yet been implemented in the `univers `_ library. If this is the case, you'll need to head over over there and implement one. + It is possible that the versioning scheme you are targetting has not yet been implemented in the `univers `_ library. If this is the case, you'll need to head over there and implement one. .. code-block:: python @@ -241,7 +240,7 @@ Congratulations! You've written your first importer. Run Your First Importer ^^^^^^^^^^^^^^^^^^^^^^^^^^ -If everything went fine, you'll see your importer in the list of available importers +If everything went well, you'll see your importer in the list of available importers. .. code-block:: console :emphasize-lines: 5 @@ -252,7 +251,7 @@ If everything went fine, you'll see your importer in the list of available impor vulnerabilities.importers.nginx.NginxImporter vulnerabilities.importers.example.ExampleImporter -Now, run the importer +Now, run the importer. .. code-block:: console @@ -285,7 +284,7 @@ For more visibility, turn on debug logs in :file:`vulnerablecode/settings.py`. }, } -Invoke the import command now and you'll see (in a fresh database) +Invoke the import command now and you'll see (in a fresh database): .. code-block:: console diff --git a/docs/source/tutorial_add_new_improver.rst b/docs/source/tutorial_add_new_improver.rst index 72f80bbc6..5d56885f7 100644 --- a/docs/source/tutorial_add_new_improver.rst +++ b/docs/source/tutorial_add_new_improver.rst @@ -5,7 +5,7 @@ Add a new improver This tutorial contains all the things one should know to quickly implement an improver. -A lot of internal sausage about improvers could be found inside the +Many internal details about improvers can be found inside the :file:`vulnerabilites/improver.py` file. Make sure to go through :ref:`improver-overview` before you begin writing one. @@ -19,9 +19,9 @@ TL;DR #. Implement the ``interesting_advisories`` property to return a QuerySet of imported data (``Advisory``) you are interested in. #. Implement the ``get_inferences`` method to return an iterable of ``Inference`` objects for the - given ``AdvisoryData`` + given ``AdvisoryData``. #. Add the newly created improver to the improvers registry at - ``vulnerabilites/improvers/__init__.py`` + ``vulnerabilites/improvers/__init__.py``. Prerequisites -------------- @@ -32,7 +32,7 @@ Importer ^^^^^^^^^^ Importers are responsible for scraping vulnerability data from various data sources without creating -a complete relational model between vulnerabilites, their fixes and store them in a structured +a complete relational model between vulnerabilites and their fixes and storing them in a structured fashion. These data are stored in the ``Advisory`` model and can be converted to an equivalent ``AdvisoryData`` for various use cases. See :ref:`importer-overview` for a brief overview on importers. @@ -40,21 +40,21 @@ See :ref:`importer-overview` for a brief overview on importers. Importer Prerequisites ^^^^^^^^^^^^^^^^^^^^^^^ -Improvers consume data produced by importers, thus it is important to familiarize yourself with -:ref:`Importer Prerequisites ` +Improvers consume data produced by importers, and thus it is important to familiarize yourself with +:ref:`Importer Prerequisites `. Inference ^^^^^^^^^^^ Inferences express the contract between the improvers and the improve runner framework. -An inference is supposed to contain data points about a vulnerability without any uncertainties, -which means, one inference will target one vulnerability with the specific relevant affected and -fixed packages (in the form of `PackageURLs `_) -There is no notion of version ranges here, all package versions must be explicitly specified. +An inference is intended to contain data points about a vulnerability without any uncertainties, +which means that one inference will target one vulnerability with the specific relevant affected and +fixed packages (in the form of `PackageURLs `_). +There is no notion of version ranges here: all package versions must be explicitly specified. -Because this concrete relationship is hardly available anywhere on the upstream, we have to *infer* +Because this concrete relationship is rarely available anywhere upstream, we have to *infer* these values, thus the name. -As infering something is not always perfect, an Inference also comes with a confidence score. +As inferring something is not always perfect, an Inference also comes with a confidence score. Improver ^^^^^^^^^ @@ -68,26 +68,25 @@ Writing an improver Locate the Source File ^^^^^^^^^^^^^^^^^^^^^^^^ -If the improver will be working on data imported by an specific importer, it will sit in the same -file at :file:`vulnerabilites/importers/{importer-name.py}`. -Otherwise, if it is a generic improver, create a new file -:file:`vulnerabilites/improvers/{improver-name.py}` +If the improver will be working on data imported by a specific importer, it will be located in +the same file at :file:`vulnerabilites/importers/{importer-name.py}`. Otherwise, if it is a +generic improver, create a new file :file:`vulnerabilites/improvers/{improver-name.py}`. Explore Package Managers (Optional) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If your Improver depends on the discrete versions of a package, the package managers' VersionAPI located at :file:`vulnerabilites/package_managers.py` could come in handy. You'll need to -instantiate the relevant ``VersionAPI`` in the improver's constructor and use them later in the +instantiate the relevant ``VersionAPI`` in the improver's constructor and use it later in the implemented methods. See an already implemented improver (NginxBasicImprover) for an example usage. Implement the ``interesting_advisories`` Property ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -This property is supposed to return a QuerySet of ``Advisory`` on which the ``Improver`` is -interested to work on. +This property is intended to return a QuerySet of ``Advisory`` on which the ``Improver`` is +designed to work. -For example, if the improver is interested to work on Advisories imported by ``ExampleImporter``, +For example, if the improver is designed to work on Advisories imported by ``ExampleImporter``, the property can be implemented as .. code-block:: python @@ -105,14 +104,14 @@ The framework calls ``get_inferences`` method for every ``AdvisoryData`` that is the ``Advisory`` QuerySet returned by the ``interesting_advisories`` property. It is expected to return an iterable of ``Inference`` objects for the given ``AdvisoryData``. To -avoid storing a lot of Inferences in memory, it is nicer to yield from this method. +avoid storing a lot of Inferences in memory, it is preferable to yield from this method. -A very simple Improver that processes all Advisories to create the minimal relationships that can be -obtained by existing data can be found at :file:`vulnerabilites/improvers/default.py` It is an -example of a generic improver, for more sophisticated and targetted one, you can look at an already -implemented improver (for eg, in :file:`vulnerabilites/importers/nginx.py`). +A very simple Improver that processes all Advisories to create the minimal relationships that can +be obtained by existing data can be found at :file:`vulnerabilites/improvers/default.py`, which is +an example of a generic improver. For a more sophisticated and targeted example, you can look +at an already implemented improver (e.g., :file:`vulnerabilites/importers/nginx.py`). -Improvers are not limited to improving discrete versions, they may also improve ``aliases``. +Improvers are not limited to improving discrete versions and may also improve ``aliases``. One such example, improving the importer written in the :ref:`importer tutorial `, is shown below. @@ -169,7 +168,7 @@ Register the Improver ^^^^^^^^^^^^^^^^^^^^^^ Finally, register your improver in the improver registry at -:file:`vulnerabilites/improvers/__init__.py` +:file:`vulnerabilites/improvers/__init__.py`. .. code-block:: python :emphasize-lines: 7 @@ -190,7 +189,7 @@ Congratulations! You've written your first improver. Run Your First Improver ^^^^^^^^^^^^^^^^^^^^^^^^^^ -If everything went fine, you'll see your improver in the list of available improvers +If everything went well, you'll see your improver in the list of available improvers. .. code-block:: console :emphasize-lines: 6 @@ -212,7 +211,7 @@ there is nothing imported. Importing data using vulnerabilities.importers.example.ExampleImporter Successfully imported data using vulnerabilities.importers.example.ExampleImporter -Now, run the improver +Now, run the improver. .. code-block:: console @@ -245,7 +244,7 @@ For more visibility, turn on debug logs in :file:`vulnerablecode/settings.py`. }, } -Invoke the improve command now and you'll see (in a fresh database, after importing) +Invoke the improve command now and you'll see (in a fresh database, after importing): .. code-block:: console @@ -266,7 +265,7 @@ Invoke the improve command now and you'll see (in a fresh database, after import .. note:: - Even though CVE-2021-23017 and CVE-2021-1234 are not supplied by this improver yet it shows them + Even though CVE-2021-23017 and CVE-2021-1234 are not supplied by this improver, the output above shows them because we left out running the ``DefaultImprover`` in the example. The ``DefaultImprover`` - inserts minimal data found via the importers in the database (Here, the above two CVEs). Run + inserts minimal data found via the importers in the database (here, the above two CVEs). Run importer, DefaultImprover and then your improver in this sequence to avoid this anomaly.