Skip to content

Commit

Permalink
Merge pull request #15 from nexB/reorg-code
Browse files Browse the repository at this point in the history
Reorg code
  • Loading branch information
JonoYang authored Jan 3, 2023
2 parents b770843 + 9504d0c commit 0a1a65c
Show file tree
Hide file tree
Showing 612 changed files with 16,522 additions and 3,006 deletions.
File renamed without changes.
6 changes: 3 additions & 3 deletions .github/workflows/docs-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,17 +24,17 @@ jobs:
run: chmod +x ./docs/scripts/doc8_style_check.sh

- name: Install Dependencies
working-directory: ./minecode
working-directory: .
run: ./configure --docs

- name: Check Sphinx Documentation build minimally
working-directory: ./docs
run: |
source ../minecode/venv/bin/activate
source ../venv/bin/activate
sphinx-build -E -W source build
- name: Check for documentation style errors
working-directory: ./docs
run: |
source ../minecode/venv/bin/activate
source ../venv/bin/activate
./scripts/doc8_style_check.sh
55 changes: 0 additions & 55 deletions .github/workflows/packagedb-tests.yml

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: Minecode Tests CI
name: PurlDB Tests CI

on: [push, pull_request]

Expand Down Expand Up @@ -43,13 +43,12 @@ jobs:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
working-directory: ./minecode
working-directory: .
run: |
./configure --dev
make dev
- name: Run tests
working-directory: ./minecode
working-directory: .
run: |
make envfile
source venv/bin/activate
python manage.py test
make test
7 changes: 6 additions & 1 deletion AUTHORS.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
The following organizations or individuals have contributed to this repo:

-
- nexB Inc.
- Jono Yang
- Philippe Ombredanne
- Li Ha
- Steven Esser
- Armin Tänzer
8 changes: 6 additions & 2 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
Changelog
=========

next-version
------------

v0.0.0
*2023-01-03* -- Add clearcode, matchcode, and matchcode-toolkit to purldb. Reorganize code such that purldb is a single Django app.

v2.0.0
------

*xxxx-xx-xx* -- Initial release.
*2022-11-11* -- Initial release.
6 changes: 0 additions & 6 deletions minecode/MANIFEST.in → MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
graft etc
graft src
graft tests

include *.LICENSE
include NOTICE
Expand All @@ -12,10 +10,6 @@ include setup.*
include configure*
include requirements*
include .git*
include MANIFEST.in
include setup.cfg
include setup.py


global-exclude *.py[co] __pycache__ *.*~

39 changes: 30 additions & 9 deletions minecode/Makefile → Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,13 @@ virtualenv:
@echo "-> Bootstrap the virtualenv with PYTHON_EXE=${PYTHON_EXE}"
@${PYTHON_EXE} ${VIRTUALENV_PYZ} --never-download --no-periodic-update ${VENV}

conf: virtualenv
conf:
@echo "-> Install dependencies"
@${ACTIVATE} pip install -e . -c requirements.txt
@./configure

dev: virtualenv
dev:
@echo "-> Configure and install development dependencies"
@${ACTIVATE} pip install -e .[dev] -c requirements.txt
@./configure --dev

envfile:
@echo "-> Create the .env file and generate a secret key"
Expand Down Expand Up @@ -70,8 +70,7 @@ check:

clean:
@echo "-> Clean the Python env"
rm -rf ${VENV} build/ dist/ packagedb.egg-info/ docs/_build/ pip-selfcheck.json
find . -type f -name '*.py[co]' -delete -o -type d -name __pycache__ -delete
@./configure --clean

migrate:
@echo "-> Apply database migrations"
Expand All @@ -91,9 +90,31 @@ postgres:
run:
${MANAGE} runserver 8001 --insecure

seed:
${MANAGE} seed

run_visit: seed
${MANAGE} run_visit

run_map:
${MANAGE} run_map

test:
@echo "-> Run the test suite"
${ACTIVATE} ${PYTHON_EXE} -m pytest -vvs
${ACTIVATE} DJANGO_SETTINGS_MODULE=purldb.settings ${PYTHON_EXE} -m pytest -vvs --ignore matchcode-toolkit
${ACTIVATE} ${PYTHON_EXE} -m pytest -vvs matchcode-toolkit

shell:
${MANAGE} shell

clearsync:
${MANAGE} clearsync --save-to-db --verbose -n 3

clearindex:
${MANAGE} run_clearindex

index_packages:
${MANAGE} index_packages

bump:
@echo "-> Bump the version"
Expand All @@ -110,6 +131,6 @@ docker-images:
docker-compose pull
@echo "-> Save the service images to a compressed tar archive in the dist/ directory"
@mkdir -p dist/
@docker save postgres packagedb_packagedb nginx | gzip > dist/packagedb-images-`git describe --tags`.tar.gz
@docker save minecode minecode_minecode nginx | gzip > dist/minecode-images-`git describe --tags`.tar.gz

.PHONY: virtualenv conf dev envfile install check valid isort clean migrate postgres sqlite run test bump docs docker-images
.PHONY: virtualenv conf dev envfile isort black doc8 valid check clean migrate postgres run test shell clearsync clearindex index_packages bump docs docker-images
126 changes: 115 additions & 11 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,22 +1,126 @@
The purldb
================================
This repo consiste of two main tools:
The purldb
==========
This repo consists of four main tools:

- MineCode that contains utilities to mine package repositories
- PackageDB that is the reference model (based on ScanCode toolkit)
that contains package data with purl (Package URLs) being a first
class citizen.
- MineCode that contains utilities to mine package repositories
- MatchCode that contains utilities to index package metadata and resources for
matching
- ClearCode that contains utilities to mine Clearlydefined for package data

These are designed to be used first for reference such that one can query for
packages by purl and validate purl existence.

In the future, the collected packages will be used as reference for dependency
resolution, as a reference knowledge base for all package data, as a reference
for vulnerable range resolution and more.


Installation
------------
Requirements
############
* Debian-based Linux distribution
* Python 3.8 or later
* Postgres 13
* git
* scancode-toolkit runtime dependencies (https://scancode-toolkit.readthedocs.io/en/stable/getting-started/install.html#install-prerequisites)

Once the prerequisites have been installed, set up PurlDB with the following commands:
::

git clone https://github.com/nexb/purldb
cd purldb
make dev
make postgres
make envfile

Once PurlDB and the database has been set up, run tests to ensure functionality:
::

make test


Usage
-----
Start the PurlDB server by running:
::

make run

To start visiting upstream package repositories for package metadata:
::

make run_visit

To populate the PackageDB using visited package metadata:
::

make run_map

If you have an empty PackageDB without Package and Package Resource information,
ClearCode should be run for a while so it can populate the PackageDB
with Package and Package Resource information from clearlydefined.
::

make clearsync

After some ClearlyDefined harvests and definitions have been obtained, run
``clearindex`` to create Packages and Resources from the harvests and
definitions.
::

make clearindex

The Package and Package Resource information will be used to create the matching indices.

Once the PackageDB has been populated, run the following command to create the
matching indices from the collected Package data:
::

make index_packages


API Endpoints
-------------

* ``api/packages``

* Contains all of the Packages stored in the PackageDB

* ``api/resources``

* Contains all of the Resources stored in the PackageDB

* ``api/cditems``

* Contains the visited ClearlyDefined harvests or definitions

* ``api/approximate_directory_content_index``

* Contains the directory content fingerprints for Packages with Resources
* Used to check if a directory and the files under it is from a known Package using the SHA1 values of the files

* ``api/approximate_directory_structure_index``

* Contains the directory structure fingerprints for Packages with Resources
* Used to check if a directory and the files under it is from a known Package using the name of the files

* ``api/exact_file_index``

* Contains the SHA1 values of Package Resources
* Used to check the SHA1 values of files from a scan to see what Packages also has that file

These are designed to be used first for reference such that one can
query by purl and validate purl existence.
* ``api/exact_package_archive_index``

In the future, these will be used as reference for dependency
resolution, as a reference knowledge base for all packag data,
as a reference for vulnerable range resolution and more.
* Contains the SHA1 values of Package archives
* Used to check the SHA1 values of archives from a scan to determine if they are known Packages


License
^^^^^^^^^^
-------

Copyright (c) nexB Inc. and others. All rights reserved.

Expand All @@ -32,6 +136,6 @@ See https://www.apache.org/licenses/LICENSE-2.0 for the license text.

See https://creativecommons.org/licenses/by-sa/4.0/legalcode for the license text.

See https://github.com/nexB/purldb for support or download.
See https://github.com/nexB/purldb for support or download.

See https://aboutcode.org for more information about nexB OSS projects.
25 changes: 25 additions & 0 deletions apache-2.0.LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -174,3 +174,28 @@
of your accepting any such warranty or additional liability.

END OF TERMS AND CONDITIONS

APPENDIX: How to apply the Apache License to your work.

To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.

Copyright [yyyy] [name of copyright owner]

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
File renamed without changes.
Loading

0 comments on commit 0a1a65c

Please sign in to comment.