-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a focused visitor for maven #179
Comments
After talking with @pombredanne, this is how it would work: There would be a function that receives a url and we check to see if its one of these types of maven repo urls:
We would put these jobs on a queue, like the visitor/mapper or on-demand package queue, where each entry on the queue would point to a particular package version (e.g. |
Some extra comments:
|
Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
* This is to avoid using get_maven_root repeatedly * Save versionless purl to importable_uris Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
* Get links and timestamps at the same time * Create command that gets release_date for maven packages Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
* Add logging messages Signed-off-by: Jono Yang <[email protected]>
* Only update release_date for packages from maven.org Signed-off-by: Jono Yang <[email protected]>
* Only update release_date for packages from maven.org Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
* This is to avoid using get_maven_root repeatedly * Save versionless purl to importable_uris Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
* Get links and timestamps at the same time * Create command that gets release_date for maven packages Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
* Add logging messages Signed-off-by: Jono Yang <[email protected]>
* Only update release_date for packages from maven.org Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
In the context of aboutcode-org/scancode.io#900, it would be useful to have a focused visitor for maven that can visit and index the entire maven index (and other repos other than Maven Central). This visitor would be able to visit and index subsets of the repo, instead of collecting the entirety of the index from an arbitrary point.
As we visit, we can create packages with minimal package information (purl+sha1). This would help us identify maven packages by sha1, which we then can go scan and index fingerprints of the package.
The text was updated successfully, but these errors were encountered: