-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Continuously watch a PURL of interest, aka. purlwatch #244
Comments
Carried over from #88 closed in favor of this one:
|
In a the future, we could refine the watch to only look for versions after a version or after a release date to avoid collecting a badzillion of old, historical, unused versions. This could take the form of:
|
There is a design that needs some thinking: How do we select which PackageWatch to process?
This would be portable beyond PostgreSQL but requires more processing, as most watches would need to be look at once on each run and would not use a queryset expression. Here the last watch date is a DatetimeField and the watch_interval would be a number of days between watches.
Let's go with 3. for start as we can easily migrate to other options afterwards: this is the most expressive and easiest to test: |
Another design point is the processing: The thing could start with either a cron-like task in RQ or a command line management command. In all cases, this would select packages eligible for watch per #244 (comment) and then could either:
Using 2. is probably best for now to avoid duplicated entries. |
Watch for packages (model and implementation) #244
This is completed now. We did extend PurlDB with a new “watch” API endpoint. Given a PURL, you can register "interest" in this PURL and continuously "watch" for updates to this Package URL (ignoring versions), polling for new versions on scheduled based on a per-watched defined interval. To back this, we integrate a queue and scheduler (using RQ). When the watch runs on schedule and a new package version is available, we further collect metadata and trigger new indexing scans in the background.
To test this feature:
You may want to use another PURL as this may already been watched |
Given a PURL, I would like to have a new API endpoint to register "interest" in this PURL. Once this is "registered", this PURL should continuously be "watched" for updates to this Package URL (all versions), polling for new versions on schedule, like on daily, or weekly basis.
When a new version is discovered, we should run the steps to collect metadata and trigger new scancode pipeline runs on each update.
Optionally we could also work from a PURL and have a flag to expand to either the previous and next versions, or to all versions of a PURL, or all the future versions.
The solution could include these elements:
The text was updated successfully, but these errors were encountered: