-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Harvesters not terminating #227
Comments
I should note that the harvester configuration above is a direct lift from the harvester configuration from a 1.5.0 deployment of CKAN |
is the ckan_run_harvester container running? Are the cron jobs in this container executing? you can run the harvester cleanup manually by executing see /contrib/docker/crontab for a list of cron jobs that are run in the ckan_run_harvester container It could be related to container permissions. the ckan_run_harvester must be run as root. |
The docker file does have a line to copy the It doesn't look like the cron jobs are installed. Running the command above it complains about "SECRET_KEY" which likely makes part or all of this down to not running the ckan generate config command and grabbing the appropriate key values or executing the commented out commands at the top of the .env file. I note that those commands will fail on Windows/WSL due to some low-level nonsense on that part. I'll work around it and rebuild the containers to see if that makes a difference. I imagine it'll let the command above run, I don't suspect it'll change anything with the cron jobs themselves. |
There is a couple of issues here. line 20 in ckan-run-harvester-entrypoint.sh should be while ckan can read it's config from environment variables the command line tools do not. so in order for all the cronjob tasks to work we need to update the ckan.ini. uncomment the following lines in your ckan.ini in the container
It is odd that the fetch and gather containers work while the run container does not... |
Note that there appears to be some odd behaviour when updating the frequency of a harvest job. While the change will show up in the GUI after hitting save. the time of the next harvest job run is not adjusted in the database until the next time it runs. This means that when going from weekly to always frequency, for example, the job will not be updated until the next time it runs, potentially in a week. To update sooner you will need to manually run the harvest to insure the database is updated to the new settings. |
CKAN version 1.6.0
Describe the bug
Harvest jobs of fresh installs of CKAN 1.6.0 do not appear to be able to terminate by themselves as previous versions do.
Current job has been running for well over an hour, but it has inserted all datasets correctly.
However, the process appears to fail before the indexes are updated as the home page shows a dataset count of zero and no E*Vs are listed as having any datasets attached to them.
The datasets page does show the datasets, E*Vs, responsible organizations, tags, resources types, licenses, formats.
Map will show dataset extents and filters appear to be working properly.
Log outputs for the ckan and harvester containers are attached.
Steps to reproduce
Steps to reproduce the behavior:
Expected behavior
The harvester should have run and produced a set of results detailing how many datasets added, updated, deleted, etc.
Additional details
Configuration:
CKAN Container & Harvester Logs:
ckan.log
ckan_harvesters.log
The text was updated successfully, but these errors were encountered: