Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix crontab in ckan_run_harvester #229

Merged
merged 6 commits into from
Jul 24, 2024
Merged

Conversation

fostermh
Copy link
Member

fix #227
update config so command line tools can load extensions. fix crontab loading in ckan_run_harvester

@fostermh fostermh requested a review from sjbruce July 11, 2024 20:29
Copy link

Image has been pushed to cioos/ckan

Testing Quick Start

Pull image

sudo docker pull cioos/ckan:DEV_PR229
or
sudo CKAN_TAG=DEV_PR229 docker-compose pull ckan

Remove Home Volume and Restart

sudo docker-compose down
sudo docker volume rm docker_ckan_home
sudo CKAN_TAG=DEV_PR229 docker-compose up -d

for full documentation see TBD

@sjbruce
Copy link

sjbruce commented Jul 12, 2024

For reasons unknown, this is what's happening now.

My guess is that we may need to downgrade the cryptography library to 38.0.4 - from looking around and testing that's that latest version that can support that import call.

Similar error to this one: apache/superset#22613

2024-07-12 11:07:46 Postgres is up - executing command
2024-07-12 11:07:47 [prerun] Initializing or upgrading db - start
2024-07-12 11:07:47 Traceback (most recent call last):
2024-07-12 11:07:47   File "/srv/app/prerun.py", line 102, in init_db
2024-07-12 11:07:47     subprocess.check_output(db_command, stderr=subprocess.STDOUT)
2024-07-12 11:07:47   File "/usr/lib/python3.9/subprocess.py", line 424, in check_output
2024-07-12 11:07:47     return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
2024-07-12 11:07:47   File "/usr/lib/python3.9/subprocess.py", line 528, in run
2024-07-12 11:07:47     raise CalledProcessError(retcode, process.args,
2024-07-12 11:07:47 subprocess.CalledProcessError: Command '['ckan', '-c', '/srv/app/ckan.ini', 'db', 'upgrade']' returned non-zero exit status 1.
2024-07-12 11:07:47 
2024-07-12 11:07:47 During handling of the above exception, another exception occurred:
2024-07-12 11:07:47 
2024-07-12 11:07:47 Traceback (most recent call last):
2024-07-12 11:07:47   File "/srv/app/prerun.py", line 215, in <module>
2024-07-12 11:07:47     init_db()
2024-07-12 11:07:47   File "/srv/app/prerun.py", line 105, in init_db
2024-07-12 11:07:47     if "OperationalError" in e.output:
2024-07-12 11:07:47 TypeError: a bytes-like object is required, not 'str'
2024-07-12 11:07:47 /srv/app/start_ckan.sh: Running init file /docker-entrypoint.d/ckan-entrypoint.sh
2024-07-12 11:07:47 db:5432 - accepting connections
2024-07-12 11:07:48 Traceback (most recent call last):
2024-07-12 11:07:48   File "/usr/bin/ckan", line 8, in <module>
2024-07-12 11:07:48     sys.exit(ckan())
2024-07-12 11:07:48   File "/usr/lib/python3.9/site-packages/click/core.py", line 829, in __call__
2024-07-12 11:07:48     return self.main(*args, **kwargs)
2024-07-12 11:07:48   File "/usr/lib/python3.9/site-packages/click/core.py", line 781, in main
2024-07-12 11:07:48     with self.make_context(prog_name, args, **extra) as ctx:
2024-07-12 11:07:48   File "/usr/lib/python3.9/site-packages/click/core.py", line 700, in make_context
2024-07-12 11:07:48     self.parse_args(ctx, args)
2024-07-12 11:07:48   File "/srv/app/src/ckan/ckan/cli/cli.py", line 116, in parse_args
2024-07-12 11:07:48     result = super(ExtendableGroup, self).parse_args(ctx, args)
2024-07-12 11:07:48   File "/usr/lib/python3.9/site-packages/click/core.py", line 1212, in parse_args
2024-07-12 11:07:48     rest = Command.parse_args(self, ctx, args)
2024-07-12 11:07:48   File "/usr/lib/python3.9/site-packages/click/core.py", line 1048, in parse_args
2024-07-12 11:07:48     value, args = param.handle_parse_result(ctx, opts, args)
2024-07-12 11:07:48   File "/usr/lib/python3.9/site-packages/click/core.py", line 1630, in handle_parse_result
2024-07-12 11:07:48     value = invoke_param_callback(self.callback, ctx, self, value)
2024-07-12 11:07:48   File "/usr/lib/python3.9/site-packages/click/core.py", line 123, in invoke_param_callback
2024-07-12 11:07:48     return callback(ctx, param, value)
2024-07-12 11:07:48   File "/srv/app/src/ckan/ckan/cli/cli.py", line 126, in _init_ckan_config
2024-07-12 11:07:48     _add_ctx_object(ctx, value)
2024-07-12 11:07:48   File "/srv/app/src/ckan/ckan/cli/cli.py", line 135, in _add_ctx_object
2024-07-12 11:07:48     ctx.obj = CtxObject(path)
2024-07-12 11:07:48   File "/srv/app/src/ckan/ckan/cli/cli.py", line 57, in __init__
2024-07-12 11:07:48     self.app = make_app(self.config)
2024-07-12 11:07:48   File "/srv/app/src/ckan/ckan/config/middleware/__init__.py", line 56, in make_app
2024-07-12 11:07:48     load_environment(conf)
2024-07-12 11:07:48   File "/srv/app/src/ckan/ckan/config/environment.py", line 123, in load_environment
2024-07-12 11:07:48     p.load_all()
2024-07-12 11:07:48   File "/srv/app/src/ckan/ckan/plugins/core.py", line 165, in load_all
2024-07-12 11:07:48     load(*plugins)
2024-07-12 11:07:48   File "/srv/app/src/ckan/ckan/plugins/core.py", line 179, in load
2024-07-12 11:07:48     service = _get_service(plugin)
2024-07-12 11:07:48   File "/srv/app/src/ckan/ckan/plugins/core.py", line 281, in _get_service
2024-07-12 11:07:48     return plugin.load()(name=plugin_name)
2024-07-12 11:07:48   File "/usr/lib/python3.9/site-packages/pkg_resources/__init__.py", line 2443, in load
2024-07-12 11:07:48     return self.resolve()
2024-07-12 11:07:48   File "/usr/lib/python3.9/site-packages/pkg_resources/__init__.py", line 2449, in resolve
2024-07-12 11:07:48     module = __import__(self.module_name, fromlist=['__name__'], level=0)
2024-07-12 11:07:48   File "/srv/app/src/ckanext-harvest/ckanext/harvest/harvesters/__init__.py", line 1, in <module>
2024-07-12 11:07:48     from ckanext.harvest.harvesters.ckanharvester import CKANHarvester
2024-07-12 11:07:48   File "/srv/app/src/ckanext-harvest/ckanext/harvest/harvesters/ckanharvester.py", line 7, in <module>
2024-07-12 11:07:48     from urllib3.contrib import pyopenssl
2024-07-12 11:07:48   File "/usr/lib/python3.9/site-packages/urllib3/contrib/pyopenssl.py", line 53, in <module>
2024-07-12 11:07:48     from cryptography.hazmat.backends.openssl.x509 import _Certificate
2024-07-12 11:07:48 ModuleNotFoundError: No module named 'cryptography.hazmat.backends.openssl.x509'

@fostermh
Copy link
Member Author

hmmm I thought we fixed this one. I will check my package versions

…ne and cause conflicts if they are not disabled)

Updated Dockerfile to lock pyopenssl and cryptography to the highest versions that still support x509 in the manner that the ckan harvester expects
@fostermh
Copy link
Member Author

it seems like the command line tools need the .plugins list in the ckan.ini however. perhaps we need to write to the ckan.ini on container start or something

@fostermh
Copy link
Member Author

I beleave with the addition of the ckan_home volume to the ckan_run_harvester this issue is now fixed. please confirm and merge if it is working.

@sjbruce
Copy link

sjbruce commented Jul 19, 2024

The harvester now runs well without any external intervention but the entrypoint file still doesn't seem to be able to setup the cronjobs on its own.

However, I have found that if I manually execute command to setup the crontab from within the container via an interactive shell then, the cron jobs will populate and then execute properly. The container loses the cronjobs if they are rebuilt though...

A thought occurs - could we just mount the crontabs file as a volume like what we're doing with the entrypoint files? I'll test that out on my local to see how it performs.

 - commented out crontab setup line in entrypoint
 - crontab file in contrib/docker needs to be owned by root:root in order to execute
@sjbruce
Copy link

sjbruce commented Jul 19, 2024

Mounting the crontab file as a volume in the ckan_run_harvester container appears to work, however, it needs to be owned by root:root in order to execute (thankfully the user id for root is universal)

… in /docker-entrypoint.d/ckan-run-harvester-entrypoint.sh is only run once on the first container start and never again until attached volumes are cleared
@fostermh
Copy link
Member Author

I pushed a fix for the crontab that does not require mounting. the entrypoint file was not being run as it is only run on the first container start if located under docker-entrypoint.d and not run again until attached volumes are cleared. this is great for the other containers but not this one.

Copy link

@sjbruce sjbruce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been working well on my end and we have clear feedback in the log that the cron jobs are being added.

@fostermh fostermh merged commit bdce540 into cioos_dev Jul 24, 2024
2 checks passed
@fostermh fostermh deleted the fix_crontab_run_harvester branch July 24, 2024 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants