Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlalchemy error #233

Closed
corviday opened this issue Feb 20, 2024 · 13 comments · Fixed by #247
Closed

sqlalchemy error #233

corviday opened this issue Feb 20, 2024 · 13 comments · Fixed by #247
Assignees

Comments

@corviday
Copy link
Contributor

Docker instances of the newest version (3.6.2) throw the following error when any API call is made:

2024-02-20 17:03:47 [13] [ERROR] Exception on /api/multimeta [GET]
Traceback (most recent call last):
File "/root/.cache/pypoetry/virtualenvs/ce-9TtSrW0h-py3.8/lib/python3.8/site-packages/sqlalchemy/util/_collections.py", line 1020, in __call__
return self.registry[key]
KeyError: 140433640835912

This does not seem to be an issue with our python code, as a version with identical code has been deployed as an unmerged branch. It is perhaps caused by an update to sqlalchemy.

@corviday corviday self-assigned this Feb 20, 2024
@corviday corviday changed the title poetry error sqlalchemy error Feb 20, 2024
@corviday
Copy link
Contributor Author

corviday commented Feb 20, 2024

  • The version of sqlalchemy used by the app has not changed.
  • The error does not occur when the same app is run on bare metal
  • The file referenced in the error does exist inside the docker
  • comparable docker and bare-metal versions use the same version of poetry

@corviday
Copy link
Contributor Author

I built a docker from 32ce883, and it did not have the error, so I think the error was introduced sometime after that hash, though it is still unclear what sort of error might appear only in docker containers and not on bare metal.

@jameshiebert
Copy link
Contributor

@corviday have you tried running the test suite in a docker container? Does the /multimeta API call fail under that environment? Just trying to get a minimal test case so that we can narrow this down and work on it.

@corviday
Copy link
Contributor Author

corviday commented Feb 23, 2024

That's a great idea, thanks. The test suite has some failures, but they seem to be unrelated (a needed directory isn't mounted to the container) but other tests pass. I don't know what that means, but it's useful.

EDIT: I have restored the missing directory and all tests pass now. Hmmm.

@jameshiebert
Copy link
Contributor

OK, that being the case can you give a full "steps to reproduce" overview then?

@corviday
Copy link
Contributor Author

corviday commented Feb 23, 2024

Steps to reproduce:

  1. Build a docker of release version 3.6.2, or use the one on docker hub
  2. Start the docker
  3. access any API endpoint to get an error. /multimeta is an easy one because it has defaults for all the parameters.

Given that tests work, issue is likely flask-related? I don't think flask is used the same way in tests as with the server running live. Hm, except that flask is used when running on bare metal, and that(s fine.

@jameshiebert
Copy link
Contributor

Can you be more specific about what parameters you're using for step 2 and exactly what URL you're using from step 3? I still cannot reproduce this.

@corviday
Copy link
Contributor Author

That's good news, I think?

Here are the current docker-compose.yaml and be.env files, with their extension changed and passwords redacted.
be.env.txt
docker-compose.yaml.txt

Here is my test URL.

@corviday
Copy link
Contributor Author

corviday commented Feb 27, 2024

Things that don't fix it:

  • downgrading flask packages to match the versions in the known-good container
  • switching database connection string to match the previous, known-good container
  • running as a production container instead of a development container (ie, gunicorn instead of flask)

Switching GDAL and / or python back has so far proved to be quite a tangle and I don't have an answer yet.

Working and buggy containers use the same version of sqlalchemy, so that's probably not the cause.

@corviday
Copy link
Contributor Author

corviday commented Jul 25, 2024

Reopened an old branch that previously worked, but that one now has the error too, so it does seem like something else updating was the cause.

@corviday
Copy link
Contributor Author

corviday commented Aug 8, 2024

Released a new version (3.6.3), which works in github tests, but not in docker containers.

I now have an instance of 3.6.3 running on my workstation and running on docker-dev02. They have identical libraries according to pip freeze, so "library updates" seems to be ruled out as a source of the error.

My desktop is using python 3.8.10 and docker-dev02 is using 3.8.15. That seems unlikely to be the problem, but I don't have any better ideas, so maybe I will try changing that next.

Another possibility is something specific to gunicorn.

@corviday
Copy link
Contributor Author

corviday commented Aug 8, 2024

Another possibility is something specific to gunicorn.

Running it on my workstation as poetry run gunicorn --config docker/gunicorn.conf ce.wsgi:app results in no errors.

Current investigative Summary:
So this runs on my workstation and on github but not on docker-dev02.; the difference is not:

  • any python packages - identical versions are present on working and non-working instances
  • usage / non usage of gunicorn
  • any code difference in our code
  • different version of poetry

Next things to check: python versions, postgres C library versions. running in a docker container

@corviday
Copy link
Contributor Author

I updated a bunch of things in the geospatial-python container to address a different error, and it appears to have fixed this issue as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants