Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile: use Ubuntu 20.10 as base #1296

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

tmcl-it
Copy link

@tmcl-it tmcl-it commented Apr 12, 2021

This PR changes the main Dockerfile to use ubuntu:20.10 as base image instead of python:3.9.2-slim-buster (itself based on debian:buster-slim).

The Dockerfile is essentially the one from #1249 (comment) with some additional cleanups to slim it down.

This fixes a couple of issues:

  1. The SQLite version in Debian Buster (2.6.0) doesn't support generated columns
  2. Installing SpatiaLite from the Debian sid repositories has the side effect of also installing updates to libc and libstdc++ from sid.

As a bonus, the Docker image becomes smaller:

$ docker image ls
REPOSITORY                   TAG           IMAGE ID       CREATED       SIZE
datasette                    0.56-ubuntu   f7aca255140a   5 hours ago   212MB
datasetteproject/datasette   0.56          efb3b282f390   13 days ago   258MB

Reproduction of the first issue

$ curl -O https://latest.datasette.io/fixtures.db
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  260k    0  260k    0     0   489k      0 --:--:-- --:--:-- --:--:--  489k

$ docker run -v `pwd`:/mnt datasetteproject/datasette:0.56 datasette /mnt/fixtures.db
Traceback (most recent call last):
  File "/usr/local/bin/datasette", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/datasette/cli.py", line 544, in serve
    asyncio.get_event_loop().run_until_complete(check_databases(ds))
  File "/usr/local/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.9/site-packages/datasette/cli.py", line 584, in check_databases
    await database.execute_fn(check_connection)
  File "/usr/local/lib/python3.9/site-packages/datasette/database.py", line 155, in execute_fn
    return await asyncio.get_event_loop().run_in_executor(
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.9/site-packages/datasette/database.py", line 153, in in_thread
    return fn(conn)
  File "/usr/local/lib/python3.9/site-packages/datasette/utils/__init__.py", line 892, in check_connection
    for r in conn.execute(
sqlite3.DatabaseError: malformed database schema (generated_columns) - near "AS": syntax error

Here is the SQLite version:

$ docker run -v `pwd`:/mnt -it datasetteproject/datasette:0.56 /bin/bash
root@d9220d3b95dd:/# python3
Python 3.9.2 (default, Mar 27 2021, 02:50:26) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlite3
>>> sqlite3.version
'2.6.0'

Reproduction of the second issue

$ docker build . -t datasette --build-arg VERSION=0.55
[...snip...]
The following packages will be upgraded:
  libc-bin libc6 libstdc++6
[...snip...]
Unpacking libc6:amd64 (2.31-11) over (2.28-10) ...
[...snip...]
Unpacking libstdc++6:amd64 (10.2.1-6) over (8.3.0-6) ...
[...snip...]

Both libc and libstdc++ are backwards compatible, so the image still works, but it will result in a combination of libraries and Python versions that exists only in the Datasette image, so it's likely untested. In addition, since Debian sid is an always-changing rolling-release, the versions of libc, libstdc++, Spatialite, and their dependencies change frequently, so the library versions in the Datasette image will depend on the day when it was built.

@codecov
Copy link

codecov bot commented Apr 12, 2021

Codecov Report

Merging #1296 (527a056) into main (c73af5d) will decrease coverage by 0.11%.
The diff coverage is n/a.

❗ Current head 527a056 differs from pull request most recent head 8f00c31. Consider uploading reports for the commit 8f00c31 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1296      +/-   ##
==========================================
- Coverage   91.62%   91.51%   -0.12%     
==========================================
  Files          34       34              
  Lines        4371     4255     -116     
==========================================
- Hits         4005     3894     -111     
+ Misses        366      361       -5     
Impacted Files Coverage Δ
datasette/tracer.py 81.60% <0.00%> (-1.35%) ⬇️
datasette/views/base.py 95.01% <0.00%> (-0.42%) ⬇️
datasette/facets.py 89.04% <0.00%> (-0.41%) ⬇️
datasette/utils/__init__.py 94.13% <0.00%> (-0.21%) ⬇️
datasette/renderer.py 94.02% <0.00%> (-0.18%) ⬇️
datasette/views/database.py 97.19% <0.00%> (-0.10%) ⬇️
datasette/views/table.py 95.88% <0.00%> (-0.07%) ⬇️
datasette/views/index.py 96.36% <0.00%> (-0.07%) ⬇️
datasette/hookspecs.py 100.00% <0.00%> (ø)
datasette/utils/testing.py 95.38% <0.00%> (ø)
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c73af5d...8f00c31. Read the comment docs.

@camallen
Copy link
Contributor

camallen commented Apr 14, 2021

Removing /var/lib/apt and /var/lib/dpkg makes apt and dpkg unusable in
images based on this one. Running apt-get clean and removing
/var/lib/apt/lists achieves similar size savings.

this PR helps me as removing the /var/lib/apt and /var/lib/dpkg directories breaks my ability to add packages when using datasetteproject/datasette:0.56 as a base image.


Shorterm workaround for me was to use this in my Dockerfile

FROM datasetteproject/datasette:0.56

RUN mkdir -p /var/lib/apt
RUN mkdir -p /var/lib/dpkg
RUN mkdir -p /var/lib/dpkg/updates
RUN mkdir -p /var/lib/dpkg/info
RUN touch /var/lib/dpkg/status

RUN apt-get update # and install your packages etc

@blairdrummond
Copy link
Contributor

I have also found that ubuntu has fewer vulnerabilities than the buster based images.

➜  ~ docker pull python:3-buster
➜  ~ trivy image python:3-buster | head                             
2021-04-28T17:14:29.313-0400    INFO    Detecting Debian vulnerabilities...
2021-04-28T17:14:29.393-0400    INFO    Trivy skips scanning programming language libraries because no supported file was detected
python:3-buster (debian 10.9)
=============================
Total: 1621 (UNKNOWN: 13, LOW: 1106, MEDIUM: 343, HIGH: 145, CRITICAL: 14)
+------------------------------+---------------------+----------+------------------------------+---------------+--------------------------------------------------------------+
|           LIBRARY            |  VULNERABILITY ID   | SEVERITY |      INSTALLED VERSION       | FIXED VERSION |                            TITLE                             |
+------------------------------+---------------------+----------+------------------------------+---------------+--------------------------------------------------------------+

@simonw
Copy link
Owner

simonw commented May 28, 2021

As a bonus, the Docker image becomes smaller

That's a huge surprise to me! And most welcome.

tmcl-it added 2 commits July 20, 2021 10:51
Removing /var/lib/apt and /var/lib/dpkg makes apt and dpkg unusable in
images based on this one. Running `apt-get clean` and removing
/var/lib/apt/lists achieves similar size savings.
The previous Dockerfile uses python:3.9.2-slim-buster (based on Debian
stable), but installs Spatialite from the Debian unstable repositories.
This has the side effect of also installing updates to libc and
libstdc++6 from unstable, resulting in an untested combination (and,
potentially, package versions that depend on the day when the Docker
image was built).

Moreover, the SQLite version in Debian stable doesn't support generated
columns, so some DBs cannot be loaded by Datasette.

Switching to ubuntu:20.10 fixes these issues.
@tmcl-it tmcl-it force-pushed the patch-dockerfile branch from 527a056 to 8f00c31 Compare July 20, 2021 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants