Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QA and Release 0.6.0 #704

Closed
60 tasks done
deeplow opened this issue Feb 12, 2024 · 14 comments
Closed
60 tasks done

QA and Release 0.6.0 #704

deeplow opened this issue Feb 12, 2024 · 14 comments
Milestone

Comments

@deeplow
Copy link
Contributor

deeplow commented Feb 12, 2024

Pre-release

Before making a release, all of these should be complete:

QA

To ensure that new releases do not introduce regressions, and support existing
and newer platforms, we have to do the following:

  • Make sure that the tip of the main branch passes the CI tests.
  • Make sure that the Apple account has a valid application password and has
    agreed to the latest Apple terms (see macOS release
    section).
  • Create a test build in Windows and make sure it works:
  • Create a test build in macOS (Intel CPU) and make sure it works:
    • Check if the suggested Python version is still supported.
    • Create a new development environment with Poetry.
    • Build the container image and ensure the development environment uses the new image.
    • Run the Dangerzone tests.
    • Create and run an app bundle.
    • Test some QA scenarios (see Scenarios below).
  • Create a test build in macOS (M1/2 CPU) and make sure it works:
    • Check if the suggested Python version is still supported.
    • Create a new development environment with Poetry.
    • Build the container image and ensure the development environment uses
      the new image.
    • Run the Dangerzone tests.
    • Create and run an app bundle.
    • Test some QA scenarios (see Scenarios below).
  • Create a test build in the most recent Ubuntu LTS platform (Ubuntu 22.04
    as of writing this) and make sure it works:
    • Create a new development environment with Poetry.
    • Build the container image and ensure the development environment uses
      the new image.
    • Run the Dangerzone tests.
    • Create a .deb package and install it system-wide.
    • Test some QA scenarios (see Scenarios below).
  • Create a test build in the most recent Fedora platform (Fedora 39 as of
    writing this) and make sure it works:
    • Create a new development environment with Poetry.
    • Build the container image and ensure the development environment uses
      the new image.
    • Run the Dangerzone tests.
    • Create an .rpm package and install it system-wide.
    • Test some QA scenarios (see Scenarios below).
  • Create a test build in the most recent Qubes Fedora template (Fedora 38 39 as
    of writing this) and make sure it works:
    • Create a new development environment with Poetry.
    • Run the Dangerzone tests.
    • Create a Qubes .rpm package and install it system-wide.
    • Ensure that the Dangerzone application appears in the "Applications"
      tab.
    • Test some QA scenarios (see Scenarios below) and make sure
      they spawn disposable qubes.

(see also the previous QA for 0.5.1)

Publishing the Release

@deeplow
Copy link
Contributor Author

deeplow commented Feb 13, 2024

Fedora 39

QA passed on Fedora 39. However, I noticed that a document failure (sample_bad) yields a different message now due to the page streaming. It says that the conversion was interrupted rather than some file format issue. Not sure if this was intentional. I'd have to check the code. But we may want try and catch this error on the server side and pass it along to the user.

@deeplow
Copy link
Contributor Author

deeplow commented Feb 14, 2024

While testing on ubuntu, I noticed that interrupted conversions (sample_bad...) show a traceback instead of simply an error message. We should show an error message:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/dangerzone/isolation_provider/base.py", line 74, in convert
    self.doc_to_pixels(document, t)
  File "/usr/lib/python3/dist-packages/dangerzone/isolation_provider/base.py", line 106, in doc_to_pixels
    n_pages = read_int(p.stdout)
  File "/usr/lib/python3/dist-packages/dangerzone/isolation_provider/base.py", line 38, in read_int
    raise errors.InterruptedConversionException()
dangerzone.conversion.errors.InterruptedConversionException: Something interrupted the conversion and it could not be completed.

@apyrgio
Copy link
Contributor

apyrgio commented Feb 14, 2024

Make sure that the tip of the main branch passes the CI tests.

The latest released container image (v0.5.1) does not actually pass security scanning, due to CVE-2023-5841. This will be fixed once we release the new image, but for all intents and purposes, our CI tests now pass.

@deeplow
Copy link
Contributor Author

deeplow commented Feb 14, 2024

Qubes (Fedora 39 template)

Works well but we also need to add our repo before installing the locally built RPM. This is to be expected with the new PySide6 dependency from the dangerzone repos.

Notes:

  • failed scenario 5 policy failure lead to "Interrupted conversion error" instead of the message that says something went wrong in Qubes. In particular, it failed the following:

    (Only for Qubes) The only message that the user should see is: "The document format is not supported", without any untrusted strings.
    but it does pass scenario 9.1 showing that it could not start disposable qube

  • 50% progress is repeated twice. We should have one of them be like up to 45% to avoid this repetition

    noticed

@apyrgio
Copy link
Contributor

apyrgio commented Feb 14, 2024

Windows

I've tested building a Docker image on Windows, and we have a minor issue. The end result is a corrupted image. More specifically, the tar archive starts with the following string:

failed to get console mode for stdout: The handle is invalid.

This string is returned by docker save (see docker/for-win#13891). It seems to be a regression in Docker that was introduced recently, and will be fixed soon.

In the meantime, we are not super impacted by this error, because we build this image on a different environment.


EDIT: Turns out that this is not the case. Even if we build the image in a separate environment, the user needs to ingest it. Unfortunately, the Docker call that fetches the current image always includes the faulty string. this means that Dangerzone will always believe that the container image is not installed, and will install it every time.

Note that this is something that affects existing installations as well, that update to the latest Docker Desktop.

@deeplow
Copy link
Contributor Author

deeplow commented Feb 15, 2024

Note that this is something that affects existing installations as well, that update to the latest Docker Desktop.

This sucks. We can tackle this in code, but this adds one more reason to implement #693

@apyrgio
Copy link
Contributor

apyrgio commented Feb 20, 2024

The macOS QA has finished successfully, and with that, we're done with the first round of the QA. On slightly related note, I found out that we couldn't select any .svg and .bpm files via the file browser. Turns out that we had omitted those by mistake. I sent a PR for this: #722

@deeplow
Copy link
Contributor Author

deeplow commented Feb 23, 2024

Large tests resuts

The large test set is a large set of documents. I ran the smallest of the sets, containing a total of 4962.

Comparing v0.5.0 to v0.6.0 we have the following

commit_v0.5.0-15-g8d70393 commit_v0.6.0-rc2-1-gd1000f8
full test time 9740.880 33984.981 *
test failures 66 49 **

* This means that some tests are taking and absurd amount of time. And because we removed timeouts some documents just took forever.

** The number of failures are the exact same when we consider that we removed timeouts. In v0.5.0 17 tests would just timeout while now they pass but take a long time. See


Why the time difference?

Digging into the data we have the following:

results

The files are sorted by the time they took and these were the ones that took the longest. As you can see v0.6.0 (with PyMuPDF) has 17 files that take 1000+ seconds. So much time so that without these slow tests these would be 4737, which is about half of what v0.5.0. This is consistent with some results we did previously, it's just that when we did those we still had timeouts and so they'd be terminated and fail after 65 seconds, while they now pass but take forever

Conclusion

PyMuPDF doesn't seem to add more time to the conversions nor increase the failures. Quite the contrary.

@deeplow
Copy link
Contributor Author

deeplow commented Feb 26, 2024

✅ Update the Installing Dangerzone page [nothing changed]

@apyrgio I checked this one because I don't think we needed to change anything there. Pinging you to double-check when you have availability.

@deeplow deeplow closed this as completed Feb 26, 2024
@deeplow deeplow added this to the 0.6.0 milestone Feb 26, 2024
@GWeck
Copy link

GWeck commented Mar 2, 2024

Upgrading Dangerzone 0.5.0 to 0.6.0 on Fedora 38 may / will break the OCR component. This can be easily fixed by appending the line

    export TESSDATA_PREFIX=/usr/share/tesseract/tessdata

to the file .bash_profile in the disposable template used for the Dangerzone dispVM.

@apyrgio
Copy link
Contributor

apyrgio commented Mar 4, 2024

Great catch! We had tested on Qubes a dev build for Fedora 38 templates, and a production build for Fedora 39 templates. And yet we missed it 😬 . The reason we missed it is:

  1. The PyMuPDF version on Fedora 39 is 1.23.3, which accepts the Tesseract data path as a separate argument. Our code checks for the PyMuPDF version and does pass the correct path:

    def get_tessdata_dir() -> str:
    if running_on_qubes():
    return "/usr/share/tesseract/tessdata/"
    else:
    return "/usr/share/tessdata/"

    if int(fitz.version[2]) >= 20230621000001:
    page_pdf_bytes = pixmap.pdfocr_tobytes(
    compress=True,
    language=ocr_lang,
    tessdata=get_tessdata_dir(),
    )
    else:
    # XXX method signature changed in v1.22.5 to add tessdata arg
    # TODO remove after oldest distro has PyMuPDF >= v1.22.5
    page_pdf_bytes = pixmap.pdfocr_tobytes(
    compress=True,
    language=ocr_lang,
    )

  2. The dev script on Qubes has this line, so that's why the issue did not manifest on our local tests:

    # XXX workaround lack of tessdata path arg for PyMuPDF < v1.22.5
    # for context see https://github.com/freedomofpress/dangerzone/issues/682
    os.environ["TESSDATA_PREFIX"] = os.environ.get("TESSDATA_PREFIX", "/usr/share/tesseract/tessdata")

@deeplow
Copy link
Contributor Author

deeplow commented Mar 4, 2024

I have moved the above issue to its own issue #737

@GWeck
Copy link

GWeck commented Mar 4, 2024

I checked now with Qubes R4.2.0 and Fedora 39. There everything is o.k.

@GWeck
Copy link

GWeck commented Mar 6, 2024

Now I checked the new version with Fedora 38. It's o.k., too. Thank you for correcting this so quickly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants