Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

base-runner: bump python to 3.10 #11420

Closed
wants to merge 2 commits into from
Closed

Conversation

DavidKorczynski
Copy link
Collaborator

Fixes: #11419

Signed-off-by: David Korczynski <[email protected]>
Signed-off-by: David Korczynski <[email protected]>
@maflcko
Copy link
Contributor

maflcko commented Jan 5, 2024

I guess an alternative would be to bump the base image to Ubuntu 22.04, similar to #6305 . However, that may be more of a breaking change, so maybe it is less painful by bumping to 24.04 in one go, in the future.

edit: Though, I guess changing the Ubuntu version may be blocked on google/clusterfuzz#3290 ?

@apollo13
Copy link
Contributor

apollo13 commented Feb 9, 2024

@DavidKorczynski Anything we can do to help move this forward?

@DavidKorczynski
Copy link
Collaborator Author

I guess an alternative would be to bump the base image to Ubuntu 22.04, similar to #6305 . However, that may be more of a breaking change, so maybe it is less painful by bumping to 24.04 in one go, in the future.

I think this would probably be preferred -- it would be nice to not fall to far behind the latest distro. But I'm not sure about the impact on all the projects.

@maflcko
Copy link
Contributor

maflcko commented Apr 1, 2024

For reference, I did a quick try with Ubuntu 22.04, but it failed to compile honggfuzz. Not sure why, yet.

...
clang -c -O3 -funroll-loops -D_HF_LINUX_NO_BFD -std=c11 -I/usr/local/include -D_GNU_SOURCE -Wall -Wextra -Werror -Wno-format-truncation -Wno-override-init -I. -D_FILE_OFFSET_BITS=64 -Wno-initializer-overrides -Wno-unknown-warning-option -Wno-gnu-empty-initializer -Wno-format-pedantic -Wno-gnu-statement-expression -mllvm -inline-threshold=2000 -D_HF_ARCH_LINUX -fPIC -fno-stack-protector -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=0   -o libhfcommon/util.o libhfcommon/util.c
ar rcs libhfcommon/libhfcommon.a libhfcommon/files.o libhfcommon/log.o libhfcommon/ns.o libhfcommon/util.o
clang -o honggfuzz cmdline.o display.o fuzz.o honggfuzz.o input.o mangle.o report.o sanitizers.o socketfuzzer.o subproc.o linux/arch.o linux/bfd.o linux/perf.o linux/pt.o linux/trace.o linux/unwind.o libhfcommon/libhfcommon.a -pthread -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -lm -L/usr/local/include -Wl,-Bstatic `pkg-config --libs --static libunwind-ptrace libunwind-generic` -lopcodes -lbfd -liberty -lz -Wl,-Bdynamic -lrt -ldl -lm -Wl,-Bstatic -lBlocksRuntime -Wl,-Bdynamic
/usr/bin/ld: /lib/x86_64-linux-gnu/libunwind-ptrace.a(_UPT_access_fpreg.o): relocation R_X86_64_32S against symbol `_UPT_reg_offset' can not be used when making a PIE object; recompile with -fPIE
/usr/bin/ld: failed to set dynamic section sizes: bad value
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [Makefile:295: honggfuzz] Error 1
The command '/bin/sh -c precompile_honggfuzz' returned a non-zero code: 2

@jonathanmetzman
Copy link
Contributor

It will probably be a year before we update Ubuntu image

DaveLak added a commit to DaveLak/oss-fuzz that referenced this pull request Jun 4, 2024
The changes introduced here upgrade Python from 3.8 to 3.10.14 inside
the base-builder and base-runner images.

 ### base-builder changes:

Prior to these changes, base-builder compiled Python 3.8 from source
using sources downloaded from the official release servers at
https://www.python.org/ftp/python/. This updates the compiled version
to 3.10.14 (the latest 3.10 release) instead.

 ### base-runner changes:

Prior to these changes, base-runner installed Python 3.8 from the
default apt repository provided by the Ubuntu 20.04 image it's based
on. These apt repositories do not have a version of Python 3.10
available by default. This updates the base-runner to instead use a
multi-stage build to copy the same Python interpreter compiled by the
base-builder image into the runner image, which ensures both Python
versions remain in-sync while saving build time by re-using a pre-built
version.

 ## Motivation

- Code coverage does not work on Python projects that use Python 3.10+
  syntax, and will not work until this or similar changes are landed
  (see google#11419)
- Upgrading the base-image to use Ubuntu 22.04 (which provides more
  recent Python versions via apt) has been stated as being unlikely to
  happen any time soon (see google#3290)
- Many OSS-Fuzz integrated Python projects no longer support Python 3.8
  and have resorted to implementing ad-hoc workarounds to upgrade to
  newer Python versions, including installing Python from the Dead
  Snakes PPA.
  - This leads to fragmentation and hard to debug issues. Maintenance
    is easier when everyone is using the same version without issue.
- With [Python 3.8 reaching end of life soon (in 2024-10)][python-
  versions-EOL], it is likely that more Python projects will begin
  dropping support for 3.8, further increasing the number of broken
  builds and ad-hoc workarounds.
- Previous attempts at upgrading Python have stalled.

 ## Known & Expected Issues

Several project Dockerfiles and build scripts contain hard coded
references to python3.8 file system paths, and many more have implanted
ad-hoc workarounds to upgrade to newer Python versions than 3.8
(typically 3.9.) Additional changes are required to each of these
projects to ensure they successfully build after this upgrade to Python
3.10.

 ### Fuzz Introspector Caveat

Fuzz Introspector currently uses Python 3.9. While an upgrade to 3.10 is
not expected to introduce any new issues, it was not tested on these
changes and may require additional work.

 ## Possible Areas of Improvement

Using the base-builder image in a multi-stage build to copy the pre-
compiled Python into base-runner is effective, but feels like a
workaround that may be introducing tech debt. A cleaner approach would
be to extract the Python compilation into a discrete base image similar
to how `base-clang` works, and use that as the multi-stage builder in
images that need it.

---

Fixes:
- google#11419

Supersedes:
- google#9532
- google#11420

[python-versions-EOL]: https://devguide.python.org/versions/
@DaveLak
Copy link
Contributor

DaveLak commented Jun 4, 2024

@DavidKorczynski I took a slightly different approach at this in #12027.

Would you please take a look and let me know what you think whenever you get a chance?

Thanks!

oliverchang added a commit that referenced this pull request Nov 25, 2024
#12027)

> [!NOTE]  
> I was looking for somewhere to get feedback from maintainers about
this approach to the Python 3.10 upgrade before attempting it, but the
discussion surrounding a Python upgrade has been rather fragmented
across many issues, PRs, and comment chains.
>
> For that reason, I felt it would be easier to propose with a working
example and dedicated PR.


#### Fixes:
- #11419
- #9638

#### Supersedes:
- #9532
- #11420


## Changes

The changes introduced here upgrade Python from 3.8 to 3.10.14 inside
the base-builder and base-runner images.

### Base Image Changes

| Image | Before Changes | After Changes |

|----------------|------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **base-builder** | Compiled Python 3.8 from source using official
release servers at https://www.python.org/ftp/python/. | Compiles Python
3.10.14 (the latest 3.10 release) from source using official release
servers at https://www.python.org/ftp/python/. |
| **base-runner** | Installed Python 3.8 from the default apt repository
provided by the Ubuntu 20.04 image. | Uses a multi-stage build to copy
the Python 3.10.14 interpreter compiled by the base-builder image,
ensuring version sync and saving build time by re-using a pre-built
version. |


## Known Impact on Projects

### 3.9 Workarounds That Can Be Removed

| Project    | Fix Link |
|------------|----------|
| dask |
DaveLak@417bbf5
|
| docutils |
DaveLak@e4c21ff
|
| dovecot |
DaveLak@7ab3ab6
|
| nbclassic |
DaveLak@5509b4e
|
| pandas |
DaveLak@0642a7a
|
| pybind11 |
DaveLak@a5bbdb3
|
| pyodbc |
DaveLak@afa2b5e
|
| qpid-proton|
DaveLak@f5bf756
|

### Anticipated Build Failures

#### Preexisting Failures 

##### Fix is Prepared

| Project               | Fix Link |
|-----------------------|----------|
| airflow |
DaveLak@60a0368
|
| ipython |
DaveLak@21ac68e
|
| networkx |
DaveLak@fc2f8c5
|
| numpy |
DaveLak@9383c87
|
| tensorflow-addons |
DaveLak@eed2bea
|
| django (coverage build)|
DaveLak@c724d61
|
| proto-plus-python |
DaveLak@37d973e
|
| dnspython | The upgraded pip version in the base-builder fixes the
currently failing build. |

##### Fix Requires Upstream Changes

| Project | Issue |
|---------|-------|
| pyvex | Currently failing on python 3.9 because `archinfo` dependency
requires >=3.10. Fails after the 3.10 upgrade because [the upstream
build script needs `python3.9` replaced with
`python3`](https://github.com/angr/pyvex/blob/f94c95636a3800c5bbd781ecf1e3fb0c0d9feec4/fuzzing/build.sh#L19-L23).
|

##### Requires More Investigation

| Project            | Issue |
|--------------------|-------|
| matplotlib | Upgrading Python & Pyinstaller does resolve the build
issues, but an error in the fuzz harness is exposed and must be resolved
for check_build to pass. The exception: `TypeError: Parser.non_math()
takes 2 positional arguments but 4 were given" in "File "fuzz_plt.py",
line 43, in TestOneInput`. |
| scipy | Upgrading Python & Pyinstaller does resolve the build issues,
but an error in the build step causes the build to fail. The error seems
related to the linking: "/usr/bin/ld: /usr/bin/ld: DWARF error: invalid
or unhandled FORM value: 0x25". When `export LDFLAGS="-fuse-ld=lld"` is
set, the error becomes: "`ld.lld: error: undefined symbol:
__asan_report_store4`". |
| pandas (Introspector only)| [This workaround in `build.sh` is the
issue](https://github.com/google/oss-fuzz/blob/1515519a665756d8a50a6c46abac8b431e5462ef/projects/pandas/build.sh#L22-L32).
|
| pycrypto | Failing with error: "`SystemError: PY_SSIZE_T_CLEAN macro
must be defined for '#' formats`". Seems like the issue described
[here](https://stackoverflow.com/a/71019907). Pycrypto is deprecated and
this is unlikely to be fixed upstream. |


## Possible Future Improvements

Using the base-builder image in a multi-stage build to copy the pre-
compiled Python into base-runner is effective, but feels like a
workaround that may be introducing tech debt. A cleaner approach would
be to extract the Python compilation into a discrete base image similar
to how `base-clang` works, and use that as the multi-stage builder in
images that need it.

### Fuzz Introspector Caveat

Fuzz Introspector currently uses Python 3.9. While an upgrade to 3.10 is
not expected to introduce any new issues, it was not tested on these
changes and may require additional work.

---

## Motivation

- Python [3.8 is reaching end of life in October
2024](https://devguide.python.org/versions/).
- The [Scientific Python Community already encourages dropping 3.8
support](https://scientific-python.org/specs/spec-0000/).
- This is evident when looking at which projects have resorted to
upgrading to newer Pythons using ad-hoc workarounds (see `numpy`,
`scipy`, `pandas`, etc.)
- It is likely that more Python projects will begin dropping support for
3.8, further increasing the number of broken builds and ad-hoc
workarounds.
- Code coverage does not work on Python projects that use Python 3.10+
syntax.
- Previous attempts at upgrading Python have stalled (see
google/clusterfuzz#3290 (comment)
& the issues linked under "Supersedes" above.)
- In recognition of the fact that OSS-Fuzz maintainers are stretched
thin, I thought I'd give it a shot.

---------

Co-authored-by: Oliver Chang <[email protected]>
Co-authored-by: Andrew Murray <[email protected]>
@radarhere
Copy link
Contributor

#12027 has been merged instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Python projects that have Python 3.10+ syntax have broken code coverage
7 participants