Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix bgzf test failure with biopython 1.80 #199

Merged
merged 1 commit into from
Dec 5, 2022

Conversation

emollier
Copy link
Contributor

Starting with version 1.80, biopython does more stringent checks on opening bgzf files. This is causing a series of test failures in pyfaidx 0.7.1:

tests/test_Fasta_bgzip.py::test_build_issue_126 XFAIL                    [ 11%]
tests/test_Fasta_bgzip.py::test_integer_slice FAILED                     [ 11%]
tests/test_Fasta_bgzip.py::test_integer_index FAILED                     [ 12%]
tests/test_Fasta_bgzip.py::test_fetch_whole_fasta FAILED                 [ 13%]
tests/test_Fasta_bgzip.py::test_line_len FAILED                          [ 14%]
tests/test_Fasta_bgzip.py::test_mutable_bgzf PASSED                      [ 14%]
tests/test_Fasta_bgzip.py::test_long_names FAILED                        [ 15%]
tests/test_Fasta_bgzip.py::test_fetch_whole_entry FAILED                 [ 16%]
tests/test_Fasta_bgzip.py::test_fetch_middle FAILED                      [ 17%]
tests/test_Fasta_bgzip.py::test_fetch_end FAILED                         [ 17%]
tests/test_Fasta_bgzip.py::test_fetch_border FAILED                      [ 18%]
tests/test_Fasta_bgzip.py::test_rev FAILED                               [ 19%]
tests/test_Fasta_bgzip.py::test_fetch_past_bounds FAILED                 [ 20%]
tests/test_Fasta_bgzip.py::test_fetch_negative FAILED                    [ 20%]
tests/test_Fasta_bgzip.py::test_fetch_reversed_coordinates FAILED        [ 21%]
tests/test_Fasta_bgzip.py::test_fetch_keyerror FAILED                    [ 22%]
tests/test_Fasta_bgzip.py::test_blank_string FAILED                      [ 22%]
tests/test_Fasta_bgzip.py::test_slice_from_beginning FAILED              [ 23%]
tests/test_Fasta_bgzip.py::test_slice_from_end FAILED                    [ 24%]
tests/test_Fasta_bgzip.py::test_issue_74_start FAILED                    [ 25%]
tests/test_Fasta_bgzip.py::test_issue_74_consistency FAILED              [ 25%]
tests/test_Fasta_bgzip.py::test_issue_74_end_faidx FAILED                [ 26%]
tests/test_Fasta_bgzip.py::test_issue_74_end_fasta FAILED                [ 27%]
tests/test_Fasta_bgzip.py::test_issue_79_fix FAILED                      [ 28%]
tests/test_Fasta_bgzip.py::test_issue_79_fix_negate FAILED               [ 28%]
tests/test_Fasta_bgzip.py::test_issue_79_fix_one_based_false FAILED      [ 29%]
tests/test_Fasta_bgzip.py::test_issue_79_fix_one_based_false_negate FAILED [ 30%]
tests/test_Fasta_bgzip.py::test_fetch_border_padded FAILED               [ 31%]
[…]
                try:
                    # mutable mode is not supported for bzgf anyways
>                   self.file = bgzf.BgzfReader(fileobj=self.file, mode="b")

/tmp/autopkgtest-lxc.i9z1xpfv/downtmp/build.bvQ/src/pyfaidx/__init__.py:412:
[…]
        if mode.lower() not in ("r", "tr", "rt", "rb", "br"):
>           raise ValueError(
                "Must use a read mode like 'r' (default), 'rt', or 'rb' for binary"
            )
E           ValueError: Must use a read mode like 'r' (default), 'rt', or 'rb' for binary

/usr/lib/python3/dist-packages/Bio/bgzf.py:597: ValueError
[…]
                try:
                    # mutable mode is not supported for bzgf anyways
                    self.file = bgzf.BgzfReader(fileobj=self.file, mode="b")
                except (ValueError, IOError):
>                   raise UnsupportedCompressionFormat(
                        "Compressed FASTA is only supported in BGZF format. Use "
                        "the samtools bgzip utility (instead of gzip) to "
                        "compress your FASTA."
E                       pyfaidx.UnsupportedCompressionFormat: Compressed FASTA is only supported in BGZF format. Use the samtools bgzip utility (instead of gzip) to compress your FASTA.

/tmp/autopkgtest-lxc.i9z1xpfv/downtmp/build.bvQ/src/pyfaidx/__init__.py:414: UnsupportedCompressionFormat

A full log is available on Debian CI infrastructure, as I first noticed this while attempting to upgrade biopython to version 1.80 in Debian experimental.

Looking at the main development branch, I suspect this breakage might still occur. This patch makes opening bgzf files explicitly in read-only mode, and thus does not trigger the error anymore, without causing regression in the test suite.

Note the comment in BgzfReader constructor mentions that one would:

typically use the top level bgzf.open(...) function
which will call this class internally. Direct use is discouraged.

So maybe this commit is more of a workaround than the proper fix.

Signed-off-by: Étienne Mollier [email protected]

Starting with version 1.80, biopython does more stringent checks on
opening bgzf files.  This is causing a series of test failures in
pyfaidx 0.7.1:

	tests/test_Fasta_bgzip.py::test_build_issue_126 XFAIL                    [ 11%]
	tests/test_Fasta_bgzip.py::test_integer_slice FAILED                     [ 11%]
	tests/test_Fasta_bgzip.py::test_integer_index FAILED                     [ 12%]
	tests/test_Fasta_bgzip.py::test_fetch_whole_fasta FAILED                 [ 13%]
	tests/test_Fasta_bgzip.py::test_line_len FAILED                          [ 14%]
	tests/test_Fasta_bgzip.py::test_mutable_bgzf PASSED                      [ 14%]
	tests/test_Fasta_bgzip.py::test_long_names FAILED                        [ 15%]
	tests/test_Fasta_bgzip.py::test_fetch_whole_entry FAILED                 [ 16%]
	tests/test_Fasta_bgzip.py::test_fetch_middle FAILED                      [ 17%]
	tests/test_Fasta_bgzip.py::test_fetch_end FAILED                         [ 17%]
	tests/test_Fasta_bgzip.py::test_fetch_border FAILED                      [ 18%]
	tests/test_Fasta_bgzip.py::test_rev FAILED                               [ 19%]
	tests/test_Fasta_bgzip.py::test_fetch_past_bounds FAILED                 [ 20%]
	tests/test_Fasta_bgzip.py::test_fetch_negative FAILED                    [ 20%]
	tests/test_Fasta_bgzip.py::test_fetch_reversed_coordinates FAILED        [ 21%]
	tests/test_Fasta_bgzip.py::test_fetch_keyerror FAILED                    [ 22%]
	tests/test_Fasta_bgzip.py::test_blank_string FAILED                      [ 22%]
	tests/test_Fasta_bgzip.py::test_slice_from_beginning FAILED              [ 23%]
	tests/test_Fasta_bgzip.py::test_slice_from_end FAILED                    [ 24%]
	tests/test_Fasta_bgzip.py::test_issue_74_start FAILED                    [ 25%]
	tests/test_Fasta_bgzip.py::test_issue_74_consistency FAILED              [ 25%]
	tests/test_Fasta_bgzip.py::test_issue_74_end_faidx FAILED                [ 26%]
	tests/test_Fasta_bgzip.py::test_issue_74_end_fasta FAILED                [ 27%]
	tests/test_Fasta_bgzip.py::test_issue_79_fix FAILED                      [ 28%]
	tests/test_Fasta_bgzip.py::test_issue_79_fix_negate FAILED               [ 28%]
	tests/test_Fasta_bgzip.py::test_issue_79_fix_one_based_false FAILED      [ 29%]
	tests/test_Fasta_bgzip.py::test_issue_79_fix_one_based_false_negate FAILED [ 30%]
	tests/test_Fasta_bgzip.py::test_fetch_border_padded FAILED               [ 31%]
	[…]
	                try:
	                    # mutable mode is not supported for bzgf anyways
	>                   self.file = bgzf.BgzfReader(fileobj=self.file, mode="b")

	/tmp/autopkgtest-lxc.i9z1xpfv/downtmp/build.bvQ/src/pyfaidx/__init__.py:412:
	[…]
	        if mode.lower() not in ("r", "tr", "rt", "rb", "br"):
	>           raise ValueError(
	                "Must use a read mode like 'r' (default), 'rt', or 'rb' for binary"
	            )
	E           ValueError: Must use a read mode like 'r' (default), 'rt', or 'rb' for binary

	/usr/lib/python3/dist-packages/Bio/bgzf.py:597: ValueError
	[…]
	                try:
	                    # mutable mode is not supported for bzgf anyways
	                    self.file = bgzf.BgzfReader(fileobj=self.file, mode="b")
	                except (ValueError, IOError):
	>                   raise UnsupportedCompressionFormat(
	                        "Compressed FASTA is only supported in BGZF format. Use "
	                        "the samtools bgzip utility (instead of gzip) to "
	                        "compress your FASTA."
	E                       pyfaidx.UnsupportedCompressionFormat: Compressed FASTA is only supported in BGZF format. Use the samtools bgzip utility (instead of gzip) to compress your FASTA.

	/tmp/autopkgtest-lxc.i9z1xpfv/downtmp/build.bvQ/src/pyfaidx/__init__.py:414: UnsupportedCompressionFormat

A [full log] is available on Debian CI infrastructure, as I first
noticed this while attempting to upgrade biopython to version 1.80 in
Debian experimental.

[full log]: https://ci.debian.net/data/autopkgtest/unstable/amd64/p/python-pyfaidx/28663555/log.gz

Looking at the main development branch, I suspect this breakage might
still occur.  This patch makes opening bgzf files explicitly in
read-only mode, and thus does not trigger the error anymore, without
causing regression in the test suite.

Note the comment in BgzfReader constructor mentions that one would:

> typically use the top level ``bgzf.open(...)`` function
> which will call this class internally. Direct use is discouraged.

So maybe this commit is more of a workaround than the proper fix.

Signed-off-by: Étienne Mollier <[email protected]>
@marcelm
Copy link
Contributor

marcelm commented Dec 1, 2022

I ran into this as well for #200. I think yours is the correct fix because bgzf.open would need a file name while the usage here is to re-open an existing file-like object (hence fileobj=).

@marcelm marcelm mentioned this pull request Dec 1, 2022
@emollier
Copy link
Contributor Author

emollier commented Dec 1, 2022

Cool, thanks for the confirmation! :)

@mdshw5
Copy link
Owner

mdshw5 commented Dec 5, 2022

Thanks for taking the time to provide this fix @emollier. It looks good and I will go ahead and merge.

@mdshw5 mdshw5 merged commit e91c3fe into mdshw5:master Dec 5, 2022
@emollier emollier deleted the biopython-1.80 branch December 18, 2022 17:18
@mdshw5
Copy link
Owner

mdshw5 commented Feb 15, 2023

I've released a new version (v0.7.2) that incorporates these changes. Thanks for helping out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants