Stream support for exporting pdbs not working with OTHERS record #141

gate-tec · 2024-03-04T00:52:51Z

Describe the bug

When trying to export pdb data with ATOM and OTHERS entries using .to_pdb_stream I always get a pandas.errors.IntCastingNaNError (cf. Steps/Code to Reproduce).
As I need to maintain the TER markers in the resulting pdb data, the content of the OTHERS frame is necessary.

When writing directly to a pdb file with .to_pdb there is no such issue. A possible approach in fixing could be an abstract base function for both methods or to specify the desired output (i.e. file or stream) in to_pdb as mentioned in #108

Steps/Code to Reproduce

Example:

from biopandas.pdb import PandasPdb

pdb_df = PandasPdb().fetch_pdb('1ou5')
out_string = pdb_df.to_pdb_stream(records=('ATOM', 'OTHERS'))

Expected Results

Stream containing the specified records in pdb format.

Actual Results

A pandas.errors.IntCastingNaNError stemming from Line 909 in pandas_pdb.py

df.residue_number = df.residue_number.astype(int)

which is executed on the entire concatenated DataFrame.
As the OTHERS frame doesn't contain residue number entries, these cells are always NaN after concatenating.

Versions

biopandas 0.5.0dev
Linux-5.4.0-91-generic-x86_64-with-glibc2.31
Python 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0]
Scikit-learn 1.3.0
NumPy 1.23.5
SciPy 1.11.1

The text was updated successfully, but these errors were encountered:

a-r-j · 2024-08-01T15:47:40Z

Hi @gate-tec thanks for raising.

I think we should switch this to: pd.to_numeric(df.residue_number, errors='corce') and subsequently strip the NaNs. What do you think?

* include testing on newer python versions * bump version string * linting: remove print statements from tests * fix: improve robustness of and add a test #141 * fix: add init to test data module * fix: add init to remaining test data modules * tests: add tests with github actions * tests: add tests with github actions * tests: rename build job * update changelog --------- Co-authored-by: Arian Jamasb <[email protected]>

a-r-j added a commit that referenced this issue Aug 1, 2024

fix: improve robustness of and add a test #141

5473c6b

a-r-j closed this as completed Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream support for exporting pdbs not working with OTHERS record #141

Stream support for exporting pdbs not working with OTHERS record #141

gate-tec commented Mar 4, 2024

a-r-j commented Aug 1, 2024

Stream support for exporting pdbs not working with OTHERS record #141

Stream support for exporting pdbs not working with OTHERS record #141

Comments

gate-tec commented Mar 4, 2024

Describe the bug

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

a-r-j commented Aug 1, 2024