-
Notifications
You must be signed in to change notification settings - Fork 658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove the BioPython PDBParser for topology and trajectory readers #777
Comments
I will go ahead and take care of this in the coming days. |
I do not know if anybody calls the PDB parser directly, but it may be best to have a deprecation period for it. Anyway, no other part of the code should call it, and the documentation should be updated. |
You should also deprecate the |
I think the point here is we're trying to make our parser do everything the other one does, so it doesn't need deprecating as no functionality is lost? @jdetle as step 0 for this, I'd try reading every .pdb file we have in the test suite with |
@richardjgowers does our testsuite even use the BioPython Trajectory reader. As far as I can tell from the coordinates tests only the primitive reader is used. I actually started to notice a bunch of PDB related issues when I tried to use the new common-API reader tests class defined in |
This is getting a little confusing, I thought we were talking about the Parser (reads topology). Either way, I find the whole permissive/primitive thing confusing and don't really understand what one does that the other doesn't |
Oh sorry I meant the trajectory reader, I never looked at the topology parsers so I don't have a clear idea about them |
@kain88-de there's a similar split and it could do with removing too. I guess you could rename this issue to cover both |
We want to get rid of the |
Duly noted, I am working on what @richardjgowers said earlier. I think in in these initial stages everything is going to take longer than I think it should because I'm getting familiar with the codebase and python itself. I just ran into an interesting error that I'm investigating regarding object equality. By my understanding:
the print statement should yield "checking equal: True" but instead yields False, furthermore the test doesn't pass:
Is this done intentionally for some reason or are we missing some |
I am not sure readers implement any comparison operator. If so, the python documentation tells us that objects are compared by their identity so
This means an object will only be equal to itself. In your example the two objects represent the same thing (a PDB reader for the same file), but they are different instances: i.e. In that case, the distinction is important, because a reader store a state and changing the sate of a reader does not change the state of an identical reader. |
We don't implement a comparison for the |
@richardjgowers maybe I misunderstood, but I went ahead and tried to start testing the parsers, however I ran into an issue that I don't know how to get around.
throws the exception "Module not callable" how do I go about testing the .py versions of the modules? Does this require changing how my packages are linked something far simpler? |
@jdetle none of the parsers are written in Cython; that's not the issue. The problem here is that if have a look at the source tree, from MDAnalysis.topology.PDBParser import PDBParser to make this work. |
@dotsdl ah okay! Thank you. |
@dotsdl Did you write the test_topology file? It seems like you actually have already done most of what @richardjgowers suggested I do with the _TestTopology and TestPDB classes in test_topology.py, I wrote a kind of shoddy script from inspecting failures with this
My gut tells me that I should be separating this out into a test for each filename so that it would be three failures and the rest pass, I can do that but since we are removing the BioPython parser anyways it seems moot. The parsed atom arrays are not equal for PDB_multiframe, PDB_conect and PDB_full. Given that the PrimitiveParser is tested extensively in test_topology, I think its safe to say that these are instances in which the strict parser fails and further reinforce why we are getting rid of it. If this conclusion seems valid I'll go ahead and delete BioPython parser where it comes up and make a pull request. |
@jdetle so what you've described is called a test generator. An example of using them is here: Where the for loop goes over the list called So a simpler example is... class TestAddition(object): # can't use TestCase!
def _check_addition(self, a, b, ref):
assert_equal(a+b, ref)
def test_addition(self):
for x, y, z in ((1, 2, 3), (7, 10, 1), (3, 4, 7)):
yield self._check_addition, x, y, z So the yield statement is creating tests, 3 tests will get created, with the 2nd test failing but the 3rd still running |
@richardjgowers Awesome, I will certainly use that for a problem in the future. Big question: Do we still need the BioPython trajectory writer? If this is not true, by my understanding, we could get rid of the permissive flag in its entirety, but the pull request would be somewhat more substantial. |
Yeah I think we're trying to get rid of permissive variants of everything. But it's sometimes a lot easier for 1 issue to have 3 PRs if you can split it up nicely. |
Sorry I think I'm having a jargon issue, my understanding is that BioPython.PDBReader is a strict reader, in that it is not 'permissive' of weird pdb formats. We are getting rid of |
Yeah I get them confused too. We're killing the bio versions whatever On Sat, 2 Apr 2016 23:07 John Detlefs, [email protected] wrote:
|
Yea okay so now I am fairly certain that we could get rid of the 'permissive_pdb_reader' flag in MDAnalyis/core/init.py' |
Yes we can get rid of that |
- 'permissive'=False has no effect anymore - Added deprecation warnings for Primitive Readers/Writers and Parsers. - Changed doc strings to eliminate references to BioPython Reader/Writer. - Updated CHANGELOG to reflect changes.
- Changed tests to eliminate known failures caused by BioPython - Updated CHANGELOGs and fixed wrong version number in AtomGroup.py, - fixed indentation issue in PDB.py - Fixed doc references to version strings and the permissive flags, - got rid of extraneous text in PrimitivePDBParser, fixed scope of warnings - Used boolean property of collections as suggested by QC
- 'permissive'=False has no effect anymore - Added deprecation warnings for Primitive Readers/Writers and Parsers. - Changed doc strings to eliminate references to BioPython Reader/Writer. - Updated CHANGELOG to reflect changes.
- Changed tests to eliminate known failures caused by BioPython - Updated CHANGELOGs and fixed wrong version number in AtomGroup.py, - fixed indentation issue in PDB.py - Fixed doc references to version strings and the permissive flags, - got rid of extraneous text in PrimitivePDBParser, fixed scope of warnings - Used boolean property of collections as suggested by QC
remove Bio.PDBParser (Issue #777)
@jdetle well done, that was sizable chunk of work. I look forward to your GSoC contribution! |
@orbeckst Thanks! |
In the discussion in #775 it became clear that the BioPython Parser should be removed completely.
The text was updated successfully, but these errors were encountered: