Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

numberOfAtoms to n_atoms and others #376

Closed
hainm opened this issue Aug 4, 2015 · 16 comments · Fixed by #387
Closed

numberOfAtoms to n_atoms and others #376

hainm opened this issue Aug 4, 2015 · 16 comments · Fixed by #387
Assignees
Milestone

Comments

@hainm
Copy link
Contributor

hainm commented Aug 4, 2015

guys,

as @orbeckst suggested from this issue #372, I am opening this issue to raise discussiong whether mdanalysis should change its behavior, like changing numberOfAtoms method to n_atoms property?

My point is quite simple: if your method is not expensive to evaluate, change it to attribute, which is much shorter (numberOfAtoms vs n_atoms). And The method's name should be shortened (as shown). The point is, we (including mdanalaysis) should follow some common patterns in Python's world to make users have smooth transition from one to another package. There is nothing wrong if "our" package is similar to others, a package still have some features that others don't have (for example, mdanalysis is quite good at membrane analysis (vs pytraj)).

PS: I myself wrote my own package (pytraj), so I am not benefit from the changing (:D) but mdanalysis.

My suggestion is too look at attributes from mdtraj package. It has pretty good naming stuff (one of pytraj's developers always suggesting using mdtraj in 1st place, so I think there must be a reason).

@hainm
Copy link
Contributor Author

hainm commented Aug 4, 2015

Just for example.

this is the code from mdtraj

import mdtraj as md
traj = md.load(filename, topname)

and this is from pytraj

import pytraj as pt
traj = pt.iterload(filename, topname)

there is nothing wrong with having similar API. The main point is to make users feel comfortable with the interface. There is seamless conversion from mdtraj to pytraj. This helps when mdtraj or pytraj need helps from another package.

traj = pt.Trajectory(xyz=m_traj.xyz, top=topology_name)

@richardjgowers
Copy link
Member

With n_atoms, numberOfAtoms etc, I'm in favour of getting rid of all of them and instead using len. AtomGroup, ResidueGroup and SegmentGroup are just containers, Python containers use len. Easiest to remember imo.

ag.numberOfAtoms() == len(ag)
ag.numberOfResidues() == len(ag.residues)

@orbeckst
Copy link
Member

orbeckst commented Aug 4, 2015

On 4 Aug, 2015, at 01:09, Richard Gowers wrote:

With n_atoms, numberOfAtoms etc, I'm in favour of getting rid of all of them and instead using len. AtomGroup, ResidueGroup and SegmentGroup are just containers, Python containers use len. Easiest to remember imo.

ag.numberOfAtoms() == len(ag)
ag.numberOfResidues() == len(ag.residues)

There's, however, something to be said for easy introspection. If I do

AtomGroup.n<TAB>

and get

AtomGroup.numatoms

I know immediately what it does.

I would definitely want to have len(AtomGroup) work, too. (And here I am actually disagreeing with the Zen of Python that there should be always only one way to do things... especially with overloading generic mechanisms it is not always clear what the domain-specific meaning is.)

I favor deprecating numberOfAtoms(), numberOfResidues(), numberOfSegments() and replacing them with managed properties numatoms, numresidues, numsegments (because we already have trajectory.numframes)... or if people like other variants like num_atoms or natoms or n_atoms better then we should deprecate numframes and turn it into n_frames.

(Note that I don't think "fast to type" is the biggest argument here --- clarity is more important for complicated and central data structures such as AtomGroup --- if anything, this is the object that most users will have contact with and that is probably the most confusing. Part of this and related issues is to make it behave more the way that an unsuspecting user would reasonably expect it to behave.)

@orbeckst
Copy link
Member

orbeckst commented Aug 4, 2015

On 3 Aug, 2015, at 21:00, Hai Nguyen wrote:

Just for example.

this is the code from mdtraj

import mdtraj as md
traj = md.load(filename, topname)
and this is from pytraj

import pytraj as pt
traj = pt.iterload(filename, topname)
there is nothing wrong with having similar API. The main point is to make users feel comfortable with the interface.

Although this is a separate issue, we could add a compatibility module along the lines of

import MDAnalysis.masquerade as mdtraj
u = mdtraj.load(trajectory, topology)

masquerade.load() would need to do some careful argument juggling to always get the topology (something like args[-1]) and do some sanity checks but at least this way we would have a non-intrusive way of accommodating alternative interfaces.

However, such a feature would very much depend on user demand and, well, user contributions, i.e. pull requests by the like of @hainm :-). (At the moment, the core devs have a lot to do with the most pressing other issues.)

@orbeckst orbeckst added this to the 0.11 milestone Aug 4, 2015
@orbeckst
Copy link
Member

orbeckst commented Aug 4, 2015

Just saw that Timestep also uses numatoms (#250) so this should also be consistent with whatever else we decide.

@dotsdl
Copy link
Member

dotsdl commented Aug 6, 2015

I'm in favor of n_atoms, n_residues, n_segments, as well as trajectory.n_frames. Although I think numatoms and friends are easier to say (n_atoms feels a bit weird, but that might just be me), I think n_... is easier to read. I just spent a couple minutes staring at both sets of variants to make up my mind.

@hainm
Copy link
Contributor Author

hainm commented Aug 6, 2015

and they are (n_atoms, n_...) are shorters to write code (we're not always use ipython with <TAB>)

@dotsdl
Copy link
Member

dotsdl commented Aug 6, 2015

I'm not so concerned with shorter (the difference is a single character), but readability matters. Any counterpoints to using n_...?

@richardjgowers
Copy link
Member

cough len cough :)

@hainm
Copy link
Contributor Author

hainm commented Aug 6, 2015

@richardjgowers
you meant len(ag.atoms) vs ag.n_atoms?

ag.n_atoms looks nicer and (like @orbeckst said), we can use <TAB> with n_atoms.

@dotsdl
I meant numberOfAtoms vs n_atoms (from @richardjgowers example)

@dotsdl
Copy link
Member

dotsdl commented Aug 6, 2015

@hainm no, scroll up.

@dotsdl
Copy link
Member

dotsdl commented Aug 6, 2015

@richardjgowers I agree with @orbeckst that we should keep methods for these things for introspection, since they would essentially be minimal effort to maintain. Although I am likewise tempted to say burn them all and just use len(), I don't think it's too big a deal. As long as we are consistent in the end. :D

@dotsdl dotsdl self-assigned this Aug 6, 2015
@kain88-de
Copy link
Member

For beginners having methods like 'n_atoms' is nicer. But why not support both?

Personally I would prefer 'numatoms' because the underscore is hard to reach.

@orbeckst
Copy link
Member

orbeckst commented Aug 6, 2015

Ok, final verdict:

  • n_atoms
  • n_residues
  • n_segments
  • n_frames

(and no getting rid of them in favor of len()... you purists all know the code inside out by now but I also have to consider what it looks like to new users... or PIs who occasionally still have to do real work... ;-) )

@dotsdl
Copy link
Member

dotsdl commented Aug 6, 2015

It will be done. On it.

@orbeckst
Copy link
Member

orbeckst commented Aug 9, 2015

@dotsdl , please add entry to CHANGELOG for n_atoms and friends – API breakage changes must be documented in the CHANGELOG.

@orbeckst orbeckst reopened this Aug 9, 2015
dotsdl added a commit that referenced this issue Aug 11, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants