MemoryReader doesn't have a `filename` attribute #1027

kain88-de · 2016-10-14T08:53:52Z

Problem

When I need a real copy of an AtomGroup i usually do

u2 = mda.Universe(ag.universe.filename, ag.universe.trajectory.filename)

This create a independent copy of the original universe. This is good to create references from a single Atomgroup argument in analysis functions. This doesn't work any more then the AtomGroup is a MemoryReader.

My main issue is that I can't create independent copies of atomgroups like I used to when I use the MemoryReader. There are two solutions for me. I can either create a completely independent copy another way or the MemoryReader still keeps a reference to the original file used to create it (but I admit that might be confusing).

Desired Implementation

add a filename attribute to the MemoryReader
- set filename = None if constructed from an array or another anonymous source
- set filename = <trajectory-filename> if it comes from a universe with an associated trajectory filename
in Universe.transfer_to_memory, set the filename attribute
test that the filename attribute of a MemoryReader instance built from an array is None
test that the filename attribute of a MemoryReader instance built via transfer_to_memory is the same as the source filename

History

EDIT 2016-12-05: added desired implementation from MemoryReader doesn't have a filename attribute #1027 (comment) (@orbeckst)
@jbarnoud redefined the TODO, @richardjgowers editted the edit

The text was updated successfully, but these errors were encountered:

richardjgowers · 2016-10-14T09:05:26Z

Really we should have a Universe.clone or Universe.copy method that creates
this.

On Fri, 14 Oct 2016, 9:53 a.m. Max Linke, [email protected] wrote:

When I need a real copy of an AtomGroup i usually do

u2 = mda.Universe(ag.universe.filename, ag.universe.trajectory.filename)

This create a independent copy of the original universe. This is good to
create references from a single Atomgroup argument in analysis functions.
This doesn't work any more then the AtomGroup is a MemoryReader.

My main issue is that I can't create independent copies of atomgroups like
I used to when I use the MemoryReader. There are two solutions for me. I
can either create a completely independent copy another way or the
MemoryReader still keeps a reference to the original file used to create it
(but I admit that might be confusing).

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1027, or mute the thread
https://github.com/notifications/unsubscribe-auth/AI0jBxjimHNzF6iv42UZqJbbvvakLismks5qz0MggaJpZM4KWyBL
.

kain88-de · 2016-10-14T09:16:59Z

But the filename might still be interesting for the MemoryReader to know from where the data originally comes.

kain88-de · 2016-10-14T09:27:47Z

I opened #1029 to discuss the copying of a universe. I'm would like some more feedback is the MemoryReader should hold some information about the original file used to load the data. @wouterboomsma and @mtiberti do you have an opinion on that?

richardjgowers · 2016-10-14T11:00:29Z

If I remember right, you can init a memoryreader with just a numpy array, so that's a nice corner case for someone to figure out too :)

wouterboomsma · 2016-10-17T20:24:41Z

@kain88-de, It would be a bit odd to have a filename in the MemoryReader. At least in our use cases, we frequently construct MemoryReaders directly from a numpy array, or as a merge from different trajectories. For instance, we often use something like this to merge universes:

def merge_universes(universes):
    """
    Merge list of universes into one

    Parameters
    ----------
    `universes` : list of Universe objects


    Returns
    ----------
    Universe object
    """

    for universe in universes:
        universe.transfer_to_memory()

    return mda.Universe(
        universes[0].filename,
        np.concatenate(tuple([e.trajectory.timeseries() for e in universes]),
        axis=1),
        format=MemoryReader)

As @richardjgowers suggests, a Universe.clone or Universe.copy could be a path forward although this function would require an explicit isinstance test for MemoryReader just as we already have in Universe.transfer_to_memory - which I can understand one might like to avoid. Another path would be to give readers a copy() or clone() method - but this would require that we added support for a copy-constructor like functionality in Universe, rather than the using the filenames directly:

u2 = mda.Universe(ag.universe.topology, ag.universe.trajectory.copy())

...which could of course also be hidden inside a Universe.copy(), bringing us back to @richardjgowers suggestion :)

u2 = ag.universe.copy(copy_topology=False, copy_trajectory=True)

richardjgowers · 2016-10-17T20:49:07Z

It might be handy if transfer_to_memory did store what the original filename was, and it doesn't cost anything to do it. But .filename isn't always going to be available.

kain88-de · 2016-10-17T21:01:24Z

What if we give a name when the MemoryReader is constructed from a universe with known filename. If the MemoryReader is constructed from a array we set the filename to None. What something like that work?

richardjgowers · 2016-10-17T21:16:02Z

Yes, or something like 'numpy'

On Mon, 17 Oct 2016, 10:01 p.m. Max Linke, [email protected] wrote:

What if we give a name when the MemoryReader is constructed from a
universe with known filename. If the MemoryReader is constructed from a
array we set the filename to None. What something like that work?

—
You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
#1027 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AI0jB_cXEUe_aRuPFaF-o-6Br0rxrpt3ks5q0-IlgaJpZM4KWyBL
.

kain88-de · 2016-10-17T21:20:17Z

I would prefer a more general arraybut my mind tells me that a None is better since we can catch it in library code and we throw correct type errors for users.

orbeckst · 2016-10-20T17:51:12Z

Can we just decide to add filename = None to the MemoryReader, document it, and then let application code deal with it?

(For context: The NamedStream class main reason for existence is to add an actual filename so that code can decide on the format of the stream. This is not needed here because the Reader API is known.)

wouterboomsma · 2016-10-21T07:25:39Z

Sounds like a good compromise to have the filename attribute set when it makes sense.

orbeckst · 2016-10-26T00:51:14Z

I haven't heard anything else so for right now so I conclude that the consensus is to

add a filename attribute to the MemoryReader
set filename = None if constructed from an array or another anonymous source
set filename = <trajectory-filename> if it comes from a universe with an associated trajectory filename

(Note on 3: The ChainReader will only give you the current trajectory filename without indication that there are other files in the chain.)

EDIT: properly paraphrased #1027 (comment)

utkbansal · 2017-01-28T18:30:16Z

I'd like to work on this.

utkbansal · 2017-01-29T04:46:06Z

@richardjgowers From what I understand, a MemoryReader is used when transfer_to_memory is called on the Universe object , or when in_memory=True is given while instantiating the Universe object.

What I don't understand is

u2 = mda.Universe(ag.universe.filename, ag.universe.trajectory.filename)
This create a independent copy of the original universe.

What I have seen in tests is something like this -
Universe(PDB_small, DCD, in_memory=True)

Is ag.universe.filename the PDB_small and ag.universe.trajectory.filename the DCD files? And the filename attribute in the MemoryReader will have the path to the DCD file?

jbarnoud · 2017-03-19T14:06:42Z

Edit: Moved TODO list into Issue main body - RG

@utkbansal Are you still interested in tackling this issue?

richardjgowers · 2017-03-19T14:41:56Z

I've moved the Issue goals into the main body of this issue so it's all in one (easy to find) place.

Is it safe to have MemoryReader.filename return two different types? (None or str).

jbarnoud · 2017-03-19T14:46:56Z

@richardjgowers It might break in a few places. But these places probably break already without us knowing. Anyway, it is better to have the API clearly defined so the rest of the code can adapt. Also, an attribute being None when not relevant looks rather common in python.

richardjgowers · 2017-03-19T14:49:37Z

Yeah it's definitely an improvement, just wondering if it's the best solution. As long as we document in the API that the attribute might be None then it's OK

…

On Sun, 19 Mar 2017, 2:46 p.m. Jonathan Barnoud, ***@***.***> wrote: @richardjgowers <https://github.com/richardjgowers> It might break in a few places. But these places probably break already without us knowing. Anyway, it is better to have the API clearly defined so the rest of the code can adapt. Also, an attribute being None when not relevant looks rather common in python. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1027 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AI0jB6hjUHO7iJvfdkYvBcI2vOkTzojfks5rnT_hgaJpZM4KWyBL> .

utkbansal · 2017-03-19T14:57:25Z

@jbarnoud Sure, I'm happy to work on this.

kain88-de · 2017-03-19T15:09:03Z

Is this also the correct solution if we change the underlying numpy arrays? I commonly write code to clone a universe like this u = mda.Universe(other.filename, other.trajectory.filename). I'm ok with this throwing an error due to a None value. But I'm concerned about the silent bug I get when this loads the original trajectory and not makes a copy of the memory reader. Maybe this is a sign we need an real copy function for universes.

orbeckst · 2017-03-20T20:11:05Z

Regarding #1027 (comment) : should we consider URI-style names: e.g

"file://trajectory.xtc"
"memory://trajectory.xtc"
None if not associate with any file-like object

(And if we ever get streams working: "https://pdb.org/.../1ake.mmtf" or similar.)

EDIT: This does not really addess @kain88-de 's concern, though. Rather, when changing the underlying array of MemoryReader, the filename attribute should be set to None to indicate that there exists no actual file-like object that contains the data.

jbarnoud · 2017-03-20T21:01:22Z

@orbeckst I am a bit worried adding a protocol prefix would make the use of the attribute overly complicated. Indeed, it means adding a parsing step. Also, we would have to have the protocol bit in the filename for every reader.

orbeckst · 2017-03-20T21:09:46Z

Yes, it would make it more complicated.

Also, we would have to have the protocol bit in the filename for every reader.

This is really not solving any immediate problem, so yes, ignore ;-). But I think it's worth thinking about eventually.

mnmelo · 2017-03-21T01:07:50Z

@orbeckst, rather than overloading the filename string with extra info we could add a kwarg or an attribute to the readers. Reader.source, for instance, defaulting to "file".

Fixes #1027 Changes made in this Pull Request: Adds filename attribute to MemoryReader Adds tests

kain88-de added the question label Oct 14, 2016

orbeckst added Component-Readers Difficulty-easy and removed question labels Oct 26, 2016

orbeckst added the Format-MemoryReader label Nov 7, 2016

kain88-de mentioned this issue Mar 19, 2017

density_from_Universe does not work with the MemoryReader #1248

Closed

jbarnoud assigned utkbansal Mar 19, 2017

jbarnoud mentioned this issue Mar 19, 2017

Add a method to copy a Universe #1249

Closed

utkbansal mentioned this issue Mar 22, 2017

Adds filename attribute to MemoryReader & associated tests #1252

Merged

4 tasks

jbarnoud closed this as completed in #1252 Mar 22, 2017

jbarnoud pushed a commit that referenced this issue Mar 22, 2017

Adds filename attribute to MemoryReader & associated tests (#1252)

e48153f

Fixes #1027 Changes made in this Pull Request: Adds filename attribute to MemoryReader Adds tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MemoryReader doesn't have a `filename` attribute #1027

MemoryReader doesn't have a `filename` attribute #1027

kain88-de commented Oct 14, 2016 •

edited by richardjgowers

Loading

richardjgowers commented Oct 14, 2016

kain88-de commented Oct 14, 2016

kain88-de commented Oct 14, 2016

richardjgowers commented Oct 14, 2016

wouterboomsma commented Oct 17, 2016

richardjgowers commented Oct 17, 2016

kain88-de commented Oct 17, 2016

richardjgowers commented Oct 17, 2016

kain88-de commented Oct 17, 2016

orbeckst commented Oct 20, 2016

wouterboomsma commented Oct 21, 2016

orbeckst commented Oct 26, 2016 •

edited

Loading

utkbansal commented Jan 28, 2017 •

edited

Loading

utkbansal commented Jan 29, 2017 •

edited

Loading

jbarnoud commented Mar 19, 2017 •

edited by richardjgowers

Loading

richardjgowers commented Mar 19, 2017

jbarnoud commented Mar 19, 2017

richardjgowers commented Mar 19, 2017 via email

utkbansal commented Mar 19, 2017

kain88-de commented Mar 19, 2017

orbeckst commented Mar 20, 2017 •

edited

Loading

jbarnoud commented Mar 20, 2017

orbeckst commented Mar 20, 2017

mnmelo commented Mar 21, 2017

MemoryReader doesn't have a filename attribute #1027

MemoryReader doesn't have a filename attribute #1027

Comments

kain88-de commented Oct 14, 2016 • edited by richardjgowers Loading

Problem

Desired Implementation

History

richardjgowers commented Oct 14, 2016

kain88-de commented Oct 14, 2016

kain88-de commented Oct 14, 2016

richardjgowers commented Oct 14, 2016

wouterboomsma commented Oct 17, 2016

richardjgowers commented Oct 17, 2016

kain88-de commented Oct 17, 2016

richardjgowers commented Oct 17, 2016

kain88-de commented Oct 17, 2016

orbeckst commented Oct 20, 2016

wouterboomsma commented Oct 21, 2016

orbeckst commented Oct 26, 2016 • edited Loading

utkbansal commented Jan 28, 2017 • edited Loading

utkbansal commented Jan 29, 2017 • edited Loading

jbarnoud commented Mar 19, 2017 • edited by richardjgowers Loading

richardjgowers commented Mar 19, 2017

jbarnoud commented Mar 19, 2017

richardjgowers commented Mar 19, 2017 via email

utkbansal commented Mar 19, 2017

kain88-de commented Mar 19, 2017

orbeckst commented Mar 20, 2017 • edited Loading

jbarnoud commented Mar 20, 2017

orbeckst commented Mar 20, 2017

mnmelo commented Mar 21, 2017

MemoryReader doesn't have a `filename` attribute #1027

MemoryReader doesn't have a `filename` attribute #1027

kain88-de commented Oct 14, 2016 •

edited by richardjgowers

Loading

orbeckst commented Oct 26, 2016 •

edited

Loading

utkbansal commented Jan 28, 2017 •

edited

Loading

utkbansal commented Jan 29, 2017 •

edited

Loading

jbarnoud commented Mar 19, 2017 •

edited by richardjgowers

Loading

orbeckst commented Mar 20, 2017 •

edited

Loading