-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for opening file from memory #652
Conversation
btw unittests are failing because it can't find the |
This is great, thanks! To fix the issue with the tests, take a look at the |
btw, why is netCDF4/_netCDF4.c checked in? it's generated when setup.py is run. |
We'll also need a test. _netCD4.c has to be manually generated. It should be created with all the values in include/constants.pyx set to 0. It's only used if cython is not installed. |
Probably we should just require cython and get rid of _netCDF.c in the repo. |
Cython is super-easy to install these days - there are wheels for many versions for OSX, Windows, Linux. |
working on updates, btw mind if I PEP'ify some of the files? get a lot of warnings in PyCharm |
ok, made nc_open_mem lookup dynamic and added a test. |
btw, I'm not even sure how the .c file here helps because isn't it specific to a python version / platform / cython version? Not sure how that ever worked correctly. The one in this PR is definitely wrong so need to figure out how to create a "correct" one. |
btw let me know how I should fix this new unittest error. I'm guessing it's compiling with all the settings off. Further it won't work until there's a version of netcdf-c with the fix |
The standard way to do this is to generate C files for upload to pypi, but not include them in the source. |
@@ -1712,7 +1721,7 @@ references to the parent Dataset or Group. | |||
the parent Dataset or Group.""" | |||
|
|||
def __init__(self, filename, mode='r', clobber=True, format='NETCDF4', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make filename
optional if memory
is provided?
Even if netCDF-C needs a bogus filename
argument, I imagine we could generate something at random.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It gets reflected in the Dataset.filepath
attribute so I think we should keep it. My unittest actually validates this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I guess so. It just seems pretty pointless to me :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ya, weirdness :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some text should be added to the docstring noting that filename
is only used to set the value of filepath
when memory
is specified.
Looks like the I'd like to remove _netCDF.c from the repo, but that should be a separate pull request. Are the other failing tests due to the bug in Unidata/netcdf-c#394? |
updated, so I think the tests are failing now because it's running the mem test with the feature disabled? |
The test fails on my machine even though I have 4.4.1 (with nc_open_mem). I guess this is because of Unidata/netcdf-c#394? If so, a solution would be to disable the test in run_all.py if |
btw, it seems like we should be able to use a memoryview insead reading http://docs.cython.org/en/latest/src/userguide/memoryviews.html?highlight=memory%20views.html. Let me know what you guys think |
""" | ||
cdef int grpid, ierr, numgrps, numdims, numvars | ||
cdef char *path | ||
cdef char namstring[NC_MAX_NAME+1] | ||
|
||
memset(&self._buffer, 0, sizeof(self._buffer)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably ought to add _buffer
to _private_atts
so it doesn't get mistaken for a netcdf attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
Don't see any real benefit to using memoryviews, since we're not slicing or manipulating the buffer in any way. |
netCDF4/_netCDF4.pyx
Outdated
# so it will compile with versions <= 4.2. | ||
NC_DISKLESS = 0x0008 | ||
NC_INMEMORY = 0x8000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't look like NC_INMEMORY
is ever used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
test/tst_open_mem.py
Outdated
class TestOpenMem(unittest.TestCase): | ||
def test_mem_open(self): | ||
# Needs: https://github.com/Unidata/netcdf-c/pull/400 | ||
if netCDF4.__netcdf4libversion__ >= '4.4.1.2': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just add this check in run_all.py so the test is skipped altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually updated the test to test different scenarios of this
Merging now - thanks again for this contribution! |
thank you for the quick responses! pleasure! |
Is there a working example/documentation based on this discussion for pulling a dataset via a url into memory using the newest netCDF4 version? I've read material on adding a memory=bytes construct in the init function to use the in-memory read capability from nc_open_mem(). However, my initial attempts have produced file location errors - "No such file or directory". I also ran a portion of the test script provided at https://github.com/Unidata/netcdf4-python/blob/master/test/tst_open_mem.py. An error was raised for netCDF4.netcdf4libversion = '4.4.1.1' (< 4.4.1.2). Do I need to pull from Unidata/netcdf-c#400 and integrate netcdf library changes directly into the package? Clarification is appreciated! |
I am testing on a Jupyter notebook using Python 3.5 and netCDF4 version 1.3.0, but could switch if needed. |
I think you need to link against the updated netcdf-c library. This one should work: https://github.com/Unidata/netcdf-c/releases/tag/v4.5.0-rc1 |
Another option is to pass it a filename to an empty file, like from |
Great, substituting a completely unrelated local netCDF file as the "filename" but also passing a memory file in byte format proved to be successful. The returned file dimensions were consistent with the memory file rather than the unrelated local netCDF file with different dimensions/variables. So that checks out. I also attempted unsuccessfully to use the temp file option using FILE_NAME = tempfile.NamedTemporaryFile(suffix='.nc', delete=False).name, mainly because it is much cleaner in my view. However, using a temp file as prescribed returned an OSError: NetCDF: Unknown file format. Should I format the temp file in a specific way? Thank you. |
I think "filename" just needs to be non-empty string |
If I just input a text string the code returns an OSError: No such file or directory, e.g. using nc_file = netCDF4.Dataset('temp_file', mode = 'r', memory=bytefile) |
what version of netcdf-c ? |
@TWellman I needed it to be a netCDF file (apparently--I guess I forgot about it). This is what I've done for now until netCDF 4.5 is released: with NamedTemporaryFile(suffix='.nc') as temp:
# Create a temporary netCDF file to work around bug in netCDF C 4.4.1.1
# We shouldn't actually need any file on disk.
nc_temp = Dataset(temp.name, 'w').close()
nc = Dataset(temp.name, memory=mem_buffer) |
I should probably upgrade as suggested earlier. I am using netCDF4.netcdf4libversion = '4.4.1.1', netCDF4 version 1.3.0. |
Great, thank you! |
fixes: #406 and #295, requires: Unidata/netcdf-c#394 to be fixed