Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No compression for CLI test ouputs #924

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 30 additions & 2 deletions lib/improver/tests/utilities/test_save.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,18 +28,20 @@
# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
# pylint: disable=E1136
"""Unit tests for saving functionality."""

import os
import unittest
import numpy as np
from unittest import mock
from subprocess import call
from tempfile import mkdtemp

import numpy as np
from netCDF4 import Dataset
import iris
from iris.coords import CellMethod
from iris.tests import IrisTest
from netCDF4 import Dataset

from improver.utilities.load import load_cube
from improver.utilities.save import (
Expand Down Expand Up @@ -182,6 +184,32 @@ def test_error_unknown_units(self):
with self.assertRaisesRegex(ValueError, "Cannot save 'unknown' cube"):
save_netcdf(no_units_cube, self.filepath)

def test_compressed_by_default(self):
"""Test saves file with compression by default. This is indicated by
the presence of the complevel attribute on the netcdf file."""

# Default expected to be compressed.
save_netcdf(self.cube, self.filepath)
cube = Dataset(self.filepath)
var = cube.variables['air_temperature']
compressed = var.filters().get('complevel', False)

self.assertTrue(compressed)

def test_uncompressed_using_env_var(self):
"""Test saves file without compression if the SAVE_COMPRESSED
environment variable is set to False. In this case the complevel
attribute should not be present on the netcdf file."""

# Environment variable should prevent compression.
with mock.patch.dict(os.environ, {'SAVE_COMPRESSED': 'False'}):
save_netcdf(self.cube, self.filepath)
cube = Dataset(self.filepath)
var = cube.variables['air_temperature']
compressed = var.filters().get('complevel', False)

self.assertFalse(compressed)


class Test__check_for_units(IrisTest):
"""Test function that checks for valid units"""
Expand Down
9 changes: 8 additions & 1 deletion lib/improver/utilities/save.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@
# POSSIBILITY OF SUCH DAMAGE.
"""Module for saving netcdf cubes with desired attribute types."""

import os
import ast
import cf_units
import warnings
import iris
Expand Down Expand Up @@ -142,6 +144,10 @@ def save_netcdf(cubelist, filename):
Raises:
warning if cubelist contains cubes of varying dimensions.
"""
# To avoid compression SAVE_COMPRESSED=False can be exported to the
# environment. Defaults True if not set.
save_compressed = ast.literal_eval(os.getenv("SAVE_COMPRESSED", "True"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer a CLI flag really. If we are going to go for an environment variable, it probably should be IMPROVER_SAVE_UNCOMPRESSED and just test it's non-empty. We should also document it in all the CLI help... so it probably doesn't save any effort vs a CLI flag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it also feels like reading environment within the save routine doesn't really belong there. Unfortunately, there doesn't seem any other easy way to implement that. CLI flag will force us for us add this to each and every CLI (because we have no way to handle common options across all CLIs).

A nice way to implement this could be with a context manager:

with improver.settings(netcdf_compression=False):
     ...

Unfortunately, we have no common point of entry into CLIs where we could add something like this.
CC @arh89, maybe you have some thoughts on this.

Copy link
Contributor

@arh89 arh89 Aug 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Agreed with @bayliffe about not doing this until after @neilCrosswaite's PR has gone in.)

Adding to all CLIs should be trivial, if we add to the COMPULSORY_ARGUMENTS in https://github.com/metoppv/improver/blob/master/lib/improver/argparser.py#L87, then all CLIs should pick them up. (NB: We'll still have to update all the BATS test outputs though, e.g help/nullargs)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if we wait until after Neil's work has gone in, we can trivially centralise the loading and saving, then we're golden 👍


if isinstance(cubelist, iris.cube.Cube):
cubelist = [cubelist]

Expand Down Expand Up @@ -174,5 +180,6 @@ def save_netcdf(cubelist, filename):

cubelist = _append_metadata_cube(cubelist, global_keys)
iris.fileformats.netcdf.save(cubelist, filename, local_keys=local_keys,
complevel=1, shuffle=True, zlib=True,
complevel=1, shuffle=True,
zlib=save_compressed,
chunksizes=chunksizes)
3 changes: 3 additions & 0 deletions tests/bin/bats
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,9 @@ export PATH="$BATS_LIBEXEC:$PATH"
# tests on different machines. Some CLI test results exhibit precision level
# sensitivity to the number of threads.
export OMP_NUM_THREADS=1
# Environment variable used to turn off compression in save wrapper.
# SAVE_COMPRESSED="False", results not compressed, tests run faster.
export SAVE_COMPRESSED="False"

options=()
arguments=()
Expand Down