Baseline images need updates due to the recent updates of earth relief data #451

seisman · 2020-05-24T18:56:53Z

Description of the problem

Upstream GMT has recently updated the global earth relief data (see GenericMappingTools/gmtserver-admin#40 for the recipe change), which lead to some test failures in PyGMT.

We need to re-generate the baseline images using the latest earth relief data and GMT 6.0.0.

See #401 for a previous fix.

seisman · 2020-05-24T19:01:30Z

While the Travis CI reports several test failures, the Azure Pipelines CIs don't. It's simply because we already cached the earth relief in the Azure Pipelines CI, thus Azure Pipelines CI is still using the old earth relief data. We need to update the cache key 20200519 to a new date to manually re-cache the new earth relief data.

seisman · 2020-06-09T17:19:05Z

Re-open the issue again, since the GMT data server updated again recently.

The biggest change is that the GMT data server now provides earth relief grids in both pixel- and gridline-registrations.

In GMT 6.1.0 (not released yet), users can give a name like @earth_relief_30m_p or @earth_relief_30m_g to specify which registration they want to use. If no registration is given, e.g., @earth_relief_30m, then the pixel-registration grid is the default.

For GMT 6.0.0, it's a breaking change. The grids changes from gridline-registration to pixel-registration.

Thus, we need to update the failing PyGMT tests.

weiji14 · 2020-06-09T21:50:58Z

We'll need to be very careful about this change (from gridline/node-registration to pixel-registration). It's been touched upon before in #375. Personally I'm in favour of pixel registration since that's what xarray assumes (I think?), but we might want to change the following lines:

pygmt/pygmt/clib/session.py

Lines 567 to 569 in 0267dd1

    
           registration_int = self._parse_constant( 
        
               kwargs.get("registration", "GMT_GRID_NODE_REG"), valid=REGISTRATIONS 
        
           )

Currently it defaults to gridline-registration GMT_GRID_NODE_REG. Ideally we would be able to specify whether pixel- GMT_GRID_PIXEL_REG or gridline/node- GMT_GRID_NODE_REG is used.

seisman · 2020-06-10T03:31:32Z

Yes, it looks a big issue.

Currently it defaults to gridline-registration GMT_GRID_NODE_REG.

Yes, the default registration in GMT is also gridline registration, so we will keep as it is.

Ideally we would be able to specify whether pixel- GMT_GRID_PIXEL_REG or gridline/node- GMT_GRID_NODE_REG is used.

I agree. Then we need to figure out how to determine the registration of xarray.DataArray.

weiji14 · 2020-06-10T03:56:18Z

Then we need to figure out how to determine the registration of xarray.DataArray.

Yep, xarray uses pixel registration (i.e. the data value is for the centre of the pixel), see http://xarray.pydata.org/en/stable/plotting.html#coordinates, and also pydata/xarray#1468.

Just need to finish off some of my other work first, and I'll try to get a Pull Request up to resolve this tonight (actually had a local branch working on fixing #375 before).

weiji14 · 2020-06-19T04:22:28Z

In GMT 6.1.0 (not released yet), users can give a name like @earth_relief_30m_p or @earth_relief_30m_g to specify which registration they want to use. If no registration is given, e.g., @earth_relief_30m, then the pixel-registration grid is the default.

For GMT 6.0.0, it's a breaking change. The grids changes from gridline-registration to pixel-registration.

Do you think it's a good idea to change the pygmt code to use @earth_relief_30m_g for now (instead of @earth_relief_30m_p), if only to make the tests pass and keep the existing behaviour? I'm a bit hesitant to update so many PNG files again (as with GenericMappingTools/gmt#3470), and #476 doesn't seem to be a quick fix to make.

seisman · 2020-06-19T04:53:18Z

Gridline-registered grids like @earth_relief_30m_g are not available to GMT 6.0.0.

So, it's a big breaking change for GMT 6.0.0. Perhaps for GMT 6.0.0, @earth_relief_30m should be aliased to @earth_relief_30m_g instead of @earth_relief_30_p to keep backward compatibility? @PaulWessel

In GMT 6.0.0, both `01d` and `60m` are valid resolutions of earth relief data. It was called `60m` at the beginning, and was changed to `01d` when GMT 6.0.0 was officially released. `60m` is still valid for backward compatibility. Run the following commands, and you will have the two data in the current directory. ``` gmt which @earth_relief_60m -Gl gmt which @earth_relief_01d -Gl ``` These two files have different file names but are identical: ``` $ md5sum earth_relief_01d.grd earth_relief_60m.grd 74a884c902015dda516d17605f317efe earth_relief_01d.grd 74a884c902015dda516d17605f317efe earth_relief_60m.grd ``` In the upcoming GMT 6.1.0, the resolution `60m` will be deprecated. That's why we have many ~25 errors when testing PyGMT with the GMT master branch, simply because GMT 6.1.0 can't download the `@earth_relief_60m`. To make the transition to GMT 6.1.0 easier, here I change the default earth relief resolution of `load_earth_relief()` function from `60m` to `01d`. As the two grids are identical, the change in this PR won't break anything. Note that, currently there are ~43 failures due to the recent updates of the GMT data server (#451), and we can't fix these failures easily due to the grid registration issue (#476). Thus, I don't try to fix any failures in this PR. The test log files of the master branch and this branch are the same. Tests that fail in the master brach still fail in the same way in this branch.

In GMT 6.0.0, both `01d` and `60m` are valid resolutions of earth relief data. It was called `60m` at the beginning and was changed to `01d` when GMT 6.0.0 was officially released. `60m` is still valid for backward compatibility. Run the following commands, and you will have the two data in the current directory. ``` gmt which @earth_relief_60m -Gl gmt which @earth_relief_01d -Gl ``` These two files have different file names but are identical: ``` $ md5sum earth_relief_01d.grd earth_relief_60m.grd 74a884c902015dda516d17605f317efe earth_relief_01d.grd 74a884c902015dda516d17605f317efe earth_relief_60m.grd ``` In the upcoming GMT 6.1.0, the resolution `60m` will be deprecated. That's why we have many ~25 errors when testing PyGMT with the GMT master branch, simply because GMT 6.1.0 can't download the `@earth_relief_60m`. To make the transition to GMT 6.1.0 easier, here I change the default earth relief resolution of `load_earth_relief()` function from `60m` to `01d`. As the two grids are identical, the change in this PR won't break anything. Note that, currently there are ~43 failures due to the recent updates of the GMT data server (#451), and we can't fix these failures easily due to the grid registration issue (#476). Thus, I don't try to fix any failures in this PR. The test log files of the master branch and this branch are the same. Tests that fail in the master branch still fail in the same way in this branch. Co-authored-by: Wei Ji <[email protected]>

seisman · 2020-06-26T22:58:20Z

@weiji14 Paul just changed the earth relief grids for GMT<=6.0.0 to gridline registration. I believe after updating the CI caches, we should see zero or only a few failures.

weiji14 · 2020-06-26T23:03:55Z

Awesome! I'll test things locally first, thanks for the update.

Update some numbers due to the recent changes in GMT earth relief data. We don't update the failing baseline images to minimize images updates. Partially address #451.

weiji14 · 2020-07-09T03:22:40Z

Should we start using submodules? I.e. split the pygmt/tests/baseline folder into a separate repository (see https://docs.github.com/en/github/using-git/splitting-a-subfolder-out-into-a-new-repository).

I've been reading up on git submodules/subtrees/git-lfs and there doesn't seem to be an easy way to do this, there will be a learning curve in any case. Matplotlib currently has a big PR at matplotlib/matplotlib#17557 to move their baseline images into a separate place, and I really do not want myself or anyone to handle that in X years.

seisman · 2020-07-09T03:48:37Z

It's too complicated. When we add a test, we have to open two separate PRs in two repositories, one for the baseline images and one for the tests. How can the tests PR know it should get the new baseline images in the corresponding branch?

weiji14 · 2020-07-09T03:55:14Z

Yeah, and I don't think it will be friendly for new contributors either. Surely there must be a better way to store the images, or test them ☹️

seisman · 2020-07-09T04:03:05Z

The GMT repository also faces the same problem. After some discussion, I feel the easiest way is to use shallow clone so that we don't have to care too much about the repository size.

weiji14 · 2020-09-06T23:57:02Z

Yeah, and I don't think it will be friendly for new contributors either. Surely there must be a better way to store the images, or test them ☹️

Now that we have #555, we should be able to resolve this issue by removing all the xfail statements, specifically just for the grd* modules (the non-grd* ones should be covered by #522).

Should we remove the baseline images from the repository when we switch from using @pytest.mark.mpl_image_compare to @check_figures_equal()?

seisman · 2020-09-07T00:21:00Z

Now that we have #555, we should be able to resolve this issue by removing all the xfail statements, specifically just for the grd* modules (the non-grd* ones should be covered by #522).

For other non-grd tests, perhaps we could generate the reference images by directly passing arguments to GMT modules. For examples,

fig_test.basemap(region=[0, 10, 0, 10], projection='X10c/10c', frame=['xaf', 'yaf', 'WSen'])

should be identical to the reference image generated by:

lib.call_module("basemap", "-R0/10/0/10 -JX10c/10c -Bxaf -Byaf -BWSen")

In this way, we can almost avoid all baselines images.

Should we remove the baseline images from the repository when we switch from using @pytest.mark.mpl_image_compare to @check_figures_equal()?

We can, but perhaps we should leave them in the repository for a while until we make the final decision.

seisman · 2020-09-07T00:25:44Z

For other non-grd tests, perhaps we could generate the reference images by directly passing arguments to GMT modules.

GMT sometimes make tiny changes that may affect all images. For example, GenericMappingTools/gmt#677, if this "bug" is fixed, we may have many failures and have to update the images again. Using the method I mentioned above can avoid these problems.

weiji14 · 2020-09-07T00:38:17Z

For other non-grd tests, perhaps we could generate the reference images by directly passing arguments to GMT modules. For examples,
fig_test.basemap(region=[0, 10, 0, 10], projection='X10c/10c', frame=['xaf', 'yaf', 'WSen'])
should be identical to the reference image generated by:
lib.call_module("basemap", "-R0/10/0/10 -JX10c/10c -Bxaf -Byaf -BWSen")
In this way, we can almost avoid all baselines images.

This is almost asking for a command like pygmt.call_module 😆

Also, just to note on the pytest-mpl PR at matplotlib/pytest-mpl#95 (comment), we might be able to keep using @pytest.mark.mpl_image_compare but just return 2 figures instead of 1. Code as so:

@pytest.mark.mpl_image_compare
def test_check_equal():
    fig_test = pygmt.Figure()
    fig_test.basemap(region=[0, 10, 0, 10], projection='X10c', frame=True)

    fig_ref = pygmt.Figure()
    fig_ref.basemap(region="0/10/0/10", projection="X10c/10c", frame="af")

    return fig_test, fig_ref

Should we remove the baseline images from the repository when we switch from using @pytest.mark.mpl_image_compare to @check_figures_equal()?

We can, but perhaps we should leave them in the repository for a while until we make the final decision.

Ok. We can remove them one at a time as upstream GMT starts breaking things.

seisman · 2020-10-15T00:52:19Z

Recently, upstream GMT made a "tiny" change about the order of plotting frames, ticks, gridlines, and data (see GenericMappingTools/gmt#4274). The change is "tiny" and unnoticeable to most users. However, it breaks some PyGMT tests. Such "tiny" changes never end. That's another reason we have to avoid storing baseline images.

weiji14 · 2020-10-15T00:54:07Z

Yes, was just about to report on these new failures at https://github.com/GenericMappingTools/pygmt/runs/1256473593?check_suite_focus=true#step:10:423

__________________________________ test_logo ___________________________________
Error: Image files did not match.
  RMS Value: 14.157876271754736
  Expected:  
    /home/runner/work/pygmt/pygmt/tmp-test-dir-with-unique-name/results/tmpssca0rm3/baseline-test_logo.png
  Actual:    
    /home/runner/work/pygmt/pygmt/tmp-test-dir-with-unique-name/results/tmpssca0rm3/test_logo.png
  Difference:
    /home/runner/work/pygmt/pygmt/tmp-test-dir-with-unique-name/results/tmpssca0rm3/test_logo-failed-diff.png
  Tolerance: 
    2
______________________________ test_logo_on_a_map ______________________________
Error: Image files did not match.
  RMS Value: 4.2480788344353515
  Expected:  
    /home/runner/work/pygmt/pygmt/tmp-test-dir-with-unique-name/results/tmpto92h1jj/baseline-test_logo_on_a_map.png
  Actual:    
    /home/runner/work/pygmt/pygmt/tmp-test-dir-with-unique-name/results/tmpto92h1jj/test_logo_on_a_map.png
  Difference:
    /home/runner/work/pygmt/pygmt/tmp-test-dir-with-unique-name/results/tmpto92h1jj/test_logo_on_a_map-failed-diff.png
  Tolerance: 
    2
_______________________________ test_plot_colors _______________________________
Error: Image files did not match.
  RMS Value: 2.483719660608718
  Expected:  
    /home/runner/work/pygmt/pygmt/tmp-test-dir-with-unique-name/results/tmpe1mt8j5j/baseline-test_plot_colors.png
  Actual:    
    /home/runner/work/pygmt/pygmt/tmp-test-dir-with-unique-name/results/tmpe1mt8j5j/test_plot_colors.png
  Difference:
    /home/runner/work/pygmt/pygmt/tmp-test-dir-with-unique-name/results/tmpe1mt8j5j/test_plot_colors-failed-diff.png
  Tolerance: 
    2

seisman · 2020-12-15T07:45:31Z

It seems we can close the issue now.

Please re-open the issue if I missed anything.

seisman added help wanted Helping hands are appreciated maintenance Boring but important stuff for the core devs labels May 24, 2020

seisman self-assigned this May 24, 2020

seisman mentioned this issue May 24, 2020

Update baseline images for updates of earth relief data #452

Merged

5 tasks

seisman closed this as completed in #452 May 25, 2020

seisman reopened this Jun 9, 2020

seisman pinned this issue Jun 9, 2020

weiji14 mentioned this issue Jun 10, 2020

Properly allow for either pixel or gridline registered grids #476

Merged

5 tasks

seisman mentioned this issue Jun 19, 2020

What to do when @earth_relief_xxy is given? GenericMappingTools/gmtserver-admin#53

Closed

seisman mentioned this issue Jun 23, 2020

Change load_earth_relief()'s default resolution to 01d #488

Merged

5 tasks

This was referenced Jun 26, 2020

Re-caching GMT remote files #495

Closed

Fix several failures due to updates of earth relief data #498

Merged

weiji14 added this to the 0.1.x milestone Jun 29, 2020

This was referenced Jul 2, 2020

Release PyGMT 0.1.2 #501

Closed

Temporarily expect failures for some grdcontour and grdview tests #503

Merged

weiji14 modified the milestones: 0.1.x, 0.2.x Jul 5, 2020

weiji14 mentioned this issue Sep 7, 2020

Remove expected failures on grdview tests #589

Merged

5 tasks

seisman mentioned this issue Sep 12, 2020

Refactor xfail tests to avoid storing baseline images #603

Merged

8 tasks

seisman modified the milestones: 0.2.0, 0.2.1 Sep 17, 2020

seisman modified the milestones: 0.2.1, 0.3.0 Nov 6, 2020

seisman mentioned this issue Dec 15, 2020

Change text when GMTInvalidInput error is raised for basemap #729

Merged

seisman closed this as completed Dec 15, 2020

seisman unpinned this issue Dec 15, 2020

weiji14 mentioned this issue Feb 6, 2021

Wrap grd2cpt #803

Merged

seisman mentioned this issue Feb 24, 2021

Rethink the testing mechanism for images #963

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Baseline images need updates due to the recent updates of earth relief data #451

Baseline images need updates due to the recent updates of earth relief data #451

seisman commented May 24, 2020

seisman commented May 24, 2020

seisman commented Jun 9, 2020

weiji14 commented Jun 9, 2020 •

edited

Loading

seisman commented Jun 10, 2020

weiji14 commented Jun 10, 2020

weiji14 commented Jun 19, 2020

seisman commented Jun 19, 2020

seisman commented Jun 26, 2020

weiji14 commented Jun 26, 2020

weiji14 commented Jul 9, 2020

seisman commented Jul 9, 2020 •

edited

Loading

weiji14 commented Jul 9, 2020

seisman commented Jul 9, 2020

weiji14 commented Sep 6, 2020 •

edited

Loading

seisman commented Sep 7, 2020

seisman commented Sep 7, 2020

weiji14 commented Sep 7, 2020

seisman commented Oct 15, 2020

weiji14 commented Oct 15, 2020

seisman commented Dec 15, 2020

Baseline images need updates due to the recent updates of earth relief data #451

Baseline images need updates due to the recent updates of earth relief data #451

Comments

seisman commented May 24, 2020

seisman commented May 24, 2020

seisman commented Jun 9, 2020

weiji14 commented Jun 9, 2020 • edited Loading

seisman commented Jun 10, 2020

weiji14 commented Jun 10, 2020

weiji14 commented Jun 19, 2020

seisman commented Jun 19, 2020

seisman commented Jun 26, 2020

weiji14 commented Jun 26, 2020

weiji14 commented Jul 9, 2020

seisman commented Jul 9, 2020 • edited Loading

weiji14 commented Jul 9, 2020

seisman commented Jul 9, 2020

weiji14 commented Sep 6, 2020 • edited Loading

seisman commented Sep 7, 2020

seisman commented Sep 7, 2020

weiji14 commented Sep 7, 2020

seisman commented Oct 15, 2020

weiji14 commented Oct 15, 2020

seisman commented Dec 15, 2020

weiji14 commented Jun 9, 2020 •

edited

Loading

seisman commented Jul 9, 2020 •

edited

Loading

weiji14 commented Sep 6, 2020 •

edited

Loading