Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure.plot: parameter aspatial does not accept integers for filling polygons #2497

Closed
michaelgrund opened this issue Apr 12, 2023 · 9 comments · Fixed by #2592
Closed

Figure.plot: parameter aspatial does not accept integers for filling polygons #2497

michaelgrund opened this issue Apr 12, 2023 · 9 comments · Fixed by #2592
Labels
bug Something isn't working
Milestone

Comments

@michaelgrund
Copy link
Member

michaelgrund commented Apr 12, 2023

Description of the problem

When using polygons (e.g. read in via geopandas), the aspatial parameter of Figure.plot does not accept integers for filling the corresponding polygons (the polygons remain empty). First they need to be converted to float.

Minimal Complete Verifiable Example

import pygmt
import geopandas as gpd

world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))

world = world[world["continent"] == "Europe"]
world["pop_est"] = world.pop_est * 1e-6

fig = pygmt.Figure()

pygmt.makecpt(cmap = "lajolla", series = [0, 150, 50], continuous = True)

fig.basemap(region = [-12, 35, 35, 60], 
            projection = "M12c", 
            frame = True)

fig.plot(data = world, 
         pen = "1p,black", 
         close = True, 
         fill = "+z", 
         cmap = True, 
         aspatial = "Z=pop_est")

fig.colorbar(position = "JMR", frame = 'a10+lPopulation in millions')

###############################################################################
fig.shift_origin(yshift = "-11c")

# first round data, then convert to int
world["pop_est_round"] = round(world["pop_est"])
world["pop_est_int"] = world.pop_est_round.astype(int)

fig.basemap(region = [-12, 35, 35, 60], 
            projection = "M12c", 
            frame = True)

fig.plot(data = world, 
         pen = "1p,black", 
         close = True, 
         fill = "+z", 
         cmap = True, 
         aspatial = "Z=pop_est_int")

fig.colorbar(position = "JMR", frame = 'a10+lPopulation in millions')

fig.show()

grafik

Full error message

No response

System information

PyGMT information:
  version: v0.9.0
System information:
  python: 3.9.16 | packaged by conda-forge | (main, Feb  1 2023, 21:39:03)  [GCC 11.3.0]
  executable: /home/michael/anaconda3/envs/pygmt/bin/python
  machine: Linux-5.15.0-67-generic-x86_64-with-glibc2.31
Dependency information:
  numpy: 1.23.2
  pandas: 1.4.3
  xarray: 2022.6.0
  netCDF4: 1.6.3
  packaging: 21.3
  contextily: 1.3.0
  geopandas: 0.11.1
  ghostscript: 9.54.0
GMT library information:
  binary version: 6.4.0
  cores: 8
  grid layout: rows
  image layout: 
  library path: /home/michael/anaconda3/envs/pygmt/lib/libgmt.so
  padding: 2
  plugin dir: /home/michael/anaconda3/envs/pygmt/lib/gmt/plugins
  share dir: /home/michael/anaconda3/envs/pygmt/share/gmt
  version: 6.4.0
@michaelgrund michaelgrund added the bug Something isn't working label Apr 12, 2023
@GenericMappingTools GenericMappingTools deleted a comment from welcome bot Apr 12, 2023
@seisman
Copy link
Member

seisman commented Apr 17, 2023

For GeoPandas objects, PyGMT calls the tempfile_from_json function to save the GeoPandas object into a ".gmt" file (using the OGR_GMT driver). In your case, it's equivalent to:

world.to_file(filename="world.gmt", driver="OGR_GMT", mode="w")

and pass the file world.gmt to the Figure.plot module.

If you open the file world.gmt, you will see:

# @Npop_est|continent|name|iso_a3|gdp_md_est|pop_est_round|pop_est_int          
# @Tdouble|string|string|string|string|double|string 

For unknown reasons, pop_est_int is in string type. If you manually change pop_est_int's type to double or integer, you will see that passing the file world.gmt to Figure.plot will work as expected.

@seisman seisman added this to the 0.10.0 milestone Apr 17, 2023
@seisman
Copy link
Member

seisman commented Apr 18, 2023

For unknown reasons, pop_est_int is in string type. If you manually change pop_est_int's type to double or integer, you will see that passing the file world.gmt to Figure.plot will work as expected.

In other words, we need to understand why pop_est_int is saved in string type (maybe a geopandas bug?) and how to correct set the data type.

@seisman
Copy link
Member

seisman commented Apr 20, 2023

Here is a simplified version to reproduce the issue:

>>> import geopandas as gpd
>>> world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
>>> world.dtypes
pop_est        float64
continent       object
name            object
iso_a3          object
gdp_md_est       int64
geometry      geometry
dtype: object
>>> world.to_file(filename="world.gmt", driver="OGR_GMT", mode="w")

The first few lines of the world.gmt file has:

# @GMULTIPOLYGON
# @Npop_est|continent|name|iso_a3|gdp_md_est
# @Tdouble|string|string|string|string

As you can see, world["gdp_md_est"] is in int64 type, but is stored as "string" type in world.gmt.

@seisman
Copy link
Member

seisman commented Apr 20, 2023

One more update here.

geopandas calls fiona to save a GeoDataFrame object to a .gmt file. The following script shows how it works.

The variable schema_int is the schema guessed by geopandas. As you can see, I tried to change the gdp_md_est's type from int32 and int64.

import geopandas as gpd
import fiona
from collections import OrderedDict

world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
schema_int = {
    "geometry": ["MultiPolygon", "Polygon"],
    "properties": OrderedDict(
        [
            ("pop_est", "float"),
            ("continent", "str"),
            ("name", "str"),
            ("iso_a3", "str"),
            ("gdp_md_est", "int"),
        ]
    ),
}
with fiona.open("world_int.gmt", mode="w", driver="OGR_GMT", schema=schema_int) as colxn:
    colxn.writerecords(world.iterfeatures())


schema_int32 = {
    "geometry": ["MultiPolygon", "Polygon"],
    "properties": OrderedDict(
        [
            ("pop_est", "float"),
            ("continent", "str"),
            ("name", "str"),
            ("iso_a3", "str"),
            ("gdp_md_est", "int32"),
        ]
    ),
}
with fiona.open("world_int32.gmt", mode="w", driver="OGR_GMT", schema=schema_int32) as colxn:
    colxn.writerecords(world.iterfeatures())

schema_int64 = {
    "geometry": ["MultiPolygon", "Polygon"],
    "properties": OrderedDict(
        [
            ("pop_est", "float"),
            ("continent", "str"),
            ("name", "str"),
            ("iso_a3", "str"),
            ("gdp_md_est", "int64"),
        ]
    ),
}
with fiona.open("world_int64.gmt", mode="w", driver="OGR_GMT", schema=schema_int64) as colxn:
    colxn.writerecords(world.iterfeatures())

The gdp_md_est column has the correct integer type in world_int32.gmt, and has the wrong string type in both world_int.gmt and world_int64.gmt.

Looks like a fiona issue?

@michaelgrund
Copy link
Member Author

michaelgrund commented Jun 23, 2023

Looks like a fiona issue?

Opened an issue in the fiona repo @seisman. Will wait for an answer and hopefully fix before continuing with the corresponding gallery example.

@michaelgrund
Copy link
Member Author

Update on the integer issue. This behavior is expected. The OGR_GMT driver only supports 32 bit integers, floats, and datetimes. All other types are converted to strings: https://github.com/OSGeo/gdal/blob/6e9103bd5acb1ff8da305c4e77fa30335ffa3b70/ogr/ogrsf_frmts/gmt/ogrgmtlayer.cpp#L769. This means we have to decide how we want to handle other types as input @GenericMappingTools/pygmt-maintainers.

@seisman
Copy link
Member

seisman commented Jun 30, 2023

We don't have many options:

  1. Ask the upstream GDAL to support more data types for the OGR_GMT driver
  2. Ask the upstream GMT to use the Z column (i.e, pop_est_int ) even it's marked as string type
  3. Find a workaround to make sure that we always pass 32 bit integers or floats.

For option 3, I thought world["pop_est_int"] = world.pop_est_round.astype("int32") should work, but it doesn't.

@seisman
Copy link
Member

seisman commented Jul 4, 2023

This issue is almost the same as geopandas/geopandas#967 and the workaround here geopandas/geopandas#967 (comment) works.

See #2592 for the patch.

@weiji14
Copy link
Member

weiji14 commented Jul 4, 2023

For option 3, I thought world["pop_est_int"] = world.pop_est_round.astype("int32") should work, but it doesn't.

Need to patch upstream geopandas for this. The PR at geopandas/geopandas#2950 should make this work on PyGMT v0.9.0:

pip install git+https://github.com/weiji14/geopandas.git@infer_int32_schema
import pandas as pd
import pygmt
import geopandas as gpd

world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
europe = world.query(expr="continent == 'Europe'")
europe.loc[:, "pop_est_int"] = round(europe.pop_est * 1e-6).astype(pd.Int32Dtype())

fig = pygmt.Figure()
pygmt.makecpt(cmap="lajolla", series=[0, 150, 50], continuous=True)
fig.basemap(region=[-12, 35, 35, 60], projection="M12c", frame=True)
fig.plot(
    data=europe,
    pen="1p,black",
    close=True,
    fill="+z",
    cmap=True,
    aspatial="Z=pop_est_int",
)
fig.colorbar(position="JMR", frame="a10+lPopulation in millions")
fig.show()

produces
pop

However, I did .astype(pd.Int32Dtype()) instead of .astype("int32"). Also, we'll still need #2592 if other dtypes like int64 are passed in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants