Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pygmt.binstats: Add alias "tiling" for "T" #2409

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

yvonnefroehlich
Copy link
Member

@yvonnefroehlich yvonnefroehlich commented Mar 10, 2023

Description of proposed changes
This PR aims to add a alias for pygmt.binstats:

  • T -> tiling

The docstring is orientated on https://docs.generic-mapping-tools.org/dev/gmtbinstats.html#t.
Related to comment: https://github.com/GenericMappingTools/pygmt/pull/2376/files#r1131483527.

Reminders

  • Run make format and make check to make sure the code follows the style guide.
  • Add tests for new features or tests that would have caught the bug that you're fixing.
  • Add new public functions/methods/classes to doc/api/index.rst.
  • Write detailed docstrings for all functions/methods.
  • If wrapping a new module, open a 'Wrap new GMT module' issue and submit reasonably-sized PRs.
  • If adding new functionality, add an example to docstrings or tutorials.
  • Use underscores (not hyphens) in names of Python files and directories.

Slash Commands

You can write slash commands (/command) in the first line of a comment to perform
specific operations. Supported slash commands are:

  • /format: automatically format and lint the code
  • /test-gmt-dev: run full tests on the latest GMT development version

@yvonnefroehlich yvonnefroehlich added the documentation Improvements or additions to documentation label Mar 10, 2023
@yvonnefroehlich yvonnefroehlich added this to the 0.9.0 milestone Mar 10, 2023
@yvonnefroehlich yvonnefroehlich self-assigned this Mar 10, 2023
@yvonnefroehlich yvonnefroehlich added enhancement Improving an existing feature and removed documentation Improvements or additions to documentation labels Mar 10, 2023
@yvonnefroehlich yvonnefroehlich changed the title WIP: pygmt.binstats: Add aliase "tiling" for "T" pygmt.binstats: Add aliase "tiling" for "T" Mar 11, 2023
@yvonnefroehlich yvonnefroehlich added the needs review This PR has higher priority and needs review. label Mar 11, 2023
@weiji14 weiji14 changed the title pygmt.binstats: Add aliase "tiling" for "T" pygmt.binstats: Add alias "tiling" for "T" Mar 11, 2023
pygmt/src/binstats.py Outdated Show resolved Hide resolved
Comment on lines +98 to +100
For **h**, we write a table with the centers of the hexagons and
the computed statistics to standard output (or to the file named
in ``outgrid``). Here, the ``spacing`` setting is expected to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using tiling="h" outputs the statistics to a table, which was why this alias was not added in the orginal implementation, see #1652 (comment).

The way we've handle functions that output to either a grid or table was discussed in #1536, which is to use a "Two methods in a single Python class" like pygmt.triangulate and pygmt.grdhisteq. Unfortunately, if we follow this style, that would require breaking changes to pygmt.binstats 🙂

Copy link
Member

@weiji14 weiji14 Mar 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so @maxrjones mentioned at #1652 (review) that the flag to output statistics to a table is actually more similar to #896 (which is for fig.* plotting functions)? But gmtbinstats is not a plotting function, so this is kinda awkward 😅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @weiji14 for this clarification and background information. As tiling is already used in the docstring for search_radius I thought that adding the doctstring for tiling was simply overlooked...

search_radius (float or str) – Set the search_radius that determines which data points are considered close to a node. Append the distance unit. Not compatible with tiling.

I am unsure how to continue with this PR. Should we leave it open or should we have a separate issue for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure how to continue with this PR. Should we leave it open or should we have a separate issue for this?

Let's leave the PR open for now, the discussion on whether to turn pygmt.binstats into two separate methods (e.g. pygmt.binstats.to_grid or pygmt.binstats.to_table) can happen here. Anything more general can also be discussed on the open thread at #1536.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand this means that it's also not possible to plot the hexagons outlines (polygons )that are used to calculate the stats but to get the center coordinates and stats values for the corresponding hexagon?

Copy link
Member Author

@yvonnefroehlich yvonnefroehlich Jun 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelgrund what is your concrete use-case here? Please note, that hexagonal binning is only available for Cartesian data and that only the y-increment can be given and GMT calculates the x-increment given the geometry (https://docs.generic-mapping-tools.org/latest/gmtbinstats.html#t). Based on the upstream GMT documentation, I feel it is not possible to get the outline of the hexagons directly.

Are you looking for figures like this:

outlines outlines + color-coding
binstats_hexagons_outline binstats_hexagons_outline_cmap

In some cases, one can try to determine the (regular) hexagon based on the center of the hexagon (first two columns of the output file via outgrid) and the spacing between the centers in y-direction (value passed to spacing). The outline can then be plotted (be careful with width and height of the figure) using the centers of the hexagon and style="h" + str(determined_diameter_of_hexagon) + "c". Additionally, the stats output can be used to color-code the fill of the hexagons (please un-comment the corresponding parts in the code below). However, I feel there are other libraries / packages which are more suitable to generate such a visualization.

Code to reproduce the figures shown above:
Test data (Please place it in your work dictionary): test_data_in.txt

import numpy as np
import pygmt as gmt

# >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# >>> Adjust for your needs <<<
#
# Give y increment
# Passed to spacing
# In case of hexagon binning only the y increment can be given;
# the x increment is calculated based on the given geometry
# Please see the upsteam GMT documentation
# https://docs.generic-mapping-tools.org/dev/gmtbinstats.html#t
# Last access: 2023/06/01
bin_spacing_y = 1.3

# Give study region
x_min = -1
x_max = 7
y_min = -1
y_max = 5
study_region = [x_min, x_max, y_min, y_max]

# Give size of figure
fig_width = 7

# Give paths and file names
data_in = "test_data_in.txt"
stats_out = "stats_out.txt"
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<


#-----------------------------------------------------------------------------
# Be carefulle if you change something here
delta_x = np.abs(x_max - x_min)
delta_y = np.abs(y_max - y_min)

scale_x = fig_width / delta_x

fig_hight = delta_y * scale_x

# Based on the geometry of a regular hexagon
diag_hexagon_temp = ( (bin_spacing_y/2) / np.cos(2*np.pi/360 * 30) ) * 2
diag_hexagon = diag_hexagon_temp * scale_x
bin_spacing_x = diag_hexagon * np.sqrt(3)


#-----------------------------------------------------------------------------
# Bin tabular data based on hexagons
# Calculate stats within hexagons here counts
# Write output to file: center of hexagons and stats
gmt.binstats(
    data=data_in,
    region=study_region,
    spacing=str(bin_spacing_y) + "+e",
    # +e to slightly adjust the max x or y to fit exactly the
    # given increment if needed [Default is to slightly adjust the increment
    # to fit the given domain].
    T="h",  # tiling using hexagons - only for Cartesian data
    statistic="n",  # n counts, a average, m median, z sum
    outgrid=stats_out,
)


#-----------------------------------------------------------------------------
# Plot statistic output
fig = gmt.Figure()

fig.basemap(
    region=study_region,
    projection="X" + str(fig_width) + "c/" + str(fig_hight) + "c",
    frame="a1g1f0.5",
)

# Set up colormap for counts per hexagon
trans_stats = 20
gmt.makecpt(
    cmap="grayC",
    series=[1, 6, 1],
    transparency=trans_stats,
)

# Plot hexagons color-coded by counts
fig.plot(
    data=stats_out,
    style="h" + str(diag_hexagon) + "c",
    pen="0.3p,darkred",
    # cmap=True,    # un-comment for color-coding
    # transparency=trans_stats,  # un-comment for color-coding
)

# Plot centers of hexagons
fig.plot(
    data=stats_out,
    style="p0.15",
    fill="darkred",
)

# Add labels with counts per hexagon
fig.text(
    textfiles=stats_out,
    font="5p",
    offset="0c/-0.2c",
    fill="white@30"
)

# Plot single records
fig.plot(
    data=data_in,
    style="p0.1",
    fill="darkorange",
)

# fig.colorbar(frame="xa1+lcounts within hexagon")  # un-comment for color-coding

fig.show()
# fig.savefig("binstats_hexagons_outline.png")

@weiji14 weiji14 marked this pull request as draft March 11, 2023 21:36
@yvonnefroehlich yvonnefroehlich modified the milestones: 0.9.0, 0.10.0 Mar 12, 2023
@yvonnefroehlich yvonnefroehlich removed the needs review This PR has higher priority and needs review. label Mar 12, 2023
@yvonnefroehlich yvonnefroehlich modified the milestones: 0.10.0, 0.11.0 Jun 11, 2023
@yvonnefroehlich yvonnefroehlich modified the milestones: 0.11.0, 0.12.0 Dec 5, 2023
@seisman seisman removed this from the 0.12.0 milestone Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improving an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants