Skip to content

Commit

Permalink
Merge branch 'main' into feat/preserve-median
Browse files Browse the repository at this point in the history
  • Loading branch information
stefmolin authored Sep 23, 2024
2 parents d3eba8d + e97b548 commit 9e438c3
Show file tree
Hide file tree
Showing 26 changed files with 3,010 additions and 151 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ morphed_data/

# dev setup
.vscode/
.idea/

# Byte-compiled / optimized / DLL files
__pycache__/
Expand Down
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,14 @@ repos:
exclude: (\.(svg|png|pdf)$)|(CODE_OF_CONDUCT.md)

- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.4.9
rev: v0.5.7
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix, --show-fixes]
- id: ruff-format

- repo: https://github.com/numpy/numpydoc
rev: v1.7.0
rev: v1.8.0
hooks:
- id: numpydoc-validation
exclude: (tests|docs)/.*
Expand All @@ -44,7 +44,7 @@ repos:
files: tests/.*

- repo: https://github.com/tox-dev/pyproject-fmt
rev: 2.1.3
rev: 2.2.1
hooks:
- id: pyproject-fmt
args: [--keep-full-version, --no-print-diff]
8 changes: 4 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Set up the pre-commit hooks to make sure you can pass the CI checks:
$ pre-commit install
```

All commits will be squashed, so just make sure that the final commit passes all of the linting, documentation, and testing checks. These will run in GitHub Actions when you open a pull request, but you should also run them locally:
All commits will be squashed, so just make sure that the final commit passes all linting, documentation, and testing checks. These will run in GitHub Actions when you open a pull request, but you should also run them locally:

```shell
$ pre-commit run --all-files # linting and documentation format checks
Expand All @@ -29,7 +29,7 @@ $ cd docs && make html # build the documentation locally
Some things to remember:

- All code must be documented using docstrings in the [numpydoc style](https://numpydoc.readthedocs.io/en/latest/format.html) – the pre-commit hooks will check for this.
- Any changes to the API must be accompanied with either an additional test case or a new test. Run `pytest` to make sure your changes are covered.
- Any changes to the API must be accompanied by either an additional test case or a new test. Run `pytest` to make sure your changes are covered.
- Documentation for the project is built with Sphinx. Your changes must render correct in the output. Run `make html` from the `docs` directory and inspect the result.

## 3. Open a pull request
Expand All @@ -43,13 +43,13 @@ In your description, please do the following:

When you create your pull request, first-time contributors will need to wait for a maintainer to approve running the GitHub Actions workflows. Please be patient until this happens.

Once it does, the same checks described above (testing, documentation, linting) that you ran on your machine will run on Linux, MacOS, and Windows with multiple versions of Python. Please note that it is possible that differences in operating systems and/or Python versions results in a failure, despite it working on your machine.
Once it does, the same checks described above (testing, documentation, linting) that you ran on your machine will run on Linux, macOS, and Windows with multiple versions of Python. Please note that it is possible that differences in operating systems and/or Python versions results in a failure, despite it working on your machine.

If anything fails, please attempt to fix it as we're unlikely to review your code until everything passes. If stuck, please feel free to leave a note in the pull request enumerating what you have already tried and someone may be able to offer assistance.

## 4. Code review

After all of the checks in your pull request pass, a maintainer will review your code. In many cases, there will be some feedback to address, and this may require a few iterations to get to the best implementation. Remember to be patient and polite during this process.
After all checks in your pull request pass, a maintainer will review your code. In many cases, there will be some feedback to address, and this may require a few iterations to get to the best implementation. Remember to be patient and polite during this process.

## 5. Congratulations!

Expand Down
18 changes: 14 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@
<hr/>
</div>

Data Morph transforms an input dataset of 2D points into select shapes, while preserving the summary statistics to a given number of decimal points through simulated annealing.
Data Morph transforms an input dataset of 2D points into select shapes, while preserving the summary statistics to a given number of decimal points through simulated annealing. It is intended to be used as a teaching tool to illustrate the importance of data visualization (see the [Data Morph in the Classroom](https://github.com/stefmolin/data-morph/#data-morph-in-the-classroom) section for ideas).

<div align="center">
<img alt="Morphing the panda dataset into the star shape." src="https://raw.githubusercontent.com/stefmolin/data-morph/main/docs/_static/panda-to-star-eased.gif">
Expand Down Expand Up @@ -152,15 +152,25 @@ Note that the `result` variable in the above code block is a `pandas.DataFrame`

In this example, we morphed the built-in panda `Dataset` into the star `Shape`. Be sure to try out the other built-in options:

* The `DataLoader.AVAILABLE_DATASETS` attribute contains a list of available datasets, which are also visualized in the `DataLoader` documentation.
* The `DataLoader.AVAILABLE_DATASETS` attribute contains a list of available datasets, which are also visualized in the `DataLoader` documentation [here](https://stefaniemolin.com/data-morph/stable/api/data_morph.data.loader.html#data_morph.data.loader.DataLoader).

* The `ShapeFactory.AVAILABLE_SHAPES` attribute contains a list of available shapes, which are also visualized in the `ShapeFactory` documentation.
* The `ShapeFactory.AVAILABLE_SHAPES` attribute contains a list of available shapes, which are also visualized in the `ShapeFactory` documentation [here](https://stefaniemolin.com/data-morph/stable/api/data_morph.shapes.factory.html#data_morph.shapes.factory.ShapeFactory).

## Data Morph in the Classroom

Data Morph is intended to be used as a teaching tool to illustrate the importance of data visualization. Here are some potential classroom activities for instructors:

- **Statistics Focus**: Have students pick one of the [built-in datasets](https://stefaniemolin.com/data-morph/stable/api/data_morph.data.loader.html#data_morph.data.loader.DataLoader), and morph it into all available [target shapes](https://stefaniemolin.com/data-morph/stable/api/data_morph.shapes.factory.html#data_morph.shapes.factory.ShapeFactory). Ask students to comment on which transformations worked best and why.
- **Creativity Focus**: Have students [create a new dataset](https://stefaniemolin.com/data-morph/stable/custom_datasets.html) (*e.g.*, your school logo or something that the student designs), and morph that into multiple [target shapes](https://stefaniemolin.com/data-morph/stable/api/data_morph.shapes.factory.html#data_morph.shapes.factory.ShapeFactory). Ask students to comment on which transformations worked best and why.
- **Math and Coding Focus**: Have students [create a custom shape](https://stefaniemolin.com/data-morph/dev/shape_creation.html) by inheriting from `LineCollection` or `PointCollection`, and try morphing a couple of the [built-in datasets](https://stefaniemolin.com/data-morph/stable/api/data_morph.data.loader.html#data_morph.data.loader.DataLoader) into that shape. Ask students to explain how they chose to calculate the shape, and comment on which transformations worked best and why.

If you end up using Data Morph in your classroom, I would love to hear about it. Please [send me a message](https://stefaniemolin.com/contact/) detailing how you used it and how it went.

## Acknowledgements

This code has been altered by Stefanie Molin ([@stefmolin](https://github.com/stefmolin)) to work for other input datasets by parameterizing the target shapes with information from the input shape. The original code works for a specific dataset called the "Datasaurus" and was created for the paper *Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing* by Justin Matejka and George Fitzmaurice (ACM CHI 2017).

The paper and video can be found on the Autodesk Research website [here](https://www.research.autodesk.com/publications/same-stats-different-graphs-generating-datasets-with-varied-appearance-and-identical-statistics-through-simulated-annealing/). The version of the code placed on GitHub at [jmatejka/same-stats-different-graphs](https://github.com/jmatejka/same-stats-different-graphs), served as the starting point for the Data Morph code base, which is on GitHub at [stefmolin/data-morph](https://github.com/stefmolin/data-morph).
The paper and video can be found on the Autodesk Research website [here](https://www.research.autodesk.com/publications/same-stats-different-graphs-generating-datasets-with-varied-appearance-and-identical-statistics-through-simulated-annealing/). The version of the code placed on GitHub at [jmatejka/same-stats-different-graphs](https://github.com/jmatejka/same-stats-different-graphs), served as the starting point for the Data Morph codebase, which is on GitHub at [stefmolin/data-morph](https://github.com/stefmolin/data-morph).

Read more about the creation of Data Morph [here](https://stefaniemolin.com/articles/data-science/introducing-data-morph/) and [here](https://stefaniemolin.com/data-morph-talk/#/).

Expand Down
Loading

0 comments on commit 9e438c3

Please sign in to comment.