-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Correct and document crimea.json
#648
Conversation
Notes on Mortality Rate Data and Existing Visualizations1. Mortality Rate Data DerivationThe mortality rate columns from Nightingale's 1859 publication can be calculated from the raw death counts, making their inclusion in
VerificationHere's a Python implementation that reproduces the original mortality rates: import pandas as pd
def transform_mortality_data(deaths_df):
deaths_df['Diseases'] = (deaths_df['disease'] * 1000 * 12) / deaths_df['army_size']
deaths_df['Wounds'] = (deaths_df['wounds'] * 1000 * 12) / deaths_df['army_size']
deaths_df['Other'] = (deaths_df['other'] * 1000 * 12) / deaths_df['army_size']
deaths_df['Month'] = pd.to_datetime(deaths_df['date']).dt.strftime('%B')
deaths_df['Year'] = pd.to_datetime(deaths_df['date']).dt.year.astype('int64')
return deaths_df[['Month', 'Year', 'Diseases', 'Wounds', 'Other']].round(1) Sample output matches the original data:
2. Related Visualization WorkThere's an existing Vega implementation by @avatorl that uses the transformed mortality rate data. See a 2022 blog post and repository for details. This may be a good candidate for a Vega example @domoritz mentioned in #594. |
crimea.json
crimea.json
@dangotbanned Please note the diff shows very slight modifications to data filesizes in |
@dangotbanned also please be aware that when I attempted to break up the line length in the TOML multi-line strings below (for field description and source title) to keep to 80-100 characters per line, the resulting markdown table generated by [[resources.schema.fields]]
name = "wounds"
description = """Deaths from "Wounds and Injuries" which comprised: Luxatio (dislocation), Sub-Luxatio (partial dislocation), Vulnus Sclopitorum (gunshot wounds), Vulnus Incisum (incised wounds), Contusio (bruising), Fractura (fractures), Ambustio (burns) and Concussio-Cerebri (brain concussion)""" [[resources.sources]]
title = """
Nightingale, Florence. A contribution to the sanitary history of the British army during the late war with Russia. London : John W. Parker and Son, 1859. Table II. Table showing the Estimated Average Monthly Strength of the Army; and the Deaths and Annual Rate of Mortality per 1,000 in each month, from April 1854, to March 1856 (inclusive), in the Hospitals of the Army in the East
""" |
Interesting 🤔 My first thought would be maybe os.stat_result.st_size differs across platforms? import sys
sys.platform in {"darwin", "posix"} # @dsmedia
sys.platform == "win32" # @dangotbanned |
@dsmedia I'll take a look at this today. For this one, it could be the leading [[resources.sources]]
title = """
Nightingale, Florence. A contribution to the sanitary history of the British army during the late war with Russia. London : John W. Parker and Son, 1859. Table II. Table showing the Estimated Average Monthly Strength of the Army; and the Deaths and Annual Rate of Mortality per 1,000 in each month, from April 1854, to March 1856 (inclusive), in the Hospitals of the Army in the East
""" Updated@dsmedia this will fix it if you re-run I'm holding off on doing that locally, since I don't wanna revert all the |
Co-authored-by: Dan Redding <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @dsmedia!
Changes from vega/vega-datasets#648 Currently pinned on `main` until `v3.0.0` introduces `datapackage.json` https://github.com/vega/vega-datasets/tree/main
Resolves #594
Tasks
crimea.json
with version from stdlibcrimea.json
to_data/datapackage_additions.toml
Notes
crimea.json
dataset