-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update WorldBank WDI scripts #963
base: master
Are you sure you want to change the base?
Conversation
|
||
Node: dcid:WorldBank/VC_IHR_PSRC_P5 | ||
name: "Intentional homicides (per 100,000 people)" | ||
description: "Intentional homicides are estimates of unlawful homicides purposely inflicted as a result of domestic disputes, interpersonal violence, violent conflicts over land resources, intergang violence over turf or control, and predatory violence and killing by armed groups. Intentional homicide does not include all intentional killing; the difference is usually in the organization of the killing. Individuals or small groups usually commit homicide, whereas killing in armed conflict is usually committed by fairly cohesive groups of up to several hundred members and is thus usually excluded. UN Office on Drugs and Crime's International Homicide Statistics database." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any ideas why these show up?
observationPeriod: "P1Y" | ||
observationAbout: C:WorldBank->ISO3166Alpha3 | ||
value: C:WorldBank->Value1 | ||
unit: C:WorldBank->unit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be that we don't want to check-in the full .CSV?
scripts/world_bank/wdi/worldbank.py
Outdated
@@ -124,8 +124,9 @@ def read_worldbank(iso3166alpha3, fetchFromSource): | |||
if df is None: | |||
df = pd.DataFrame(columns=cols) | |||
else: | |||
df = df.append(pd.DataFrame([cols], columns=df.columns), | |||
ignore_index=True) | |||
# df = df.append(pd.DataFrame([cols], columns=df.columns), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed df.append() calls as the Api is deprecated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we delete instead of comment the old line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, deleted now. was commented for testing.
Working on the more script changes as it needs more changes fr pandas update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Added some comments. The main point of concern is that some new SVs are being introduced but they already seem to exist. So I am a bit unsure what's going on?
The thing I am a bit confused by as a result is that none of the DCIDs in this file (scripts/world_bank/wdi/output/WorldBank_StatisticalVariables.mcf```) can be found and I find the corresponding SVs here (
third_party/datacommons/schema/stat_vars/manual_wdi_stat_vars.mcf``). So I am not sure what's going on between these DCID mappings?
@@ -53,7 +53,6 @@ CM.MKT.LCAP.CD,,,Market capitalization of listed domestic companies (current US$ | |||
BX.TRF.PWKR.DT.GD.ZS,,,"Personal remittances, received (% of GDP)","Personal remittances comprise personal transfers and compensation of employees. Personal transfers consist of all current transfers in cash or in kind made or received by resident households to or from nonresident households. Personal transfers thus include all current transfers between resident and nonresident individuals. Compensation of employees refers to the income of border, seasonal, and other short-term workers who are employed in an economy where they are not resident and of residents employed by nonresident entities. Data are the sum of two items defined in the sixth edition of the IMF's Balance of Payments Manual: personal transfers and compensation of employees.","World Bank staff estimates based on IMF balance of payments data, and World Bank and OECD GDP estimates.",Remittance,measuredValue,amount,transferType,InwardRemittance,,,,,Amount_EconomicActivity_GrossDomesticProduction_Nominal,100,,WorldBankEstimate, | |||
BX.TRF.PWKR.CD.DT,,,"Personal remittances, received (current US$)","Personal remittances comprise personal transfers and compensation of employees. Personal transfers consist of all current transfers in cash or in kind made or received by resident households to or from nonresident households. Personal transfers thus include all current transfers between resident and nonresident individuals. Compensation of employees refers to the income of border, seasonal, and other short-term workers who are employed in an economy where they are not resident and of residents employed by nonresident entities. Data are the sum of two items defined in the sixth edition of the IMF's Balance of Payments Manual: personal transfers and compensation of employees. Data are in current U.S. dollars.",World Bank staff estimates based on IMF balance of payments data.,Remittance,measuredValue,amount,transferType,InwardRemittance,,,,,,,,WorldBankEstimate,USDollar | |||
BM.TRF.PWKR.CD.DT,,,"Personal remittances, paid (current US$)","Personal remittances comprise personal transfers and compensation of employees. Personal transfers consist of all current transfers in cash or in kind made or received by resident households to or from nonresident households. Personal transfers thus include all current transfers between resident and nonresident individuals. Compensation of employees refers to the income of border, seasonal, and other short-term workers who are employed in an economy where they are not resident and of residents employed by nonresident entities. Data are the sum of two items defined in the sixth edition of the IMF's Balance of Payments Manual: personal transfers and compensation of employees. Data are in current U.S. dollars.","World Bank staff estimates based on IMF balance of payments data, and World Bank and OECD GDP estimates.",Remittance,measuredValue,amount,transferType,OutwardRemittance,,,,,,,,WorldBankEstimate,USDollar | |||
VC.IHR.PSRC.P5,,,"Intentional homicides (per 100,000 people)","Intentional homicides are estimates of unlawful homicides purposely inflicted as a result of domestic disputes, interpersonal violence, violent conflicts over land resources, intergang violence over turf or control, and predatory violence and killing by armed groups. Intentional homicide does not include all intentional killing; the difference is usually in the organization of the killing. Individuals or small groups usually commit homicide, whereas killing in armed conflict is usually committed by fairly cohesive groups of up to several hundred members and is thus usually excluded.",UN Office on Drugs and Crime's International Homicide Statistics database.,CriminalActivities,measuredValue,count,crimeType,MurderAndNonNegligentManslaughter,,,,,Count_Person,,100000,, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know why this was deleted?
@@ -634,3 +634,163 @@ statType: dcs:measuredValue | |||
measuredProperty: dcs:amount | |||
transferType: dcs:OutwardRemittance | |||
|
|||
|
|||
Node: dcid:WorldBank/SH_DYN_MORT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for these additions across the file? See a couple of comments below where it seems that the SV DCIDs are getting renamed but the SV and its contents are the same. Any idea what happened here? We should keep the SV DCIDs as checked in under google3/third_party/datacommons/schema/stat_vars/manual_wdi_stat_vars.mcf
@@ -634,3 +634,163 @@ statType: dcs:measuredValue | |||
measuredProperty: dcs:amount | |||
transferType: dcs:OutwardRemittance | |||
|
|||
|
|||
Node: dcid:WorldBank/SH_DYN_MORT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this the same as the existing SV: dcid:MortalityRate_Person_Upto4Years_AsFractionOf_Count_BirthEvent_LiveBirth
If so, then why the change?
age: dcs:YearsUpto4 | ||
|
||
|
||
Node: dcid:WorldBank/SH_PRV_SMOK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above for this one. The existing SV is this: dcid:Count_Person_15OrMoreYears_Smoking_AsFractionOf_Count_Person_15OrMoreYears
You can find this with code search in the file third_party/datacommons/schema/stat_vars/manual_wdi_stat_vars.mcf
'measurementMethod', 'measurementDenominator', 'scalingFactor', | ||
'sourceScalingFactor', 'unit' | ||
'measurementMethod', | ||
#'measurementDenominator', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are commenting these out, wouldn't it be better to just remove/delete the lines?
worldbank_dataframe = worldbank_dataframe.append(country_df) | ||
# Add new table to main dataframe. | ||
wb_dfs.append(country_df) | ||
# COmbine tables to get a single dataframe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "combine" (small "o")
Update WorldBank WDi scripts with the following: