Skip to content
This repository has been archived by the owner on Nov 10, 2020. It is now read-only.

How do we express state revenue and production rankings? #1365

Closed
shawnbot opened this issue Apr 7, 2016 · 9 comments
Closed

How do we express state revenue and production rankings? #1365

shawnbot opened this issue Apr 7, 2016 · 9 comments

Comments

@shawnbot
Copy link
Contributor

shawnbot commented Apr 7, 2016

While building out state prototypes (#1355) and our new data "infrastructure" (#1353) I noticed one issue with state rankings: they might be a little complicated to describe if they're derived from state and offshore region data. For instance, in 2013 California was the "top" state producer of Fuel Oil at a whopping $68, but the national total in that year was actually negative $73k, with Western Gulf of Mexico contributing the overwhelming majority.

So there are a couple of questions here:

  1. Do we include offshore regions in our rankings? (This is the important one.)
  2. How do we express contributions to a negative revenue total?

For 2. I was thinking that we would derive a percentage by taking the absolute values of the regional and national totals, and dividing those. So abs(68) / abs(-73725) * 100 yields .09%. This accounts for both the edge cases of states with negative revenue in a positive national total and vice-versa.

@RyanSibley
Copy link

Link to a notes doc after reviewing the California page. I don't actually get to your question about negative numbers until #6:

https://docs.google.com/document/d/1Yju7CYTc90SFg6KaE0vJIgJ-84qWJIZsXRVnellXRW0/edit

@meiqimichelle
Copy link
Contributor

This is still underway as part of the state profile updates.

@shawnbot
Copy link
Contributor Author

shawnbot commented May 9, 2016

My thoughts on this have evolved. From what I've read around the internets, it sounds like the best practice for negative contributions to a positive total (or vice-versa) are not typically expressed as a numeric percentage, because they don't really make sense.

I can dig into my history a bit to find some references, but I think the thing to do here is to update our state rankings and percentages to be null (undefined) if the numerator and denominator in the percentage fraction for a given state have different signs (±).

@shawnbot
Copy link
Contributor Author

While investigating #1397, I queried the db for state revenue rankings with a % > 100 and found some other offenders (note: these values are truncated for legibility, so the ones with % = 100 are actually between 100 and 101):

year state product $ revenue $ total % #
2004 ID Phosphate Raw Ore 3451909 3451909 100 1
2004 WY Helium -21423 -21365 100 2
2004 WY Soda Ash-Granular -583138 -449974 129 3
2005 CA Soda Ash-Granular -726947 -719067 101 2
2006 MN Copper 28720 18072 158 1
2006 WY Inlet Scrubber 637064 553912 115 1
2006 WY Sodium 73976 63918 115 1
2007 NM Muriate Of Potash-Standard 1452920 1451918 100 1
2008 MN Hardrock 24299 17618 137 1
2009 CO Sodium -142029 64530 220 5
2009 WY Inlet Scrubber -1087305 -849504 127 2
2009 WY Sodium 156577 64530 242 1
2010 VA Limestone 117397 117197 100 1
2011 CA Geothermal - Direct Utilization, Millions of BTUs 48086 46295 103 1
2011 ND Nitrogen 1098 861 127 1
2012 WY Unprocessed (Wet) Gas -132500520 99040241 133 23

There are some values that have different signs than the annual total (for instance, in the last row). Then there are some others that have both negative numbers, but the state number is larger (proportionally) than the annual total.

This makes me think that we just shouldn't show percentages for either negative state or totals. But what about when the state total is greater than the nationwide total (and, presumably, there were other states with negative revenues to bring down the nationwide total)?

Thoughts, @RyanSibley, @coreycaitlin, @meiqimichelle, @mentastc?

Note: these numbers are from the new revenue data that Nathan sent us, so they're provisional; but there will certainly be instances of each of these in the final data.

@shawnbot
Copy link
Contributor Author

Here are the edge cases we need to decide how to represent, if the percentages are even worth listing:

  1. Positive state revenue that is greater than the total
  2. Positive state revenue when the total is negative
  3. Negative state revenue when the total is positive
  4. Negative state revenue that is less than the negative total

@shawnbot
Copy link
Contributor Author

Nathan notes that we should not say "percent of US total revenue" for a given commodity, because we're only working with federal revenue, which accounts for about 25% of all revenue generated by the extractives industry (including private).

@meiqimichelle
Copy link
Contributor

Is this issue mostly resolved now that we have Nathan's categories, or is this something different? And/or, it seems like we're moving away from showing percentages along with many of our revenue/production values, right?

@coreycaitlin
Copy link
Contributor

Nathan's commodities groups solve most of the weird-data issues here. Also, I think we're moving away from showing year-over-year percentages, and we may not wind up showing percentages at all. If we do keep some form of percentages (i.e. "In 2013, federal revenue from coal extraction in Utah accounted for 8 percent of all federal revenue from coal extraction on federal lands"...which is a mouthful), I think there will at least be fewer edge cases.

@coreycaitlin
Copy link
Contributor

Closing this out in favor of #1465, since we'll mostly only be using percentages or rankings in intro sections.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants