Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fields and reorganize current_law_policy.json #1074

Closed
MattHJensen opened this issue Nov 20, 2016 · 6 comments
Closed

Add fields and reorganize current_law_policy.json #1074

MattHJensen opened this issue Nov 20, 2016 · 6 comments

Comments

@MattHJensen
Copy link
Contributor

MattHJensen commented Nov 20, 2016

I am thinking about opening a few PRs to augment and reorganize taxcalc/current_law_policy.json and am interested in feedback from others before I undertake the project.

While I strongly believe these changes will be helpful for Tax-Calculator users, I'll admit that my original motivation for thinking about this project is to facilitate coordination between the Tax-Calculator and TaxBrain teams. In particular, I hope to include enough information in current_law_policy.json to enable a set of rules that fully specifies what content should appear on the TaxBrain input page.

This approach could be useful for B-Tax and CCC as well.

Currently the primary disconnects between the TB input page and current_law_policy.json are:

  1. Sections and subsections
  • TB is organized by sections and subsections, (e.g Refundable Credits: EITC), whereas current_law_policy.json parameters are only informally grouped into sections by reference in the title name.
  1. Order of the parameters.
  • TB is ordered roughly by where the params appear in the tax forms, while TC's current_law_policy.json is roughly ordered by where the parameters appear in taxcalc/functions.py.
  1. "Principle" parameters
  • The TB input page does not include every parameter from current_law_policy.json, rather it only includes parameters of principle importance.
  1. 1-2 relationships between TB fields and TC params
  • Two sets of fields on the TaxBrain input page have a 1-2 relationship with parameters in Tax-Calculator.
    • The personal income tax rate and brackets fields on TaxBrain govern both II_rts and II_brks as well as PT_rts and PT_brks
    • the long term capital gains and qualified dividends fields on TaxBrain govern both CG_rts and CG_bks as well as AMT_CG_rts and AMT_CG_brks.

To resolve these four disconnects I propose to:

  1. Add a section field to each parameter in current_law_policy that would contain a string: "section: subsection".
  • For example:
 "_SS_Earnings_c": {
    "long_name": "Maximum Taxable Earnings for Social Security",
    "description": "Only individual earnings below this maximum amount are subjected to Social Security (OASDI) payroll tax.",
    "section": "Payroll Taxes: Social Security FICA" 
    "irs_ref": "W-2, Box 4, instructions",
    "notes": "This parameter is indexed by the rate of growth in average wages, not by the inflation rate.",
    "start_year": 2013,
    "col_var": "",
    "row_var": "FLPDYR",
    "row_label": ["2013",
                  "2014",
                  "2015",
                  "2016"],
    "cpi_inflated": true,
    "col_label": "",
    "value": [113700,
              117000,
              118500,
              118500]
},
  • The TaxBrain rule would be to add parameters to sections and subsections based on the value of the "section" field.
  1. Reorganize the parameters in current_law_policy.json according to this schema:
  • Sections are organized by order of appearance on the 1040.
  • Subsections and parameters within subsections are ordered randomly, except that the parameters that modify policies that are "turned off" under current law, such as the itemized deduction surtax, appear last in a section or subsection.
  • The TaxBrain rule would be to display parameters in the order that they are included in current_law_policy.json.
  1. (a) Add a "compatible_data" field to each parameter after the "notes" field.
  • The acceptable values that I anticipate in the short to medium term are:
    • "taxdata_puf", which refers to the PUF based datafile produced by TaxData.
    • "taxdata_cps", which refers to the CPS based datafile produced by TaxData.
    • "taxcalc_filings", which refers to the Luca-based taxpayer inputted data that @zrisher is developing.
    • any comma-separated combination of those three, such as "taxdata_puf, taxdata_cps, taxcalc_filings".
  • The reason why we need this field is that every parameter will not be relevant to every dataset because all of the relevant variables in a dataset might be zero. The calculator will still run just fine with the dataset and even with parameter modifications, but the parameter modifications won't affect results and I expect this to be quite confusing to users.
  • The TaxBrain rule would be to display all of the parameters that are compatible with the chosen dataset (currently taxdata_puf is the only option).
  1. (b) (implied by 3a and requires no TC work) Stop filtering parameters on TB by our perception of their importance to TaxBrain users.
  • As the parameters in current_law_policy.json proliferate, we may want to add a "primary_importance" indicator field or alternatively a "advanced_user" indicator field which we could take advantage of in some clever way in the TaxBrain user interface to highlight some parameters and not others, but for now I think we are fine displaying everything.
  • Some of the new parameters we are adding are a bit tricky to use based on their long_name alone. The TaxBrain team might want to consider moving the "description" field out of the 'information i' hover over and into plain sight on the TaxBrain input page, but that can be discussed further in the TaxBrain repo.
  1. Eliminate 1-2 mappings between TaxBrain fields and TC params by including all parameters on TaxBrain.
  • To improve usability, add more detail about common parings of parameters to the long_name and description fields current_law_policy.json.
  • Be willing to break the organization schema described in number 2 above by, for instance, moving the AMT_CG_rt and AMT_CG brk params right next to the CG_rt and CG_brk params.

cc @Amy-Xu, @talumbau, @martinholmer, @feenberg, @codykallen, @GoFroggyRun, @PeterDSteinberg, @jdebacker, @zrisher

@feenberg
Copy link
Contributor

On Sun, 20 Nov 2016, Matt Jensen wrote:

I am thinking about opening a few PRs to augment and reorganize
taxcalc/current_law_policy.json and am interested in feedback from others before I
undertake the project.

While I strongly believe these changes will be helpful for Tax-Calculator users, I'll
admit that my original motivation for thinking about this project is to facilitate
coordination between the Tax-Calculator and TaxBrain teams. In particular, I hope to
include enough information in current_law_policy.json to enable a set of rules that
fully specifies what content should appear on the TaxBrain input page.

These seem like really good ideas. Perhaps the TB web page could have
expandable sections, where you could select a section to expose the
parameters available to set, and perhaps subsections that could be
expanded recursivly.

dan

@zrisher
Copy link
Contributor

zrisher commented Dec 1, 2016

@MattHJensen said:

To resolve these four disconnects I propose to:

  1. Add a section field to each parameter in current_law_policy that would contain a string: "section: subsection". For example: "section": "Payroll Taxes: Social Security FICA"

In this case I recommend separating the "section" and "subsection" data into two values so that we don't have to do that via string parsing later. Generally, root data sources like current_law_policy.json should be normalized.

However, you may actually want to organize this file (or a separate file providing organization to the items defined in this file) as a directed graph with different length branches, depending on the answer to the question I pose below regarding Tax Policy structure.

  1. Reorganize the parameters in current_law_policy.json according to this schema:
  • Sections are organized by order of appearance on the 1040.

I suggest separating our thinking about Tax Policy from the structure of Taxpayer Data, especially since a long term objective is to work with more diverse tax policy structures. Tax Policy doesn't evolve around Taxpayer Data's structure/availability, because tax collection agencies can always just request more data. And I think this will become more of a trend whenever they start collecting that data in digital form.

If you think about the concept of "Tax Policy" in isolation, does a more natural grouping of topics and parameters emerge? This is a complex question, and unfortunately not one I'm well equipped to answer, but I think it will be helpful both for organizing this data as well guiding our approach to more generalized policy modeling. @feenberg @MattHJensen @PeterDSteinberg @martinholmer

  • Subsections and parameters within subsections are ordered randomly

"Random" is never a great ordering strategy. 😄 I suggest ordering items alphabetically once the order implied by their relationships has been exhausted.

except that the parameters that modify policies that are "turned off" under current law, such as the itemized deduction surtax, appear last in a section or subsection.

I think you could indicate this visually instead of via ordering, i.e. "grey out" the area surrounding parameters that are currently unused (due to their value or the value of higher-level parameters). Save the ordering for indicating structure.

3(a). Add a "compatible_data" field to each parameter after the "notes" field.

It will be a bit of extra maintenance work. When changing the way policy is coded, a contributor must remember to update this field if usable data changes, and the relationships may not be easy to trace. These fields could also be affected by changes to any of the listed data sets. It's also computationally expensive to enforce via automated testing. However, the value prop you described is definitely there.

The rest of @MattHJensen's comments I completely agree with. This sounds very valuable.

@feenberg said:

Perhaps the TB web page could have
expandable sections, where you could select a section to expose the
parameters available to set, and perhaps subsections that could be
expanded recursivly.

100% agree, and nice to see this is tracked now in ospc-org/ospc.org#417.

@MattHJensen
Copy link
Contributor Author

@zrisher, thank you very much for your comments. There is one that I don't fully understand, though:

However, you may actually want to organize this file (or a separate file providing organization to the items defined in this file) as a directed graph with different length branches,

What would it mean to, "organize this file (or a separate file providing organization to the items defined in this file) as a directed graph with different length branches"?

@zrisher
Copy link
Contributor

zrisher commented Dec 22, 2016

@MattHJensen Per our conversation, I'm working on presenting a possible architecture for a more modular tax calculation model that will elucidate my suggestion you referenced.

@MattHJensen
Copy link
Contributor Author

Thanks @zrisher

@MattHJensen
Copy link
Contributor Author

1, 2, and 4 are resolved by #1109. Opening new issues for 3a and 3b.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants