Skip to content
This repository has been archived by the owner on Dec 23, 2017. It is now read-only.

Complete mapping data catalog and FTP data to advanced data pages #2318

Closed
4 tasks done
PaulClark2 opened this issue Sep 29, 2017 · 10 comments
Closed
4 tasks done

Complete mapping data catalog and FTP data to advanced data pages #2318

PaulClark2 opened this issue Sep 29, 2017 · 10 comments
Assignees
Milestone

Comments

@PaulClark2
Copy link

PaulClark2 commented Sep 29, 2017

So that we can move data sets from classic to new fec.gov, use the strategy of distributing the data sets by existing advanced data category set in #2265 to create an outline with placement and order of the data sets.

Completion criteria

  • Data sets broken down in outline form (@PaulClark2)
  • Create InVision prototype of Advanced Data section with outlined data in place (@jenniferthibault)
  • meet with Christian to understand his concerns about the FTP files
  • understand the need for file formats and metadata
@PaulClark2 PaulClark2 added this to the Sprint 4.1 milestone Sep 29, 2017
@jenniferthibault jenniferthibault changed the title Complete mapping data catalog to FTP data advanced data pages Complete mapping data catalog and FTP data to advanced data pages Sep 29, 2017
@PaulClark2
Copy link
Author

PaulClark2 commented Oct 3, 2017

Raising (Advanced data page:https://www.fec.gov/data/advanced/?tab=raising)

Data Catalog

  • Bundled contributions

FTP

  • Contributions by individuals (indiv)

Spending (Advanced data page: https://www.fec.gov/data/advanced/?tab=spending)

Data catalog

  • Candidate disbursements (??)
  • Communication costs
  • Electioneering communications
  • Independent expenditures

FTP

  • Any transaction from one committee to another
  • Contributions to candidates
  • Operating expenditures

Candidates (Advanced data page:https://www.fec.gov/data/advanced/?tab=candidates)

Data catalog

  • Candidate disbursements (??)
  • Candidate summary
  • New statements of organization

FTP

  • All candidates (weball)
  • Candidate committee linkage file
  • Candidate master
  • Contributions to candidates
  • House/Senate current campaigns (webl)
  • Presidential matching funds

Committees (Advanced data page:https://www.fec.gov/data/advanced/?tab=committees)

Data catalog

  • Bundled contributions
  • Committee report summary
  • Committee summary
  • Leadership PACs
  • Lobbyist/registrant
  • New committee registrations
  • Unverified filers

FTP

  • Any Transaction from one committee to another
  • Committee master
  • Contributions to candidates
  • PACs (webk)

Filings and reports (Advanced data page: https://www.fec.gov/data/advanced/?tab=filings)

Data catalog

  • Unverified filers

FTP

  • Efilings (.fec files)
  • Paper filings (.fec files)

@PaulClark2 PaulClark2 self-assigned this Oct 3, 2017
@jenniferthibault
Copy link
Contributor

jenniferthibault commented Oct 4, 2017

Prototype

👀 🖼 InVision prototype with clickable side-menu and accordions

Questions and observations from working with the first draft:

By our principle that data should only live in one place, we should pick the category that fits best instead of duplicating the download link in two places. The following data sets are listed in multiple categories, so we need to decide on a single best fit for them:

  • Contributions to candidates (listed under Spending, Candidates, and Committees. If this is a subset of the Itemized records files, it makes sense to be grouped under Spending as well)
  • Candidate disbursements (listed under Spending and Candidates. To me this makes the most sense under Spending with the other disbursements)
  • Bundled contributions (listed under Raising and Committees)
  • Any transaction from one committee to another (listed under Spending and Committees. To me this makes the most sense under Spending)
  • Unverified filers (listed under Committees and Filings and reports. I tried grouping this in the accordion with most recent Form 1's, but not sure if that works)

More general data placement and source questions:

  • For the FTP files, what is the difference between "All candidates” file and the “Candidates master” file?

  • Where are the weball and webl files coming from? I couldn’t find their source to get the year spans or the intro text

  • "Committee report summary”—Should we move this under the Filings and reports section instead of Committees since it's a summary of reports?

  • Where should "Senate unofficial electronic filings" go? (I tentatively put them with searchable House/Senate report data in Filings and reports)

  • Since we're distributing the bulk data into the content categories, should we take the same approach with the external sources?

Observations:
Finding a way to present the historical summaries feels like a bigger task than this issue. Are those data sets based in Waltham making it a NOW issue? Or are they based elsewhere which could allow us to pick up that task later?

As you open the accordions, there are also some where I had a hard time finding the exact year range available, or knowing what text to pull for descriptions. Those are noted in pink.


I know this is a lot, glad to talk through!

@PaulClark2
Copy link
Author

Thanks, Jen.

Let's drop Candidate disbursements.

We need to include links to the metadata and to the file formats, too. I'm hoping we can leverage a Wagtail template that allows for tables.

Data in one place

  • Contributions to candidates: Spending
  • Candidate disbursements (listed under Spending and Candidates. To me this makes the most sense under Spending with the other disbursements)
  • Bundled contributions: Raising
  • Any transaction from one committee to another (listed under Spending and Committees: Spending
  • Unverified filers: Filings and reports

General questions and source data

  • "All candidates" includes financial information. "Candidate master" only contains administrative (name, state, office, district, etc.) information
  • The webl, weball and webk files are here, http://classic.fec.gov/finance/disclosure/ftpsum.shtml, on the classic site. We have files from 1996 to 2018.
  • "Committee report summary”: Let's move this to Filings and reports as you suggest.
  • "Senate unofficial electronic filings": Yes, Filings and reports makes sense to me, too.

External sources: Let's not map these to content categories. They aren't FEC data.

@jenniferthibault
Copy link
Contributor

Thanks Paul, I can make the changes to get data into one place in the mockups. Expect that to start next week, unless I have more evening time than expected this week.

Including metadata and file formats are new requirements for this design (and later implementation scoping). To think about: are these critical features, or do you think we can narrow them down at all?

@jenniferthibault
Copy link
Contributor

🎨 Updated InVision prototype

There's a lot in this prototype, so I'm going to try to break it down into a change log with links to specific pages.

Changes

Advanced data

  • Since some of the bulk data (FTP files) are built in a way that span categories (raising & spending, for example) grouping them in one place by section was problematic. We moved them all into their own section together in the Download bulk data . This page keeps the accordions so that people can keep the familiar format.
  • Adds a link to download each header file to the bulk data accordion template (One example)
  • Adds a link to get to a metadata page for each data set into the bulk data accordion template (One example)
    • ❗️ This means that we need to create one page per data set to hold the metadata info (which is also the format on the Classic site). Since the metadata info is white long, I think it's ok, just wanting to track new pages required.
  • Adds a link to get to the metadata page for each data set into the data catalog accordion template (One example)

Presidential matching fund info & data

Administrative fine data downloads

  • Embeds the download for Administrative fine cases onto the Admin fine page and adds a link to a page for its metadata using an existing Wagtail template

Next steps & open questions

  • @PaulClark2 it would be helpful for you to take a look and double-check that I've gotten things in the right places. I'm particularly wondering if I accounted for everything needed about the "file formats" which I realized we didn't discuss in depth.
  • @jenniferthibault talks with front end devs to see what we know about enabling links in content inside Wagtail tableblocks

@PaulClark2
Copy link
Author

@jenniferthibault thanks for making the changes we asked for. The file formats look good. We'll probably need to do something a little different for the electronically and paper filed reports file formats. They have multiple file formats reflecting filing requirement changes over the years.

@jenniferthibault
Copy link
Contributor

jenniferthibault commented Oct 23, 2017

Thanks Paul, can you outline what the file formats are for what year ranges?

@PaulClark2
Copy link
Author

.FEC files of electronically filed reports have one of 15 formats. All the file format files are in a single compressed (.zip) file (http://classic.fec.gov/elecfil/eFilingFormats.zip). There is a table, http://classic.fec.gov/finance/disclosure/ftpefile.shtml, on classic that explains when specific versions of the format are valid.

.FEC files of Paper filed reports have one of 10 formats. All the file format files are in a single compressed (.zip) file (http://classic.fec.gov/elecfil/PaperFormats.zip). There is a table, http://classic.fec.gov/finance/disclosure/ftppaper.shtml, on classic that explains when specific versions of the format are valid.

@PaulClark2 PaulClark2 modified the milestones: Sprint 4.1, Sprint 4.2 Oct 24, 2017
@jenniferthibault
Copy link
Contributor

jenniferthibault commented Oct 26, 2017

Update: @PaulClark2 and I just paired through the electronic and paper filing scenarios, which are a little different from the rest. Since files are offered as daily collections (since 2001 and 2005) and it's important that we don't break the file download links for users, we are going to link to the s3 bucket instead of offering the downloads directly from the panel.

Example:
Where the link would go to the s3 bucket
bulk-accordion-11

I've updated the InVision prototype accordingly, and we're moving to implementation issues. Will follow up with links to those issues and close this when they're in place.

@jenniferthibault
Copy link
Contributor

jenniferthibault commented Oct 27, 2017

I've created a virtual flotilla of implementation issues to carry this task forward. 🛥⛵️🛥⛵️ 🛥

There are many issues because I tried to keep the tasks as small as possible for the nearest-term items, which will allow them to be split among more folks.

To prepare the designs, I started with a task to restyle the template as a whole:

Then focused on the bulk data tab as one section to complete

Followed by the same set of tasks for the data catalog data in the Raising, Spending, Candidates, Committees, and Filing and Reports tabs.

And isolated the same type of work for administrative fines and presidential public fund submissions. The difference here is that this work happens mostly in the CMS through Wagtail. It could be taken on by a content manager with a little help from someone comfortable with HTML tables.

Since the historical statistics link (which currently goes to the transition site) would be losing its home on the new site with the change, we're providing a tentative home for it in a new section on the Adv data tab. These collective changes meant we needed to update the items shown and order of the data landing page cards, and menu content.

--- This would be the point where v1 is complete! ---

We identified the need and desire to rewrite the data descriptions with more plain language so that they could be accessible to a wider audience. This work could be happening while the above work is in progress, but is not expected to be finished in time for the v1 release.

And finally, a new research and design task for understanding and transferring the historical statistic data onto the new site from transition. Since this information doesn't rely on Waltham, I've placed this tag in the backlog to help keep it separate from earlier priorities.

Since the implementation and follow up issues are in place, I'll close this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants