Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch JSON download from public whip to twfy-votes #1743

Closed
ajparsons opened this issue Nov 21, 2023 · 8 comments
Closed

Switch JSON download from public whip to twfy-votes #1743

ajparsons opened this issue Nov 21, 2023 · 8 comments
Assignees

Comments

@ajparsons
Copy link
Contributor

Public Whip creates popolo JSON feeds of all votes associated with a policy, which TheyWorkForYou imports to create an association between a division, and update any division descriptions based on the content in the Public Whip.

TheyWorkForYou Votes recreates these popolo feeds (although part of this ticket is verifying they work as intended).

Currently, the process is done in several phases.

  • Parlparse creates XMLs for Public Whip to parse.
  • Parlparse downloads popolo json from Public Whip.
  • TheyWorkForYou ingest some of these jsons from ParlParse.

The simplest approach would be to adjust parlparse to pull from a new source (might need a new json feed from twfy-votes to give a list of available policies (twfy has a list of IDs, but parlparse doesn’t).

However, the XML export from PW to TWFY does not do this double loop. Do we want to similarly switch to TheyWorkForYou directly querying the JSON and cut it out from ParlParse?

(The current approach does mean the json is then publicly available).

@dracos
Copy link
Member

dracos commented Nov 21, 2023

List of policies is presumably linked with #1747

If the data is going to be available elsewhere anyway, I don't see an issue with changing it. It looks like the fact it requests policy XML for the MP info and then vote info from the JSON is purely historical, I see no reason to maintain that if it can be simplified. I assume ideally you'd want to request one policy JSON URL that returned the MP information and the vote information together that could then be imported, does that make sense with the twfy-votes repo?

@ajparsons
Copy link
Contributor Author

Yeah, that'd be fine.

So we'd get towards something like:

policy_details: {meta information about the policy}
votes: [information about how each person voted for each vote]
alignment: [overall information about alignment for each person with the policy]

The existing popolo-esque view can be adapted to add that.

@ajparsons
Copy link
Contributor Author

This end point now contains extra json information.

Exact details in the swagger API documentation linked to from home page.

@dracos
Copy link
Member

dracos commented Nov 24, 2023

Using 810 as a comparison, the outputs are different at present, I'm afraid. For example, 2005-06-14 division 12 in PW JSON has aye/no/both/absent of 199/292/0/153, whereas votes has 233/312/0/105. The 'text' of the motion is also different, PW has "National Lottery Bill (Reasoned amendment on second reading)", votes only has "National Lottery Bill". Or 2008-04-28/155 has 223/311/0/11 vs 265/335/0/50 and "Finance Bill - Clause 21 - Amusement Machine licence duty" vs "Orders of the Day - Clause 21 - Amusement Machine licence duty".

For the alignments, checked the first dozen or so...
10003 PW has voted 3, absent 2; votes has voted 2, absent 2.
10002 is in PW but not in votes; I assumed this was because voted 0 times, but 10007 is present in both voting 0 times.
10011 - absent for 5 in PW, absent for 4 in votes.

@ajparsons
Copy link
Contributor Author

ajparsons commented Nov 24, 2023

Division totals are just going to be a number problem I need to look at.

Text I think I need to parse the wikitables to get the full labels stored in publicwhip (related to mysociety/twfy-votes#5)

Alignment I would expect to be slightly off on 810 because it has a Lords vote that's ignored in the other system (those misalignemnts becomes Lords later). 1027 is an existing pure commons policy and should line up.

Gerry Adams I knew about - It calculates everything for one MP at once, Gerry Adams has never voted, so gets nothing scored, whereas others will appear because they have voted elsewhere. Is this a problem in terms of the ingest? Would be fairly easy to create null entries in these cases.

@dracos
Copy link
Member

dracos commented Nov 24, 2023

I don't think Gerry Adams matters, no, looks like the code checks *_distance exists, so wouldn't matter if it wasn't set, and the front end overrides any vote display for SF MPs. So should be fine.

Makes sense on the Lords vote not being there, yep. Titles, yep, the script is https://github.com/publicwhip/publicwhip/blob/ac756343534ebfa36edf8f4e1740e3c5407acb85/build/generate_popolo_json.php, calling get_wiki_current_value from pw_dyn_wiki_motion table and extracting title and maybe yes/no from it that way.

@ajparsons
Copy link
Contributor Author

Big division problem fixed - absences are still off because I hardcoded 650 and didn't go back to fix it, but understand why that's happening.

@dracos
Copy link
Member

dracos commented Dec 15, 2023

Sorry, more problems, the IDs don't match, eg 6679 - PW has an ID of pw-2010-07-06-14-commons, but votes has pw-2010-07-06-commons.
votes is lacking the policy_vote, not sure if that's easy to work out? I guess look at the counts and then work it out from majority/minority + "strong", as long as that doesn't have any edge issues.
And for my own documenting, PW has "aye" in the counts, votes has "yes" (can cope with both easily enough there).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants