You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We had a quick look at the Senedd as part of the initial project - but it’s not available through the same API as the debate transcripts. Instead, we’d need a new scraper.
There is a search page that can be specified to a date range, and to the Written Question type. However, it actually fetches the boxes with the information through javascript - the same arguments can be fed to the seeMore endpoint that will return a json with some html boxes with the information - using this is needed to page through multiple content automatically.
The search page does not allow limiting to answered questions, and does not indicate if this has happened (the ‘for answer on xx/xx/xxxx’ text can’t be trusted). Either questions need to be rechecked until answered, or unanswered questions stashed to be checked again later. This is similar to what’s needed for the London Assembly (see PR).
Complication here of having to do both language versions - have not explored if there are examples where a question has been answered but not yet translated into both languages (my guess is there won’t be, as it’s not live, translation is probably part of the publication process0. . In practice, fetching the Welsh version after successfully retrieving a complete question and answer in English is probably good enough. They’re the same page with different text, so the same scraper would hopefully work aimed at the other page.
The text was updated successfully, but these errors were encountered:
We had a quick look at the Senedd as part of the initial project - but it’s not available through the same API as the debate transcripts. Instead, we’d need a new scraper.
There is a search page that can be specified to a date range, and to the Written Question type. However, it actually fetches the boxes with the information through javascript - the same arguments can be fed to the seeMore endpoint that will return a json with some html boxes with the information - using this is needed to page through multiple content automatically.
https://record.senedd.wales/Search/seeMore?type=7
The search page does not allow limiting to answered questions, and does not indicate if this has happened (the ‘for answer on xx/xx/xxxx’ text can’t be trusted). Either questions need to be rechecked until answered, or unanswered questions stashed to be checked again later. This is similar to what’s needed for the London Assembly (see PR).
Complication here of having to do both language versions - have not explored if there are examples where a question has been answered but not yet translated into both languages (my guess is there won’t be, as it’s not live, translation is probably part of the publication process0. . In practice, fetching the Welsh version after successfully retrieving a complete question and answer in English is probably good enough. They’re the same page with different text, so the same scraper would hopefully work aimed at the other page.
The text was updated successfully, but these errors were encountered: