Skip to content

Commit

Permalink
docs: explanation about candidate data
Browse files Browse the repository at this point in the history
  • Loading branch information
andylolz committed Jun 2, 2024
1 parent 7e4ed05 commit 2572583
Showing 1 changed file with 18 additions and 0 deletions.
18 changes: 18 additions & 0 deletions output/about/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,20 @@ Notes are excluded if they meet any of the following criteria:

We also attempt to filter out notes for deleted tweets and non-English tweets.

---

### Filter by author group

With thanks to [@leobenedictus](https://x.com/leobenedictus) for the suggestion, community notes can be filtered by ~~current UK MPs~~ **UK General Election 2024 candidates**.

In order to do this, we need a list of the Twitter (X) handles of election candidates. This data is pulled daily from [Democracy Club candidates](https://candidates.democracyclub.org.uk/). It’s incomplete, but you can help improve it by finding and adding candidates / candidate Twitter (X) handles to their data.

{% assign total_candidate_handles = site.data.ge2024-candidates | size %}

At present, Democracy Club candidates has twitter (X) handles of **{% include commify.html number=total_candidate_handles %} General Election 2024 candidates**.

---

### Special twitter (X) language codes

When Twitter (X) can’t determine the language of a tweet, it uses one of several reserved language codes. For the purpose of language filtering, we’ve grouped these all together. But this is the breakdown:
Expand All @@ -31,6 +45,8 @@ When Twitter (X) can’t determine the language of a tweet, it uses one of sever
| `zxx` | Tweet contains media or twitter card only |
{: .table .table-striped .w-inherit }

---

### Tweet indexing status

After fetching new proposed community notes, the text of the tweets that the notes reference is not immediately searchable. In order to make it searchable, we need to fetch these tweets – a process that can take several hours. You can see the current status below.
Expand All @@ -54,6 +70,8 @@ After fetching new proposed community notes, the text of the tweets that the not
}
</script>

---

### Why is the language unknown for some tweets?

Until we’ve fetched a tweet, we don’t know its language. So ‘unknown language’ may mean we haven’t yet fetched that tweet. Once we’ve fetched it (in the next hour or so) we should know the tweet author, language and text.
Expand Down

0 comments on commit 2572583

Please sign in to comment.