Query #269
-
Hi, I want search for tweets that contain an exact word and an emoji in the tweet. For example, I am here I am searching for a tweet where a word "don" appears with the emoji "😃" in the same tweet ("He don pass exam 😃" ). I used this as a query "("di" 😃)" as explained by @chainsawriot here #224 (comment). However, the query returns both tweets that contain the word "don't" and "don". For example, the tweet "He don't pass exam 😃" was also extracted.
Expectation. Please, how can I use word boundaries so that only tweet with "don" and "😃" will be return (not tweets with don't , dont). Also, I tried this as a query:
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
@shmuhammad2004 So quoting doesn't produce what you would want (BTW, it makes no sense to quote a single word). My suggestion is to do it this way instead. require(academictwitteR)
#> Loading required package: academictwitteR
x <- get_all_tweets("don 😃 -don't",
start_tweets = "2020-01-01T00:00:00Z",
end_tweets = "2021-11-01T00:00:00Z",
n = 10,
country = "ng",
verbose = FALSE)
x$text
#> [1] "Nkwobi, Bread and Water😀😃😃😀😃\n\nMan U don suffer #OleOutNow https://t.co/lyOA0xgoS5"
#> [2] "Oga, na Bus wey don comot for Park you dey stop.😃😃 https://t.co/cdDhG4sQpu"
#> [3] "@CyberBug11 @TheBriDen You don get mouth ba? 😃"
#> [4] "Even Neighborhood Watch self don dey from SARS. 😃😃\n#EndSARS"
#> [5] "Everybody for Lafia don snap for LAFIA CITY MALL.… \n\nNa only me remain.😃😃"
#> [6] "@StillYoursADD I don forget her handle 😃😃"
#> [7] "Shay your eyes don clear now? Smile 😃 https://t.co/iI5SsQF2hq"
#> [8] "@SamsonEguntola @max_sticks @Onise_iyanu @Mayami0105 @Femaledriver2 @Auntyfeyi @BayoAdedosu @Trinity_Don_JFK @savndaniel @fhinksleem96 😃🤣😂🤣 I'm also 12 years Sir 😂"
#> [9] "I yab a slim cousin of mine that he looks like Fido Dido, he was just looking at me, I later asked the older cousins of under 25, that do dey kno Fido Dido...? They said no... One of the best TV advert we had.... \nChai... I don old o😃 https://t.co/3xVaYK7CgO"
#> [10] "@TimeyinFreedom1 Wo! I don chop my own.. make e go round abeg. I jump am pass 😃" Created on 2021-12-27 by the reprex package (v2.0.1) |
Beta Was this translation helpful? Give feedback.
-
@chainsawriot Your answer gives me more clue on how to handle another problem that involves looking for tweets with a search term that contains diacritics. Twitter documenation says:
So, using the same trick you applied, I try to search tweets that contains "tó" and negate the query with "-to", but it returns nothing.
However, searching with only "tó" returns tweets.
|
Beta Was this translation helpful? Give feedback.
@shmuhammad2004
The problem is that the tokenizer at Twitter produces a search index with both "don" and "don't" for tweets with "don't". It makes sense because some languages use apostrophes as delimiter too, e.g. L’université.
So quoting doesn't produce what you would want (BTW, it makes no sense to quote a single word). My suggestion is to do it this way instead.