Error in make_query. Status code: 400 #180

shmuhammadd · 2021-07-03T10:08:05Z

I run the code below to extract tweets with hashtag #BlackLivesMatter. But, it returns an error Error in make_query(url = endpoint_url, params = params, bearer_token = bearer_token, : something went wrong. Status code: 400.

I understand error 400 means bad request but the query is a verbatim copy from academictwitteR.

get_all_tweets(
    query = "#BlackLivesMatter",
    start_tweets = "2020-01-01T00:00:00Z",
    end_tweets = "2020-01-05T00:00:00Z",
    file = "blmtweets",
    data_path = "data/",
    n = 100,
    bearer_token = get_bearer()
  )

Expected behavior

Return the expected tweets as queried.

Session Info:


R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics 
[3] grDevices utils    
[5] datasets  methods  
[7] base     

other attached packages:
 [1] quanteda.textstats_0.94.1
 [2] quanteda.tidy_0.2        
 [3] quanteda_3.0.0           
 [4] forcats_0.5.1            
 [5] stringr_1.4.0            
 [6] dplyr_1.0.7              
 [7] purrr_0.3.4              
 [8] readr_1.4.0              
 [9] tidyr_1.1.3              
[10] tibble_3.1.2             
[11] ggplot2_3.3.4.9000       
[12] tidyverse_1.3.1          
[13] academictwitteR_0.2.0    
[14] goodshirt_0.2.2          

loaded via a namespace (and not attached):
 [1] rmsfact_0.0.3             
 [2] Rcpp_1.0.6                
 [3] stringdist_0.9.6.3        
 [4] lubridate_1.7.10          
 [5] lattice_0.20-44           
 [6] LexisNexisTools_0.3.4.9000
 [7] assertthat_0.2.1          
 [8] utf8_1.2.1                
 [9] cellranger_1.1.0          
[10] R6_2.5.0                  
[11] plyr_1.8.6                
[12] backports_1.2.1           
[13] reprex_2.0.0              
[14] nsyllable_1.0             
[15] httr_1.4.2                
[16] pillar_1.6.1              
[17] rlang_0.4.11              
[18] readxl_1.3.1              
[19] curl_4.3.2                
[20] rstudioapi_0.13           
[21] data.table_1.14.0         
[22] praise_1.0.0              
[23] Matrix_1.3-4              
[24] munsell_0.5.0             
[25] broom_0.7.8               
[26] modelr_0.1.8              
[27] compiler_4.0.3            
[28] xfun_0.24                 
[29] pkgconfig_2.0.3           
[30] tidyselect_1.1.1          
[31] emo_0.0.0.9000            
[32] fansi_0.5.0               
[33] withr_2.4.2               
[34] crayon_1.4.1              
[35] dbplyr_2.1.1              
[36] grid_4.0.3                
[37] jsonlite_1.7.2            
[38] gtable_0.3.0              
[39] lifecycle_1.0.0           
[40] DBI_1.1.1                 
[41] magrittr_2.0.1            
[42] scales_1.1.1              
[43] RcppParallel_5.1.4        
[44] cli_2.5.0                 
[45] stringi_1.6.2             
[46] pbapply_1.4-3             
[47] reshape2_1.4.4            
[48] fs_1.5.0                  
[49] cowsay_0.8.0              
[50] xml2_1.3.2                
[51] ellipsis_0.3.2            
[52] stopwords_2.2             
[53] fortunes_1.5-4            
[54] generics_0.1.0            
[55] vctrs_0.3.8               
[56] fastmatch_1.1-0           
[57] tools_4.0.3               
[58] glue_1.4.2                
[59] hms_1.1.0                 
[60] parallel_4.0.3            
[61] colorspace_2.0-2          
[62] rvest_1.0.0               
[63] haven_2.4.1               
[64] knitr_1.33                
[65] usethis_2.0.1.9000

Thanks @cjbarrie for the amazing work.

Please, kindly advised.

Best,
Shamsuddeen

The text was updated successfully, but these errors were encountered:

hsakareem · 2021-07-03T14:09:37Z

I am getting the same error. In my case, it was working perfectly till yesterday. I'm getting this error from this evening. Might be a problem at Twitter's end.

DrorWalt · 2021-07-03T16:04:25Z

Same issue here. Python works with no problem though and same bearer.

shmuhammadd · 2021-07-03T16:41:14Z

I am getting the same error. In my case, it was working perfectly till yesterday. I'm getting this error from this evening. Might be a problem at Twitter's end.

Works fine for me also yesterday.

justinchuntingho · 2021-07-03T18:55:57Z

Twitter just changed its API a few days ago, if a user requests context_annotations with the tweet.fields parameter (by default on), the fetch will be limited to 100 tweets per page (by default 500, hence the error). A quick workaround would be to add page_n = 100. We are working on a fix and will update soon.

shmuhammadd · 2021-07-03T19:01:23Z

Twitter just changed its API a few days ago, if a user requests context_annotations with the tweet.fields parameter (by default on), the fetch will be limited to 100 tweets per page (by default 500). A quick workaround would be to add page_n = 100. We are working on a fix and will update soon.

Thanks for the response @justinchuntingho.

jmwright432 · 2021-07-04T17:59:37Z

I'm having an issue on this front. I ran this code about 4-5 days ago with no issue--I was getting tweets scraped of upwards of 250,000 which was fantastic. Now I am getting this 400 error message and using page_n =100 obviously limits me to 100 tweets per page and maxing out my tweets at 100. Is there a workaround for this or is this package now limiting to that few of tweets?

justinchuntingho · 2021-07-04T18:08:15Z

Starting from #181, you should now be able to specify context_annotations = FALSE (also the default), in this case you will be able to fetch 500 tweets per page. We will try to push the patch to CRAN soon but at the mean time you could install the development version to use this.

jmwright432 · 2021-07-04T18:17:43Z

This is the message I get with the following code:

tweets4 <- get_all_tweets(query=build_query("sanctuary cities OR sanctuary city",is_retweet=FALSE,lang="en"),start_tweets="2018-01-01T00:00:00Z",end_tweets="2018-01-03T00:00:00Z", bearer_token=bearer_token, data_path = "data6/", bind_tweets = TRUE, context_annotations=FALSE, page_n=500)
query: sanctuary cities OR sanctuary city -is:retweet lang:en
Total pages queried: 1 (tweets captured this page: 496).
Total tweets captured now reach 100 : finishing collection.

chainsawriot · 2021-07-04T19:08:37Z

@jmwright432 How about

city",is_retweet=FALSE,lang="en"),start_tweets="2018-01-01T00:00:00Z",end_tweets="2018-01-03T00:00:00Z", bearer_token=bearer_token, data_path = "data6/", bind_tweets = TRUE, context_annotations=FALSE, page_n=500, n = Inf)

You needa tune the n.

natesheehan · 2021-07-04T20:56:42Z

Is there a workaround for this or is this package now limiting to that few of tweets?

From what I understand about the new twitter update and this package, you should still be able to mine > 100 tweets , it will just be much slower if you want context_annotations until an update @justinchuntingho ?

chainsawriot · 2021-07-04T21:34:07Z

I am not @justinchuntingho (I'm the quiet Beatle). But I can answer your question @natesheehan.

The update is there, now. You can install the Github version.

First thing first, you can get more than 100 tweets. You can get 1000 tweets in 5s, for example. The only change is that you won't get the context annotations, the things (e.g. topics, name entities) that Twitter extracted for you from tweets.

require(academictwitteR)
#> Loading required package: academictwitteR

start_time <- Sys.time()
x <- get_all_tweets(
  query = "#ichbinhanna",
  start_tweets = "2021-01-01T00:00:00Z",
  end_tweets = "2021-07-01T00:00:00Z",
  n = 1000
)
#> Warning: Recommended to specify a data path in order to mitigate data loss when
#> ingesting large amounts of data.
#> Warning: Tweets will not be stored as JSONs or as a .rds file and will only be
#> available in local memory if assigned to an object.
#> query:  #ichbinhanna 
#> Total pages queried: 1 (tweets captured this page: 500).
#> Total pages queried: 2 (tweets captured this page: 500).
#> Total tweets captured now reach 1000 : finishing collection.
end_time <- Sys.time()
end_time - start_time
#> Time difference of 4.990046 secs
nrow(x)
#> [1] 1000

^{Created on 2021-07-04 by the reprex package (v2.0.0)}

If you need those context annotations, you need to specify it explicitly in your call to get_all_tweets. It will be slower also.

require(academictwitteR)
#> Loading required package: academictwitteR

start_time <- Sys.time()
x <- get_all_tweets(
  query = "#ichbinhanna",
  start_tweets = "2021-01-01T00:00:00Z",
  end_tweets = "2021-07-01T00:00:00Z",
  n = 1000,
  context_annotations = TRUE
)
#> Warning: Recommended to specify a data path in order to mitigate data loss when
#> ingesting large amounts of data.
#> Warning: Tweets will not be stored as JSONs or as a .rds file and will only be
#> available in local memory if assigned to an object.
#> page_n is limited to 100 due to the restriction imposed by Twitter API
#> query:  #ichbinhanna 
#> Total pages queried: 1 (tweets captured this page: 100).
#> Total pages queried: 2 (tweets captured this page: 100).
#> Total pages queried: 3 (tweets captured this page: 100).
#> Total pages queried: 4 (tweets captured this page: 100).
#> Total pages queried: 5 (tweets captured this page: 100).
#> Total pages queried: 6 (tweets captured this page: 100).
#> Total pages queried: 7 (tweets captured this page: 100).
#> Total pages queried: 8 (tweets captured this page: 100).
#> Total pages queried: 9 (tweets captured this page: 100).
#> Total pages queried: 10 (tweets captured this page: 100).
#> Total tweets captured now reach 1000 : finishing collection.
end_time <- Sys.time()
end_time - start_time
#> Time difference of 11.94927 secs
nrow(x)
#> [1] 1000

^{Created on 2021-07-04 by the reprex package (v2.0.0)}

jmwright432 · 2021-07-04T21:41:22Z

Thanks @chainsawriot adding the n=Inf worked. I’m getting closer to 250k tweets now which was what I was getting a few days ago. Clearly the syntax has changed in the code. Much appreciated!

natesheehan · 2021-07-04T21:56:54Z

@chainsawriot hey quiet Beatle - great answer and thanks for this tip!

You needa tune the n.

Got that n tuned finely now! Many thanks @justinchuntingho for the speedy fix!

shmuhammadd · 2021-07-04T22:49:26Z

Many thanks guys for fixing this. @justinchuntingho @chainsawriot you guys are amazing.

helennguyen1312 · 2021-07-05T02:41:54Z

Hi @chainsawriot, I still have a problem with the status code: 400. Below is my code. Can you please tell me what I did wrong? I tried to add page_n=500 but it did not work. page_n = 100 worked, but I noticed that it took longer than a few days ago when the update had not happened yet.
tweets <-
get_all_tweets("paris accord",
"2018-07-01T00:00:00Z",
"2021-07-04T00:00:00Z",
BEARER_TOKEN,
lang = "en") (this one did not work)
I am a newbie so I am sorry if my question is not good.

shmuhammadd · 2021-07-05T05:25:55Z

Hi @chainsawriot, I still have a problem with the status code: 400. Below is my code. Can you please tell me what I did wrong? I tried to add page_n=500 but it did not work. page_n = 100 worked, but I noticed that it took longer than a few days ago when the update had not happened yet.
tweets <-
get_all_tweets("paris accord",
"2018-07-01T00:00:00Z",
"2021-07-04T00:00:00Z",
BEARER_TOKEN,
lang = "en") (this one did not work)
I am a newbie so I am sorry if my question is not good.

Hi @helennguyen1312 ,

You need to update the package. It is not yet push to CRAN, but you can install the dev version as shown below.

devtools::install_github("cjbarrie/academictwitteR", build_vignettes = TRUE)

This is is what works for me.

Best,
Shamsuddeen

helennguyen1312 · 2021-07-05T05:31:41Z

@shmuhammad2004 Thank you so much! I got it now.
And many thanks to @justinchuntingho @chainsawriot for fixing the issue.

AndreaaMarche · 2021-07-05T16:34:38Z

Hi @chainsawriot sorry to bother you. I'm still having issues in getting tweets.
Firstly I create bearer_token and query objects. The query is the following:
query <- build_query( query = "blabla", is_retweet = FALSE, has_hashtags = TRUE, remove_promoted = TRUE)

Then, I try to get tweets with the following command:
try <- get_all_tweets( query = query, bearer_token, file = NULL, data_path = NULL, bind_tweets = TRUE, start_tweets = "2021-06-11T00:00:00Z", end_tweets = "2021-07-04T23:59:59Z", verbose= FALSE)

Such a command does not work. The error is the following
Errore in make_query(url = endpoint_url, params = params, bearer_token = bearer_token, : something went wrong. Status code: 400

I tried to introduce context_annotations = TRUE, I tried to introduce n = . In both cases, the command does not work (it does not recognize context_annotations as valid argument).

The command works only with page_n = 100. Yet, I need to scrape many more tweets. How can I solve this? Any tip?

Thank you all in advance for your great work and support.

chainsawriot · 2021-07-05T16:45:16Z

@AndreaaMarche Have you installed the latest Github version?

devtools::install_github("cjbarrie/academictwitteR", build_vignettes = TRUE)

Can't reproduce your error.

require(academictwitteR)
#> Loading required package: academictwitteR
query <- build_query( query = "blabla", is_retweet = FALSE, has_hashtags = TRUE, 
                      remove_promoted = TRUE)

try <- get_all_tweets( query = query, file = NULL, data_path = NULL, 
                       bind_tweets = TRUE, start_tweets = "2021-06-11T00:00:00Z", 
                       end_tweets = "2021-07-04T23:59:59Z", verbose= FALSE, n = 2000)
nrow(try)
#> [1] 2000

^{Created on 2021-07-05 by the reprex package (v2.0.0)}

AndreaaMarche · 2021-07-05T16:55:56Z

@chainsawriot It works if I do not specify bearer_token in the command. I do not understand why, but this was the issue in my case, and I had to use set_bearer: maybe it can be helpful for other users.

If possible, I would like to know the maximum n = I can specify. Thank you very much for your help!

chainsawriot · 2021-07-05T17:02:32Z

@AndreaaMarche Study count_all_tweets

kobihackenburg · 2021-07-06T16:46:08Z

Hi @chainsawriot sorry to bother you. I'm still having issues in getting tweets.
Firstly I create bearer_token and query objects. The query is the following:
query <- build_query( query = "blabla", is_retweet = FALSE, has_hashtags = TRUE, remove_promoted = TRUE)

Then, I try to get tweets with the following command:
try <- get_all_tweets( query = query, bearer_token, file = NULL, data_path = NULL, bind_tweets = TRUE, start_tweets = "2021-06-11T00:00:00Z", end_tweets = "2021-07-04T23:59:59Z", verbose= FALSE)

Such a command does not work. The error is the following
Errore in make_query(url = endpoint_url, params = params, bearer_token = bearer_token, : something went wrong. Status code: 400

I tried to introduce context_annotations = TRUE, I tried to introduce n = . In both cases, the command does not work (it does not recognize context_annotations as valid argument).

The command works only with page_n = 100. Yet, I need to scrape many more tweets. How can I solve this? Any tip?

Thank you all in advance for your great work and support.

Hi @chainsawriot! I'm having the same exact issue @AndreaaMarche had, but her solution is not working for me, as I never specified bearer token in the command to begin with. My query is as follows:

hillary_tweets <- get_all_tweets(users = c("HillaryClinton"), start_tweets = "2015-04-12T00:00:00Z", end_tweets = "2016-06-06T00:00:00Z", bind_tweets = TRUE, page_n = 500, n = Inf)

This gives me the 400 error:

Error in make_query(url = endpoint_url, params = params, bearer_token = bearer_token, : something went wrong. Status code: 400

I've installed the latest dev version of the package, but like @AndreaaMarche I can't introduce context_annotations = FALSE or n = without getting errors. I can only get it to work with page_n = 100, which quickly exceeds the rate limit. Any suggestions?

Thanks so much!

chainsawriot · 2021-07-06T16:52:53Z

@kobihackenburg can't reproduce

require(academictwitteR)
#> Loading required package: academictwitteR
hillary_tweets <- get_all_tweets(users = c("HillaryClinton"), start_tweets = "2015-04-12T00:00:00Z", end_tweets = "2016-06-06T00:00:00Z", bind_tweets = TRUE, page_n = 500, n = Inf)
#> Warning: Recommended to specify a data path in order to mitigate data loss when
#> ingesting large amounts of data.
#> Warning: Tweets will not be stored as JSONs or as a .rds file and will only be
#> available in local memory if assigned to an object.
#> query:   (from:HillaryClinton) 
#> Total pages queried: 1 (tweets captured this page: 496).
#> Total pages queried: 2 (tweets captured this page: 500).
#> Total pages queried: 3 (tweets captured this page: 499).
#> Total pages queried: 4 (tweets captured this page: 496).
#> Total pages queried: 5 (tweets captured this page: 486).
#> Total pages queried: 6 (tweets captured this page: 494).
#> Total pages queried: 7 (tweets captured this page: 494).
#> Total pages queried: 8 (tweets captured this page: 500).
#> Total pages queried: 9 (tweets captured this page: 491).
#> Total pages queried: 10 (tweets captured this page: 498).
#> Total pages queried: 11 (tweets captured this page: 497).
#> Total pages queried: 12 (tweets captured this page: 430).
#> This is the last page for  (from:HillaryClinton) : finishing collection.

^{Created on 2021-07-06 by the reprex package (v2.0.0)}

I am using 0.2.1 a.k.a. the current Github version.

justinchuntingho · 2021-07-06T18:29:24Z

Hi @chainsawriot sorry to bother you. I'm still having issues in getting tweets.
Firstly I create bearer_token and query objects. The query is the following:
query <- build_query( query = "blabla", is_retweet = FALSE, has_hashtags = TRUE, remove_promoted = TRUE)

Then, I try to get tweets with the following command:
try <- get_all_tweets( query = query, bearer_token, file = NULL, data_path = NULL, bind_tweets = TRUE, start_tweets = "2021-06-11T00:00:00Z", end_tweets = "2021-07-04T23:59:59Z", verbose= FALSE)

Such a command does not work. The error is the following
Errore in make_query(url = endpoint_url, params = params, bearer_token = bearer_token, : something went wrong. Status code: 400

I tried to introduce context_annotations = TRUE, I tried to introduce n = . In both cases, the command does not work (it does not recognize context_annotations as valid argument).

The command works only with page_n = 100. Yet, I need to scrape many more tweets. How can I solve this? Any tip?

Thank you all in advance for your great work and support.

If you are supplying the arguments in the order they were defined, you need to name them, eg you need to state explicitly bearer_token = bearer_token (recommended), or put your arguments in the order they were defined get_all_tweets(query, start_tweets, end_tweets, bearer_token, ...).

cjbarrie · 2021-07-07T11:38:10Z

Patch v0.2.1 now on CRAN: ref. commit 49d0c7e

justinchuntingho added the bug Something isn't working label Jul 4, 2021

justinchuntingho closed this as completed Jul 8, 2021

cjbarrie mentioned this issue Aug 6, 2021

get_all_tweets error code 503 #135

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in make_query. Status code: 400 #180

Error in make_query. Status code: 400 #180

shmuhammadd commented Jul 3, 2021 •

edited

Loading

hsakareem commented Jul 3, 2021

DrorWalt commented Jul 3, 2021

shmuhammadd commented Jul 3, 2021

justinchuntingho commented Jul 3, 2021 •

edited

Loading

shmuhammadd commented Jul 3, 2021

jmwright432 commented Jul 4, 2021

justinchuntingho commented Jul 4, 2021

jmwright432 commented Jul 4, 2021

chainsawriot commented Jul 4, 2021 •

edited

Loading

natesheehan commented Jul 4, 2021 •

edited

Loading

chainsawriot commented Jul 4, 2021 •

edited

Loading

jmwright432 commented Jul 4, 2021

natesheehan commented Jul 4, 2021 •

edited

Loading

shmuhammadd commented Jul 4, 2021

helennguyen1312 commented Jul 5, 2021

shmuhammadd commented Jul 5, 2021 •

edited

Loading

helennguyen1312 commented Jul 5, 2021

AndreaaMarche commented Jul 5, 2021 •

edited

Loading

chainsawriot commented Jul 5, 2021

AndreaaMarche commented Jul 5, 2021

chainsawriot commented Jul 5, 2021

kobihackenburg commented Jul 6, 2021

chainsawriot commented Jul 6, 2021

justinchuntingho commented Jul 6, 2021

cjbarrie commented Jul 7, 2021

Error in make_query. Status code: 400 #180

Error in make_query. Status code: 400 #180

Comments

shmuhammadd commented Jul 3, 2021 • edited Loading

hsakareem commented Jul 3, 2021

DrorWalt commented Jul 3, 2021

shmuhammadd commented Jul 3, 2021

justinchuntingho commented Jul 3, 2021 • edited Loading

shmuhammadd commented Jul 3, 2021

jmwright432 commented Jul 4, 2021

justinchuntingho commented Jul 4, 2021

jmwright432 commented Jul 4, 2021

chainsawriot commented Jul 4, 2021 • edited Loading

natesheehan commented Jul 4, 2021 • edited Loading

chainsawriot commented Jul 4, 2021 • edited Loading

jmwright432 commented Jul 4, 2021

natesheehan commented Jul 4, 2021 • edited Loading

shmuhammadd commented Jul 4, 2021

helennguyen1312 commented Jul 5, 2021

shmuhammadd commented Jul 5, 2021 • edited Loading

helennguyen1312 commented Jul 5, 2021

AndreaaMarche commented Jul 5, 2021 • edited Loading

chainsawriot commented Jul 5, 2021

AndreaaMarche commented Jul 5, 2021

chainsawriot commented Jul 5, 2021

kobihackenburg commented Jul 6, 2021

chainsawriot commented Jul 6, 2021

justinchuntingho commented Jul 6, 2021

cjbarrie commented Jul 7, 2021

shmuhammadd commented Jul 3, 2021 •

edited

Loading

justinchuntingho commented Jul 3, 2021 •

edited

Loading

chainsawriot commented Jul 4, 2021 •

edited

Loading

natesheehan commented Jul 4, 2021 •

edited

Loading

chainsawriot commented Jul 4, 2021 •

edited

Loading

natesheehan commented Jul 4, 2021 •

edited

Loading

shmuhammadd commented Jul 5, 2021 •

edited

Loading

AndreaaMarche commented Jul 5, 2021 •

edited

Loading