You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 10, 2024. It is now read-only.
While using stream_tweets I randomly/sometimes the lexical error:
Error:lexicalerror:invalidcharacterinsidestring.nload\/android\" rel=\"nofollo w\"\u003eTwitter for Android\ (right here) ------^
This is my code:
#stream tweets on the American Continents for 20 seconds
stream_tweets(c(-169.1, -57.2, -31.9, 74.7), timeout=20, parse=FALSE)
#parse the stream with the error-file
parse_stream("error_stream-20211014145815.json")
#parse the stream with the fix-file (line break removed manually)
parse_stream("fix_stream-20211014145815.json")
And this is my terminal-output:
> stream_tweets(c(-169.1, -57.2, -31.9, 74.7), timeout=20, parse=FALSE)
Streamingtweetsfor20seconds...Finishedstreamingtweets!streamingdatasavedasstream-20211014145815.json>#parse the stream with the error-file> parse_stream("error_stream-20211014145815.json")
Error:lexicalerror:invalidcharacterinsidestring.nload\/android\" rel=\"nofollo w\"\u003eTwitter for Android\ (right here) ------^> #parse the stream with the fix-file (line break removed manually)> parse_stream("fix_stream-20211014145815.json")# A tibble: 262 x 90 user_id status_id created_at screen_name text source display_text_wi~ reply_to_status~ <chr> <chr> <dttm> <chr> <chr> <chr> <dbl> <chr> 1 1413163529847386117 14486342~ 2021-10-14 12:58:08 EuropeSpac~ "\U0~Twitt~NA144863422059169~25779070314486342~2021-10-1412:58:08denise_ste~"@_G~ Twitt~ 22 144863178777373~ 3 1713933823 14486342~ 2021-10-14 12:58:08 somoschile~ "Bue~Twitt~NANA4227211455414486342~2021-10-1412:58:08ahsilla82"@el~ Twitt~ 0 144863033797891~ 5 387903087 14486342~ 2021-10-14 12:58:08 gabizadoro~ "ont~Twitt~NA144406775907629~6110988593226893721614486342~2021-10-1412:58:08agathallet~"htt~ Twitt~ NA NA 7 1353883596394803202 14486342~ 2021-10-14 12:58:08 JoeShow683~ "@Mi~Twitt~92144863414041337~882111318049982054414486342~2021-10-1412:58:09LIGGICPHOTO"@ha~ Twitt~ 49 144852012635818~ 9 742807059322839040 14486342~ 2021-10-14 12:58:09 riverarias~ "Vel~Twitt~NANA1029365848414486342~2021-10-1412:58:08shinychevy"@Ta~ Twitt~ 27 144863173869189~# ... with 252 more rows, and 82 more variables: reply_to_user_id <chr>, reply_to_screen_name <chr>,# is_quote <lgl>, is_retweet <lgl>, favorite_count <int>, retweet_count <int>, quote_count <int>,# reply_count <int>, hashtags <list>, symbols <list>, urls_url <list>, urls_t.co <list>,# urls_expanded_url <list>, media_url <list>, media_t.co <list>, media_expanded_url <list>,# media_type <list>, ext_media_url <list>, ext_media_t.co <list>, ext_media_expanded_url <list>,# ext_media_type <chr>, mentions_user_id <list>, mentions_screen_name <list>, lang <chr>,# quoted_status_id <chr>, quoted_text <chr>, quoted_created_at <dttm>, quoted_source <chr>, ...
If I use stream_tweets with the argument "parse = FALSE", the streaming works without any issues. Only if I in a second step try to parse_stream, I get the error. Obviously I would directly get the error, if I would not have used "parse = FALSE".
-> this is the original (error-)file: error_stream-20211014145815.zip
I looked into the json-file and found that the error seems to occur due to a line break in th middle of a tweet (having looked at other error-files, this line break can occur many times per file). It seems to be at random points, i.e., I could not observe a pattern. Normally, one tweet is written in one line in the json-file.
-> this is the fixed file: fix_stream-20211014145815.zip
So to me it seems like stream_tweets sometimes adds randomly a line break while writing the streamed tweet to a json-file!
Also: A friend has the same issue, so I don't think it is due to my OS or PC.
Reproduce the problem
Unfortunately, I am not able to reproduce the error reliably. To reproduce this example, I had to execute my 20 sec stream_tweets 4 times. I found a similar issue reported #356, and it said to be fixed. However, to me its sound like my issue is similar to the one reported there.
The issue #356 seems like the same bug. That bug is fixed on the development version of the package, while you are using the version of CRAN. So, you aren't using the fix that closed that issue.
If you want to use the package with a fix for this you'll need to use the current package as it is here. Unless you do that and find the same problem I'll close as duplicate this issue.
If you install this version be aware that it might change before it reaches CRAN and it changes quite from the version you are currently using.
Problem
While using stream_tweets I randomly/sometimes the lexical error:
This is my code:
And this is my terminal-output:
If I use stream_tweets with the argument "parse = FALSE", the streaming works without any issues. Only if I in a second step try to parse_stream, I get the error. Obviously I would directly get the error, if I would not have used "parse = FALSE".
-> this is the original (error-)file: error_stream-20211014145815.zip
I looked into the json-file and found that the error seems to occur due to a line break in th middle of a tweet (having looked at other error-files, this line break can occur many times per file). It seems to be at random points, i.e., I could not observe a pattern. Normally, one tweet is written in one line in the json-file.
-> this is the fixed file: fix_stream-20211014145815.zip
So to me it seems like stream_tweets sometimes adds randomly a line break while writing the streamed tweet to a json-file!
Also: A friend has the same issue, so I don't think it is due to my OS or PC.
Reproduce the problem
Unfortunately, I am not able to reproduce the error reliably. To reproduce this example, I had to execute my 20 sec stream_tweets 4 times. I found a similar issue reported #356, and it said to be fixed. However, to me its sound like my issue is similar to the one reported there.
rtweet version
Session info
The text was updated successfully, but these errors were encountered: