Add yt-dlp based archiving for TwitterArchiver #138
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds a yt-dlp based twitter archiving function to
TwitterArchiver
as a fallback to the existing two archiving strategies. It uses the_extract_status
function of yt-dlp'sTwitterIE
extractor to extract tweet metadata, and processes it in a similar way to the existing archiving implementation.Upon local testing, the existing snscrape(which seems to be unmaintained) and twitter-hack solution does not work reliably for tweets, but the yt-dlp based solution does. Happy to know if it can be replicated!
Also happy to add a configuration option to
TwitterArchiver
for specifying the preference of tweet archiving methods(snscrape
/twitter-hack
/yt-dlp
)