Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is there a way to make gallery-dl twitter.com/username download what it would normally download plus whatever would be downloaded via twitter.com/username/media ? #2252

Closed
left1000 opened this issue Jan 31, 2022 · 25 comments

Comments

@left1000
Copy link

Is there a way to make gallery-dl twitter.com/username download what it would normally download plus whatever would be downloaded via twitter.com/username/media ?

I thought it was already working this way, I actually for some reason thought /media content was a subset of the main twitter url, but I guess that's false... because artists reply to themselves with their own art, which falls under tweets and replies which no one wants because most replies don't have 'content'

tl;dr anyways I just realized the dozens of rips of twitter's I've done over the past year are likely missing dozens of content posts because I guess I want both twitter.com/username and twitter.com/username/media both but I've only ever typed the one?

@nisehime
Copy link

/media should contain all media files posted by the user, including the ones in replies.

@left1000
Copy link
Author

Yes, that's what I was saying in my post. twitter.com/username/media DOES include all media files posted by the user including the ones in replies.

I want though for twitter.com/username to also include all media files though. I'm not sure why it doesn't?

        "filename": "{filename}.{extension}",
        "content": false,
        "retweets": false,
        "videos": true

if I set

"content: true"

will that make the extractor do what I want? is that my issue?

@Twi-Hard
Copy link

Twi-Hard commented Jan 31, 2022

The twitter API only lets you go back 3200 tweets. Maybe there's not as many media tweets in the main feed because there's other tweets besides media tweets (which leaves less room for media tweets).

Edit: What I said has nothing to do with replies but it does affect the amount of media you'll get in the main feed.

@nisehime
Copy link

nisehime commented Jan 31, 2022

"content: true"

What's that? I haven't seen this option...

You probably want to use urls like that then: twitter.com/username/with_replies to get tweets and tweets with replies in gallery-dl.

@left1000
Copy link
Author

left1000 commented Jan 31, 2022

I do not want tweets with replies....
the behavior I want is for:

gallery-dl twiiter.com/username

to function the same as

gallery-dl twitter.com/username/media twitter.com/username

having to type the same thing twice (as in the above command) seems redundant....
I'm not explaining myself very clearly because all the replies in this thread are not related to what I was asking :(

Also the 3200 tweet limit isn't at play here because in my test case the author posted some media like a week ago and then replied to his own tweet with more media (it was a comic with more panels in the replies he made to himself)..... Essentially the reason I like to use twitter.com/username is because very specifically I do not want anything from twitter.com/username/with_replies because content from twitter.com/username/with_replies is usually total trash. UNLESS the content is under twitter.com/username/media in which case then yes it might be worthwhile as in the case of a comic panel reply.

In the past I had assumed that twitter.com/username/media was a subset of twitter.com/username and that twitter.com/username was a subset of twitter.com/username/with_replies I now realize that this is false.

Still using the command

gallery-dl twitter.com/username/media twitter.com/username

will achieve the result I want. The only issue is that this command seems so... obvious that surely there must be an extractor option to make this the default behavior for

gallery-dl twitter.com/username

and if there is not an extractor option for twitter to make things work as I'm trying to describe here? there should be, because as far as I can tell this is the main set of things a person can want from a twitter rip.

@nisehime
Copy link

nisehime commented Feb 1, 2022

Ok, when you do twitter.com/username you get media from user's own tweets and others tweets he's reposted.
twitter.com/username/media you get all media from user's own tweets only.

That's enough for most people I guess.

I think twitter.com/username should include threaded tweets, but it needs to be tested.

I've noticed something though: twitter.com/username/with_replies depends on your login status. For example if you download twitter.com/usernameA/with_replies and there's tweet where user A replies just with text to the tweet of user B, which contain an image, the media of the user B tweet won't be downloaded if you're not logged in, but if you are - it will be. That is basically matching how it is displayed in the browser. So maybe this also applies to threaded tweets.

@mikf I think you can implement this behavior for this - i.e. download media from URL-user's replies only, when logged in.

@left1000
Copy link
Author

left1000 commented Feb 1, 2022

Yes, you did a good job of explaining the issue, it's threaded media replies that I want to get, with zero other replies (because most replies are terrible).

Sorry I can't explain myself better, I barely understand how twitter works.

@nisehime
Copy link

nisehime commented Feb 1, 2022

You don't get all other replies, only those which are directly replied to by the username (and if that's a long "conversation" type it will also download media from OP tweet). I think you could have avoided that if the author did what I asked a year ago with a filter (maybe you still can, I'm just not that good with it).

Anyway, as I said, getting twitter.com/username/with_replies without logging in to twitter should be equal to twitter.com/username/ + twitter.com/username/media at this point.

@left1000
Copy link
Author

left1000 commented Feb 1, 2022

Unfortunately I want to be logged into twitter because a number of twitters I follow are private.
Although that's just me, so I suppose I'll just try to remember to use twitter.com/username/ + twitter.com/username/media

and the general use case is perhaps solved for others who don't have a twitter account already/anyways.

@nisehime
Copy link

nisehime commented Feb 1, 2022

Wait for the author, I think he will add what you want soon.

@rautamiekka
Copy link
Contributor

I'll just try to remember to use twitter.com/username/ + twitter.com/username/media

Use a script which you pass the links to. The main problem will be placing the script in a place where it's immediately available without having to use full path or change to the folder.

@Hrxn
Copy link
Contributor

Hrxn commented Feb 2, 2022

I don't think it's a big deal, to be honest.
Simply use a text file with the URLs, and always use both twitter.com/user/media and twitter.com/user
You can collect all your links there, and then simply use the --input-file option. You can obviously use even more than one input file.
This is especially useful because you can set any options per URL in such a file..

@rautamiekka
Copy link
Contributor

This is especially useful because you can set any options per URL in such a file..

Like many ppl would say: mind blown.

@mikf
Copy link
Owner

mikf commented Feb 2, 2022

I could add an include option like there is for instagram, deviantart, and other sites, but like the others said:

  • write your own little script
    twitter-dl() { gallery-dl "$1" "$1/media"; } in bash
  • use an --input-file
  • only use twitter.com/USER/media as URL since you don't care about retweets and such.

@left1000
Copy link
Author

left1000 commented Feb 2, 2022

I do wonder if twitter.com/USER/media is just what I should be using instead. What does twitter.com/USER include that isn't included in twitter.com/USER/media ? I'm actually not sure I know what it includes whatsoever (anything?)

@nisehime
Copy link

nisehime commented Feb 2, 2022

What does twitter.com/USER include

Retweets.

@left1000
Copy link
Author

left1000 commented Feb 2, 2022

I do want retweets a twitter user makes that are retweets of their own tweets, because they bypass the 3200 tweet scan limit (if they retweet something old of theirs)... But I do not want retweets that they make that are other people's art from other twitters.... but I could've sworn that this second type of retweet wasn't included in twitter.com/USER ? because as far as I can recall I've never seen it appear there?

does this have anything to do with the

    "content": false,

extractor setting? In fact what does the above extractor setting do? if anything? for twitter?

@nisehime
Copy link

nisehime commented Feb 3, 2022

I could've sworn that this second type of retweet wasn't included in twitter.com/USER

It does include them. Since v1.18.0 retweets are disabled in gallery-dl by default, including self-retweets, so if you didn't enable them in your config you weren't downloading any of them.

If you want user's older tweets you can use twitter search.

"content": false,

I don't know where you got it, but it doesn't do anything.

@mikf

I could add an include option like there is for instagram, deviantart, and other sites

I have already said that you can simply add second option for extractor.twitter.replies which will download tweets where author = user from url, except for retweets.

@left1000
Copy link
Author

left1000 commented Feb 5, 2022

Okay, then, I guess for me, I'll just use twitter.com/USER/media although if and when an option to enable only self-retweets exists I'd probably come back, turn that on, and start doing twitter.com/USER and twitter.com/USER/media both again?

If I have an option that does nothing it's likely I copy pasted it from the settings for another extractor, I did really type any of my options they're all defaults+copypastes of defaults from another extractor I thought would work.

@Hrxn
Copy link
Contributor

Hrxn commented Feb 5, 2022

If I have an option that does nothing it's likely I copy pasted it from the settings for another extractor, I did really type any of my options they're all defaults+copypastes of defaults from another extractor I thought would work.

Don't do this, by the way. Not a sane assumption to make, even when there are some options that actually are "universal" (like "filename", "directory", "archive", "skip" etc.), it is simply false for other options, because they are extractor-specific.

@left1000
Copy link
Author

left1000 commented Feb 5, 2022

I've never done it on purpose, I think the problem mostly stems from my gallery.conf file having been created in early 2020 so it's probably not as nice as a brand new one would be?

@github-account1111
Copy link

when you do twitter.com/username you get media from user's own tweets and others tweets he's reposted.
twitter.com/username/media you get all media from user's own tweets only.

Wait that's how it works?
Then what's the point of https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst#extractortwitterretweets?
I've always assumed twitter.com/username was a catch-all.

OT but this makes me wonder: given I generally want all of the author's own content and don't mind other authors' content if necessary (like in this case, where some of the old inaccessible stuff can be referenced in the author's own retweet), how should one determine the URL to pass for a given extractor, other than trial & error?

@left1000
Copy link
Author

left1000 commented Dec 9, 2022

I always run

gallery-dl twitter.com/username twitter.com/username/media

because I'm never really sure what's correct

@nisehime
Copy link

nisehime commented Dec 9, 2022

Then what's the point of https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst#extractortwitterretweets?

If retweets are disabled then retweets won't be downloaded from twitter.com/username. It also defines the behaviour of extractor.twitter.timeline.strategy

If you want all user's content + retweets then twitter.com/username twitter.com/username/media is easiest way.

@github-account1111
Copy link

Does #2226 (comment) mean this can be closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants