Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add --dump-regexps option to print the regular expressions used to match supported sites #52

Closed
wants to merge 2 commits into from

Conversation

chocolateboy
Copy link
Contributor

I use youtube-dl as an optional downloader in a project where I don't have the luxury of invoking it on every URL to see if a site is supported. Instead, I keep a static list of the regexes used to match supported sites and manually update it by hand (well, by script and hand) periodically. Obviously this is fragile.

This patch allows youtube-dl to dump these regexes so I don't have to scrape them. Maybe someone else will find it useful.

@chocolateboy
Copy link
Contributor Author

P.S. Yes, I agree to release it (and any other contributions) to the public domain. And, if you apply it, perhaps this option and --dump-user-agent should be placed under a "Developer Options" heading so they don't clutter up the main options.

@rg3
Copy link
Collaborator

rg3 commented Jan 7, 2011

I don't oppose such an option strongly, but at this moment I prefer not to implement this feature. Right now this depends on _VALID_URL, which is an internal field used for convenience and people include it when they write an InfoExtractor because they copy what they see in other InfoExtractors. Many times, however, I've been thinking about splitting a complex regexp into several ones, and modifying the "suitable()" method accordingly to check each one in turn. Moreover, "suitable()" may not depend on regular expressions at all, and just check the URL starts with a given string, for example.

I know currently the code you have implemented is small and works, yet I still think it exposes an internal implementation decision for InfoExtractors that shouldn't be exposed to users, sorry.

@chocolateboy
Copy link
Contributor Author

That's what I thought you'd say. Oh well, worth a shot :-)

@rg3
Copy link
Collaborator

rg3 commented Jan 7, 2011

:)

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants