-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tf1] fix wat id extraction (closes ytdl-org#21365) #21372
Conversation
youtube_dl/extractor/tf1.py
Outdated
}] | ||
|
||
def _real_extract(self, url): | ||
video_id = self._match_id(url) | ||
webpage = self._download_webpage(url, video_id) | ||
wat_id = self._html_search_regex( | ||
r'(["\'])(?:https?:)?//www\.wat\.tv/embedframe/.*?(?P<id>\d{8})\1', | ||
r'"streamId":"(?P<id>\d{8})"', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Do not remove the old pattern.
- Relax regex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should I do with the old pattern? Comment it out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. As already said you must keep the old pattern along with the new.
youtube_dl/extractor/tf1.py
Outdated
}] | ||
|
||
def _real_extract(self, url): | ||
video_id = self._match_id(url) | ||
webpage = self._download_webpage(url, video_id) | ||
slug = self._search_regex( | ||
r'(?<=/)(?P<slug>[^/]+)(?=\.html$)', | ||
url, 'slug', group='slug', default='') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already extracted as video_id.
Can we have this merged? |
r'(["\'])(?:https?:)?//www\.wat\.tv/embedframe/.*?(?P<id>\d{8})\1', | ||
webpage, 'wat id', group='id') | ||
vids_data_string = self._html_search_regex( | ||
r'<script>\s*window\.__APOLLO_STATE__\s*=\s*(?P<vids_data_string>\{.*?\})\s*;?\s*</script>', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Remove script tags, it's unique enough without them.
- Do not use named group when there is only one group.
- Curly braces don't need escaping.
- Do not capture empty dict.
if vids_data_string is not None: | ||
vids_data = self._parse_json( | ||
vids_data_string, video_id, | ||
transform_source=js_to_json) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Must not be fatal.
vids_data = self._parse_json( | ||
vids_data_string, video_id, | ||
transform_source=js_to_json) | ||
video_data = [v for v in vids_data.values() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
video_data
is totally useless. Write directly to id variable when found.
vids_data_string, video_id, | ||
transform_source=js_to_json) | ||
video_data = [v for v in vids_data.values() | ||
if 'slug' in v and v['slug'] == video_id] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v
may not be dict.v.get('slug')
.
What's missing to get this merged? |
Please follow the guide below
x
into all the boxes [ ] relevant to your pull request (like that [x])Before submitting a pull request make sure you have:
In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:
What is the purpose of your pull request?
Description of your pull request and other information
Explanation of your pull request in arbitrary form goes here. Please make sure the description explains the purpose and effect of your pull request and is worded well enough to be understood. Provide as much context and examples as possible.