Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ARD:mediathek] Fix title and description extraction #18371

Merged
merged 2 commits into from
Dec 6, 2018

Conversation

goggle
Copy link
Contributor

@goggle goggle commented Dec 2, 2018

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

[ARD:mediathek] recently fails for certain videos, because youtube-dl cannot extract the title. This PR adds a new regex for the title extraction and a new regex for the description extraction, which is used e.g. in this (https://www.ardmediathek.de/tv/Dokumentarfilm/Das-Salz-der-Erde-Sebasti%C3%A3o-Salgado-im/SWR-Fernsehen/Video?bcastId=1105036&documentId=57927466) video.

It should close #18349, but I cannot explain that strange behavior which I described there: #18349 (comment)

'description', webpage, 'meta description', default=None)
if description is None:
description = self._html_search_regex(
r'<p\s+class="teasertext">(.*?)</p>', webpage, 'teaser text')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must not be fatal. Do not capture empty string.

@dstftw dstftw merged commit 8c58797 into ytdl-org:master Dec 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ARD:mediathek] RegexNotFoundError
2 participants