Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

download squarespace escaped video embeds #21859

Closed
wants to merge 1 commit into from

Conversation

galgeek
Copy link
Contributor

@galgeek galgeek commented Jul 22, 2019

Please follow the guide below

  • You will be asked some questions, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your pull request (like that [x])
  • Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

Captures videos embedded in Squarespace sites in escaped html, like the youtube embeds in Squarespace sites noted in issue #21294.

This can replace #21802 already initially reviewed by @dstftw.

youtube_dl/extractor/generic.py Outdated Show resolved Hide resolved
@galgeek galgeek force-pushed the dl-escaped-embeds branch from 4b0e69c to 15ac8c8 Compare July 30, 2019 20:36
@galgeek galgeek changed the title capture 2019 squarespace and other escaped videos download squarespace escaped video embeds Jul 30, 2019
@galgeek
Copy link
Contributor Author

galgeek commented Jul 30, 2019

Thanks, @dstftw!

I've updated, finding all squarespace video embeds, and escaping before searching for video URL.

I've checked using these 3 URLs:

http://www.ootboxford.com/
https://www.harvardballetcompany.org/past-productions
http://www.immediategratification.org/

youtube_dl/extractor/generic.py Outdated Show resolved Hide resolved
@galgeek
Copy link
Contributor Author

galgeek commented Aug 3, 2019

Thanks, @dstftw!
I've updated, and the new code should be pretty close to what you're looking for.

@galgeek galgeek force-pushed the dl-escaped-embeds branch from afad27d to e1137c8 Compare August 21, 2019 21:54
@galgeek
Copy link
Contributor Author

galgeek commented Aug 21, 2019

I've verified that #21294 is still present in youtube-dl version 2019.08.13, and this PR still fixes the issue.

@@ -2395,6 +2395,12 @@ def _real_extract(self, url):
# Unescaping the whole page allows to handle those cases in a generic way
webpage = compat_urllib_parse_unquote(webpage)

# unescape squarespace video embeds
sqs_videos = re.findall(r'<div class="[^"]*?sqs-video-wrapper[^>]*>', webpage)
Copy link
Collaborator

@dstftw dstftw Aug 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. re.sub.
  2. Regex must match any quote kinds, must allow other attributes before class.
  3. Add a test.

@galgeek
Copy link
Contributor Author

galgeek commented Aug 29, 2019

@dstftw thanks!

I've updated, and I've verified that the latest updates resolve #21294

@galgeek galgeek force-pushed the dl-escaped-embeds branch from 203e04f to ffb9bed Compare August 30, 2019 00:20
@dstftw dstftw closed this in 7cb51b5 Aug 31, 2019
@galgeek
Copy link
Contributor Author

galgeek commented Aug 31, 2019

@dstftw thank you for your help with this!

It's great to see the fix in master, and I learned a lot, too.

meunierd referenced this pull request in meunierd/youtube-dl Feb 13, 2020
pareronia referenced this pull request in pareronia/youtube-dl Jun 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants