Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pornflip] extractor update Closes:#18485 #21169

Closed
wants to merge 2 commits into from

Conversation

nindogo
Copy link

@nindogo nindogo commented May 21, 2019

Please follow the guide below

  • You will be asked some questions, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your pull request (like that [x])
  • Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence) I modified the original extractor to get it to work. So part of the code is mine and part of the code is from the original developer.

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

pornflip.py is broken because the website changed. It is currently doing only dash video.
This is an update of the original extractor to get pornflip to work again.
At this time, since all the videos on the site are multipart dash files, none of the tests work, however, the code works as tested as below:

root@DESKTOP:pornflip# python3.7 -m youtube_dl --verbose https://www.pornflip.com/k27gGfg7cqt/green-hair
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--verbose', 'https://www.pornflip.com/k27gGfg7cqt/green-hair']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2019.05.20
[debug] Git HEAD: 6bee1627a
[debug] Python version 3.7.3 (CPython) - Linux-4.4.0-17134-Microsoft-x86_64-with-Ubuntu-18.04-bionic
[debug] exe versions: ffmpeg 3.4.6, ffprobe 3.4.6
[debug] Proxy map: {}
[PornFlip] k27gGfg7cqt: Downloading webpage
[PornFlip] k27gGfg7cqt: Downloading MPD manifest
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'https://vs15.userscontent.net/dash/1795994/manifest.mpd?seclink=aq7E9TuwOEAym5jqeG4yZw&sectime=1558406072'
[dashsegments] Total fragments: 250
[download] Destination: Green hair-k27gGfg7cqt.f1795994-f3-v1-x3.mp4
[download] 100% of 201.53MiB in 13:14
[debug] Invoking downloader on 'https://vs15.userscontent.net/dash/1795994/manifest.mpd?seclink=aq7E9TuwOEAym5jqeG4yZw&sectime=1558406072'
[dashsegments] Total fragments: 250
[download] Destination: Green hair-k27gGfg7cqt.f1795994-f1-a1-x3.m4a
[download] 100% of 15.39MiB in 05:06
[ffmpeg] Merging formats into "Green hair-k27gGfg7cqt.mp4"
[debug] ffmpeg command line: ffmpeg -y -loglevel repeat+info -i 'file:Green hair-k27gGfg7cqt.f1795994-f3-v1-x3.mp4' -i 'file:Green hair-k27gGfg7cqt.f1795994-f1-a1-x3.m4a' -c copy -map 0:v:0 -map 1:a:0 'file:Green hair-k27gGfg7cqt.temp.mp4'
Deleting original file Green hair-k27gGfg7cqt.f1795994-f3-v1-x3.mp4 (pass -k to keep)
Deleting original file Green hair-k27gGfg7cqt.f1795994-f1-a1-x3.m4a (pass -k to keep)```

Change the way the website is parsed and how the data is presented to
YouTubeDl.
@nindogo nindogo changed the title [pornflip] extractor update fixes #18485 [pornflip] extractor update fixes [#18485](https://github.com/ytdl-org/youtube-dl/issues/18485 "pornflip.com unsupported") May 21, 2019
@nindogo nindogo changed the title [pornflip] extractor update fixes [#18485](https://github.com/ytdl-org/youtube-dl/issues/18485 "pornflip.com unsupported") [pornflip] extractor update fixes fixes:#18485 May 21, 2019
@nindogo nindogo changed the title [pornflip] extractor update fixes fixes:#18485 [pornflip] extractor update fixes #18485 May 21, 2019
@nindogo nindogo changed the title [pornflip] extractor update fixes #18485 [pornflip] extractor update Closes:#18485 May 21, 2019
@nindogo nindogo changed the title [pornflip] extractor update Closes:#18485 [pornflip] extractor update Closes: #18485 May 21, 2019
@nindogo nindogo changed the title [pornflip] extractor update Closes: #18485 [pornflip] extractor update Closes:#18485 May 22, 2019
@dstftw dstftw closed this in f856816 May 23, 2019
@nindogo
Copy link
Author

nindogo commented May 24, 2019

Thanks for closing this PR by modifying utils.py (strip_or_none()) and commons.py to get pornflip to work again. I have tested the three URL formats of pornflip.com and the latest state of youtube-dl at master works for all three cases (but with different media names as per the different header and URL formats.)

Could you also consider looking at my other submission in this PR?

@nindogo
Copy link
Author

nindogo commented May 25, 2019

Hi @dstftw ,

tl,dr
In the wild the mpeg-dash mpd file appears in a src , data-src or data-video-src attribute of the video element.
On pornflip.com there are several attribute options in the video element that may contain the mpd file. When more than one of them appears, they all appear in the same video element and in the following order:

  1. src always appears
  2. data-src always appears
  3. data-src360 sometimes appears
  4. data-src720 sometimes appears

On pornflip the data-src always has all the video file types available and the src one only has them when there is no 720p video. We should always select the mpd in the data-src attribute but youtube-dl selects the mpd in the src attribute and sticks with it without checking for/in the mpd in the data-src attribute as well. This means we can not get HD videos on pornflip with youtube-dl as it is.
end tl,dr

I have looked at this change one more time, and there is a small bug I would like to draw your attention to. It occurs when the video in question is available also as a HD video.

Take this video

It's video element appears as follows (modified a bit so it is easier to read):

<video class="mediaPlayer" id="mediaPlayer"  
        src=" https://cdn-eu-v1.userscontent.net/dash2/123/3369/1051c532ba16ac5e2c51fe5f4f1d571a.mp4/manifest.mpd?seclink=2FMe0PXTcEcRaLvNfsTw2w&amp;sectime=1559225523" 
    data-src="https://vs19.userscontent.net/dash/1233369/manifest.mpd?seclink=uUwQ7XxPXq9U5DmTJth3oA&amp;sectime=1559225523" 
 data-src360="https://cdn-eu-v1.userscontent.net/dash2/123/3369/1051c532ba16ac5e2c51fe5f4f1d571a.mp4/manifest.mpd?seclink=2FMe0PXTcEcRaLvNfsTw2w&amp;sectime=1559225523" 
 data-src720="https://cdn-eu-v1.userscontent.net/dash2/123/3369/a9acf8ac3fc5b3f2ed8bddfa6f50hd4b.mp4/manifest.mpd?seclink=YY0BmMTMFTNx2noziN90sA&amp;sectime=1559225523" 
data-vast-url="https://syndication.exosrv.com/splash.php?idzone=3109786" data-vast-impr-start="/ads/banner?please_no_web_redirect&amp;banners[]=22" data-vast-impr-shown="/ads/banner?please_no_web_redirect&amp;banners[]=-220"
--
  | data-matrix="https://img1.pornflip.com/thumbs/123/1233369/" data-embed="https://www.pornflip.com/embed/BETFpNCrhOR" data-link="https://www.pornflip.com/v/BETFpNCrhOR"
  | data-id="611799"
  | data-proto="https" data-related="https://www.pornflip.com/api/get_related_videos/611799"
  | data-qualities="360\|480\|720" data-qualities-convert="" data-version="1"></video>

As can be seen, there are several media presentation descriptions (mpds) with attributes:

  1. src
  2. data-src
  3. data-src360
  4. data-src720

Closer inspection shows that

src == data-src360
&
data-src and data-src720 are all different from them.

In fact for this video, src only has this video up to 272p.

<?xml version="1.0"?>
<MPD
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="urn:mpeg:dash:schema:mpd:2011"
    xsi:schemaLocation="urn:mpeg:dash:schema:mpd:2011 http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-DASH_schema_files/DASH-MPD.xsd"
    type="static"
    mediaPresentationDuration="PT2270.333S"
    minBufferTime="PT4S"
    profiles="urn:mpeg:dash:profile:isoff-main:2011">
  <Period>
    <AdaptationSet
        id="1"
        segmentAlignment="true"
        maxWidth="480"
        maxHeight="272"
        maxFrameRate="24">
        <SegmentTemplate
            timescale="1000"
            media="https://cdn-eu-v1.userscontent.net/dash2/123/3369/1051c532ba16ac5e2c51fe5f4f1d571a.mp4/fragment-$Number$-$RepresentationID$.m4s"
            initialization="https://cdn-eu-v1.userscontent.net/dash2/123/3369/1051c532ba16ac5e2c51fe5f4f1d571a.mp4/init-$RepresentationID$.mp4"
            startNumber="1">
            <SegmentTimeline>
                <S d="3500"/>
                <S d="4000" r="565"/>
                <S d="2833"/>
            </SegmentTimeline>
        </SegmentTemplate>
      <Representation
          id="v1-x3"
          mimeType="video/mp4"
          codecs="avc1.4d4015"
          width="480"
          height="272"
          frameRate="24"
          sar="1:1"
          startWithSAP="1"
          bandwidth="352652">
      </Representation>
    </AdaptationSet>
    <AdaptationSet
        id="2"
        segmentAlignment="true">
      <AudioChannelConfiguration
          schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011"
          value="1"/>
        <SegmentTemplate
            timescale="1000"
            media="https://cdn-eu-v1.userscontent.net/dash2/123/3369/1051c532ba16ac5e2c51fe5f4f1d571a.mp4/fragment-$Number$-$RepresentationID$.m4s"
            initialization="https://cdn-eu-v1.userscontent.net/dash2/123/3369/1051c532ba16ac5e2c51fe5f4f1d571a.mp4/init-$RepresentationID$.mp4"
            startNumber="1">
            <SegmentTimeline>
                <S d="3500"/>
                <S d="4000" r="565"/>
                <S d="2692"/>
            </SegmentTimeline>
        </SegmentTemplate>
      <Representation
          id="a1-x3"
          mimeType="audio/mp4"
          codecs="mp4a.40.2"
          audioSamplingRate="44100"
          startWithSAP="1"
          bandwidth="127938">
      </Representation>
    </AdaptationSet>
  </Period>
</MPD>

data-src has all the formats described:

<?xml version="1.0"?>
<MPD
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="urn:mpeg:dash:schema:mpd:2011"
    xsi:schemaLocation="urn:mpeg:dash:schema:mpd:2011 http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-DASH_schema_files/DASH-MPD.xsd"
    type="static"
    mediaPresentationDuration="PT2270.333S"
    minBufferTime="PT4S"
    profiles="urn:mpeg:dash:profile:isoff-main:2011">
  <Period>
    <AdaptationSet
        id="1"
        segmentAlignment="true"
        maxWidth="1280"
        maxHeight="720"
        maxFrameRate="24">
        <SegmentTemplate
            timescale="1000"
            media="https://vs19.userscontent.net/dash/1233369/fragment-$Number$-$RepresentationID$.m4s"
            initialization="https://vs19.userscontent.net/dash/1233369/init-$RepresentationID$.mp4"
            duration="4000"
            startNumber="1">
        </SegmentTemplate>
      <Representation
          id="f1-v1-x3"
          mimeType="video/mp4"
          codecs="avc1.4d4015"
          width="480"
          height="272"
          frameRate="24"
          sar="1:1"
          startWithSAP="1"
          bandwidth="352652">
      </Representation>
      <Representation
          id="f2-v1-x3"
          mimeType="video/mp4"
          codecs="avc1.4d401e"
          width="640"
          height="360"
          frameRate="24"
          sar="1:1"
          startWithSAP="1"
          bandwidth="977688">
      </Representation>
      <Representation
          id="f3-v1-x3"
          mimeType="video/mp4"
          codecs="avc1.4d401f"
          width="1280"
          height="720"
          frameRate="24"
          sar="1:1"
          startWithSAP="1"
          bandwidth="1697896">
      </Representation>
    </AdaptationSet>
    <AdaptationSet
        id="2"
        segmentAlignment="true">
      <AudioChannelConfiguration
          schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011"
          value="1"/>
        <SegmentTemplate
            timescale="1000"
            media="https://vs19.userscontent.net/dash/1233369/fragment-$Number$-$RepresentationID$.m4s"
            initialization="https://vs19.userscontent.net/dash/1233369/init-$RepresentationID$.mp4"
            duration="4000"
            startNumber="1">
        </SegmentTemplate>
      <Representation
          id="f1-a1-x3"
          mimeType="audio/mp4"
          codecs="mp4a.40.2"
          audioSamplingRate="44100"
          startWithSAP="1"
          bandwidth="127938">
      </Representation>
    </AdaptationSet>
  </Period>
</MPD>

data-src720 has only the 720p video described:

<?xml version="1.0"?>
<MPD
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="urn:mpeg:dash:schema:mpd:2011"
    xsi:schemaLocation="urn:mpeg:dash:schema:mpd:2011 http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-DASH_schema_files/DASH-MPD.xsd"
    type="static"
    mediaPresentationDuration="PT2270.333S"
    minBufferTime="PT4S"
    profiles="urn:mpeg:dash:profile:isoff-main:2011">
  <Period>
    <AdaptationSet
        id="1"
        segmentAlignment="true"
        maxWidth="1280"
        maxHeight="720"
        maxFrameRate="24">
        <SegmentTemplate
            timescale="1000"
            media="https://cdn-eu-v1.userscontent.net/dash2/123/3369/a9acf8ac3fc5b3f2ed8bddfa6f50hd4b.mp4/fragment-$Number$-$RepresentationID$.m4s"
            initialization="https://cdn-eu-v1.userscontent.net/dash2/123/3369/a9acf8ac3fc5b3f2ed8bddfa6f50hd4b.mp4/init-$RepresentationID$.mp4"
            startNumber="1">
            <SegmentTimeline>
                <S d="3500"/>
                <S d="4000" r="565"/>
                <S d="2833"/>
            </SegmentTimeline>
        </SegmentTemplate>
      <Representation
          id="v1-x3"
          mimeType="video/mp4"
          codecs="avc1.4d401f"
          width="1280"
          height="720"
          frameRate="24"
          sar="1:1"
          startWithSAP="1"
          bandwidth="1697896">
      </Representation>
    </AdaptationSet>
    <AdaptationSet
        id="2"
        segmentAlignment="true">
      <AudioChannelConfiguration
          schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011"
          value="1"/>
        <SegmentTemplate
            timescale="1000"
            media="https://cdn-eu-v1.userscontent.net/dash2/123/3369/a9acf8ac3fc5b3f2ed8bddfa6f50hd4b.mp4/fragment-$Number$-$RepresentationID$.m4s"
            initialization="https://cdn-eu-v1.userscontent.net/dash2/123/3369/a9acf8ac3fc5b3f2ed8bddfa6f50hd4b.mp4/init-$RepresentationID$.mp4"
            startNumber="1">
            <SegmentTimeline>
                <S d="3500"/>
                <S d="4000" r="565"/>
                <S d="2692"/>
            </SegmentTimeline>
        </SegmentTemplate>
      <Representation
          id="a1-x3"
          mimeType="audio/mp4"
          codecs="mp4a.40.2"
          audioSamplingRate="44100"
          startWithSAP="1"
          bandwidth="127938">
      </Representation>
    </AdaptationSet>
  </Period>
</MPD>

This means to be able to get the best video on the site you have to change from using the src .mpd file to using the data-src one.

meunierd referenced this pull request in meunierd/youtube-dl Dec 27, 2019
meunierd referenced this pull request in meunierd/youtube-dl Feb 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant