Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tver] Add support for TVer #26662

Closed
wants to merge 14 commits into from
Closed

Conversation

tsukumijima
Copy link

@tsukumijima tsukumijima commented Sep 21, 2020

Please follow the guide below

  • You will be asked some questions, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your pull request (like that [x])
  • Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

Explanation of your pull request in arbitrary form goes here. Please make sure the description explains the purpose and effect of your pull request and is worded well enough to be understood. Provide as much context and examples as possible.


TVer is Japan's largest free distribution service for TV programs, operated mainly by five Japanese TV stations.
It distributes various programs such as Japanese anime, drama, news, and variety shows, and has more than 10 million active users.
This pull request was opened to make TVer videos available for download on youtube-dl.

Note

  • TVer is a service for Japan, so I set _GEO_COUNTRIES = ['JP'] to disguise the X-Fowarded-For header as a Japanese IP, but it may still not be available for download from outside Japan.
  • The test from test / test_download.py doesn't seem to handle encrypted HLS videos correctly and says if pycryptodome is not installed, an error "ERROR: hlsnative has detected features it does not support, extraction will be delegated to ffmpeg" will occur.
    Also, if pycryptodome is installed, an error "Value Error: Data must be padded to 16 byte boundary in CBC mode" will occur. I'm not sure why.
    To avoid this, I've added a'skip'to _TESTS to skip the test (although most TVer videos aren't available after a week or so, so after a while the test itself doesn't make sense...)
  • I an using Google Translate. The text may be unnatural.

@tsukumijima tsukumijima marked this pull request as draft September 21, 2020 21:30
@tsukumijima tsukumijima marked this pull request as ready for review September 21, 2020 21:30
Comment on lines 29 to 62
{
'url': 'https://tver.jp/corner/f0056997', # 'corner'
'md5': 'aac4e681dcdb775fc44497da4f7bdd05', # MD5 hash of a short video downloaded by running youtube-dl with the --test option
'info_dict': {
'id': 'f0056997', # TVer ID
'display_id': 'ref:kanokari_10', # Brightcove ID
'ext': 'mp4',
'title': '彼女、お借りします 第10話「友達の彼女」-トモカノ-',
'description': 'バイトの初任給を何に使おうか考える和也だったが、ふと栗林のことが脳裏をよぎる。最近栗林の様子がおかしいと、木部から話を聞いていたのだ。ボーッとしていたり、女性不信のつぶやきをしているという。和也は意を決して、栗林を呼び出すことに。翌日、栗林が和也を待っていると──「駿君、だよね?」。待ち合わせ場所にやって来たのは、千鶴だった……!',
'thumbnail': 'https://cf-images.ap-northeast-1.prod.boltdns.net/v1/jit/5102072605001/900216cc-2e97-4c19-93bb-1a531de358d6/main/1920x1080/12m18s37ms/match/image.jpg',
'duration': 1476.075,
'timestamp': 1599554409,
'upload_date': '20200908',
'uploader_id': '5102072605001',
},
'skip': 'Running from test_download.py doesn\'t seem to be able to handle encrypted HLS videos',
},
{
'url': 'https://tver.jp/episode/76799350', # 'episode'
'md5': 'ad893db02b8a3e949216c463af7ce51e', # MD5 hash of a short video downloaded by running youtube-dl with the --test option
'info_dict': {
'id': '76799350', # TVer ID
'display_id': '2366_2365_4533', # Brightcove ID
'ext': 'mp4',
'title': '港時間 #49 神奈川県/リビエラシーボニアマリーナ 9月18日(金)放送分',
'description': '【毎週金曜 よる12時15分から放送】\n\n日本のヨット文化 を育んできた三浦半島の西海岸、小網代湾にあるリビエラシーボニアマリーナ。昨年から始まったSailGPの日本チームを率いるヨット界のレジェンドに会いました。',
'thumbnail': 'https://cf-images.ap-northeast-1.prod.boltdns.net/v1/jit/4394098883001/904361ca-40d3-4028-8478-8916b9a0ff49/main/1920x1080/58s80ms/match/image.jpg',
'duration': 116.16,
'timestamp': 1600052421,
'upload_date': '20200914',
'uploader_id': '4394098883001',
},
'skip': 'Running from test_download.py doesn\'t seem to be able to handle encrypted HLS videos',
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove duplicate tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been deleted.

from .brightcove import BrightcoveNewIE


class TVerIE(BrightcoveNewIE):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not inherit brightcove, you must delegate instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TVer makes use of Brightcove, but some entities, such as descriptions, had to be extracted from TVer's site.
I didn't know how to overwrite the entity retrieved by BrightcoveNewIE in such a case, so I ended up with such a dirty implementation.
In the return value, I found that it was possible to delegate in the correct way by setting '_type' to 'url_transparent' and setting the entity retrieved by TVerIE such as 'description', so I rewrote it significantly.
I didn't do enough preliminary research ... I'm sorry.

'ext': 'mp4',
'title': '半沢直樹(新シリーズ) 第1話 子会社VS銀行!飛ばされた半沢の新たな下剋上が始まる',
'description': '大和田(香川照之)の不正を糾弾し、子会社へ出向を命じられた半沢直樹(堺雅人)は、東京セントラル証券営業企画部長に。ある日1500億円超の買収案件が舞い込むが…。',
'thumbnail': 'https://cf-images.ap-northeast-1.prod.boltdns.net/v1/jit/4031511847001/37b5f176-3989-48d9-81d1-4688e80c5531/main/1920x1080/34m10s16ms/match/image.jpg',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No exact URLs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

'display_id': 'ref:hanzawa_naoki---s2----323-001', # Brightcove ID
'ext': 'mp4',
'title': '半沢直樹(新シリーズ) 第1話 子会社VS銀行!飛ばされた半沢の新たな下剋上が始まる',
'description': '大和田(香川照之)の不正を糾弾し、子会社へ出向を命じられた半沢直樹(堺雅人)は、東京セントラル証券営業企画部長に。ある日1500億円超の買収案件が舞い込むが…。',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

md5

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


def _real_extract(self, url):

# extract video id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove all useless comments.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously excessive comments have been removed.
However, I think that the code will be difficult to understand if there are too few comments, so I left the comments that I felt necessary.

Comment on lines 96 to 100
if self._downloader.params.get('verbose', False):
self.to_screen('Video Information: %s' % video_info)
self.to_screen('Brightcove Account ID: %s' % brightcove_account_id)
self.to_screen('Brightcove Video ID: %s' % brightcove_video_id)
self.to_screen('Brightcove URL: %s' % brightcove_url)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove all debug garbage.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been deleted.

self.to_screen('Brightcove URL: %s' % brightcove_url)

# evacuate _VALID_URL
_VALID_URL = self._VALID_URL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noway.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines 105 to 107
# temporarily replace _VALID_URL
# prevent _VALID_URL from being the URL of Tver when executing the parent class's _real_extract () method
self._VALID_URL = r'https?://players\.brightcove\.net/(?P<account_id>\d+)/(?P<player_id>[^/]+)_(?P<embed>[^/]+)/index\.html\?.*(?P<content_type>video|playlist)Id=(?P<video_id>\d+|ref:[^&]+)'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noway.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines 109 to 110
# get video information
info_dict = super(TVerIE, self)._real_extract(brightcove_url)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noway.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines 117 to 127
# undo _VALID_URL
self._VALID_URL = _VALID_URL

# TVer ID
info_dict['id'] = video_id
# Brightcove ID
info_dict['display_id'] = brightcove_video_id
# select large thumbnail
info_dict['thumbnail'] = info_dict.get('thumbnail').replace('160x90', '1920x1080')
# desctiption
info_dict['description'] = description
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noway. See other extractors on how to delegate properly.

Copy link
Author

@tsukumijima tsukumijima Sep 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I fixed all the parts you pointed out.
Thank you for your review.

@tsukumijima tsukumijima requested a review from dstftw September 29, 2020 21:35
@tsukumijima
Copy link
Author

I'm sorry to be busy, Would you please review it again?
Or should I reopen the PR?

@remitamine remitamine closed this in 64554c1 Dec 2, 2020
@Paun
Copy link

Paun commented Dec 3, 2020

Oh finally!!

I am very glad to see this will be supported soon.

Nice work.

@tsukumijima
Copy link
Author

The merged changes are a bit different from the code I wrote, but in any case, I'm happy to see TVer support added to youtube-dl.
I haven't confirmed if it works yet, but as far as I can see the code, it should work.

@tsukumijima
Copy link
Author

I've read the code and it's certainly more verbose and nice than my code.
I also admire the ability to write this code in just a few hours after receiving your request.

ThirumalaiK pushed a commit to ThirumalaiK/youtube-dl that referenced this pull request Jan 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants