mirror of
https://github.com/ytdl-org/youtube-dl.git
synced 2025-07-22 21:31:40 -05:00
Compare commits
77 Commits
2017.03.06
...
2017.03.24
Author | SHA1 | Date | |
---|---|---|---|
a3ccd6bd11 | |||
7963b6cba8 | |||
bea7af6947 | |||
a5d783f525 | |||
d0572557c2 | |||
52d5ecabd5 | |||
b0f7f21cb9 | |||
579c99a284 | |||
ca5ed022e9 | |||
391d076d7c | |||
c183e14f89 | |||
093dad9e25 | |||
e8686e51d7 | |||
8e5a7c5e67 | |||
e1e35d1ac6 | |||
21fbf0f955 | |||
97952bdb78 | |||
8a8cc339b6 | |||
957f453429 | |||
0e9a73e612 | |||
0ecdd3adbd | |||
9487ce03e9 | |||
45e6ad21b4 | |||
68220649fa | |||
46b18f2349 | |||
772b5ff57f | |||
f68ef1e2ab | |||
febfe1e262 | |||
5f0daab1ca | |||
2a721cdff2 | |||
e7a51a4c02 | |||
3e5856d860 | |||
ea883a687c | |||
7f3590c43b | |||
7d539ee10a | |||
6ad476079d | |||
0efbc6b56d | |||
21bfcd3d6e | |||
b51dc9db0e | |||
a309684285 | |||
ba448445b8 | |||
5db83d79bf | |||
2a751e137f | |||
398887b4c0 | |||
66bf351f80 | |||
9d08963022 | |||
e313d209c2 | |||
ff9d509d20 | |||
c1795ca6c8 | |||
8c99623259 | |||
57b0ddb35f | |||
a28f8d7396 | |||
7049799470 | |||
4605c94d1a | |||
a8e687a4da | |||
f9e5c92c94 | |||
c2ee861c6d | |||
bd34c32bd7 | |||
f802c48660 | |||
76bee08fe7 | |||
2913821723 | |||
0e7f9a9b48 | |||
0cf2352e85 | |||
0f6b87d067 | |||
d7344d33b1 | |||
b08cc749d6 | |||
b68a812ea8 | |||
2e76bdc850 | |||
fe646a2f10 | |||
9df53ea36e | |||
d7d7f84c95 | |||
dccd0ab35d | |||
80146dcc6c | |||
e30ccf7047 | |||
54a3a8827b | |||
92cb5763f4 | |||
da92da4b88 |
6
.github/ISSUE_TEMPLATE.md
vendored
6
.github/ISSUE_TEMPLATE.md
vendored
@ -6,8 +6,8 @@
|
||||
|
||||
---
|
||||
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.03.06*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.03.06**
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.03.24*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.03.24**
|
||||
|
||||
### Before submitting an *issue* make sure you have:
|
||||
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
||||
@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] youtube-dl version 2017.03.06
|
||||
[debug] youtube-dl version 2017.03.24
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
|
3
AUTHORS
3
AUTHORS
@ -207,3 +207,6 @@ Marek Rusinowski
|
||||
Tobias Gruetzmacher
|
||||
Olivier Bilodeau
|
||||
Lars Vierbergen
|
||||
Juanjo Benages
|
||||
Xiao Di Guan
|
||||
Thomas Winant
|
||||
|
84
ChangeLog
84
ChangeLog
@ -1,3 +1,87 @@
|
||||
version 2017.03.24
|
||||
|
||||
Extractors
|
||||
- [9c9media] Remove mp4 URL extraction request
|
||||
+ [bellmedia] Add support for etalk.ca and space.ca (#12447)
|
||||
* [channel9] Fix extraction (#11323)
|
||||
* [cloudy] Fix extraction (#12525)
|
||||
+ [hbo] Add support for free episode URLs and new formats extraction (#12519)
|
||||
* [condenast] Fix extraction and style (#12526)
|
||||
* [viu] Relax URL regular expression (#12529)
|
||||
|
||||
|
||||
version 2017.03.22
|
||||
|
||||
Extractors
|
||||
- [pluralsight] Omit module title from video title (#12506)
|
||||
* [pornhub] Decode obfuscated video URL (#12470, #12515)
|
||||
* [senateisvp] Allow https URL scheme for embeds (#12512)
|
||||
|
||||
|
||||
version 2017.03.20
|
||||
|
||||
Core
|
||||
+ [YoutubeDL] Allow multiple input URLs to be used with stdout (-) as
|
||||
output template
|
||||
+ [adobepass] Detect and output error on authz token extraction (#12472)
|
||||
|
||||
Extractors
|
||||
+ [bostonglobe] Add extractor for bostonglobe.com (#12099)
|
||||
+ [toongoggles] Add support for toongoggles.com (#12171)
|
||||
+ [medialaan] Add support for Medialaan sites (#9974, #11912)
|
||||
+ [discoverynetworks] Add support for more domains and bypass geo restiction
|
||||
* [openload] Fix extraction (#10408)
|
||||
|
||||
|
||||
version 2017.03.16
|
||||
|
||||
Core
|
||||
+ [postprocessor/ffmpeg] Add support for flac
|
||||
+ [extractor/common] Extract SMIL formats from jwplayer
|
||||
|
||||
Extractors
|
||||
+ [generic] Add forgotten return for jwplayer formats
|
||||
* [redbulltv] Improve extraction
|
||||
|
||||
|
||||
version 2017.03.15
|
||||
|
||||
Core
|
||||
* Fix missing subtitles if --add-metadata is used (#12423)
|
||||
|
||||
Extractors
|
||||
* [facebook] Make title optional (#12443)
|
||||
+ [mitele] Add support for ooyala videos (#12430)
|
||||
* [openload] Fix extraction (#12435, #12446)
|
||||
* [streamable] Update API URL (#12433)
|
||||
+ [crunchyroll] Extract season name (#12428)
|
||||
* [discoverygo] Bypass geo restriction
|
||||
+ [discoverygo:playlist] Add support for playlists (#12424)
|
||||
|
||||
|
||||
version 2017.03.10
|
||||
|
||||
Extractors
|
||||
* [generic] Make title optional for jwplayer embeds (#12410)
|
||||
* [wdr:maus] Fix extraction (#12373)
|
||||
* [prosiebensat1] Improve title extraction (#12318, #12327)
|
||||
* [dplayit] Separate and rewrite extractor and bypass geo restriction (#12393)
|
||||
* [miomio] Fix extraction (#12291, #12388, #12402)
|
||||
* [telequebec] Fix description extraction (#12399)
|
||||
* [openload] Fix extraction (#12357)
|
||||
* [brightcove:legacy] Relax videoPlayer validation check (#12381)
|
||||
|
||||
|
||||
version 2017.03.07
|
||||
|
||||
Core
|
||||
* Metadata are now added after conversion (#5594)
|
||||
|
||||
Extractors
|
||||
* [soundcloud] Update client id (#12376)
|
||||
* [openload] Fix extraction (#10408, #12357)
|
||||
|
||||
|
||||
version 2017.03.06
|
||||
|
||||
Core
|
||||
|
@ -375,8 +375,9 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
|
||||
(requires ffmpeg or avconv and ffprobe or
|
||||
avprobe)
|
||||
--audio-format FORMAT Specify audio format: "best", "aac",
|
||||
"vorbis", "mp3", "m4a", "opus", or "wav";
|
||||
"best" by default; No effect without -x
|
||||
"flac", "mp3", "m4a", "opus", "vorbis", or
|
||||
"wav"; "best" by default; No effect without
|
||||
-x
|
||||
--audio-quality QUALITY Specify ffmpeg/avconv audio quality, insert
|
||||
a value between 0 (better) and 9 (worse)
|
||||
for VBR or a specific bitrate like 128K
|
||||
|
@ -108,6 +108,7 @@
|
||||
- **blinkx**
|
||||
- **Bloomberg**
|
||||
- **BokeCC**
|
||||
- **BostonGlobe**
|
||||
- **Bpb**: Bundeszentrale für politische Bildung
|
||||
- **BR**: Bayerischer Rundfunk Mediathek
|
||||
- **BravoTV**
|
||||
@ -208,10 +209,13 @@
|
||||
- **Digiteka**
|
||||
- **Discovery**
|
||||
- **DiscoveryGo**
|
||||
- **DiscoveryGoPlaylist**
|
||||
- **DiscoveryNetworksDe**
|
||||
- **Disney**
|
||||
- **Dotsub**
|
||||
- **DouyuTV**: 斗鱼
|
||||
- **DPlay**
|
||||
- **DPlayIt**
|
||||
- **dramafever**
|
||||
- **dramafever:series**
|
||||
- **DRBonanza**
|
||||
@ -308,8 +312,8 @@
|
||||
- **GPUTechConf**
|
||||
- **Groupon**
|
||||
- **Hark**
|
||||
- **HBO**
|
||||
- **HBOEpisode**
|
||||
- **hbo**
|
||||
- **hbo:episode**
|
||||
- **HearThisAt**
|
||||
- **Heise**
|
||||
- **HellPorno**
|
||||
@ -423,6 +427,7 @@
|
||||
- **MatchTV**
|
||||
- **MDR**: MDR.DE and KiKA
|
||||
- **media.ccc.de**
|
||||
- **Medialaan**
|
||||
- **Meipai**: 美拍
|
||||
- **MelonVOD**
|
||||
- **META**
|
||||
@ -775,12 +780,12 @@
|
||||
- **ThisAV**
|
||||
- **ThisOldHouse**
|
||||
- **tinypic**: tinypic.com videos
|
||||
- **tlc.de**
|
||||
- **TMZ**
|
||||
- **TMZArticle**
|
||||
- **TNAFlix**
|
||||
- **TNAFlixNetworkEmbed**
|
||||
- **toggle**
|
||||
- **ToonGoggles**
|
||||
- **Tosh**: Tosh.0
|
||||
- **tou.tv**
|
||||
- **Toypics**: Toypics user profile
|
||||
|
@ -8,7 +8,7 @@ import sys
|
||||
import unittest
|
||||
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||
|
||||
from test.helper import FakeYDL
|
||||
from test.helper import FakeYDL, expect_dict
|
||||
from youtube_dl.extractor.common import InfoExtractor
|
||||
from youtube_dl.extractor import YoutubeIE, get_info_extractor
|
||||
from youtube_dl.utils import encode_data_uri, strip_jsonp, ExtractorError, RegexNotFoundError
|
||||
@ -84,6 +84,97 @@ class TestInfoExtractor(unittest.TestCase):
|
||||
self.assertRaises(ExtractorError, self.ie._download_json, uri, None)
|
||||
self.assertEqual(self.ie._download_json(uri, None, fatal=False), None)
|
||||
|
||||
def test_extract_jwplayer_data_realworld(self):
|
||||
# from http://www.suffolk.edu/sjc/
|
||||
expect_dict(
|
||||
self,
|
||||
self.ie._extract_jwplayer_data(r'''
|
||||
<script type='text/javascript'>
|
||||
jwplayer('my-video').setup({
|
||||
file: 'rtmp://192.138.214.154/live/sjclive',
|
||||
fallback: 'true',
|
||||
width: '95%',
|
||||
aspectratio: '16:9',
|
||||
primary: 'flash',
|
||||
mediaid:'XEgvuql4'
|
||||
});
|
||||
</script>
|
||||
''', None, require_title=False),
|
||||
{
|
||||
'id': 'XEgvuql4',
|
||||
'formats': [{
|
||||
'url': 'rtmp://192.138.214.154/live/sjclive',
|
||||
'ext': 'flv'
|
||||
}]
|
||||
})
|
||||
|
||||
# from https://www.pornoxo.com/videos/7564/striptease-from-sexy-secretary/
|
||||
expect_dict(
|
||||
self,
|
||||
self.ie._extract_jwplayer_data(r'''
|
||||
<script type="text/javascript">
|
||||
jwplayer("mediaplayer").setup({
|
||||
'videoid': "7564",
|
||||
'width': "100%",
|
||||
'aspectratio': "16:9",
|
||||
'stretching': "exactfit",
|
||||
'autostart': 'false',
|
||||
'flashplayer': "https://t04.vipstreamservice.com/jwplayer/v5.10/player.swf",
|
||||
'file': "https://cdn.pornoxo.com/key=MF+oEbaxqTKb50P-w9G3nA,end=1489689259,ip=104.199.146.27/ip=104.199.146.27/speed=6573765/buffer=3.0/2009-12/4b2157147afe5efa93ce1978e0265289c193874e02597.flv",
|
||||
'image': "https://t03.vipstreamservice.com/thumbs/pxo-full/2009-12/14/a4b2157147afe5efa93ce1978e0265289c193874e02597.flv-full-13.jpg",
|
||||
'filefallback': "https://cdn.pornoxo.com/key=9ZPsTR5EvPLQrBaak2MUGA,end=1489689259,ip=104.199.146.27/ip=104.199.146.27/speed=6573765/buffer=3.0/2009-12/m_4b2157147afe5efa93ce1978e0265289c193874e02597.mp4",
|
||||
'logo.hide': true,
|
||||
'skin': "https://t04.vipstreamservice.com/jwplayer/skin/modieus-blk.zip",
|
||||
'plugins': "https://t04.vipstreamservice.com/jwplayer/dock/dockableskinnableplugin.swf",
|
||||
'dockableskinnableplugin.piclink': "/index.php?key=ajax-videothumbsn&vid=7564&data=2009-12--14--4b2157147afe5efa93ce1978e0265289c193874e02597.flv--17370",
|
||||
'controlbar': 'bottom',
|
||||
'modes': [
|
||||
{type: 'flash', src: 'https://t04.vipstreamservice.com/jwplayer/v5.10/player.swf'}
|
||||
],
|
||||
'provider': 'http'
|
||||
});
|
||||
//noinspection JSAnnotator
|
||||
invideo.setup({
|
||||
adsUrl: "/banner-iframe/?zoneId=32",
|
||||
adsUrl2: "",
|
||||
autostart: false
|
||||
});
|
||||
</script>
|
||||
''', 'dummy', require_title=False),
|
||||
{
|
||||
'thumbnail': 'https://t03.vipstreamservice.com/thumbs/pxo-full/2009-12/14/a4b2157147afe5efa93ce1978e0265289c193874e02597.flv-full-13.jpg',
|
||||
'formats': [{
|
||||
'url': 'https://cdn.pornoxo.com/key=MF+oEbaxqTKb50P-w9G3nA,end=1489689259,ip=104.199.146.27/ip=104.199.146.27/speed=6573765/buffer=3.0/2009-12/4b2157147afe5efa93ce1978e0265289c193874e02597.flv',
|
||||
'ext': 'flv'
|
||||
}]
|
||||
})
|
||||
|
||||
# from http://www.indiedb.com/games/king-machine/videos
|
||||
expect_dict(
|
||||
self,
|
||||
self.ie._extract_jwplayer_data(r'''
|
||||
<script>
|
||||
jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/\/www.indiedb.com\/","displaytitle":false,"autostart":false,"repeat":false,"title":"king machine trailer 1","sharing":{"link":"http:\/\/www.indiedb.com\/games\/king-machine\/videos\/king-machine-trailer-1","code":"<iframe width=\"560\" height=\"315\" src=\"http:\/\/www.indiedb.com\/media\/iframe\/1522983\" frameborder=\"0\" allowfullscreen><\/iframe><br><a href=\"http:\/\/www.indiedb.com\/games\/king-machine\/videos\/king-machine-trailer-1\">king machine trailer 1 - Indie DB<\/a>"},"related":{"file":"http:\/\/rss.indiedb.com\/media\/recommended\/1522983\/feed\/rss.xml","dimensions":"160x120","onclick":"link"},"sources":[{"file":"http:\/\/cdn.dbolical.com\/cache\/videos\/games\/1\/50\/49678\/encode_mp4\/king-machine-trailer.mp4","label":"360p SD","default":"true"},{"file":"http:\/\/cdn.dbolical.com\/cache\/videos\/games\/1\/50\/49678\/encode720p_mp4\/king-machine-trailer.mp4","label":"720p HD"}],"image":"http:\/\/media.indiedb.com\/cache\/images\/games\/1\/50\/49678\/thumb_620x2000\/king-machine-trailer.mp4.jpg","advertising":{"client":"vast","tag":"http:\/\/ads.intergi.com\/adrawdata\/3.0\/5205\/4251742\/0\/1013\/ADTECH;cors=yes;width=560;height=315;referring_url=http:\/\/www.indiedb.com\/games\/king-machine\/videos\/king-machine-trailer-1;content_url=http:\/\/www.indiedb.com\/games\/king-machine\/videos\/king-machine-trailer-1;media_id=1522983;title=king+machine+trailer+1;device=__DEVICE__;model=__MODEL__;os=Windows+OS;osversion=__OSVERSION__;ua=__UA__;ip=109.171.17.81;uniqueid=1522983;tags=__TAGS__;number=58cac25928151;time=1489683033"},"width":620,"height":349}).once("play", function(event) {
|
||||
videoAnalytics("play");
|
||||
}).once("complete", function(event) {
|
||||
videoAnalytics("completed");
|
||||
});
|
||||
</script>
|
||||
''', 'dummy'),
|
||||
{
|
||||
'title': 'king machine trailer 1',
|
||||
'thumbnail': 'http://media.indiedb.com/cache/images/games/1/50/49678/thumb_620x2000/king-machine-trailer.mp4.jpg',
|
||||
'formats': [{
|
||||
'url': 'http://cdn.dbolical.com/cache/videos/games/1/50/49678/encode_mp4/king-machine-trailer.mp4',
|
||||
'height': 360,
|
||||
'ext': 'mp4'
|
||||
}, {
|
||||
'url': 'http://cdn.dbolical.com/cache/videos/games/1/50/49678/encode720p_mp4/king-machine-trailer.mp4',
|
||||
'height': 720,
|
||||
'ext': 'mp4'
|
||||
}]
|
||||
})
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
unittest.main()
|
||||
|
@ -1872,6 +1872,7 @@ class YoutubeDL(object):
|
||||
"""Download a given list of URLs."""
|
||||
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
|
||||
if (len(url_list) > 1 and
|
||||
outtmpl != '-' and
|
||||
'%' not in outtmpl and
|
||||
self.params.get('max_downloads') != 1):
|
||||
raise SameFileError(outtmpl)
|
||||
|
@ -196,7 +196,7 @@ def _real_main(argv=None):
|
||||
if opts.playlistend not in (-1, None) and opts.playlistend < opts.playliststart:
|
||||
raise ValueError('Playlist end must be greater than playlist start')
|
||||
if opts.extractaudio:
|
||||
if opts.audioformat not in ['best', 'aac', 'mp3', 'm4a', 'opus', 'vorbis', 'wav']:
|
||||
if opts.audioformat not in ['best', 'aac', 'flac', 'mp3', 'm4a', 'opus', 'vorbis', 'wav']:
|
||||
parser.error('invalid audio format specified')
|
||||
if opts.audioquality:
|
||||
opts.audioquality = opts.audioquality.strip('k').strip('K')
|
||||
@ -242,14 +242,11 @@ def _real_main(argv=None):
|
||||
|
||||
# PostProcessors
|
||||
postprocessors = []
|
||||
# Add the metadata pp first, the other pps will copy it
|
||||
if opts.metafromtitle:
|
||||
postprocessors.append({
|
||||
'key': 'MetadataFromTitle',
|
||||
'titleformat': opts.metafromtitle
|
||||
})
|
||||
if opts.addmetadata:
|
||||
postprocessors.append({'key': 'FFmpegMetadata'})
|
||||
if opts.extractaudio:
|
||||
postprocessors.append({
|
||||
'key': 'FFmpegExtractAudio',
|
||||
@ -262,6 +259,16 @@ def _real_main(argv=None):
|
||||
'key': 'FFmpegVideoConvertor',
|
||||
'preferedformat': opts.recodevideo,
|
||||
})
|
||||
# FFmpegMetadataPP should be run after FFmpegVideoConvertorPP and
|
||||
# FFmpegExtractAudioPP as containers before conversion may not support
|
||||
# metadata (3gp, webm, etc.)
|
||||
# And this post-processor should be placed before other metadata
|
||||
# manipulating post-processors (FFmpegEmbedSubtitle) to prevent loss of
|
||||
# extra metadata. By default ffmpeg preserves metadata applicable for both
|
||||
# source and target containers. From this point the container won't change,
|
||||
# so metadata can be added here.
|
||||
if opts.addmetadata:
|
||||
postprocessors.append({'key': 'FFmpegMetadata'})
|
||||
if opts.convertsubtitles:
|
||||
postprocessors.append({
|
||||
'key': 'FFmpegSubtitlesConvertor',
|
||||
|
@ -1458,6 +1458,8 @@ class AdobePassIE(InfoExtractor):
|
||||
self._downloader.cache.store(self._MVPD_CACHE, requestor_id, {})
|
||||
count += 1
|
||||
continue
|
||||
if '<error' in authorize:
|
||||
raise ExtractorError(xml_text(authorize, 'details'), expected=True)
|
||||
authz_token = unescapeHTML(xml_text(authorize, 'authzToken'))
|
||||
requestor_info[guid] = authz_token
|
||||
self._downloader.cache.store(self._MVPD_CACHE, requestor_id, requestor_info)
|
||||
|
@ -21,10 +21,11 @@ class BellMediaIE(InfoExtractor):
|
||||
animalplanet|
|
||||
bravo|
|
||||
mtv|
|
||||
space
|
||||
space|
|
||||
etalk
|
||||
)\.ca|
|
||||
much\.com
|
||||
)/.*?(?:\bvid=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6,})'''
|
||||
)/.*?(?:\bvid(?:eoid)?=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6,})'''
|
||||
_TESTS = [{
|
||||
'url': 'http://www.ctv.ca/video/player?vid=706966',
|
||||
'md5': 'ff2ebbeae0aa2dcc32a830c3fd69b7b0',
|
||||
@ -58,6 +59,9 @@ class BellMediaIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://www.ctv.ca/DCs-Legends-of-Tomorrow/Video/S2E11-Turncoat-vid1051430',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.etalk.ca/video?videoid=663455',
|
||||
'only_matching': True,
|
||||
}]
|
||||
_DOMAINS = {
|
||||
'thecomedynetwork': 'comedy',
|
||||
@ -65,6 +69,7 @@ class BellMediaIE(InfoExtractor):
|
||||
'sciencechannel': 'discsci',
|
||||
'investigationdiscovery': 'invdisc',
|
||||
'animalplanet': 'aniplan',
|
||||
'etalk': 'ctv',
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
72
youtube_dl/extractor/bostonglobe.py
Normal file
72
youtube_dl/extractor/bostonglobe.py
Normal file
@ -0,0 +1,72 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
from ..utils import (
|
||||
extract_attributes,
|
||||
)
|
||||
|
||||
|
||||
class BostonGlobeIE(InfoExtractor):
|
||||
_VALID_URL = r'(?i)https?://(?:www\.)?bostonglobe\.com/.*/(?P<id>[^/]+)/\w+(?:\.html)?'
|
||||
_TESTS = [
|
||||
{
|
||||
'url': 'http://www.bostonglobe.com/metro/2017/02/11/tree-finally-succumbs-disease-leaving-hole-neighborhood/h1b4lviqzMTIn9sVy8F3gP/story.html',
|
||||
'md5': '0a62181079c85c2d2b618c9a738aedaf',
|
||||
'info_dict': {
|
||||
'title': 'A tree finally succumbs to disease, leaving a hole in a neighborhood',
|
||||
'id': '5320421710001',
|
||||
'ext': 'mp4',
|
||||
'description': 'It arrived as a sapling when the Back Bay was in its infancy, a spindly American elm tamped down into a square of dirt cut into the brick sidewalk of 1880s Marlborough Street, no higher than the first bay window of the new brownstone behind it.',
|
||||
'timestamp': 1486877593,
|
||||
'upload_date': '20170212',
|
||||
'uploader_id': '245991542',
|
||||
},
|
||||
},
|
||||
{
|
||||
# Embedded youtube video; we hand it off to the Generic extractor.
|
||||
'url': 'https://www.bostonglobe.com/lifestyle/names/2017/02/17/does-ben-affleck-play-matt-damon-favorite-version-batman/ruqkc9VxKBYmh5txn1XhSI/story.html',
|
||||
'md5': '582b40327089d5c0c949b3c54b13c24b',
|
||||
'info_dict': {
|
||||
'title': "Who Is Matt Damon's Favorite Batman?",
|
||||
'id': 'ZW1QCnlA6Qc',
|
||||
'ext': 'mp4',
|
||||
'upload_date': '20170217',
|
||||
'description': 'md5:3b3dccb9375867e0b4d527ed87d307cb',
|
||||
'uploader': 'The Late Late Show with James Corden',
|
||||
'uploader_id': 'TheLateLateShow',
|
||||
},
|
||||
'expected_warnings': ['404'],
|
||||
},
|
||||
]
|
||||
|
||||
def _real_extract(self, url):
|
||||
page_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, page_id)
|
||||
|
||||
page_title = self._og_search_title(webpage, default=None)
|
||||
|
||||
# <video data-brightcove-video-id="5320421710001" data-account="245991542" data-player="SJWAiyYWg" data-embed="default" class="video-js" controls itemscope itemtype="http://schema.org/VideoObject">
|
||||
entries = []
|
||||
for video in re.findall(r'(?i)(<video[^>]+>)', webpage):
|
||||
attrs = extract_attributes(video)
|
||||
|
||||
video_id = attrs.get('data-brightcove-video-id')
|
||||
account_id = attrs.get('data-account')
|
||||
player_id = attrs.get('data-player')
|
||||
embed = attrs.get('data-embed')
|
||||
|
||||
if video_id and account_id and player_id and embed:
|
||||
entries.append(
|
||||
'http://players.brightcove.net/%s/%s_%s/index.html?videoId=%s'
|
||||
% (account_id, player_id, embed, video_id))
|
||||
|
||||
if len(entries) == 0:
|
||||
return self.url_result(url, 'Generic')
|
||||
elif len(entries) == 1:
|
||||
return self.url_result(entries[0], 'BrightcoveNew')
|
||||
else:
|
||||
return self.playlist_from_matches(entries, page_id, page_title, ie='BrightcoveNew')
|
@ -193,7 +193,13 @@ class BrightcoveLegacyIE(InfoExtractor):
|
||||
if videoPlayer is not None:
|
||||
if isinstance(videoPlayer, list):
|
||||
videoPlayer = videoPlayer[0]
|
||||
if not (videoPlayer.isdigit() or videoPlayer.startswith('ref:')):
|
||||
videoPlayer = videoPlayer.strip()
|
||||
# UUID is also possible for videoPlayer (e.g.
|
||||
# http://www.popcornflix.com/hoodies-vs-hooligans/7f2d2b87-bbf2-4623-acfb-ea942b4f01dd
|
||||
# or http://www8.hp.com/cn/zh/home.html)
|
||||
if not (re.match(
|
||||
r'^(?:\d+|[\da-fA-F]{8}-?[\da-fA-F]{4}-?[\da-fA-F]{4}-?[\da-fA-F]{4}-?[\da-fA-F]{12})$',
|
||||
videoPlayer) or videoPlayer.startswith('ref:')):
|
||||
return None
|
||||
params['@videoPlayer'] = videoPlayer
|
||||
linkBase = find_param('linkBaseURL')
|
||||
|
@ -4,62 +4,62 @@ import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
clean_html,
|
||||
ExtractorError,
|
||||
parse_filesize,
|
||||
int_or_none,
|
||||
parse_iso8601,
|
||||
qualities,
|
||||
unescapeHTML,
|
||||
)
|
||||
|
||||
|
||||
class Channel9IE(InfoExtractor):
|
||||
'''
|
||||
Common extractor for channel9.msdn.com.
|
||||
|
||||
The type of provided URL (video or playlist) is determined according to
|
||||
meta Search.PageType from web page HTML rather than URL itself, as it is
|
||||
not always possible to do.
|
||||
'''
|
||||
IE_DESC = 'Channel 9'
|
||||
IE_NAME = 'channel9'
|
||||
_VALID_URL = r'https?://(?:www\.)?channel9\.msdn\.com/(?P<contentpath>.+?)(?P<rss>/RSS)?/?(?:[?#&]|$)'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:channel9\.msdn\.com|s\.ch9\.ms)/(?P<contentpath>.+?)(?P<rss>/RSS)?/?(?:[?#&]|$)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://channel9.msdn.com/Events/TechEd/Australia/2013/KOS002',
|
||||
'md5': 'bbd75296ba47916b754e73c3a4bbdf10',
|
||||
'md5': '32083d4eaf1946db6d454313f44510ca',
|
||||
'info_dict': {
|
||||
'id': 'Events/TechEd/Australia/2013/KOS002',
|
||||
'ext': 'mp4',
|
||||
'id': '6c413323-383a-49dc-88f9-a22800cab024',
|
||||
'ext': 'wmv',
|
||||
'title': 'Developer Kick-Off Session: Stuff We Love',
|
||||
'description': 'md5:c08d72240b7c87fcecafe2692f80e35f',
|
||||
'description': 'md5:b80bf9355a503c193aff7ec6cd5a7731',
|
||||
'duration': 4576,
|
||||
'thumbnail': r're:http://.*\.jpg',
|
||||
'thumbnail': r're:https?://.*\.jpg',
|
||||
'timestamp': 1377717420,
|
||||
'upload_date': '20130828',
|
||||
'session_code': 'KOS002',
|
||||
'session_day': 'Day 1',
|
||||
'session_room': 'Arena 1A',
|
||||
'session_speakers': ['Ed Blankenship', 'Andrew Coates', 'Brady Gaster', 'Patrick Klug',
|
||||
'Mads Kristensen'],
|
||||
'session_speakers': ['Andrew Coates', 'Brady Gaster', 'Mads Kristensen', 'Ed Blankenship', 'Patrick Klug'],
|
||||
},
|
||||
}, {
|
||||
'url': 'http://channel9.msdn.com/posts/Self-service-BI-with-Power-BI-nuclear-testing',
|
||||
'md5': 'b43ee4529d111bc37ba7ee4f34813e68',
|
||||
'md5': 'dcf983ee6acd2088e7188c3cf79b46bc',
|
||||
'info_dict': {
|
||||
'id': 'posts/Self-service-BI-with-Power-BI-nuclear-testing',
|
||||
'ext': 'mp4',
|
||||
'id': 'fe8e435f-bb93-4e01-8e97-a28c01887024',
|
||||
'ext': 'wmv',
|
||||
'title': 'Self-service BI with Power BI - nuclear testing',
|
||||
'description': 'md5:d1e6ecaafa7fb52a2cacdf9599829f5b',
|
||||
'description': 'md5:2d17fec927fc91e9e17783b3ecc88f54',
|
||||
'duration': 1540,
|
||||
'thumbnail': r're:http://.*\.jpg',
|
||||
'thumbnail': r're:https?://.*\.jpg',
|
||||
'timestamp': 1386381991,
|
||||
'upload_date': '20131207',
|
||||
'authors': ['Mike Wilmot'],
|
||||
},
|
||||
}, {
|
||||
# low quality mp4 is best
|
||||
'url': 'https://channel9.msdn.com/Events/CPP/CppCon-2015/Ranges-for-the-Standard-Library',
|
||||
'info_dict': {
|
||||
'id': 'Events/CPP/CppCon-2015/Ranges-for-the-Standard-Library',
|
||||
'id': '33ad69d2-6a4e-4172-83a1-a523013dec76',
|
||||
'ext': 'mp4',
|
||||
'title': 'Ranges for the Standard Library',
|
||||
'description': 'md5:2e6b4917677af3728c5f6d63784c4c5d',
|
||||
'description': 'md5:9895e0a9fd80822d2f01c454b8f4a372',
|
||||
'duration': 5646,
|
||||
'thumbnail': r're:http://.*\.jpg',
|
||||
'thumbnail': r're:https?://.*\.jpg',
|
||||
'upload_date': '20150930',
|
||||
'timestamp': 1443640735,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
@ -70,7 +70,7 @@ class Channel9IE(InfoExtractor):
|
||||
'id': 'Niners/Splendid22/Queue/76acff796e8f411184b008028e0d492b',
|
||||
'title': 'Channel 9',
|
||||
},
|
||||
'playlist_count': 2,
|
||||
'playlist_mincount': 100,
|
||||
}, {
|
||||
'url': 'https://channel9.msdn.com/Events/DEVintersection/DEVintersection-2016/RSS',
|
||||
'only_matching': True,
|
||||
@ -81,189 +81,6 @@ class Channel9IE(InfoExtractor):
|
||||
|
||||
_RSS_URL = 'http://channel9.msdn.com/%s/RSS'
|
||||
|
||||
def _formats_from_html(self, html):
|
||||
FORMAT_REGEX = r'''
|
||||
(?x)
|
||||
<a\s+href="(?P<url>[^"]+)">(?P<quality>[^<]+)</a>\s*
|
||||
<span\s+class="usage">\((?P<note>[^\)]+)\)</span>\s*
|
||||
(?:<div\s+class="popup\s+rounded">\s*
|
||||
<h3>File\s+size</h3>\s*(?P<filesize>.*?)\s*
|
||||
</div>)? # File size part may be missing
|
||||
'''
|
||||
quality = qualities((
|
||||
'MP3', 'MP4',
|
||||
'Low Quality WMV', 'Low Quality MP4',
|
||||
'Mid Quality WMV', 'Mid Quality MP4',
|
||||
'High Quality WMV', 'High Quality MP4'))
|
||||
formats = [{
|
||||
'url': x.group('url'),
|
||||
'format_id': x.group('quality'),
|
||||
'format_note': x.group('note'),
|
||||
'format': '%s (%s)' % (x.group('quality'), x.group('note')),
|
||||
'filesize_approx': parse_filesize(x.group('filesize')),
|
||||
'quality': quality(x.group('quality')),
|
||||
'vcodec': 'none' if x.group('note') == 'Audio only' else None,
|
||||
} for x in list(re.finditer(FORMAT_REGEX, html))]
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
return formats
|
||||
|
||||
def _extract_title(self, html):
|
||||
title = self._html_search_meta('title', html, 'title')
|
||||
if title is None:
|
||||
title = self._og_search_title(html)
|
||||
TITLE_SUFFIX = ' (Channel 9)'
|
||||
if title is not None and title.endswith(TITLE_SUFFIX):
|
||||
title = title[:-len(TITLE_SUFFIX)]
|
||||
return title
|
||||
|
||||
def _extract_description(self, html):
|
||||
DESCRIPTION_REGEX = r'''(?sx)
|
||||
<div\s+class="entry-content">\s*
|
||||
<div\s+id="entry-body">\s*
|
||||
(?P<description>.+?)\s*
|
||||
</div>\s*
|
||||
</div>
|
||||
'''
|
||||
m = re.search(DESCRIPTION_REGEX, html)
|
||||
if m is not None:
|
||||
return m.group('description')
|
||||
return self._html_search_meta('description', html, 'description')
|
||||
|
||||
def _extract_duration(self, html):
|
||||
m = re.search(r'"length": *"(?P<hours>\d{2}):(?P<minutes>\d{2}):(?P<seconds>\d{2})"', html)
|
||||
return ((int(m.group('hours')) * 60 * 60) + (int(m.group('minutes')) * 60) + int(m.group('seconds'))) if m else None
|
||||
|
||||
def _extract_slides(self, html):
|
||||
m = re.search(r'<a href="(?P<slidesurl>[^"]+)" class="slides">Slides</a>', html)
|
||||
return m.group('slidesurl') if m is not None else None
|
||||
|
||||
def _extract_zip(self, html):
|
||||
m = re.search(r'<a href="(?P<zipurl>[^"]+)" class="zip">Zip</a>', html)
|
||||
return m.group('zipurl') if m is not None else None
|
||||
|
||||
def _extract_avg_rating(self, html):
|
||||
m = re.search(r'<p class="avg-rating">Avg Rating: <span>(?P<avgrating>[^<]+)</span></p>', html)
|
||||
return float(m.group('avgrating')) if m is not None else 0
|
||||
|
||||
def _extract_rating_count(self, html):
|
||||
m = re.search(r'<div class="rating-count">\((?P<ratingcount>[^<]+)\)</div>', html)
|
||||
return int(self._fix_count(m.group('ratingcount'))) if m is not None else 0
|
||||
|
||||
def _extract_view_count(self, html):
|
||||
m = re.search(r'<li class="views">\s*<span class="count">(?P<viewcount>[^<]+)</span> Views\s*</li>', html)
|
||||
return int(self._fix_count(m.group('viewcount'))) if m is not None else 0
|
||||
|
||||
def _extract_comment_count(self, html):
|
||||
m = re.search(r'<li class="comments">\s*<a href="#comments">\s*<span class="count">(?P<commentcount>[^<]+)</span> Comments\s*</a>\s*</li>', html)
|
||||
return int(self._fix_count(m.group('commentcount'))) if m is not None else 0
|
||||
|
||||
def _fix_count(self, count):
|
||||
return int(str(count).replace(',', '')) if count is not None else None
|
||||
|
||||
def _extract_authors(self, html):
|
||||
m = re.search(r'(?s)<li class="author">(.*?)</li>', html)
|
||||
if m is None:
|
||||
return None
|
||||
return re.findall(r'<a href="/Niners/[^"]+">([^<]+)</a>', m.group(1))
|
||||
|
||||
def _extract_session_code(self, html):
|
||||
m = re.search(r'<li class="code">\s*(?P<code>.+?)\s*</li>', html)
|
||||
return m.group('code') if m is not None else None
|
||||
|
||||
def _extract_session_day(self, html):
|
||||
m = re.search(r'<li class="day">\s*<a href="/Events/[^"]+">(?P<day>[^<]+)</a>\s*</li>', html)
|
||||
return m.group('day').strip() if m is not None else None
|
||||
|
||||
def _extract_session_room(self, html):
|
||||
m = re.search(r'<li class="room">\s*(?P<room>.+?)\s*</li>', html)
|
||||
return m.group('room') if m is not None else None
|
||||
|
||||
def _extract_session_speakers(self, html):
|
||||
return re.findall(r'<a href="/Events/Speakers/[^"]+">([^<]+)</a>', html)
|
||||
|
||||
def _extract_content(self, html, content_path):
|
||||
# Look for downloadable content
|
||||
formats = self._formats_from_html(html)
|
||||
slides = self._extract_slides(html)
|
||||
zip_ = self._extract_zip(html)
|
||||
|
||||
# Nothing to download
|
||||
if len(formats) == 0 and slides is None and zip_ is None:
|
||||
self._downloader.report_warning('None of recording, slides or zip are available for %s' % content_path)
|
||||
return
|
||||
|
||||
# Extract meta
|
||||
title = self._extract_title(html)
|
||||
description = self._extract_description(html)
|
||||
thumbnail = self._og_search_thumbnail(html)
|
||||
duration = self._extract_duration(html)
|
||||
avg_rating = self._extract_avg_rating(html)
|
||||
rating_count = self._extract_rating_count(html)
|
||||
view_count = self._extract_view_count(html)
|
||||
comment_count = self._extract_comment_count(html)
|
||||
|
||||
common = {
|
||||
'_type': 'video',
|
||||
'id': content_path,
|
||||
'description': description,
|
||||
'thumbnail': thumbnail,
|
||||
'duration': duration,
|
||||
'avg_rating': avg_rating,
|
||||
'rating_count': rating_count,
|
||||
'view_count': view_count,
|
||||
'comment_count': comment_count,
|
||||
}
|
||||
|
||||
result = []
|
||||
|
||||
if slides is not None:
|
||||
d = common.copy()
|
||||
d.update({'title': title + '-Slides', 'url': slides})
|
||||
result.append(d)
|
||||
|
||||
if zip_ is not None:
|
||||
d = common.copy()
|
||||
d.update({'title': title + '-Zip', 'url': zip_})
|
||||
result.append(d)
|
||||
|
||||
if len(formats) > 0:
|
||||
d = common.copy()
|
||||
d.update({'title': title, 'formats': formats})
|
||||
result.append(d)
|
||||
|
||||
return result
|
||||
|
||||
def _extract_entry_item(self, html, content_path):
|
||||
contents = self._extract_content(html, content_path)
|
||||
if contents is None:
|
||||
return contents
|
||||
|
||||
if len(contents) > 1:
|
||||
raise ExtractorError('Got more than one entry')
|
||||
result = contents[0]
|
||||
result['authors'] = self._extract_authors(html)
|
||||
|
||||
return result
|
||||
|
||||
def _extract_session(self, html, content_path):
|
||||
contents = self._extract_content(html, content_path)
|
||||
if contents is None:
|
||||
return contents
|
||||
|
||||
session_meta = {
|
||||
'session_code': self._extract_session_code(html),
|
||||
'session_day': self._extract_session_day(html),
|
||||
'session_room': self._extract_session_room(html),
|
||||
'session_speakers': self._extract_session_speakers(html),
|
||||
}
|
||||
|
||||
for content in contents:
|
||||
content.update(session_meta)
|
||||
|
||||
return self.playlist_result(contents)
|
||||
|
||||
def _extract_list(self, video_id, rss_url=None):
|
||||
if not rss_url:
|
||||
rss_url = self._RSS_URL % video_id
|
||||
@ -274,9 +91,7 @@ class Channel9IE(InfoExtractor):
|
||||
return self.playlist_result(entries, video_id, title_text)
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
content_path = mobj.group('contentpath')
|
||||
rss = mobj.group('rss')
|
||||
content_path, rss = re.match(self._VALID_URL, url).groups()
|
||||
|
||||
if rss:
|
||||
return self._extract_list(content_path, url)
|
||||
@ -284,17 +99,158 @@ class Channel9IE(InfoExtractor):
|
||||
webpage = self._download_webpage(
|
||||
url, content_path, 'Downloading web page')
|
||||
|
||||
page_type = self._search_regex(
|
||||
r'<meta[^>]+name=(["\'])WT\.entryid\1[^>]+content=(["\'])(?P<pagetype>[^:]+).+?\2',
|
||||
webpage, 'page type', default=None, group='pagetype')
|
||||
if page_type:
|
||||
if page_type == 'Entry': # Any 'item'-like page, may contain downloadable content
|
||||
return self._extract_entry_item(webpage, content_path)
|
||||
elif page_type == 'Session': # Event session page, may contain downloadable content
|
||||
return self._extract_session(webpage, content_path)
|
||||
elif page_type == 'Event':
|
||||
return self._extract_list(content_path)
|
||||
episode_data = self._search_regex(
|
||||
r"data-episode='([^']+)'", webpage, 'episode data', default=None)
|
||||
if episode_data:
|
||||
episode_data = self._parse_json(unescapeHTML(
|
||||
episode_data), content_path)
|
||||
content_id = episode_data['contentId']
|
||||
is_session = '/Sessions(' in episode_data['api']
|
||||
content_url = 'https://channel9.msdn.com/odata' + episode_data['api']
|
||||
if is_session:
|
||||
content_url += '?$expand=Speakers'
|
||||
else:
|
||||
raise ExtractorError('Unexpected WT.entryid %s' % page_type, expected=True)
|
||||
else: # Assuming list
|
||||
content_url += '?$expand=Authors'
|
||||
content_data = self._download_json(content_url, content_id)
|
||||
title = content_data['Title']
|
||||
|
||||
QUALITIES = (
|
||||
'mp3',
|
||||
'wmv', 'mp4',
|
||||
'wmv-low', 'mp4-low',
|
||||
'wmv-mid', 'mp4-mid',
|
||||
'wmv-high', 'mp4-high',
|
||||
)
|
||||
|
||||
quality_key = qualities(QUALITIES)
|
||||
|
||||
def quality(quality_id, format_url):
|
||||
return (len(QUALITIES) if '_Source.' in format_url
|
||||
else quality_key(quality_id))
|
||||
|
||||
formats = []
|
||||
urls = set()
|
||||
|
||||
SITE_QUALITIES = {
|
||||
'MP3': 'mp3',
|
||||
'MP4': 'mp4',
|
||||
'Low Quality WMV': 'wmv-low',
|
||||
'Low Quality MP4': 'mp4-low',
|
||||
'Mid Quality WMV': 'wmv-mid',
|
||||
'Mid Quality MP4': 'mp4-mid',
|
||||
'High Quality WMV': 'wmv-high',
|
||||
'High Quality MP4': 'mp4-high',
|
||||
}
|
||||
|
||||
formats_select = self._search_regex(
|
||||
r'(?s)<select[^>]+name=["\']format[^>]+>(.+?)</select', webpage,
|
||||
'formats select', default=None)
|
||||
if formats_select:
|
||||
for mobj in re.finditer(
|
||||
r'<option\b[^>]+\bvalue=(["\'])(?P<url>(?:(?!\1).)+)\1[^>]*>\s*(?P<format>[^<]+?)\s*<',
|
||||
formats_select):
|
||||
format_url = mobj.group('url')
|
||||
if format_url in urls:
|
||||
continue
|
||||
urls.add(format_url)
|
||||
format_id = mobj.group('format')
|
||||
quality_id = SITE_QUALITIES.get(format_id, format_id)
|
||||
formats.append({
|
||||
'url': format_url,
|
||||
'format_id': quality_id,
|
||||
'quality': quality(quality_id, format_url),
|
||||
'vcodec': 'none' if quality_id == 'mp3' else None,
|
||||
})
|
||||
|
||||
API_QUALITIES = {
|
||||
'VideoMP4Low': 'mp4-low',
|
||||
'VideoWMV': 'wmv-mid',
|
||||
'VideoMP4Medium': 'mp4-mid',
|
||||
'VideoMP4High': 'mp4-high',
|
||||
'VideoWMVHQ': 'wmv-hq',
|
||||
}
|
||||
|
||||
for format_id, q in API_QUALITIES.items():
|
||||
q_url = content_data.get(format_id)
|
||||
if not q_url or q_url in urls:
|
||||
continue
|
||||
urls.add(q_url)
|
||||
formats.append({
|
||||
'url': q_url,
|
||||
'format_id': q,
|
||||
'quality': quality(q, q_url),
|
||||
})
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
slides = content_data.get('Slides')
|
||||
zip_file = content_data.get('ZipFile')
|
||||
|
||||
if not formats and not slides and not zip_file:
|
||||
raise ExtractorError(
|
||||
'None of recording, slides or zip are available for %s' % content_path)
|
||||
|
||||
subtitles = {}
|
||||
for caption in content_data.get('Captions', []):
|
||||
caption_url = caption.get('Url')
|
||||
if not caption_url:
|
||||
continue
|
||||
subtitles.setdefault(caption.get('Language', 'en'), []).append({
|
||||
'url': caption_url,
|
||||
'ext': 'vtt',
|
||||
})
|
||||
|
||||
common = {
|
||||
'id': content_id,
|
||||
'title': title,
|
||||
'description': clean_html(content_data.get('Description') or content_data.get('Body')),
|
||||
'thumbnail': content_data.get('Thumbnail') or content_data.get('VideoPlayerPreviewImage'),
|
||||
'duration': int_or_none(content_data.get('MediaLengthInSeconds')),
|
||||
'timestamp': parse_iso8601(content_data.get('PublishedDate')),
|
||||
'avg_rating': int_or_none(content_data.get('Rating')),
|
||||
'rating_count': int_or_none(content_data.get('RatingCount')),
|
||||
'view_count': int_or_none(content_data.get('Views')),
|
||||
'comment_count': int_or_none(content_data.get('CommentCount')),
|
||||
'subtitles': subtitles,
|
||||
}
|
||||
if is_session:
|
||||
speakers = []
|
||||
for s in content_data.get('Speakers', []):
|
||||
speaker_name = s.get('FullName')
|
||||
if not speaker_name:
|
||||
continue
|
||||
speakers.append(speaker_name)
|
||||
|
||||
common.update({
|
||||
'session_code': content_data.get('Code'),
|
||||
'session_room': content_data.get('Room'),
|
||||
'session_speakers': speakers,
|
||||
})
|
||||
else:
|
||||
authors = []
|
||||
for a in content_data.get('Authors', []):
|
||||
author_name = a.get('DisplayName')
|
||||
if not author_name:
|
||||
continue
|
||||
authors.append(author_name)
|
||||
common['authors'] = authors
|
||||
|
||||
contents = []
|
||||
|
||||
if slides:
|
||||
d = common.copy()
|
||||
d.update({'title': title + '-Slides', 'url': slides})
|
||||
contents.append(d)
|
||||
|
||||
if zip_file:
|
||||
d = common.copy()
|
||||
d.update({'title': title + '-Zip', 'url': zip_file})
|
||||
contents.append(d)
|
||||
|
||||
if formats:
|
||||
d = common.copy()
|
||||
d.update({'title': title, 'formats': formats})
|
||||
contents.append(d)
|
||||
return self.playlist_result(contents)
|
||||
else:
|
||||
return self._extract_list(content_path)
|
||||
|
@ -1,97 +1,56 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_HTTPError,
|
||||
)
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
HEADRequest,
|
||||
remove_end,
|
||||
str_to_int,
|
||||
unified_strdate,
|
||||
)
|
||||
|
||||
|
||||
class CloudyIE(InfoExtractor):
|
||||
_IE_DESC = 'cloudy.ec'
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://(?:www\.)?cloudy\.ec/
|
||||
(?:v/|embed\.php\?id=)
|
||||
(?P<id>[A-Za-z0-9]+)
|
||||
'''
|
||||
_EMBED_URL = 'http://www.cloudy.ec/embed.php?id=%s'
|
||||
_API_URL = 'http://www.cloudy.ec/api/player.api.php'
|
||||
_MAX_TRIES = 2
|
||||
_TEST = {
|
||||
_VALID_URL = r'https?://(?:www\.)?cloudy\.ec/(?:v/|embed\.php\?.*?\bid=)(?P<id>[A-Za-z0-9]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.cloudy.ec/v/af511e2527aac',
|
||||
'md5': '5cb253ace826a42f35b4740539bedf07',
|
||||
'md5': '29832b05028ead1b58be86bf319397ca',
|
||||
'info_dict': {
|
||||
'id': 'af511e2527aac',
|
||||
'ext': 'flv',
|
||||
'ext': 'mp4',
|
||||
'title': 'Funny Cats and Animals Compilation june 2013',
|
||||
'upload_date': '20130913',
|
||||
'view_count': int,
|
||||
}
|
||||
}
|
||||
|
||||
def _extract_video(self, video_id, file_key, error_url=None, try_num=0):
|
||||
|
||||
if try_num > self._MAX_TRIES - 1:
|
||||
raise ExtractorError('Unable to extract video URL', expected=True)
|
||||
|
||||
form = {
|
||||
'file': video_id,
|
||||
'key': file_key,
|
||||
}
|
||||
|
||||
if error_url:
|
||||
form.update({
|
||||
'numOfErrors': try_num,
|
||||
'errorCode': '404',
|
||||
'errorUrl': error_url,
|
||||
})
|
||||
|
||||
player_data = self._download_webpage(
|
||||
self._API_URL, video_id, 'Downloading player data', query=form)
|
||||
data = compat_parse_qs(player_data)
|
||||
|
||||
try_num += 1
|
||||
|
||||
if 'error' in data:
|
||||
raise ExtractorError(
|
||||
'%s error: %s' % (self.IE_NAME, ' '.join(data['error_msg'])),
|
||||
expected=True)
|
||||
|
||||
title = data.get('title', [None])[0]
|
||||
if title:
|
||||
title = remove_end(title, '&asdasdas').strip()
|
||||
|
||||
video_url = data.get('url', [None])[0]
|
||||
|
||||
if video_url:
|
||||
try:
|
||||
self._request_webpage(HEADRequest(video_url), video_id, 'Checking video URL')
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code in [404, 410]:
|
||||
self.report_warning('Invalid video URL, requesting another', video_id)
|
||||
return self._extract_video(video_id, file_key, video_url, try_num)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'url': video_url,
|
||||
'title': title,
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.cloudy.ec/embed.php?autoplay=1&id=af511e2527aac',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('id')
|
||||
video_id = self._match_id(url)
|
||||
|
||||
url = self._EMBED_URL % video_id
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
webpage = self._download_webpage(
|
||||
'http://www.cloudy.ec/embed.php?id=%s' % video_id, video_id)
|
||||
|
||||
file_key = self._search_regex(
|
||||
[r'key\s*:\s*"([^"]+)"', r'filekey\s*=\s*"([^"]+)"'],
|
||||
webpage, 'file_key')
|
||||
info = self._parse_html5_media_entries(url, webpage, video_id)[0]
|
||||
|
||||
return self._extract_video(video_id, file_key)
|
||||
webpage = self._download_webpage(
|
||||
'https://www.cloudy.ec/v/%s' % video_id, video_id, fatal=False)
|
||||
|
||||
if webpage:
|
||||
info.update({
|
||||
'title': self._search_regex(
|
||||
r'<h\d[^>]*>([^<]+)<', webpage, 'title'),
|
||||
'upload_date': unified_strdate(self._search_regex(
|
||||
r'>Published at (\d{4}-\d{1,2}-\d{1,2})', webpage,
|
||||
'upload date', fatal=False)),
|
||||
'view_count': str_to_int(self._search_regex(
|
||||
r'([\d,.]+) views<', webpage, 'view count', fatal=False)),
|
||||
})
|
||||
|
||||
if not info.get('title'):
|
||||
info['title'] = video_id
|
||||
|
||||
info['id'] = video_id
|
||||
|
||||
return info
|
||||
|
@ -36,34 +36,35 @@ from ..utils import (
|
||||
clean_html,
|
||||
compiled_regex_type,
|
||||
determine_ext,
|
||||
determine_protocol,
|
||||
error_to_compat_str,
|
||||
ExtractorError,
|
||||
extract_attributes,
|
||||
fix_xml_ampersands,
|
||||
float_or_none,
|
||||
GeoRestrictedError,
|
||||
GeoUtils,
|
||||
int_or_none,
|
||||
js_to_json,
|
||||
mimetype2ext,
|
||||
orderedSet,
|
||||
parse_codecs,
|
||||
parse_duration,
|
||||
parse_iso8601,
|
||||
parse_m3u8_attributes,
|
||||
RegexNotFoundError,
|
||||
sanitize_filename,
|
||||
sanitized_Request,
|
||||
sanitize_filename,
|
||||
unescapeHTML,
|
||||
unified_strdate,
|
||||
unified_timestamp,
|
||||
update_Request,
|
||||
update_url_query,
|
||||
urljoin,
|
||||
url_basename,
|
||||
xpath_element,
|
||||
xpath_text,
|
||||
xpath_with_ns,
|
||||
determine_protocol,
|
||||
parse_duration,
|
||||
mimetype2ext,
|
||||
update_Request,
|
||||
update_url_query,
|
||||
parse_m3u8_attributes,
|
||||
extract_attributes,
|
||||
parse_codecs,
|
||||
urljoin,
|
||||
)
|
||||
|
||||
|
||||
@ -714,6 +715,13 @@ class InfoExtractor(object):
|
||||
video_info['title'] = video_title
|
||||
return video_info
|
||||
|
||||
def playlist_from_matches(self, matches, video_id, video_title, getter=None, ie=None):
|
||||
urlrs = orderedSet(
|
||||
self.url_result(self._proto_relative_url(getter(m) if getter else m), ie)
|
||||
for m in matches)
|
||||
return self.playlist_result(
|
||||
urlrs, playlist_id=video_id, playlist_title=video_title)
|
||||
|
||||
@staticmethod
|
||||
def playlist_result(entries, playlist_id=None, playlist_title=None, playlist_description=None):
|
||||
"""Returns a playlist"""
|
||||
@ -2247,6 +2255,9 @@ class InfoExtractor(object):
|
||||
elif ext == 'mpd':
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
source_url, video_id, mpd_id=mpd_id, fatal=False))
|
||||
elif ext == 'smil':
|
||||
formats.extend(self._extract_smil_formats(
|
||||
source_url, video_id, fatal=False))
|
||||
# https://github.com/jwplayer/jwplayer/blob/master/src/js/providers/default.js#L67
|
||||
elif source_type.startswith('audio') or ext in (
|
||||
'oga', 'aac', 'mp3', 'mpeg', 'vorbis'):
|
||||
|
@ -9,13 +9,14 @@ from ..compat import (
|
||||
compat_urlparse,
|
||||
)
|
||||
from ..utils import (
|
||||
orderedSet,
|
||||
remove_end,
|
||||
extract_attributes,
|
||||
mimetype2ext,
|
||||
determine_ext,
|
||||
extract_attributes,
|
||||
int_or_none,
|
||||
js_to_json,
|
||||
mimetype2ext,
|
||||
orderedSet,
|
||||
parse_iso8601,
|
||||
remove_end,
|
||||
)
|
||||
|
||||
|
||||
@ -66,6 +67,16 @@ class CondeNastIE(InfoExtractor):
|
||||
'upload_date': '20130314',
|
||||
'timestamp': 1363219200,
|
||||
}
|
||||
}, {
|
||||
'url': 'http://video.gq.com/watch/the-closer-with-keith-olbermann-the-only-true-surprise-trump-s-an-idiot?c=series',
|
||||
'info_dict': {
|
||||
'id': '58d1865bfd2e6126e2000015',
|
||||
'ext': 'mp4',
|
||||
'title': 'The Only True Surprise? Trump’s an Idiot',
|
||||
'uploader': 'gq',
|
||||
'upload_date': '20170321',
|
||||
'timestamp': 1490126427,
|
||||
},
|
||||
}, {
|
||||
# JS embed
|
||||
'url': 'http://player.cnevids.com/embedjs/55f9cf8b61646d1acf00000c/5511d76261646d5566020000.js',
|
||||
@ -114,26 +125,33 @@ class CondeNastIE(InfoExtractor):
|
||||
})
|
||||
video_id = query['videoId']
|
||||
video_info = None
|
||||
info_page = self._download_webpage(
|
||||
info_page = self._download_json(
|
||||
'http://player.cnevids.com/player/video.js',
|
||||
video_id, 'Downloading video info', query=query, fatal=False)
|
||||
video_id, 'Downloading video info', fatal=False, query=query)
|
||||
if info_page:
|
||||
video_info = self._parse_json(self._search_regex(
|
||||
r'loadCallback\(({.+})\)', info_page, 'video info'), video_id)['video']
|
||||
else:
|
||||
video_info = info_page.get('video')
|
||||
if not video_info:
|
||||
info_page = self._download_webpage(
|
||||
'http://player.cnevids.com/player/loader.js',
|
||||
video_id, 'Downloading loader info', query=query)
|
||||
video_info = self._parse_json(self._search_regex(
|
||||
r'var\s+video\s*=\s*({.+?});', info_page, 'video info'), video_id)
|
||||
video_info = self._parse_json(
|
||||
self._search_regex(
|
||||
r'(?s)var\s+config\s*=\s*({.+?});', info_page, 'config'),
|
||||
video_id, transform_source=js_to_json)['video']
|
||||
|
||||
title = video_info['title']
|
||||
|
||||
formats = []
|
||||
for fdata in video_info.get('sources', [{}])[0]:
|
||||
for fdata in video_info['sources']:
|
||||
src = fdata.get('src')
|
||||
if not src:
|
||||
continue
|
||||
ext = mimetype2ext(fdata.get('type')) or determine_ext(src)
|
||||
if ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
src, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
continue
|
||||
quality = fdata.get('quality')
|
||||
formats.append({
|
||||
'format_id': ext + ('-%s' % quality if quality else ''),
|
||||
@ -169,7 +187,6 @@ class CondeNastIE(InfoExtractor):
|
||||
path=remove_end(parsed_url.path, '.js').replace('/embedjs/', '/embed/')))
|
||||
url_type = 'embed'
|
||||
|
||||
self.to_screen('Extracting from %s with the Condé Nast extractor' % self._SITES[site])
|
||||
webpage = self._download_webpage(url, item_id)
|
||||
|
||||
if url_type == 'series':
|
||||
|
@ -177,6 +177,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
|
||||
'uploader': 'Kadokawa Pictures Inc.',
|
||||
'upload_date': '20170118',
|
||||
'series': "KONOSUBA -God's blessing on this wonderful world!",
|
||||
'season': "KONOSUBA -God's blessing on this wonderful world! 2",
|
||||
'season_number': 2,
|
||||
'episode': 'Give Me Deliverance from this Judicial Injustice!',
|
||||
'episode_number': 1,
|
||||
@ -222,6 +223,23 @@ class CrunchyrollIE(CrunchyrollBaseIE):
|
||||
# just test metadata extraction
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# A video with a vastly different season name compared to the series name
|
||||
'url': 'http://www.crunchyroll.com/nyarko-san-another-crawling-chaos/episode-1-test-590532',
|
||||
'info_dict': {
|
||||
'id': '590532',
|
||||
'ext': 'mp4',
|
||||
'title': 'Haiyoru! Nyaruani (ONA) Episode 1 – Test',
|
||||
'description': 'Mahiro and Nyaruko talk about official certification.',
|
||||
'uploader': 'TV TOKYO',
|
||||
'upload_date': '20120305',
|
||||
'series': 'Nyarko-san: Another Crawling Chaos',
|
||||
'season': 'Haiyoru! Nyaruani (ONA)',
|
||||
},
|
||||
'params': {
|
||||
# Just test metadata extraction
|
||||
'skip_download': True,
|
||||
},
|
||||
}]
|
||||
|
||||
_FORMAT_IDS = {
|
||||
@ -491,7 +509,8 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
|
||||
# webpage provide more accurate data than series_title from XML
|
||||
series = self._html_search_regex(
|
||||
r'id=["\']showmedia_about_episode_num[^>]+>\s*<a[^>]+>([^<]+)',
|
||||
webpage, 'series', default=xpath_text(metadata, 'series_title'))
|
||||
webpage, 'series', fatal=False)
|
||||
season = xpath_text(metadata, 'series_title')
|
||||
|
||||
episode = xpath_text(metadata, 'episode_title')
|
||||
episode_number = int_or_none(xpath_text(metadata, 'episode_number'))
|
||||
@ -508,6 +527,7 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
|
||||
'uploader': video_uploader,
|
||||
'upload_date': video_upload_date,
|
||||
'series': series,
|
||||
'season': season,
|
||||
'season_number': season_number,
|
||||
'episode': episode,
|
||||
'episode_number': episode_number,
|
||||
|
@ -1,17 +1,21 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
extract_attributes,
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
parse_age_limit,
|
||||
ExtractorError,
|
||||
remove_end,
|
||||
unescapeHTML,
|
||||
)
|
||||
|
||||
|
||||
class DiscoveryGoIE(InfoExtractor):
|
||||
_VALID_URL = r'''(?x)https?://(?:www\.)?(?:
|
||||
class DiscoveryGoBaseIE(InfoExtractor):
|
||||
_VALID_URL_TEMPLATE = r'''(?x)https?://(?:www\.)?(?:
|
||||
discovery|
|
||||
investigationdiscovery|
|
||||
discoverylife|
|
||||
@ -21,18 +25,23 @@ class DiscoveryGoIE(InfoExtractor):
|
||||
sciencechannel|
|
||||
tlc|
|
||||
velocitychannel
|
||||
)go\.com/(?:[^/]+/)*(?P<id>[^/?#&]+)'''
|
||||
)go\.com/%s(?P<id>[^/?#&]+)'''
|
||||
|
||||
|
||||
class DiscoveryGoIE(DiscoveryGoBaseIE):
|
||||
_VALID_URL = DiscoveryGoBaseIE._VALID_URL_TEMPLATE % r'(?:[^/]+/)+'
|
||||
_GEO_COUNTRIES = ['US']
|
||||
_TEST = {
|
||||
'url': 'https://www.discoverygo.com/love-at-first-kiss/kiss-first-ask-questions-later/',
|
||||
'url': 'https://www.discoverygo.com/bering-sea-gold/reaper-madness/',
|
||||
'info_dict': {
|
||||
'id': '57a33c536b66d1cd0345eeb1',
|
||||
'id': '58c167d86b66d12f2addeb01',
|
||||
'ext': 'mp4',
|
||||
'title': 'Kiss First, Ask Questions Later!',
|
||||
'description': 'md5:fe923ba34050eae468bffae10831cb22',
|
||||
'duration': 2579,
|
||||
'series': 'Love at First Kiss',
|
||||
'season_number': 1,
|
||||
'episode_number': 1,
|
||||
'title': 'Reaper Madness',
|
||||
'description': 'md5:09f2c625c99afb8946ed4fb7865f6e78',
|
||||
'duration': 2519,
|
||||
'series': 'Bering Sea Gold',
|
||||
'season_number': 8,
|
||||
'episode_number': 6,
|
||||
'age_limit': 14,
|
||||
},
|
||||
}
|
||||
@ -113,3 +122,46 @@ class DiscoveryGoIE(InfoExtractor):
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
}
|
||||
|
||||
|
||||
class DiscoveryGoPlaylistIE(DiscoveryGoBaseIE):
|
||||
_VALID_URL = DiscoveryGoBaseIE._VALID_URL_TEMPLATE % ''
|
||||
_TEST = {
|
||||
'url': 'https://www.discoverygo.com/bering-sea-gold/',
|
||||
'info_dict': {
|
||||
'id': 'bering-sea-gold',
|
||||
'title': 'Bering Sea Gold',
|
||||
'description': 'md5:cc5c6489835949043c0cc3ad66c2fa0e',
|
||||
},
|
||||
'playlist_mincount': 6,
|
||||
}
|
||||
|
||||
@classmethod
|
||||
def suitable(cls, url):
|
||||
return False if DiscoveryGoIE.suitable(url) else super(
|
||||
DiscoveryGoPlaylistIE, cls).suitable(url)
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
entries = []
|
||||
for mobj in re.finditer(r'data-json=(["\'])(?P<json>{.+?})\1', webpage):
|
||||
data = self._parse_json(
|
||||
mobj.group('json'), display_id,
|
||||
transform_source=unescapeHTML, fatal=False)
|
||||
if not isinstance(data, dict) or data.get('type') != 'episode':
|
||||
continue
|
||||
episode_url = data.get('socialUrl')
|
||||
if not episode_url:
|
||||
continue
|
||||
entries.append(self.url_result(
|
||||
episode_url, ie=DiscoveryGoIE.ie_key(),
|
||||
video_id=data.get('id')))
|
||||
|
||||
return self.playlist_result(
|
||||
entries, display_id,
|
||||
remove_end(self._og_search_title(
|
||||
webpage, fatal=False), ' | Discovery GO'),
|
||||
self._og_search_description(webpage))
|
||||
|
@ -9,13 +9,13 @@ from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_urlparse,
|
||||
)
|
||||
from ..utils import smuggle_url
|
||||
|
||||
|
||||
class TlcDeIE(InfoExtractor):
|
||||
IE_NAME = 'tlc.de'
|
||||
_VALID_URL = r'https?://(?:www\.)?tlc\.de/(?:[^/]+/)*videos/(?P<title>[^/?#]+)?(?:.*#(?P<id>\d+))?'
|
||||
class DiscoveryNetworksDeIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:discovery|tlc|animalplanet|dmax)\.de/(?:.*#(?P<id>\d+)|(?:[^/]+/)*videos/(?P<title>[^/?#]+))'
|
||||
|
||||
_TEST = {
|
||||
_TESTS = [{
|
||||
'url': 'http://www.tlc.de/sendungen/breaking-amish/videos/#3235167922001',
|
||||
'info_dict': {
|
||||
'id': '3235167922001',
|
||||
@ -29,7 +29,13 @@ class TlcDeIE(InfoExtractor):
|
||||
'upload_date': '20140404',
|
||||
'uploader_id': '1659832546',
|
||||
},
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.dmax.de/programme/storage-hunters-uk/videos/storage-hunters-uk-episode-6/',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.discovery.de/#5332316765001',
|
||||
'only_matching': True,
|
||||
}]
|
||||
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1659832546/default_default/index.html?videoId=%s'
|
||||
|
||||
def _real_extract(self, url):
|
||||
@ -39,5 +45,8 @@ class TlcDeIE(InfoExtractor):
|
||||
title = mobj.group('title')
|
||||
webpage = self._download_webpage(url, title)
|
||||
brightcove_legacy_url = BrightcoveLegacyIE._extract_brightcove_url(webpage)
|
||||
brightcove_id = compat_parse_qs(compat_urlparse.urlparse(brightcove_legacy_url).query)['@videoPlayer'][0]
|
||||
return self.url_result(self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew', brightcove_id)
|
||||
brightcove_id = compat_parse_qs(compat_urlparse.urlparse(
|
||||
brightcove_legacy_url).query)['@videoPlayer'][0]
|
||||
return self.url_result(smuggle_url(
|
||||
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, {'geo_countries': ['DE']}),
|
||||
'BrightcoveNew', brightcove_id)
|
@ -6,37 +6,24 @@ import re
|
||||
import time
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_urlparse
|
||||
from ..compat import (
|
||||
compat_urlparse,
|
||||
compat_HTTPError,
|
||||
)
|
||||
from ..utils import (
|
||||
USER_AGENTS,
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
unified_strdate,
|
||||
remove_end,
|
||||
update_url_query,
|
||||
)
|
||||
|
||||
|
||||
class DPlayIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?P<domain>it\.dplay\.com|www\.dplay\.(?:dk|se|no))/[^/]+/(?P<id>[^/?#]+)'
|
||||
_VALID_URL = r'https?://(?P<domain>www\.dplay\.(?:dk|se|no))/[^/]+/(?P<id>[^/?#]+)'
|
||||
|
||||
_TESTS = [{
|
||||
# geo restricted, via direct unsigned hls URL
|
||||
'url': 'http://it.dplay.com/take-me-out/stagione-1-episodio-25/',
|
||||
'info_dict': {
|
||||
'id': '1255600',
|
||||
'display_id': 'stagione-1-episodio-25',
|
||||
'ext': 'mp4',
|
||||
'title': 'Episodio 25',
|
||||
'description': 'md5:cae5f40ad988811b197d2d27a53227eb',
|
||||
'duration': 2761,
|
||||
'timestamp': 1454701800,
|
||||
'upload_date': '20160205',
|
||||
'creator': 'RTIT',
|
||||
'series': 'Take me out',
|
||||
'season_number': 1,
|
||||
'episode_number': 25,
|
||||
'age_limit': 0,
|
||||
},
|
||||
'expected_warnings': ['Unable to download f4m manifest'],
|
||||
}, {
|
||||
# non geo restricted, via secure api, unsigned download hls URL
|
||||
'url': 'http://www.dplay.se/nugammalt-77-handelser-som-format-sverige/season-1-svensken-lar-sig-njuta-av-livet/',
|
||||
'info_dict': {
|
||||
@ -168,3 +155,90 @@ class DPlayIE(InfoExtractor):
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
}
|
||||
|
||||
|
||||
class DPlayItIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://it\.dplay\.com/[^/]+/[^/]+/(?P<id>[^/?#]+)'
|
||||
_GEO_COUNTRIES = ['IT']
|
||||
_TEST = {
|
||||
'url': 'http://it.dplay.com/nove/biografie-imbarazzanti/luigi-di-maio-la-psicosi-di-stanislawskij/',
|
||||
'md5': '2b808ffb00fc47b884a172ca5d13053c',
|
||||
'info_dict': {
|
||||
'id': '6918',
|
||||
'display_id': 'luigi-di-maio-la-psicosi-di-stanislawskij',
|
||||
'ext': 'mp4',
|
||||
'title': 'Biografie imbarazzanti: Luigi Di Maio: la psicosi di Stanislawskij',
|
||||
'description': 'md5:3c7a4303aef85868f867a26f5cc14813',
|
||||
'thumbnail': r're:^https?://.*\.jpe?g',
|
||||
'upload_date': '20160524',
|
||||
'series': 'Biografie imbarazzanti',
|
||||
'season_number': 1,
|
||||
'episode': 'Luigi Di Maio: la psicosi di Stanislawskij',
|
||||
'episode_number': 1,
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
info_url = self._search_regex(
|
||||
r'url\s*:\s*["\']((?:https?:)?//[^/]+/playback/videoPlaybackInfo/\d+)',
|
||||
webpage, 'video id')
|
||||
|
||||
title = remove_end(self._og_search_title(webpage), ' | Dplay')
|
||||
|
||||
try:
|
||||
info = self._download_json(
|
||||
info_url, display_id, headers={
|
||||
'Authorization': 'Bearer %s' % self._get_cookies(url).get(
|
||||
'dplayit_token').value,
|
||||
'Referer': url,
|
||||
})
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (400, 403):
|
||||
info = self._parse_json(e.cause.read().decode('utf-8'), display_id)
|
||||
error = info['errors'][0]
|
||||
if error.get('code') == 'access.denied.geoblocked':
|
||||
self.raise_geo_restricted(
|
||||
msg=error.get('detail'), countries=self._GEO_COUNTRIES)
|
||||
raise ExtractorError(info['errors'][0]['detail'], expected=True)
|
||||
raise
|
||||
|
||||
hls_url = info['data']['attributes']['streaming']['hls']['url']
|
||||
|
||||
formats = self._extract_m3u8_formats(
|
||||
hls_url, display_id, ext='mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id='hls')
|
||||
|
||||
series = self._html_search_regex(
|
||||
r'(?s)<h1[^>]+class=["\'].*?\bshow_title\b.*?["\'][^>]*>(.+?)</h1>',
|
||||
webpage, 'series', fatal=False)
|
||||
episode = self._search_regex(
|
||||
r'<p[^>]+class=["\'].*?\bdesc_ep\b.*?["\'][^>]*>\s*<br/>\s*<b>([^<]+)',
|
||||
webpage, 'episode', fatal=False)
|
||||
|
||||
mobj = re.search(
|
||||
r'(?s)<span[^>]+class=["\']dates["\'][^>]*>.+?\bS\.(?P<season_number>\d+)\s+E\.(?P<episode_number>\d+)\s*-\s*(?P<upload_date>\d{2}/\d{2}/\d{4})',
|
||||
webpage)
|
||||
if mobj:
|
||||
season_number = int(mobj.group('season_number'))
|
||||
episode_number = int(mobj.group('episode_number'))
|
||||
upload_date = unified_strdate(mobj.group('upload_date'))
|
||||
else:
|
||||
season_number = episode_number = upload_date = None
|
||||
|
||||
return {
|
||||
'id': info_url.rpartition('/')[-1],
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'description': self._og_search_description(webpage),
|
||||
'thumbnail': self._og_search_thumbnail(webpage),
|
||||
'series': series,
|
||||
'season_number': season_number,
|
||||
'episode': episode,
|
||||
'episode_number': episode_number,
|
||||
'upload_date': upload_date,
|
||||
'formats': formats,
|
||||
}
|
||||
|
@ -117,6 +117,7 @@ from .bleacherreport import (
|
||||
from .blinkx import BlinkxIE
|
||||
from .bloomberg import BloombergIE
|
||||
from .bokecc import BokeCCIE
|
||||
from .bostonglobe import BostonGlobeIE
|
||||
from .bpb import BpbIE
|
||||
from .br import BRIE
|
||||
from .bravotv import BravoTVIE
|
||||
@ -246,7 +247,10 @@ from .dfb import DFBIE
|
||||
from .dhm import DHMIE
|
||||
from .dotsub import DotsubIE
|
||||
from .douyutv import DouyuTVIE
|
||||
from .dplay import DPlayIE
|
||||
from .dplay import (
|
||||
DPlayIE,
|
||||
DPlayItIE,
|
||||
)
|
||||
from .dramafever import (
|
||||
DramaFeverIE,
|
||||
DramaFeverSeriesIE,
|
||||
@ -262,7 +266,11 @@ from .dvtv import DVTVIE
|
||||
from .dumpert import DumpertIE
|
||||
from .defense import DefenseGouvFrIE
|
||||
from .discovery import DiscoveryIE
|
||||
from .discoverygo import DiscoveryGoIE
|
||||
from .discoverygo import (
|
||||
DiscoveryGoIE,
|
||||
DiscoveryGoPlaylistIE,
|
||||
)
|
||||
from .discoverynetworks import DiscoveryNetworksDeIE
|
||||
from .disney import DisneyIE
|
||||
from .dispeak import DigitallySpeakingIE
|
||||
from .dropbox import DropboxIE
|
||||
@ -967,7 +975,6 @@ from .thisav import ThisAVIE
|
||||
from .thisoldhouse import ThisOldHouseIE
|
||||
from .threeqsdn import ThreeQSDNIE
|
||||
from .tinypic import TinyPicIE
|
||||
from .tlc import TlcDeIE
|
||||
from .tmz import (
|
||||
TMZIE,
|
||||
TMZArticleIE,
|
||||
@ -980,6 +987,7 @@ from .tnaflix import (
|
||||
)
|
||||
from .toggle import ToggleIE
|
||||
from .tonline import TOnlineIE
|
||||
from .toongoggles import ToonGogglesIE
|
||||
from .toutv import TouTvIE
|
||||
from .toypics import ToypicsUserIE, ToypicsIE
|
||||
from .traileraddict import TrailerAddictIE
|
||||
@ -1168,6 +1176,7 @@ from .voxmedia import VoxMediaIE
|
||||
from .vporn import VpornIE
|
||||
from .vrt import VRTIE
|
||||
from .vrak import VrakIE
|
||||
from .medialaan import MedialaanIE
|
||||
from .vube import VubeIE
|
||||
from .vuclip import VuClipIE
|
||||
from .vvvvid import VVVVIDIE
|
||||
|
@ -196,6 +196,10 @@ class FacebookIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'https://www.facebookcorewwwi.onion/video.php?v=274175099429670',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# no title
|
||||
'url': 'https://www.facebook.com/onlycleverentertainment/videos/1947995502095005/',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@staticmethod
|
||||
@ -353,15 +357,15 @@ class FacebookIE(InfoExtractor):
|
||||
self._sort_formats(formats)
|
||||
|
||||
video_title = self._html_search_regex(
|
||||
r'<h2\s+[^>]*class="uiHeaderTitle"[^>]*>([^<]*)</h2>', webpage, 'title',
|
||||
default=None)
|
||||
r'<h2\s+[^>]*class="uiHeaderTitle"[^>]*>([^<]*)</h2>', webpage,
|
||||
'title', default=None)
|
||||
if not video_title:
|
||||
video_title = self._html_search_regex(
|
||||
r'(?s)<span class="fbPhotosPhotoCaption".*?id="fbPhotoPageCaption"><span class="hasCaption">(.*?)</span>',
|
||||
webpage, 'alternative title', default=None)
|
||||
if not video_title:
|
||||
video_title = self._html_search_meta(
|
||||
'description', webpage, 'title')
|
||||
'description', webpage, 'title', default=None)
|
||||
if video_title:
|
||||
video_title = limit_length(video_title, 80)
|
||||
else:
|
||||
|
@ -449,6 +449,23 @@ class GenericIE(InfoExtractor):
|
||||
},
|
||||
}],
|
||||
},
|
||||
{
|
||||
# Brightcove with UUID in videoPlayer
|
||||
'url': 'http://www8.hp.com/cn/zh/home.html',
|
||||
'info_dict': {
|
||||
'id': '5255815316001',
|
||||
'ext': 'mp4',
|
||||
'title': 'Sprocket Video - China',
|
||||
'description': 'Sprocket Video - China',
|
||||
'uploader': 'HP-Video Gallery',
|
||||
'timestamp': 1482263210,
|
||||
'upload_date': '20161220',
|
||||
'uploader_id': '1107601872001',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True, # m3u8 download
|
||||
},
|
||||
},
|
||||
# ooyala video
|
||||
{
|
||||
'url': 'http://www.rollingstone.com/music/videos/norwegian-dj-cashmere-cat-goes-spartan-on-with-me-premiere-20131219',
|
||||
@ -1525,6 +1542,17 @@ class GenericIE(InfoExtractor):
|
||||
'url': 'http://www.golfchannel.com/topics/shows/golftalkcentral.htm',
|
||||
'only_matching': True,
|
||||
},
|
||||
{
|
||||
# Senate ISVP iframe https
|
||||
'url': 'https://www.hsgac.senate.gov/hearings/canadas-fast-track-refugee-plan-unanswered-questions-and-implications-for-us-national-security',
|
||||
'md5': 'fb8c70b0b515e5037981a2492099aab8',
|
||||
'info_dict': {
|
||||
'id': 'govtaff020316',
|
||||
'ext': 'mp4',
|
||||
'title': 'Integrated Senate Video Player',
|
||||
},
|
||||
'add_ie': [SenateISVPIE.ie_key()],
|
||||
},
|
||||
# {
|
||||
# # TODO: find another test
|
||||
# # http://schema.org/VideoObject
|
||||
@ -1824,14 +1852,6 @@ class GenericIE(InfoExtractor):
|
||||
video_description = self._og_search_description(webpage, default=None)
|
||||
video_thumbnail = self._og_search_thumbnail(webpage, default=None)
|
||||
|
||||
# Helper method
|
||||
def _playlist_from_matches(matches, getter=None, ie=None):
|
||||
urlrs = orderedSet(
|
||||
self.url_result(self._proto_relative_url(getter(m) if getter else m), ie)
|
||||
for m in matches)
|
||||
return self.playlist_result(
|
||||
urlrs, playlist_id=video_id, playlist_title=video_title)
|
||||
|
||||
# Look for Brightcove Legacy Studio embeds
|
||||
bc_urls = BrightcoveLegacyIE._extract_brightcove_urls(webpage)
|
||||
if bc_urls:
|
||||
@ -1852,28 +1872,28 @@ class GenericIE(InfoExtractor):
|
||||
# Look for Brightcove New Studio embeds
|
||||
bc_urls = BrightcoveNewIE._extract_urls(webpage)
|
||||
if bc_urls:
|
||||
return _playlist_from_matches(bc_urls, ie='BrightcoveNew')
|
||||
return self.playlist_from_matches(bc_urls, video_id, video_title, ie='BrightcoveNew')
|
||||
|
||||
# Look for ThePlatform embeds
|
||||
tp_urls = ThePlatformIE._extract_urls(webpage)
|
||||
if tp_urls:
|
||||
return _playlist_from_matches(tp_urls, ie='ThePlatform')
|
||||
return self.playlist_from_matches(tp_urls, video_id, video_title, ie='ThePlatform')
|
||||
|
||||
# Look for Vessel embeds
|
||||
vessel_urls = VesselIE._extract_urls(webpage)
|
||||
if vessel_urls:
|
||||
return _playlist_from_matches(vessel_urls, ie=VesselIE.ie_key())
|
||||
return self.playlist_from_matches(vessel_urls, video_id, video_title, ie=VesselIE.ie_key())
|
||||
|
||||
# Look for embedded rtl.nl player
|
||||
matches = re.findall(
|
||||
r'<iframe[^>]+?src="((?:https?:)?//(?:www\.)?rtl\.nl/system/videoplayer/[^"]+(?:video_)?embed[^"]+)"',
|
||||
webpage)
|
||||
if matches:
|
||||
return _playlist_from_matches(matches, ie='RtlNl')
|
||||
return self.playlist_from_matches(matches, video_id, video_title, ie='RtlNl')
|
||||
|
||||
vimeo_urls = VimeoIE._extract_urls(url, webpage)
|
||||
if vimeo_urls:
|
||||
return _playlist_from_matches(vimeo_urls, ie=VimeoIE.ie_key())
|
||||
return self.playlist_from_matches(vimeo_urls, video_id, video_title, ie=VimeoIE.ie_key())
|
||||
|
||||
vid_me_embed_url = self._search_regex(
|
||||
r'src=[\'"](https?://vid\.me/[^\'"]+)[\'"]',
|
||||
@ -1895,25 +1915,25 @@ class GenericIE(InfoExtractor):
|
||||
(?:embed|v|p)/.+?)
|
||||
\1''', webpage)
|
||||
if matches:
|
||||
return _playlist_from_matches(
|
||||
matches, lambda m: unescapeHTML(m[1]))
|
||||
return self.playlist_from_matches(
|
||||
matches, video_id, video_title, lambda m: unescapeHTML(m[1]))
|
||||
|
||||
# Look for lazyYT YouTube embed
|
||||
matches = re.findall(
|
||||
r'class="lazyYT" data-youtube-id="([^"]+)"', webpage)
|
||||
if matches:
|
||||
return _playlist_from_matches(matches, lambda m: unescapeHTML(m))
|
||||
return self.playlist_from_matches(matches, video_id, video_title, lambda m: unescapeHTML(m))
|
||||
|
||||
# Look for Wordpress "YouTube Video Importer" plugin
|
||||
matches = re.findall(r'''(?x)<div[^>]+
|
||||
class=(?P<q1>[\'"])[^\'"]*\byvii_single_video_player\b[^\'"]*(?P=q1)[^>]+
|
||||
data-video_id=(?P<q2>[\'"])([^\'"]+)(?P=q2)''', webpage)
|
||||
if matches:
|
||||
return _playlist_from_matches(matches, lambda m: m[-1])
|
||||
return self.playlist_from_matches(matches, video_id, video_title, lambda m: m[-1])
|
||||
|
||||
matches = DailymotionIE._extract_urls(webpage)
|
||||
if matches:
|
||||
return _playlist_from_matches(matches)
|
||||
return self.playlist_from_matches(matches, video_id, video_title)
|
||||
|
||||
# Look for embedded Dailymotion playlist player (#3822)
|
||||
m = re.search(
|
||||
@ -1922,8 +1942,8 @@ class GenericIE(InfoExtractor):
|
||||
playlists = re.findall(
|
||||
r'list\[\]=/playlist/([^/]+)/', unescapeHTML(m.group('url')))
|
||||
if playlists:
|
||||
return _playlist_from_matches(
|
||||
playlists, lambda p: '//dailymotion.com/playlist/%s' % p)
|
||||
return self.playlist_from_matches(
|
||||
playlists, video_id, video_title, lambda p: '//dailymotion.com/playlist/%s' % p)
|
||||
|
||||
# Look for embedded Wistia player
|
||||
match = re.search(
|
||||
@ -2030,8 +2050,9 @@ class GenericIE(InfoExtractor):
|
||||
if mobj is not None:
|
||||
embeds = self._parse_json(mobj.group(1), video_id, fatal=False)
|
||||
if embeds:
|
||||
return _playlist_from_matches(
|
||||
embeds, getter=lambda v: OoyalaIE._url_for_embed_code(smuggle_url(v['provider_video_id'], {'domain': url})), ie='Ooyala')
|
||||
return self.playlist_from_matches(
|
||||
embeds, video_id, video_title,
|
||||
getter=lambda v: OoyalaIE._url_for_embed_code(smuggle_url(v['provider_video_id'], {'domain': url})), ie='Ooyala')
|
||||
|
||||
# Look for Aparat videos
|
||||
mobj = re.search(r'<iframe .*?src="(http://www\.aparat\.com/video/[^"]+)"', webpage)
|
||||
@ -2093,13 +2114,13 @@ class GenericIE(InfoExtractor):
|
||||
# Look for funnyordie embed
|
||||
matches = re.findall(r'<iframe[^>]+?src="(https?://(?:www\.)?funnyordie\.com/embed/[^"]+)"', webpage)
|
||||
if matches:
|
||||
return _playlist_from_matches(
|
||||
matches, getter=unescapeHTML, ie='FunnyOrDie')
|
||||
return self.playlist_from_matches(
|
||||
matches, video_id, video_title, getter=unescapeHTML, ie='FunnyOrDie')
|
||||
|
||||
# Look for BBC iPlayer embed
|
||||
matches = re.findall(r'setPlaylist\("(https?://www\.bbc\.co\.uk/iplayer/[^/]+/[\da-z]{8})"\)', webpage)
|
||||
if matches:
|
||||
return _playlist_from_matches(matches, ie='BBCCoUk')
|
||||
return self.playlist_from_matches(matches, video_id, video_title, ie='BBCCoUk')
|
||||
|
||||
# Look for embedded RUTV player
|
||||
rutv_url = RUTVIE._extract_url(webpage)
|
||||
@ -2114,32 +2135,32 @@ class GenericIE(InfoExtractor):
|
||||
# Look for embedded SportBox player
|
||||
sportbox_urls = SportBoxEmbedIE._extract_urls(webpage)
|
||||
if sportbox_urls:
|
||||
return _playlist_from_matches(sportbox_urls, ie='SportBoxEmbed')
|
||||
return self.playlist_from_matches(sportbox_urls, video_id, video_title, ie='SportBoxEmbed')
|
||||
|
||||
# Look for embedded XHamster player
|
||||
xhamster_urls = XHamsterEmbedIE._extract_urls(webpage)
|
||||
if xhamster_urls:
|
||||
return _playlist_from_matches(xhamster_urls, ie='XHamsterEmbed')
|
||||
return self.playlist_from_matches(xhamster_urls, video_id, video_title, ie='XHamsterEmbed')
|
||||
|
||||
# Look for embedded TNAFlixNetwork player
|
||||
tnaflix_urls = TNAFlixNetworkEmbedIE._extract_urls(webpage)
|
||||
if tnaflix_urls:
|
||||
return _playlist_from_matches(tnaflix_urls, ie=TNAFlixNetworkEmbedIE.ie_key())
|
||||
return self.playlist_from_matches(tnaflix_urls, video_id, video_title, ie=TNAFlixNetworkEmbedIE.ie_key())
|
||||
|
||||
# Look for embedded PornHub player
|
||||
pornhub_urls = PornHubIE._extract_urls(webpage)
|
||||
if pornhub_urls:
|
||||
return _playlist_from_matches(pornhub_urls, ie=PornHubIE.ie_key())
|
||||
return self.playlist_from_matches(pornhub_urls, video_id, video_title, ie=PornHubIE.ie_key())
|
||||
|
||||
# Look for embedded DrTuber player
|
||||
drtuber_urls = DrTuberIE._extract_urls(webpage)
|
||||
if drtuber_urls:
|
||||
return _playlist_from_matches(drtuber_urls, ie=DrTuberIE.ie_key())
|
||||
return self.playlist_from_matches(drtuber_urls, video_id, video_title, ie=DrTuberIE.ie_key())
|
||||
|
||||
# Look for embedded RedTube player
|
||||
redtube_urls = RedTubeIE._extract_urls(webpage)
|
||||
if redtube_urls:
|
||||
return _playlist_from_matches(redtube_urls, ie=RedTubeIE.ie_key())
|
||||
return self.playlist_from_matches(redtube_urls, video_id, video_title, ie=RedTubeIE.ie_key())
|
||||
|
||||
# Look for embedded Tvigle player
|
||||
mobj = re.search(
|
||||
@ -2185,12 +2206,12 @@ class GenericIE(InfoExtractor):
|
||||
# Look for embedded soundcloud player
|
||||
soundcloud_urls = SoundcloudIE._extract_urls(webpage)
|
||||
if soundcloud_urls:
|
||||
return _playlist_from_matches(soundcloud_urls, getter=unescapeHTML, ie=SoundcloudIE.ie_key())
|
||||
return self.playlist_from_matches(soundcloud_urls, video_id, video_title, getter=unescapeHTML, ie=SoundcloudIE.ie_key())
|
||||
|
||||
# Look for tunein player
|
||||
tunein_urls = TuneInBaseIE._extract_urls(webpage)
|
||||
if tunein_urls:
|
||||
return _playlist_from_matches(tunein_urls)
|
||||
return self.playlist_from_matches(tunein_urls, video_id, video_title)
|
||||
|
||||
# Look for embedded mtvservices player
|
||||
mtvservices_url = MTVServicesEmbeddedIE._extract_url(webpage)
|
||||
@ -2473,35 +2494,35 @@ class GenericIE(InfoExtractor):
|
||||
# Look for DBTV embeds
|
||||
dbtv_urls = DBTVIE._extract_urls(webpage)
|
||||
if dbtv_urls:
|
||||
return _playlist_from_matches(dbtv_urls, ie=DBTVIE.ie_key())
|
||||
return self.playlist_from_matches(dbtv_urls, video_id, video_title, ie=DBTVIE.ie_key())
|
||||
|
||||
# Look for Videa embeds
|
||||
videa_urls = VideaIE._extract_urls(webpage)
|
||||
if videa_urls:
|
||||
return _playlist_from_matches(videa_urls, ie=VideaIE.ie_key())
|
||||
return self.playlist_from_matches(videa_urls, video_id, video_title, ie=VideaIE.ie_key())
|
||||
|
||||
# Look for 20 minuten embeds
|
||||
twentymin_urls = TwentyMinutenIE._extract_urls(webpage)
|
||||
if twentymin_urls:
|
||||
return _playlist_from_matches(
|
||||
twentymin_urls, ie=TwentyMinutenIE.ie_key())
|
||||
return self.playlist_from_matches(
|
||||
twentymin_urls, video_id, video_title, ie=TwentyMinutenIE.ie_key())
|
||||
|
||||
# Look for Openload embeds
|
||||
openload_urls = OpenloadIE._extract_urls(webpage)
|
||||
if openload_urls:
|
||||
return _playlist_from_matches(
|
||||
openload_urls, ie=OpenloadIE.ie_key())
|
||||
return self.playlist_from_matches(
|
||||
openload_urls, video_id, video_title, ie=OpenloadIE.ie_key())
|
||||
|
||||
# Look for VideoPress embeds
|
||||
videopress_urls = VideoPressIE._extract_urls(webpage)
|
||||
if videopress_urls:
|
||||
return _playlist_from_matches(
|
||||
videopress_urls, ie=VideoPressIE.ie_key())
|
||||
return self.playlist_from_matches(
|
||||
videopress_urls, video_id, video_title, ie=VideoPressIE.ie_key())
|
||||
|
||||
# Look for Rutube embeds
|
||||
rutube_urls = RutubeIE._extract_urls(webpage)
|
||||
if rutube_urls:
|
||||
return _playlist_from_matches(
|
||||
return self.playlist_from_matches(
|
||||
rutube_urls, ie=RutubeIE.ie_key())
|
||||
|
||||
# Looking for http://schema.org/VideoObject
|
||||
@ -2533,7 +2554,11 @@ class GenericIE(InfoExtractor):
|
||||
try:
|
||||
jwplayer_data = self._parse_json(
|
||||
jwplayer_data_str, video_id, transform_source=js_to_json)
|
||||
return self._parse_jwplayer_data(jwplayer_data, video_id)
|
||||
info = self._parse_jwplayer_data(
|
||||
jwplayer_data, video_id, require_title=False)
|
||||
if not info.get('title'):
|
||||
info['title'] = video_title
|
||||
return info
|
||||
except ExtractorError:
|
||||
pass
|
||||
|
||||
|
@ -4,6 +4,7 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
xpath_text,
|
||||
xpath_element,
|
||||
@ -14,14 +15,26 @@ from ..utils import (
|
||||
|
||||
class HBOBaseIE(InfoExtractor):
|
||||
_FORMATS_INFO = {
|
||||
'pro7': {
|
||||
'width': 1280,
|
||||
'height': 720,
|
||||
},
|
||||
'1920': {
|
||||
'width': 1280,
|
||||
'height': 720,
|
||||
},
|
||||
'pro6': {
|
||||
'width': 768,
|
||||
'height': 432,
|
||||
},
|
||||
'640': {
|
||||
'width': 768,
|
||||
'height': 432,
|
||||
},
|
||||
'pro5': {
|
||||
'width': 640,
|
||||
'height': 360,
|
||||
},
|
||||
'highwifi': {
|
||||
'width': 640,
|
||||
'height': 360,
|
||||
@ -78,6 +91,17 @@ class HBOBaseIE(InfoExtractor):
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
video_url.replace('.tar', '/base_index_w8.m3u8'),
|
||||
video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
|
||||
elif source.tag == 'hls':
|
||||
# #EXT-X-BYTERANGE is not supported by native hls downloader
|
||||
# and ffmpeg (#10955)
|
||||
# formats.extend(self._extract_m3u8_formats(
|
||||
# video_url.replace('.tar', '/base_index.m3u8'),
|
||||
# video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
|
||||
continue
|
||||
elif source.tag == 'dash':
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
video_url.replace('.tar', '/manifest.mpd'),
|
||||
video_id, mpd_id='dash', fatal=False))
|
||||
else:
|
||||
format_info = self._FORMATS_INFO.get(source.tag, {})
|
||||
formats.append({
|
||||
@ -112,10 +136,11 @@ class HBOBaseIE(InfoExtractor):
|
||||
|
||||
|
||||
class HBOIE(HBOBaseIE):
|
||||
IE_NAME = 'hbo'
|
||||
_VALID_URL = r'https?://(?:www\.)?hbo\.com/video/video\.html\?.*vid=(?P<id>[0-9]+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.hbo.com/video/video.html?autoplay=true&g=u&vid=1437839',
|
||||
'md5': '1c33253f0c7782142c993c0ba62a8753',
|
||||
'md5': '2c6a6bc1222c7e91cb3334dad1746e5a',
|
||||
'info_dict': {
|
||||
'id': '1437839',
|
||||
'ext': 'mp4',
|
||||
@ -131,11 +156,12 @@ class HBOIE(HBOBaseIE):
|
||||
|
||||
|
||||
class HBOEpisodeIE(HBOBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?hbo\.com/(?!video)([^/]+/)+video/(?P<id>[0-9a-z-]+)\.html'
|
||||
IE_NAME = 'hbo:episode'
|
||||
_VALID_URL = r'https?://(?:www\.)?hbo\.com/(?P<path>(?!video)(?:(?:[^/]+/)+video|watch-free-episodes)/(?P<id>[0-9a-z-]+))(?:\.html)?'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.hbo.com/girls/episodes/5/52-i-love-you-baby/video/ep-52-inside-the-episode.html?autoplay=true',
|
||||
'md5': '689132b253cc0ab7434237fc3a293210',
|
||||
'md5': '61ead79b9c0dfa8d3d4b07ef4ac556fb',
|
||||
'info_dict': {
|
||||
'id': '1439518',
|
||||
'display_id': 'ep-52-inside-the-episode',
|
||||
@ -147,16 +173,19 @@ class HBOEpisodeIE(HBOBaseIE):
|
||||
}, {
|
||||
'url': 'http://www.hbo.com/game-of-thrones/about/video/season-5-invitation-to-the-set.html?autoplay=true',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.hbo.com/watch-free-episodes/last-week-tonight-with-john-oliver',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
path, display_id = re.match(self._VALID_URL, url).groups()
|
||||
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
content = self._download_json(
|
||||
'http://www.hbo.com/api/content/' + path, display_id)['content']
|
||||
|
||||
video_id = self._search_regex(
|
||||
r'(?P<q1>[\'"])videoId(?P=q1)\s*:\s*(?P<q2>[\'"])(?P<video_id>\d+)(?P=q2)',
|
||||
webpage, 'video ID', group='video_id')
|
||||
video_id = compat_str((content.get('parsed', {}).get(
|
||||
'common:FullBleedVideo', {}) or content['selectedEpisode'])['videoId'])
|
||||
|
||||
info_dict = self._extract_from_id(video_id)
|
||||
info_dict['display_id'] = display_id
|
||||
|
259
youtube_dl/extractor/medialaan.py
Normal file
259
youtube_dl/extractor/medialaan.py
Normal file
@ -0,0 +1,259 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
parse_duration,
|
||||
try_get,
|
||||
unified_timestamp,
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
|
||||
class MedialaanIE(InfoExtractor):
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
(?:www\.)?
|
||||
(?:
|
||||
(?P<site_id>vtm|q2|vtmkzoom)\.be/
|
||||
(?:
|
||||
video(?:/[^/]+/id/|/?\?.*?\baid=)|
|
||||
(?:[^/]+/)*
|
||||
)
|
||||
)
|
||||
(?P<id>[^/?#&]+)
|
||||
'''
|
||||
_NETRC_MACHINE = 'medialaan'
|
||||
_APIKEY = '3_HZ0FtkMW_gOyKlqQzW5_0FHRC7Nd5XpXJZcDdXY4pk5eES2ZWmejRW5egwVm4ug-'
|
||||
_SITE_TO_APP_ID = {
|
||||
'vtm': 'vtm_watch',
|
||||
'q2': 'q2',
|
||||
'vtmkzoom': 'vtmkzoom',
|
||||
}
|
||||
_TESTS = [{
|
||||
# vod
|
||||
'url': 'http://vtm.be/video/volledige-afleveringen/id/vtm_20170219_VM0678361_vtmwatch',
|
||||
'info_dict': {
|
||||
'id': 'vtm_20170219_VM0678361_vtmwatch',
|
||||
'ext': 'mp4',
|
||||
'title': 'Allemaal Chris afl. 6',
|
||||
'description': 'md5:4be86427521e7b07e0adb0c9c554ddb2',
|
||||
'timestamp': 1487533280,
|
||||
'upload_date': '20170219',
|
||||
'duration': 2562,
|
||||
'series': 'Allemaal Chris',
|
||||
'season': 'Allemaal Chris',
|
||||
'season_number': 1,
|
||||
'season_id': '256936078124527',
|
||||
'episode': 'Allemaal Chris afl. 6',
|
||||
'episode_number': 6,
|
||||
'episode_id': '256936078591527',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
'skip': 'Requires account credentials',
|
||||
}, {
|
||||
# clip
|
||||
'url': 'http://vtm.be/video?aid=168332',
|
||||
'info_dict': {
|
||||
'id': '168332',
|
||||
'ext': 'mp4',
|
||||
'title': '"Veronique liegt!"',
|
||||
'description': 'md5:1385e2b743923afe54ba4adc38476155',
|
||||
'timestamp': 1489002029,
|
||||
'upload_date': '20170308',
|
||||
'duration': 96,
|
||||
},
|
||||
}, {
|
||||
# vod
|
||||
'url': 'http://vtm.be/video/volledige-afleveringen/id/257107153551000',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# vod
|
||||
'url': 'http://vtm.be/video?aid=163157',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# vod
|
||||
'url': 'http://www.q2.be/video/volledige-afleveringen/id/2be_20170301_VM0684442_q2',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# clip
|
||||
'url': 'http://vtmkzoom.be/k3-dansstudio/een-nieuw-seizoen-van-k3-dansstudio',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_initialize(self):
|
||||
self._logged_in = False
|
||||
|
||||
def _login(self):
|
||||
username, password = self._get_login_info()
|
||||
if username is None:
|
||||
self.raise_login_required()
|
||||
|
||||
auth_data = {
|
||||
'APIKey': self._APIKEY,
|
||||
'sdk': 'js_6.1',
|
||||
'format': 'json',
|
||||
'loginID': username,
|
||||
'password': password,
|
||||
}
|
||||
|
||||
auth_info = self._download_json(
|
||||
'https://accounts.eu1.gigya.com/accounts.login', None,
|
||||
note='Logging in', errnote='Unable to log in',
|
||||
data=urlencode_postdata(auth_data))
|
||||
|
||||
error_message = auth_info.get('errorDetails') or auth_info.get('errorMessage')
|
||||
if error_message:
|
||||
raise ExtractorError(
|
||||
'Unable to login: %s' % error_message, expected=True)
|
||||
|
||||
self._uid = auth_info['UID']
|
||||
self._uid_signature = auth_info['UIDSignature']
|
||||
self._signature_timestamp = auth_info['signatureTimestamp']
|
||||
|
||||
self._logged_in = True
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id, site_id = mobj.group('id', 'site_id')
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
config = self._parse_json(
|
||||
self._search_regex(
|
||||
r'videoJSConfig\s*=\s*JSON\.parse\(\'({.+?})\'\);',
|
||||
webpage, 'config', default='{}'), video_id,
|
||||
transform_source=lambda s: s.replace(
|
||||
'\\\\', '\\').replace(r'\"', '"').replace(r"\'", "'"))
|
||||
|
||||
vod_id = config.get('vodId') or self._search_regex(
|
||||
(r'\\"vodId\\"\s*:\s*\\"(.+?)\\"',
|
||||
r'<[^>]+id=["\']vod-(\d+)'),
|
||||
webpage, 'video_id', default=None)
|
||||
|
||||
# clip, no authentication required
|
||||
if not vod_id:
|
||||
player = self._parse_json(
|
||||
self._search_regex(
|
||||
r'vmmaplayer\(({.+?})\);', webpage, 'vmma player',
|
||||
default=''),
|
||||
video_id, transform_source=lambda s: '[%s]' % s, fatal=False)
|
||||
if player:
|
||||
video = player[-1]
|
||||
info = {
|
||||
'id': video_id,
|
||||
'url': video['videoUrl'],
|
||||
'title': video['title'],
|
||||
'thumbnail': video.get('imageUrl'),
|
||||
'timestamp': int_or_none(video.get('createdDate')),
|
||||
'duration': int_or_none(video.get('duration')),
|
||||
}
|
||||
else:
|
||||
info = self._parse_html5_media_entries(
|
||||
url, webpage, video_id, m3u8_id='hls')[0]
|
||||
info.update({
|
||||
'id': video_id,
|
||||
'title': self._html_search_meta('description', webpage),
|
||||
'duration': parse_duration(self._html_search_meta('duration', webpage)),
|
||||
})
|
||||
# vod, authentication required
|
||||
else:
|
||||
if not self._logged_in:
|
||||
self._login()
|
||||
|
||||
settings = self._parse_json(
|
||||
self._search_regex(
|
||||
r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);',
|
||||
webpage, 'drupal settings', default='{}'),
|
||||
video_id)
|
||||
|
||||
def get(container, item):
|
||||
return try_get(
|
||||
settings, lambda x: x[container][item],
|
||||
compat_str) or self._search_regex(
|
||||
r'"%s"\s*:\s*"([^"]+)' % item, webpage, item,
|
||||
default=None)
|
||||
|
||||
app_id = get('vod', 'app_id') or self._SITE_TO_APP_ID.get(site_id, 'vtm_watch')
|
||||
sso = get('vod', 'gigyaDatabase') or 'vtm-sso'
|
||||
|
||||
data = self._download_json(
|
||||
'http://vod.medialaan.io/api/1.0/item/%s/video' % vod_id,
|
||||
video_id, query={
|
||||
'app_id': app_id,
|
||||
'user_network': sso,
|
||||
'UID': self._uid,
|
||||
'UIDSignature': self._uid_signature,
|
||||
'signatureTimestamp': self._signature_timestamp,
|
||||
})
|
||||
|
||||
formats = self._extract_m3u8_formats(
|
||||
data['response']['uri'], video_id, entry_protocol='m3u8_native',
|
||||
ext='mp4', m3u8_id='hls')
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
info = {
|
||||
'id': vod_id,
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
api_key = get('vod', 'apiKey')
|
||||
channel = get('medialaanGigya', 'channel')
|
||||
|
||||
if api_key:
|
||||
videos = self._download_json(
|
||||
'http://vod.medialaan.io/vod/v2/videos', video_id, fatal=False,
|
||||
query={
|
||||
'channels': channel,
|
||||
'ids': vod_id,
|
||||
'limit': 1,
|
||||
'apikey': api_key,
|
||||
})
|
||||
if videos:
|
||||
video = try_get(
|
||||
videos, lambda x: x['response']['videos'][0], dict)
|
||||
if video:
|
||||
def get(container, item, expected_type=None):
|
||||
return try_get(
|
||||
video, lambda x: x[container][item], expected_type)
|
||||
|
||||
def get_string(container, item):
|
||||
return get(container, item, compat_str)
|
||||
|
||||
info.update({
|
||||
'series': get_string('program', 'title'),
|
||||
'season': get_string('season', 'title'),
|
||||
'season_number': int_or_none(get('season', 'number')),
|
||||
'season_id': get_string('season', 'id'),
|
||||
'episode': get_string('episode', 'title'),
|
||||
'episode_number': int_or_none(get('episode', 'number')),
|
||||
'episode_id': get_string('episode', 'id'),
|
||||
'duration': int_or_none(
|
||||
video.get('duration')) or int_or_none(
|
||||
video.get('durationMillis'), scale=1000),
|
||||
'title': get_string('episode', 'title'),
|
||||
'description': get_string('episode', 'text'),
|
||||
'timestamp': unified_timestamp(get_string(
|
||||
'publication', 'begin')),
|
||||
})
|
||||
|
||||
if not info.get('title'):
|
||||
info['title'] = try_get(
|
||||
config, lambda x: x['videoConfig']['title'],
|
||||
compat_str) or self._html_search_regex(
|
||||
r'\\"title\\"\s*:\s*\\"(.+?)\\"', webpage, 'title',
|
||||
default=None) or self._og_search_title(webpage)
|
||||
|
||||
if not info.get('description'):
|
||||
info['description'] = self._html_search_regex(
|
||||
r'<div[^>]+class="field-item\s+even">\s*<p>(.+?)</p>',
|
||||
webpage, 'description', default=None)
|
||||
|
||||
return info
|
@ -51,6 +51,7 @@ class MioMioIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'マツコの知らない世界【劇的進化SP!ビニール傘&冷凍食品2016】 1_2 - 16 05 31',
|
||||
},
|
||||
'skip': 'Unable to load videos',
|
||||
}]
|
||||
|
||||
def _extract_mioplayer(self, webpage, video_id, title, http_headers):
|
||||
@ -94,9 +95,18 @@ class MioMioIE(InfoExtractor):
|
||||
|
||||
return entries
|
||||
|
||||
def _download_chinese_webpage(self, *args, **kwargs):
|
||||
# Requests with English locales return garbage
|
||||
headers = {
|
||||
'Accept-Language': 'zh-TW,en-US;q=0.7,en;q=0.3',
|
||||
}
|
||||
kwargs.setdefault('headers', {}).update(headers)
|
||||
return self._download_webpage(*args, **kwargs)
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
webpage = self._download_chinese_webpage(
|
||||
url, video_id)
|
||||
|
||||
title = self._html_search_meta(
|
||||
'description', webpage, 'title', fatal=True)
|
||||
@ -106,7 +116,7 @@ class MioMioIE(InfoExtractor):
|
||||
|
||||
if '_h5' in mioplayer_path:
|
||||
player_url = compat_urlparse.urljoin(url, mioplayer_path)
|
||||
player_webpage = self._download_webpage(
|
||||
player_webpage = self._download_chinese_webpage(
|
||||
player_url, video_id,
|
||||
note='Downloading player webpage', headers={'Referer': url})
|
||||
entries = self._parse_html5_media_entries(player_url, player_webpage, video_id)
|
||||
|
@ -4,6 +4,7 @@ from __future__ import unicode_literals
|
||||
import uuid
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .ooyala import OoyalaIE
|
||||
from ..compat import (
|
||||
compat_str,
|
||||
compat_urllib_parse_urlencode,
|
||||
@ -24,6 +25,9 @@ class MiTeleBaseIE(InfoExtractor):
|
||||
r'(?s)(<ms-video-player.+?</ms-video-player>)',
|
||||
webpage, 'ms video player'))
|
||||
video_id = player_data['data-media-id']
|
||||
if player_data.get('data-cms-id') == 'ooyala':
|
||||
return self.url_result(
|
||||
'ooyala:%s' % video_id, ie=OoyalaIE.ie_key(), video_id=video_id)
|
||||
config_url = compat_urlparse.urljoin(url, player_data['data-config'])
|
||||
config = self._download_json(
|
||||
config_url, video_id, 'Downloading config JSON')
|
||||
|
@ -34,12 +34,6 @@ class NineCNineMediaStackIE(NineCNineMediaBaseIE):
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
stack_base_url + 'f4m', stack_id,
|
||||
f4m_id='hds', fatal=False))
|
||||
mp4_url = self._download_webpage(stack_base_url + 'pd', stack_id, fatal=False)
|
||||
if mp4_url:
|
||||
formats.append({
|
||||
'url': mp4_url,
|
||||
'format_id': 'mp4',
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
|
@ -75,22 +75,51 @@ class OpenloadIE(InfoExtractor):
|
||||
'<span[^>]+id="[^"]+"[^>]*>([0-9A-Za-z]+)</span>',
|
||||
webpage, 'openload ID')
|
||||
|
||||
first_char = int(ol_id[0])
|
||||
urlcode = []
|
||||
num = 1
|
||||
video_url_chars = []
|
||||
|
||||
while num < len(ol_id):
|
||||
i = ord(ol_id[num])
|
||||
key = 0
|
||||
if i <= 90:
|
||||
key = i - 65
|
||||
elif i >= 97:
|
||||
key = 25 + i - 97
|
||||
urlcode.append((key, compat_chr(int(ol_id[num + 2:num + 5]) // int(ol_id[num + 1]) - first_char)))
|
||||
num += 5
|
||||
first_char = ord(ol_id[0])
|
||||
key = first_char - 55
|
||||
maxKey = max(2, key)
|
||||
key = min(maxKey, len(ol_id) - 38)
|
||||
t = ol_id[key:key + 36]
|
||||
|
||||
video_url = 'https://openload.co/stream/' + ''.join(
|
||||
[value for _, value in sorted(urlcode, key=lambda x: x[0])])
|
||||
hashMap = {}
|
||||
v = ol_id.replace(t, '')
|
||||
h = 0
|
||||
|
||||
while h < len(t):
|
||||
f = t[h:h + 3]
|
||||
i = int(f, 8)
|
||||
hashMap[h / 3] = i
|
||||
h += 3
|
||||
|
||||
h = 0
|
||||
H = 0
|
||||
while h < len(v):
|
||||
B = ''
|
||||
C = ''
|
||||
if len(v) >= h + 2:
|
||||
B = v[h:h + 2]
|
||||
if len(v) >= h + 3:
|
||||
C = v[h:h + 3]
|
||||
i = int(B, 16)
|
||||
h += 2
|
||||
if H % 3 == 0:
|
||||
i = int(C, 8)
|
||||
h += 1
|
||||
elif H % 2 == 0 and H != 0 and ord(v[H - 1]) < 60:
|
||||
i = int(C, 10)
|
||||
h += 1
|
||||
index = H % 7
|
||||
|
||||
A = hashMap[index]
|
||||
i ^= 213
|
||||
i ^= A
|
||||
video_url_chars.append(compat_chr(i))
|
||||
H += 1
|
||||
|
||||
video_url = 'https://openload.co/stream/%s?mime=true'
|
||||
video_url = video_url % (''.join(video_url_chars))
|
||||
|
||||
title = self._og_search_title(webpage, default=None) or self._search_regex(
|
||||
r'<span[^>]+class=["\']title["\'][^>]*>([^<]+)', webpage,
|
||||
|
@ -40,7 +40,7 @@ class PluralsightIE(PluralsightBaseIE):
|
||||
'info_dict': {
|
||||
'id': 'hosting-sql-server-windows-azure-iaas-m7-mgmt-04',
|
||||
'ext': 'mp4',
|
||||
'title': 'Management of SQL Server - Demo Monitoring',
|
||||
'title': 'Demo Monitoring',
|
||||
'duration': 338,
|
||||
},
|
||||
'skip': 'Requires pluralsight account credentials',
|
||||
@ -187,7 +187,7 @@ class PluralsightIE(PluralsightBaseIE):
|
||||
if not clip:
|
||||
raise ExtractorError('Unable to resolve clip')
|
||||
|
||||
title = '%s - %s' % (module['title'], clip['title'])
|
||||
title = clip['title']
|
||||
|
||||
QUALITIES = {
|
||||
'low': {'width': 640, 'height': 480},
|
||||
|
@ -1,7 +1,9 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import functools
|
||||
import itertools
|
||||
import operator
|
||||
# import os
|
||||
import re
|
||||
|
||||
@ -18,6 +20,7 @@ from ..utils import (
|
||||
js_to_json,
|
||||
orderedSet,
|
||||
# sanitized_Request,
|
||||
remove_quotes,
|
||||
str_to_int,
|
||||
)
|
||||
# from ..aes import (
|
||||
@ -129,9 +132,32 @@ class PornHubIE(InfoExtractor):
|
||||
|
||||
tv_webpage = dl_webpage('tv')
|
||||
|
||||
video_url = self._search_regex(
|
||||
r'<video[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//.+?)\1', tv_webpage,
|
||||
'video url', group='url')
|
||||
assignments = self._search_regex(
|
||||
r'(var.+?mediastring.+?)</script>', tv_webpage,
|
||||
'encoded url').split(';')
|
||||
|
||||
js_vars = {}
|
||||
|
||||
def parse_js_value(inp):
|
||||
inp = re.sub(r'/\*(?:(?!\*/).)*?\*/', '', inp)
|
||||
if '+' in inp:
|
||||
inps = inp.split('+')
|
||||
return functools.reduce(
|
||||
operator.concat, map(parse_js_value, inps))
|
||||
inp = inp.strip()
|
||||
if inp in js_vars:
|
||||
return js_vars[inp]
|
||||
return remove_quotes(inp)
|
||||
|
||||
for assn in assignments:
|
||||
assn = assn.strip()
|
||||
if not assn:
|
||||
continue
|
||||
assn = re.sub(r'var\s+', '', assn)
|
||||
vname, value = assn.split('=', 1)
|
||||
js_vars[vname] = parse_js_value(value)
|
||||
|
||||
video_url = js_vars['mediastring']
|
||||
|
||||
title = self._search_regex(
|
||||
r'<h1>([^>]+)</h1>', tv_webpage, 'title', default=None)
|
||||
|
@ -300,6 +300,21 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
|
||||
'skip_download': True,
|
||||
},
|
||||
},
|
||||
{
|
||||
# title in <h2 class="subtitle">
|
||||
'url': 'http://www.prosieben.de/stars/oscar-award/videos/jetzt-erst-enthuellt-das-geheimnis-von-emma-stones-oscar-robe-clip',
|
||||
'info_dict': {
|
||||
'id': '4895826',
|
||||
'ext': 'mp4',
|
||||
'title': 'Jetzt erst enthüllt: Das Geheimnis von Emma Stones Oscar-Robe',
|
||||
'description': 'md5:e5ace2bc43fadf7b63adc6187e9450b9',
|
||||
'upload_date': '20170302',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
'skip': 'geo restricted to Germany',
|
||||
},
|
||||
{
|
||||
# geo restricted to Germany
|
||||
'url': 'http://www.kabeleinsdoku.de/tv/mayday-alarm-im-cockpit/video/102-notlandung-im-hudson-river-ganze-folge',
|
||||
@ -338,6 +353,7 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
|
||||
r'<header class="module_header">\s*<h2>([^<]+)</h2>\s*</header>',
|
||||
r'<h2 class="video-title" itemprop="name">\s*(.+?)</h2>',
|
||||
r'<div[^>]+id="veeseoTitle"[^>]*>(.+?)</div>',
|
||||
r'<h2[^>]+class="subtitle"[^>]*>([^<]+)</h2>',
|
||||
]
|
||||
_DESCRIPTION_REGEXES = [
|
||||
r'<p itemprop="description">\s*(.+?)</p>',
|
||||
@ -369,7 +385,9 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
|
||||
def _extract_clip(self, url, webpage):
|
||||
clip_id = self._html_search_regex(
|
||||
self._CLIPID_REGEXES, webpage, 'clip id')
|
||||
title = self._html_search_regex(self._TITLE_REGEXES, webpage, 'title')
|
||||
title = self._html_search_regex(
|
||||
self._TITLE_REGEXES, webpage, 'title',
|
||||
default=None) or self._og_search_title(webpage)
|
||||
info = self._extract_video_info(url, clip_id)
|
||||
description = self._html_search_regex(
|
||||
self._DESCRIPTION_REGEXES, webpage, 'description', default=None)
|
||||
|
@ -2,11 +2,13 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_HTTPError
|
||||
from ..utils import (
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
try_get,
|
||||
unified_timestamp,
|
||||
# unified_timestamp,
|
||||
ExtractorError,
|
||||
)
|
||||
|
||||
|
||||
@ -15,15 +17,15 @@ class RedBullTVIE(InfoExtractor):
|
||||
_TESTS = [{
|
||||
# film
|
||||
'url': 'https://www.redbull.tv/video/AP-1Q756YYX51W11/abc-of-wrc',
|
||||
'md5': '78e860f631d7a846e712fab8c5fe2c38',
|
||||
'md5': 'fb0445b98aa4394e504b413d98031d1f',
|
||||
'info_dict': {
|
||||
'id': 'AP-1Q756YYX51W11',
|
||||
'ext': 'mp4',
|
||||
'title': 'ABC of...WRC',
|
||||
'description': 'md5:5c7ed8f4015c8492ecf64b6ab31e7d31',
|
||||
'duration': 1582.04,
|
||||
'timestamp': 1488405786,
|
||||
'upload_date': '20170301',
|
||||
# 'timestamp': 1488405786,
|
||||
# 'upload_date': '20170301',
|
||||
},
|
||||
}, {
|
||||
# episode
|
||||
@ -34,8 +36,8 @@ class RedBullTVIE(InfoExtractor):
|
||||
'title': 'Grime - Hashtags S2 E4',
|
||||
'description': 'md5:334b741c8c1ce65be057eab6773c1cf5',
|
||||
'duration': 904.6,
|
||||
'timestamp': 1487290093,
|
||||
'upload_date': '20170217',
|
||||
# 'timestamp': 1487290093,
|
||||
# 'upload_date': '20170217',
|
||||
'series': 'Hashtags',
|
||||
'season_number': 2,
|
||||
'episode_number': 4,
|
||||
@ -48,29 +50,40 @@ class RedBullTVIE(InfoExtractor):
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
access_token = self._download_json(
|
||||
'https://api-v2.redbull.tv/start', video_id,
|
||||
session = self._download_json(
|
||||
'https://api-v2.redbull.tv/session', video_id,
|
||||
note='Downloading access token', query={
|
||||
'build': '4.0.9',
|
||||
'category': 'smartphone',
|
||||
'os_version': 23,
|
||||
'os_family': 'android',
|
||||
})['auth']['access_token']
|
||||
'build': '4.370.0',
|
||||
'category': 'personal_computer',
|
||||
'os_version': '1.0',
|
||||
'os_family': 'http',
|
||||
})
|
||||
if session.get('code') == 'error':
|
||||
raise ExtractorError('%s said: %s' % (
|
||||
self.IE_NAME, session['message']))
|
||||
auth = '%s %s' % (session.get('token_type', 'Bearer'), session['access_token'])
|
||||
|
||||
info = self._download_json(
|
||||
'https://api-v2.redbull.tv/views/%s' % video_id,
|
||||
video_id, note='Downloading video information',
|
||||
headers={'Authorization': 'Bearer ' + access_token}
|
||||
)['blocks'][0]['top'][0]
|
||||
try:
|
||||
info = self._download_json(
|
||||
'https://api-v2.redbull.tv/content/%s' % video_id,
|
||||
video_id, note='Downloading video information',
|
||||
headers={'Authorization': auth}
|
||||
)
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 404:
|
||||
error_message = self._parse_json(
|
||||
e.cause.read().decode(), video_id)['message']
|
||||
raise ExtractorError('%s said: %s' % (
|
||||
self.IE_NAME, error_message), expected=True)
|
||||
raise
|
||||
|
||||
video = info['video_product']
|
||||
|
||||
title = info['title'].strip()
|
||||
m3u8_url = video['url']
|
||||
|
||||
formats = self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id='hls')
|
||||
video['url'], video_id, 'mp4', 'm3u8_native')
|
||||
self._sort_formats(formats)
|
||||
|
||||
subtitles = {}
|
||||
for _, captions in (try_get(
|
||||
@ -82,9 +95,12 @@ class RedBullTVIE(InfoExtractor):
|
||||
caption_url = caption.get('url')
|
||||
if not caption_url:
|
||||
continue
|
||||
ext = caption.get('format')
|
||||
if ext == 'xml':
|
||||
ext = 'ttml'
|
||||
subtitles.setdefault(caption.get('lang') or 'en', []).append({
|
||||
'url': caption_url,
|
||||
'ext': caption.get('format'),
|
||||
'ext': ext,
|
||||
})
|
||||
|
||||
subheading = info.get('subheading')
|
||||
@ -97,7 +113,7 @@ class RedBullTVIE(InfoExtractor):
|
||||
'description': info.get('long_description') or info.get(
|
||||
'short_description'),
|
||||
'duration': float_or_none(video.get('duration'), scale=1000),
|
||||
'timestamp': unified_timestamp(info.get('published')),
|
||||
# 'timestamp': unified_timestamp(info.get('published')),
|
||||
'series': info.get('show_title'),
|
||||
'season_number': int_or_none(info.get('season_number')),
|
||||
'episode_number': int_or_none(info.get('episode_number')),
|
||||
|
@ -89,7 +89,7 @@ class SenateISVPIE(InfoExtractor):
|
||||
@staticmethod
|
||||
def _search_iframe_url(webpage):
|
||||
mobj = re.search(
|
||||
r"<iframe[^>]+src=['\"](?P<url>http://www\.senate\.gov/isvp/?\?[^'\"]+)['\"]",
|
||||
r"<iframe[^>]+src=['\"](?P<url>https?://www\.senate\.gov/isvp/?\?[^'\"]+)['\"]",
|
||||
webpage)
|
||||
if mobj:
|
||||
return mobj.group('url')
|
||||
|
@ -121,7 +121,7 @@ class SoundcloudIE(InfoExtractor):
|
||||
},
|
||||
]
|
||||
|
||||
_CLIENT_ID = 'fDoItMDbsbZz8dY16ZzARCZmzgHBPotA'
|
||||
_CLIENT_ID = '2t9loNQH90kzJcsFCODdigxfp325aq4z'
|
||||
_IPHONE_CLIENT_ID = '376f225bf427445fc4bfb6b99b72e0bf'
|
||||
|
||||
@staticmethod
|
||||
|
@ -65,7 +65,7 @@ class StreamableIE(InfoExtractor):
|
||||
# to return video info like the title properly sometimes, and doesn't
|
||||
# include info like the video duration
|
||||
video = self._download_json(
|
||||
'https://streamable.com/ajax/videos/%s' % video_id, video_id)
|
||||
'https://ajax.streamable.com/videos/%s' % video_id, video_id)
|
||||
|
||||
# Format IDs:
|
||||
# 0 The video is being uploaded
|
||||
|
@ -44,6 +44,10 @@ class TelecincoIE(MiTeleBaseIE):
|
||||
}, {
|
||||
'url': 'http://www.telecinco.es/espanasinirmaslejos/Espana-gran-destino-turistico_2_1240605043.html',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# ooyala video
|
||||
'url': 'http://www.cuatro.com/chesterinlove/a-carta/chester-chester_in_love-chester_edu_2_2331030022.html',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
@ -2,15 +2,17 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
smuggle_url,
|
||||
try_get,
|
||||
)
|
||||
|
||||
|
||||
class TeleQuebecIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://zonevideo\.telequebec\.tv/media/(?P<id>\d+)'
|
||||
_TEST = {
|
||||
_TESTS = [{
|
||||
'url': 'http://zonevideo.telequebec.tv/media/20984/le-couronnement-de-new-york/couronnement-de-new-york',
|
||||
'md5': 'fe95a0957e5707b1b01f5013e725c90f',
|
||||
'info_dict': {
|
||||
@ -18,10 +20,14 @@ class TeleQuebecIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Le couronnement de New York',
|
||||
'description': 'md5:f5b3d27a689ec6c1486132b2d687d432',
|
||||
'upload_date': '20160220',
|
||||
'timestamp': 1455965438,
|
||||
'upload_date': '20170201',
|
||||
'timestamp': 1485972222,
|
||||
}
|
||||
}
|
||||
}, {
|
||||
# no description
|
||||
'url': 'http://zonevideo.telequebec.tv/media/30261',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
media_id = self._match_id(url)
|
||||
@ -31,9 +37,13 @@ class TeleQuebecIE(InfoExtractor):
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'id': media_id,
|
||||
'url': smuggle_url('limelight:media:' + media_data['streamInfo']['sourceId'], {'geo_countries': ['CA']}),
|
||||
'url': smuggle_url(
|
||||
'limelight:media:' + media_data['streamInfo']['sourceId'],
|
||||
{'geo_countries': ['CA']}),
|
||||
'title': media_data['title'],
|
||||
'description': media_data.get('descriptions', [{'text': None}])[0].get('text'),
|
||||
'duration': int_or_none(media_data.get('durationInMilliseconds'), 1000),
|
||||
'description': try_get(
|
||||
media_data, lambda x: x['descriptions'][0]['text'], compat_str),
|
||||
'duration': int_or_none(
|
||||
media_data.get('durationInMilliseconds'), 1000),
|
||||
'ie_key': 'LimelightMedia',
|
||||
}
|
||||
|
81
youtube_dl/extractor/toongoggles.py
Normal file
81
youtube_dl/extractor/toongoggles.py
Normal file
@ -0,0 +1,81 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_duration,
|
||||
)
|
||||
|
||||
|
||||
class ToonGogglesIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?toongoggles\.com/shows/(?P<show_id>\d+)(?:/[^/]+/episodes/(?P<episode_id>\d+))?'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.toongoggles.com/shows/217143/bernard-season-2/episodes/217147/football',
|
||||
'md5': '18289fc2b951eff6b953a9d8f01e6831',
|
||||
'info_dict': {
|
||||
'id': '217147',
|
||||
'ext': 'mp4',
|
||||
'title': 'Football',
|
||||
'uploader_id': '1',
|
||||
'description': 'Bernard decides to play football in order to be better than Lloyd and tries to beat him no matter how, he even cheats.',
|
||||
'upload_date': '20160718',
|
||||
'timestamp': 1468879330,
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.toongoggles.com/shows/227759/om-nom-stories-around-the-world',
|
||||
'info_dict': {
|
||||
'id': '227759',
|
||||
'title': 'Om Nom Stories Around The World',
|
||||
},
|
||||
'playlist_mincount': 11,
|
||||
}]
|
||||
|
||||
def _call_api(self, action, page_id, query):
|
||||
query.update({
|
||||
'for_ng': 1,
|
||||
'for_web': 1,
|
||||
'show_meta': 1,
|
||||
'version': 7.0,
|
||||
})
|
||||
return self._download_json('http://api.toongoggles.com/' + action, page_id, query=query)
|
||||
|
||||
def _parse_episode_data(self, episode_data):
|
||||
title = episode_data['episode_name']
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'id': episode_data['episode_id'],
|
||||
'title': title,
|
||||
'url': 'kaltura:513551:' + episode_data['entry_id'],
|
||||
'thumbnail': episode_data.get('thumbnail_url'),
|
||||
'description': episode_data.get('description'),
|
||||
'duration': parse_duration(episode_data.get('hms')),
|
||||
'series': episode_data.get('show_name'),
|
||||
'season_number': int_or_none(episode_data.get('season_num')),
|
||||
'episode_id': episode_data.get('episode_id'),
|
||||
'episode': title,
|
||||
'episode_number': int_or_none(episode_data.get('episode_num')),
|
||||
'categories': episode_data.get('categories'),
|
||||
'ie_key': 'Kaltura',
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
show_id, episode_id = re.match(self._VALID_URL, url).groups()
|
||||
if episode_id:
|
||||
episode_data = self._call_api('search', episode_id, {
|
||||
'filter': 'episode',
|
||||
'id': episode_id,
|
||||
})['objects'][0]
|
||||
return self._parse_episode_data(episode_data)
|
||||
else:
|
||||
show_data = self._call_api('getepisodesbyshow', show_id, {
|
||||
'max': 1000000000,
|
||||
'showid': show_id,
|
||||
})
|
||||
entries = []
|
||||
for episode_data in show_data.get('objects', []):
|
||||
entries.append(self._parse_episode_data(episode_data))
|
||||
return self.playlist_result(entries, show_id, show_data.get('show_name'))
|
@ -104,7 +104,7 @@ class TwitchBaseIE(InfoExtractor):
|
||||
login_page, handle, 'Logging in as %s' % username, {
|
||||
'username': username,
|
||||
'password': password,
|
||||
})
|
||||
})
|
||||
|
||||
if re.search(r'(?i)<form[^>]+id="two-factor-submit"', redirect_page) is not None:
|
||||
# TODO: Add mechanism to request an SMS or phone call
|
||||
|
@ -44,7 +44,7 @@ class ViuBaseIE(InfoExtractor):
|
||||
|
||||
|
||||
class ViuIE(ViuBaseIE):
|
||||
_VALID_URL = r'(?:viu:|https?://www\.viu\.com/[a-z]{2}/media/)(?P<id>\d+)'
|
||||
_VALID_URL = r'(?:viu:|https?://[^/]+\.viu\.com/[a-z]{2}/media/)(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.viu.com/en/media/1116705532?containerId=playlist-22168059',
|
||||
'info_dict': {
|
||||
@ -69,6 +69,9 @@ class ViuIE(ViuBaseIE):
|
||||
'skip_download': 'm3u8 download',
|
||||
},
|
||||
'skip': 'Geo-restricted to Indonesia',
|
||||
}, {
|
||||
'url': 'https://india.viu.com/en/media/1126286865',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
@ -19,9 +19,10 @@ class WDRBaseIE(InfoExtractor):
|
||||
def _extract_wdr_video(self, webpage, display_id):
|
||||
# for wdr.de the data-extension is in a tag with the class "mediaLink"
|
||||
# for wdr.de radio players, in a tag with the class "wdrrPlayerPlayBtn"
|
||||
# for wdrmaus its in a link to the page in a multiline "videoLink"-tag
|
||||
# for wdrmaus, in a tag with the class "videoButton" (previously a link
|
||||
# to the page in a multiline "videoLink"-tag)
|
||||
json_metadata = self._html_search_regex(
|
||||
r'class=(?:"(?:mediaLink|wdrrPlayerPlayBtn)\b[^"]*"[^>]+|"videoLink\b[^"]*"[\s]*>\n[^\n]*)data-extension="([^"]+)"',
|
||||
r'class=(?:"(?:mediaLink|wdrrPlayerPlayBtn|videoButton)\b[^"]*"[^>]+|"videoLink\b[^"]*"[\s]*>\n[^\n]*)data-extension="([^"]+)"',
|
||||
webpage, 'media link', default=None, flags=re.MULTILINE)
|
||||
|
||||
if not json_metadata:
|
||||
@ -32,7 +33,7 @@ class WDRBaseIE(InfoExtractor):
|
||||
jsonp_url = media_link_obj['mediaObj']['url']
|
||||
|
||||
metadata = self._download_json(
|
||||
jsonp_url, 'metadata', transform_source=strip_jsonp)
|
||||
jsonp_url, display_id, transform_source=strip_jsonp)
|
||||
|
||||
metadata_tracker_data = metadata['trackerData']
|
||||
metadata_media_resource = metadata['mediaResource']
|
||||
@ -161,23 +162,23 @@ class WDRIE(WDRBaseIE):
|
||||
{
|
||||
'url': 'http://www.wdrmaus.de/aktuelle-sendung/index.php5',
|
||||
'info_dict': {
|
||||
'id': 'mdb-1096487',
|
||||
'ext': 'flv',
|
||||
'id': 'mdb-1323501',
|
||||
'ext': 'mp4',
|
||||
'upload_date': 're:^[0-9]{8}$',
|
||||
'title': 're:^Die Sendung mit der Maus vom [0-9.]{10}$',
|
||||
'description': '- Die Sendung mit der Maus -',
|
||||
'description': 'Die Seite mit der Maus -',
|
||||
},
|
||||
'skip': 'The id changes from week to week because of the new episode'
|
||||
},
|
||||
{
|
||||
'url': 'http://www.wdrmaus.de/sachgeschichten/sachgeschichten/achterbahn.php5',
|
||||
'url': 'http://www.wdrmaus.de/filme/sachgeschichten/achterbahn.php5',
|
||||
'md5': '803138901f6368ee497b4d195bb164f2',
|
||||
'info_dict': {
|
||||
'id': 'mdb-186083',
|
||||
'ext': 'mp4',
|
||||
'upload_date': '20130919',
|
||||
'title': 'Sachgeschichte - Achterbahn ',
|
||||
'description': '- Die Sendung mit der Maus -',
|
||||
'description': 'Die Seite mit der Maus -',
|
||||
},
|
||||
},
|
||||
{
|
||||
@ -186,7 +187,7 @@ class WDRIE(WDRBaseIE):
|
||||
'info_dict': {
|
||||
'id': 'mdb-869971',
|
||||
'ext': 'flv',
|
||||
'title': 'Funkhaus Europa Livestream',
|
||||
'title': 'COSMO Livestream',
|
||||
'description': 'md5:2309992a6716c347891c045be50992e4',
|
||||
'upload_date': '20160101',
|
||||
},
|
||||
|
@ -773,7 +773,7 @@ def parseOpts(overrideArguments=None):
|
||||
help='Convert video files to audio-only files (requires ffmpeg or avconv and ffprobe or avprobe)')
|
||||
postproc.add_option(
|
||||
'--audio-format', metavar='FORMAT', dest='audioformat', default='best',
|
||||
help='Specify audio format: "best", "aac", "vorbis", "mp3", "m4a", "opus", or "wav"; "%default" by default; No effect without -x')
|
||||
help='Specify audio format: "best", "aac", "flac", "mp3", "m4a", "opus", "vorbis", or "wav"; "%default" by default; No effect without -x')
|
||||
postproc.add_option(
|
||||
'--audio-quality', metavar='QUALITY',
|
||||
dest='audioquality', default='5',
|
||||
|
@ -26,15 +26,25 @@ from ..utils import (
|
||||
|
||||
|
||||
EXT_TO_OUT_FORMATS = {
|
||||
"aac": "adts",
|
||||
"m4a": "ipod",
|
||||
"mka": "matroska",
|
||||
"mkv": "matroska",
|
||||
"mpg": "mpeg",
|
||||
"ogv": "ogg",
|
||||
"ts": "mpegts",
|
||||
"wma": "asf",
|
||||
"wmv": "asf",
|
||||
'aac': 'adts',
|
||||
'flac': 'flac',
|
||||
'm4a': 'ipod',
|
||||
'mka': 'matroska',
|
||||
'mkv': 'matroska',
|
||||
'mpg': 'mpeg',
|
||||
'ogv': 'ogg',
|
||||
'ts': 'mpegts',
|
||||
'wma': 'asf',
|
||||
'wmv': 'asf',
|
||||
}
|
||||
ACODECS = {
|
||||
'mp3': 'libmp3lame',
|
||||
'aac': 'aac',
|
||||
'flac': 'flac',
|
||||
'm4a': 'aac',
|
||||
'opus': 'opus',
|
||||
'vorbis': 'libvorbis',
|
||||
'wav': None,
|
||||
}
|
||||
|
||||
|
||||
@ -237,7 +247,7 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
|
||||
acodec = 'copy'
|
||||
extension = 'm4a'
|
||||
more_opts = ['-bsf:a', 'aac_adtstoasc']
|
||||
elif filecodec in ['aac', 'mp3', 'vorbis', 'opus']:
|
||||
elif filecodec in ['aac', 'flac', 'mp3', 'vorbis', 'opus']:
|
||||
# Lossless if possible
|
||||
acodec = 'copy'
|
||||
extension = filecodec
|
||||
@ -256,8 +266,8 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
|
||||
else:
|
||||
more_opts += ['-b:a', self._preferredquality + 'k']
|
||||
else:
|
||||
# We convert the audio (lossy)
|
||||
acodec = {'mp3': 'libmp3lame', 'aac': 'aac', 'm4a': 'aac', 'opus': 'opus', 'vorbis': 'libvorbis', 'wav': None}[self._preferredcodec]
|
||||
# We convert the audio (lossy if codec is lossy)
|
||||
acodec = ACODECS[self._preferredcodec]
|
||||
extension = self._preferredcodec
|
||||
more_opts = []
|
||||
if self._preferredquality is not None:
|
||||
|
@ -1,3 +1,3 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
__version__ = '2017.03.06'
|
||||
__version__ = '2017.03.24'
|
||||
|
Reference in New Issue
Block a user