Compare commits

...

28 Commits

Author SHA1 Message Date
Philipp Hagemeister
190ef07981 release 2016.01.01 2016-01-01 12:17:10 +01:00
Sergey M․
82597f0ec0 [ccc] Extract duration 2016-01-01 15:41:52 +06:00
Sergey M․
8499d21158 [ccc] Fix description extraction and update test 2016-01-01 15:29:42 +06:00
Sergey M․
c9154514c4 [ccc] Fix upload date extraction 2016-01-01 15:22:22 +06:00
Sergey M․
0d5095fc65 [ccc] Update _VALID_URL (Closes #8097) 2016-01-01 15:14:41 +06:00
Yen Chi Hsuan
034caf70b2 [youku] Fix extraction (#8068) 2016-01-01 13:33:01 +08:00
remitamine
e565cf6048 [nextmovie] Add new extractor 2015-12-31 22:47:18 +01:00
remitamine
59f197aec1 Merge branch 'master' of github.com:rg3/youtube-dl 2015-12-31 22:15:14 +01:00
remitamine
a0e5beb0fb [nick] Add new extractor 2015-12-31 22:12:05 +01:00
remitamine
c1e90619bd [mtv] extract mgid extraction and query building into separate methods 2015-12-31 22:10:00 +01:00
Sergey M․
fec09bf15d [einthusan] Improve extraction (Closes #7877) 2016-01-01 02:39:00 +06:00
j
a0d7ede350 Fix einthusan parser 2016-01-01 02:38:50 +06:00
Sergey M․
b26afec81f [einthusan] Improve extraction (Closes #7877) 2016-01-01 02:23:03 +06:00
Sergey M․
8f7c4f7d2e Merge branch 'master' of github.com:rg3/youtube-dl 2016-01-01 02:22:26 +06:00
j
0416006a30 Fix einthusan parser 2016-01-01 01:58:49 +06:00
remitamine
7f9134fb2d [tvland] inherit from MTVServicesInfoExtractor 2015-12-31 20:52:47 +01:00
remitamine
91e274546c [tvland] Add new extractor 2015-12-31 20:23:48 +01:00
Jaime Marquínez Ferrándiz
69f8595256 [espn] Extract better titles 2015-12-31 20:06:21 +01:00
Jaime Marquínez Ferrándiz
930087f2f6 [espn] Support 'intl' videos (#7858) 2015-12-31 20:04:17 +01:00
Jaime Marquínez Ferrándiz
9f9f7664b5 [espn] Update test 2015-12-31 19:52:48 +01:00
Sergey M․
72528252e3 [pandoratv] Add IE names 2016-01-01 00:42:42 +06:00
Sergey M․
e4bd63f9c0 [pandoratv] Improve extraction (Closes #7921) 2016-01-01 00:40:27 +06:00
j
9accfed4e7 [pandoratv] Add new extractor (closes #6884) 2016-01-01 00:18:13 +06:00
remitamine
f1e21efe63 [tlc] remove TlcIE 2015-12-31 18:33:40 +01:00
remitamine
b05641ce40 [discovery] improve _VALID_URL regex 2015-12-31 18:24:49 +01:00
remitamine
fec040e754 [discovery] add support for discovery related sites
- investigationdiscovery.com
- discoverylife.com
- animalplanet.com
- ahctv.com
- destinationamerica.com
- sciencechannel.com
- tlc.com
- velocity.com
2015-12-31 17:29:37 +01:00
Sergey M․
34a9da136f [regiotv] Improve extraction (Closes #7915) 2015-12-31 22:12:47 +06:00
j
c43fda4c1a [regiotv] Add new extractor (closes #7797) 2015-12-31 22:11:13 +06:00
16 changed files with 431 additions and 125 deletions

View File

@@ -28,7 +28,6 @@
- **AlJazeera**
- **Allocine**
- **AlphaPorno**
- **AnimalPlanet**
- **anitube.se**
- **AnySex**
- **Aparat**
@@ -368,11 +367,13 @@
- **Newstube**
- **NextMedia**: 蘋果日報
- **NextMediaActionNews**: 蘋果日報 - 動新聞
- **nextmovie.com**
- **nfb**: National Film Board of Canada
- **nfl.com**
- **nhl.com**
- **nhl.com:news**: NHL news
- **nhl.com:videocenter**: NHL videocenter category
- **nick.com**
- **niconico**: ニコニコ動画
- **NiconicoPlaylist**
- **njoy**: N-JOY
@@ -411,6 +412,7 @@
- **orf:iptv**: iptv.ORF.at
- **orf:oe1**: Radio Österreich 1
- **orf:tvthek**: ORF TVthek
- **pandora.tv**: 판도라TV
- **parliamentlive.tv**: UK parliament videos
- **Patreon**
- **pbs**: Public Broadcasting Service (PBS) and member stations: PBS: Public Broadcasting Service, APT - Alabama Public Television (WBIQ), GPB/Georgia Public Broadcasting (WGTV), Mississippi Public Broadcasting (WMPN), Nashville Public Television (WNPT), WFSU-TV (WFSU), WSRE (WSRE), WTCI (WTCI), WPBA/Channel 30 (WPBA), Alaska Public Media (KAKM), Arizona PBS (KAET), KNME-TV/Channel 5 (KNME), Vegas PBS (KLVX), AETN/ARKANSAS ETV NETWORK (KETS), KET (WKLE), WKNO/Channel 10 (WKNO), LPB/LOUISIANA PUBLIC BROADCASTING (WLPB), OETA (KETA), Ozarks Public Television (KOZK), WSIU Public Broadcasting (WSIU), KEET TV (KEET), KIXE/Channel 9 (KIXE), KPBS San Diego (KPBS), KQED (KQED), KVIE Public Television (KVIE), PBS SoCal/KOCE (KOCE), ValleyPBS (KVPT), CONNECTICUT PUBLIC TELEVISION (WEDH), KNPB Channel 5 (KNPB), SOPTV (KSYS), Rocky Mountain PBS (KRMA), KENW-TV3 (KENW), KUED Channel 7 (KUED), Wyoming PBS (KCWC), Colorado Public Television / KBDI 12 (KBDI), KBYU-TV (KBYU), Thirteen/WNET New York (WNET), WGBH/Channel 2 (WGBH), WGBY (WGBY), NJTV Public Media NJ (WNJT), WLIW21 (WLIW), mpt/Maryland Public Television (WMPB), WETA Television and Radio (WETA), WHYY (WHYY), PBS 39 (WLVT), WVPT - Your Source for PBS and More! (WVPT), Howard University Television (WHUT), WEDU PBS (WEDU), WGCU Public Media (WGCU), WPBT2 (WPBT), WUCF TV (WUCF), WUFT/Channel 5 (WUFT), WXEL/Channel 42 (WXEL), WLRN/Channel 17 (WLRN), WUSF Public Broadcasting (WUSF), ETV (WRLK), UNC-TV (WUNC), PBS Hawaii - Oceanic Cable Channel 10 (KHET), Idaho Public Television (KAID), KSPS (KSPS), OPB (KOPB), KWSU/Channel 10 & KTNW/Channel 31 (KWSU), WILL-TV (WILL), Network Knowledge - WSEC/Springfield (WSEC), WTTW11 (WTTW), Iowa Public Television/IPTV (KDIN), Nine Network (KETC), PBS39 Fort Wayne (WFWA), WFYI Indianapolis (WFYI), Milwaukee Public Television (WMVS), WNIN (WNIN), WNIT Public Television (WNIT), WPT (WPNE), WVUT/Channel 22 (WVUT), WEIU/Channel 51 (WEIU), WQPT-TV (WQPT), WYCC PBS Chicago (WYCC), WIPB-TV (WIPB), WTIU (WTIU), CET (WCET), ThinkTVNetwork (WPTD), WBGU-TV (WBGU), WGVU TV (WGVU), NET1 (KUON), Pioneer Public Television (KWCM), SDPB Television (KUSD), TPT (KTCA), KSMQ (KSMQ), KPTS/Channel 8 (KPTS), KTWU/Channel 11 (KTWU), East Tennessee PBS (WSJK), WCTE-TV (WCTE), WLJT, Channel 11 (WLJT), WOSU TV (WOSU), WOUB/WOUC (WOUB), WVPB (WVPB), WKYU-PBS (WKYU), KERA 13 (KERA), MPBN (WCBB), Mountain Lake PBS (WCFE), NHPTV (WENH), Vermont PBS (WETK), witf (WITF), WQED Multimedia (WQED), WMHT Educational Telecommunications (WMHT), Q-TV (WDCQ), WTVS Detroit Public TV (WTVS), CMU Public Television (WCMU), WKAR-TV (WKAR), WNMU-TV Public TV 13 (WNMU), WDSE - WRPT (WDSE), WGTE TV (WGTE), Lakeland Public Television (KAWE), KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS), MontanaPBS (KUSM), KRWG/Channel 22 (KRWG), KACV (KACV), KCOS/Channel 13 (KCOS), WCNY/Channel 24 (WCNY), WNED (WNED), WPBS (WPBS), WSKG Public TV (WSKG), WXXI (WXXI), WPSU (WPSU), WVIA Public Media Studios (WVIA), WTVI (WTVI), Western Reserve PBS (WNEO), WVIZ/PBS ideastream (WVIZ), KCTS 9 (KCTS), Basin PBS (KPBT), KUHT / Channel 8 (KUHT), KLRN (KLRN), KLRU (KLRU), WTJX Channel 12 (WTJX), WCVE PBS (WCVE), KBTC Public Television (KBTC)
@@ -459,6 +461,7 @@
- **RBMARadio**
- **RDS**: RDS.ca
- **RedTube**
- **RegioTV**
- **Restudy**
- **ReverbNation**
- **RingTV**
@@ -582,7 +585,6 @@
- **THVideo**
- **THVideoPlaylist**
- **tinypic**: tinypic.com videos
- **tlc.com**
- **tlc.de**
- **TMZ**
- **TMZArticle**
@@ -611,6 +613,7 @@
- **TVC**
- **TVCArticle**
- **tvigle**: Интернет-телевидение Tvigle.ru
- **tvland.com**
- **tvp.pl**
- **tvp.pl:Series**
- **TVPlay**: TV3Play and related services

View File

@@ -19,7 +19,6 @@ from .aftonbladet import AftonbladetIE
from .airmozilla import AirMozillaIE
from .aljazeera import AlJazeeraIE
from .alphaporno import AlphaPornoIE
from .animalplanet import AnimalPlanetIE
from .anitube import AnitubeIE
from .anysex import AnySexIE
from .aol import AolIE
@@ -435,6 +434,7 @@ from .nextmedia import (
NextMediaActionNewsIE,
AppleDailyIE,
)
from .nextmovie import NextMovieIE
from .nfb import NFBIE
from .nfl import NFLIE
from .nhl import (
@@ -442,6 +442,7 @@ from .nhl import (
NHLNewsIE,
NHLVideocenterIE,
)
from .nick import NickIE
from .niconico import NiconicoIE, NiconicoPlaylistIE
from .ninegag import NineGagIE
from .noco import NocoIE
@@ -498,6 +499,7 @@ from .orf import (
ORFFM4IE,
ORFIPTVIE,
)
from .pandoratv import PandoraTVIE
from .parliamentliveuk import ParliamentLiveUKIE
from .patreon import PatreonIE
from .pbs import PBSIE
@@ -552,6 +554,7 @@ from .rai import (
from .rbmaradio import RBMARadioIE
from .rds import RDSIE
from .redtube import RedTubeIE
from .regiotv import RegioTVIE
from .restudy import RestudyIE
from .reverbnation import ReverbNationIE
from .ringtv import RingTVIE
@@ -695,7 +698,7 @@ from .thesixtyone import TheSixtyOneIE
from .thisamericanlife import ThisAmericanLifeIE
from .thisav import ThisAVIE
from .tinypic import TinyPicIE
from .tlc import TlcIE, TlcDeIE
from .tlc import TlcDeIE
from .tmz import (
TMZIE,
TMZArticleIE,
@@ -738,6 +741,7 @@ from .tvc import (
TVCArticleIE,
)
from .tvigle import TvigleIE
from .tvland import TVLandIE
from .tvp import TvpIE, TvpSeriesIE
from .tvplay import TVPlayIE
from .tweakers import TweakersIE

View File

@@ -1,53 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
parse_duration,
parse_iso8601,
)
class AnimalPlanetIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?animalplanet\.com/([^/]+/)*(?P<id>[^/\?#]+)'
_TESTS = [{
'url': 'http://www.animalplanet.com/tv-shows/i-shouldnt-be-alive/videos/dog-saves-injured-owner/',
'info_dict': {
'id': '10608',
'ext': 'mp4',
'title': 'Dog Saves Injured Owner',
'description': 'A world class athlete is put to the test when she falls into a canyon and breaks her hip. Her only companion is her dog, Taz, who is on a mission to save her!',
'upload_date': '20100410',
'timestamp': 1270857727,
'duration': 220,
},
'params': {
# m3u8 download
'skip_download': True,
}
}, {
'url': 'http://www.animalplanet.com/longfin-eels-maneaters/',
'only_matching': True,
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_data = self._parse_json(self._search_regex(
r'initialVideoData\s*=\s*({.+?});',
webpage, 'initialVideoData'), display_id)['playlist'][0]
return {
'id': compat_str(video_data['id']),
'display_id': display_id,
'title': video_data['title'],
'description': video_data.get('description'),
'thumbnail': video_data.get('thumbnailURL'),
'duration': parse_duration(video_data.get('video_length')),
'timestamp': parse_iso8601(video_data.get('publishedDate')),
'formats': self._extract_m3u8_formats(
video_data['src'], display_id, 'mp4',
'm3u8_native', m3u8_id='hls')
}

View File

@@ -5,6 +5,7 @@ import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_duration,
qualities,
unified_strdate,
)
@@ -12,21 +13,25 @@ from ..utils import (
class CCCIE(InfoExtractor):
IE_NAME = 'media.ccc.de'
_VALID_URL = r'https?://(?:www\.)?media\.ccc\.de/[^?#]+/[^?#/]*?_(?P<id>[0-9]{8,})._[^?#/]*\.html'
_VALID_URL = r'https?://(?:www\.)?media\.ccc\.de/v/(?P<id>[^/?#&]+)'
_TEST = {
'url': 'http://media.ccc.de/browse/congress/2013/30C3_-_5443_-_en_-_saal_g_-_201312281830_-_introduction_to_processor_design_-_byterazor.html#video',
_TESTS = [{
'url': 'https://media.ccc.de/v/30C3_-_5443_-_en_-_saal_g_-_201312281830_-_introduction_to_processor_design_-_byterazor#video',
'md5': '3a1eda8f3a29515d27f5adb967d7e740',
'info_dict': {
'id': '20131228183',
'id': '30C3_-_5443_-_en_-_saal_g_-_201312281830_-_introduction_to_processor_design_-_byterazor',
'ext': 'mp4',
'title': 'Introduction to Processor Design',
'description': 'md5:5ddbf8c734800267f2cee4eab187bc1b',
'description': 'md5:80be298773966f66d56cb11260b879af',
'thumbnail': 're:^https?://.*\.jpg$',
'view_count': int,
'upload_date': '20131229',
'upload_date': '20131228',
'duration': 3660,
}
}
}, {
'url': 'https://media.ccc.de/v/32c3-7368-shopshifting#download',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
@@ -40,14 +45,17 @@ class CCCIE(InfoExtractor):
title = self._html_search_regex(
r'(?s)<h1>(.*?)</h1>', webpage, 'title')
description = self._html_search_regex(
r"(?s)<p class='description'>(.*?)</p>",
r"(?s)<h3>About</h3>(.+?)<h3>",
webpage, 'description', fatal=False)
upload_date = unified_strdate(self._html_search_regex(
r"(?s)<span class='[^']*fa-calendar-o'></span>(.*?)</li>",
r"(?s)<span[^>]+class='[^']*fa-calendar-o'[^>]*>(.+?)</span>",
webpage, 'upload date', fatal=False))
view_count = int_or_none(self._html_search_regex(
r"(?s)<span class='[^']*fa-eye'></span>(.*?)</li>",
webpage, 'view count', fatal=False))
duration = parse_duration(self._html_search_regex(
r'(?s)<span[^>]+class=(["\']).*?fa-clock-o.*?\1[^>]*></span>(?P<duration>.+?)</li',
webpage, 'duration', fatal=False, group='duration'))
matches = re.finditer(r'''(?xs)
<(?:span|div)\s+class='label\s+filetype'>(?P<format>.*?)</(?:span|div)>\s*
@@ -95,5 +103,6 @@ class CCCIE(InfoExtractor):
'thumbnail': thumbnail,
'view_count': view_count,
'upload_date': upload_date,
'duration': duration,
'formats': formats,
}

View File

@@ -9,7 +9,17 @@ from ..compat import compat_str
class DiscoveryIE(InfoExtractor):
_VALID_URL = r'https?://www\.discovery\.com\/[a-zA-Z0-9\-]*/[a-zA-Z0-9\-]*/videos/(?P<id>[a-zA-Z0-9_\-]*)(?:\.htm)?'
_VALID_URL = r'''(?x)http://(?:www\.)?(?:
discovery|
investigationdiscovery|
discoverylife|
animalplanet|
ahctv|
destinationamerica|
sciencechannel|
tlc|
velocity
)\.com/(?:[^/]+/)*(?P<id>[^./?#]+)'''
_TESTS = [{
'url': 'http://www.discovery.com/tv-shows/mythbusters/videos/mission-impossible-outtakes.htm',
'info_dict': {
@@ -21,8 +31,8 @@ class DiscoveryIE(InfoExtractor):
'don\'t miss Adam moon-walking as Jamie ... behind Jamie\'s'
' back.'),
'duration': 156,
'timestamp': 1303099200,
'upload_date': '20110418',
'timestamp': 1302032462,
'upload_date': '20110405',
},
'params': {
'skip_download': True, # requires ffmpeg
@@ -33,27 +43,38 @@ class DiscoveryIE(InfoExtractor):
'id': 'mythbusters-the-simpsons',
'title': 'MythBusters: The Simpsons',
},
'playlist_count': 9,
'playlist_mincount': 10,
}, {
'url': 'http://www.animalplanet.com/longfin-eels-maneaters/',
'info_dict': {
'id': '78326',
'ext': 'mp4',
'title': 'Longfin Eels: Maneaters?',
'description': 'Jeremy Wade tests whether or not New Zealand\'s longfin eels are man-eaters by covering himself in fish guts and getting in the water with them.',
'upload_date': '20140725',
'timestamp': 1406246400,
'duration': 116,
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
info = self._download_json(url + '?flat=1', video_id)
display_id = self._match_id(url)
info = self._download_json(url + '?flat=1', display_id)
video_title = info.get('playlist_title') or info.get('video_title')
entries = [{
'id': compat_str(video_info['id']),
'formats': self._extract_m3u8_formats(
video_info['src'], video_id, ext='mp4',
video_info['src'], display_id, 'mp4', 'm3u8_native', m3u8_id='hls',
note='Download m3u8 information for video %d' % (idx + 1)),
'title': video_info['title'],
'description': video_info.get('description'),
'duration': parse_duration(video_info.get('video_length')),
'webpage_url': video_info.get('href'),
'webpage_url': video_info.get('href') or video_info.get('url'),
'thumbnail': video_info.get('thumbnailURL'),
'alt_title': video_info.get('secondary_title'),
'timestamp': parse_iso8601(video_info.get('publishedDate')),
} for idx, video_info in enumerate(info['playlist'])]
return self.playlist_result(entries, video_id, video_title)
return self.playlist_result(entries, display_id, video_title)

View File

@@ -1,9 +1,12 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
remove_start,
sanitized_Request,
)
class EinthusanIE(InfoExtractor):
@@ -34,27 +37,33 @@ class EinthusanIE(InfoExtractor):
]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
webpage = self._download_webpage(url, video_id)
video_id = self._match_id(url)
video_title = self._html_search_regex(
r'<h1><a class="movie-title".*?>(.*?)</a></h1>', webpage, 'title')
request = sanitized_Request(url)
request.add_header('User-Agent', 'Mozilla/5.0 (Windows NT 5.2; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0')
webpage = self._download_webpage(request, video_id)
video_url = self._html_search_regex(
r'''(?s)jwplayer\("mediaplayer"\)\.setup\({.*?'file': '([^']+)'.*?}\);''',
webpage, 'video url')
title = self._html_search_regex(
r'<h1><a[^>]+class=["\']movie-title["\'][^>]*>(.+?)</a></h1>',
webpage, 'title')
video_id = self._search_regex(
r'data-movieid=["\'](\d+)', webpage, 'video id', default=video_id)
video_url = self._download_webpage(
'http://cdn.einthusan.com/geturl/%s/hd/London,Washington,Toronto,Dallas,San,Sydney/'
% video_id, video_id)
description = self._html_search_meta('description', webpage)
thumbnail = self._html_search_regex(
r'''<a class="movie-cover-wrapper".*?><img src=["'](.*?)["'].*?/></a>''',
webpage, "thumbnail url", fatal=False)
if thumbnail is not None:
thumbnail = thumbnail.replace('..', 'http://www.einthusan.com')
thumbnail = compat_urlparse.urljoin(url, remove_start(thumbnail, '..'))
return {
'id': video_id,
'title': video_title,
'title': title,
'url': video_url,
'thumbnail': thumbnail,
'description': description,

View File

@@ -1,6 +1,7 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import remove_end
class ESPNIE(InfoExtractor):
@@ -10,8 +11,20 @@ class ESPNIE(InfoExtractor):
'info_dict': {
'id': 'FkYWtmazr6Ed8xmvILvKLWjd4QvYZpzG',
'ext': 'mp4',
'title': 'dm_140128_30for30Shorts___JudgingJewellv2',
'description': '',
'title': '30 for 30 Shorts: Judging Jewell',
'description': None,
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
# intl video, from http://www.espnfc.us/video/mls-highlights/150/video/2743663/must-see-moments-best-of-the-mls-season
'url': 'http://espn.go.com/video/clip?id=2743663',
'info_dict': {
'id': '50NDFkeTqRHB0nXBOK-RGdSG5YQPuxHg',
'ext': 'mp4',
'title': 'Must-See Moments: Best of the MLS season',
},
'params': {
# m3u8 download
@@ -43,12 +56,23 @@ class ESPNIE(InfoExtractor):
r'class="video-play-button"[^>]+data-id="(\d+)',
webpage, 'video id')
cms = 'espn'
if 'data-source="intl"' in webpage:
cms = 'intl'
player_url = 'https://espn.go.com/video/iframe/twitter/?id=%s&cms=%s' % (video_id, cms)
player = self._download_webpage(
'https://espn.go.com/video/iframe/twitter/?id=%s' % video_id, video_id)
player_url, video_id)
pcode = self._search_regex(
r'["\']pcode=([^"\']+)["\']', player, 'pcode')
return self.url_result(
'ooyalaexternal:espn:%s:%s' % (video_id, pcode),
'OoyalaExternal')
title = remove_end(
self._og_search_title(webpage),
'- ESPN Video').strip()
return {
'_type': 'url_transparent',
'url': 'ooyalaexternal:%s:%s:%s' % (cms, video_id, pcode),
'ie_key': 'OoyalaExternal',
'title': title,
}

View File

@@ -167,14 +167,16 @@ class MTVServicesInfoExtractor(InfoExtractor):
'description': description,
}
def _get_feed_query(self, uri):
data = {'uri': uri}
if self._LANG:
data['lang'] = self._LANG
return compat_urllib_parse.urlencode(data)
def _get_videos_info(self, uri):
video_id = self._id_from_uri(uri)
feed_url = self._get_feed_url(uri)
data = compat_urllib_parse.urlencode({'uri': uri})
info_url = feed_url + '?'
if self._LANG:
info_url += 'lang=%s&' % self._LANG
info_url += data
info_url = feed_url + '?' + self._get_feed_query(uri)
return self._get_videos_info_from_url(info_url, video_id)
def _get_videos_info_from_url(self, url, video_id):
@@ -184,9 +186,7 @@ class MTVServicesInfoExtractor(InfoExtractor):
return self.playlist_result(
[self._get_video_info(item) for item in idoc.findall('.//item')])
def _real_extract(self, url):
title = url_basename(url)
webpage = self._download_webpage(url, title)
def _extract_mgid(self, webpage):
try:
# the url can be http://media.mtvnservices.com/fb/{mgid}.swf
# or http://media.mtvnservices.com/{mgid}
@@ -207,7 +207,12 @@ class MTVServicesInfoExtractor(InfoExtractor):
'sm4:video:embed', webpage, 'sm4 embed', default='')
mgid = self._search_regex(
r'embed/(mgid:.+?)["\'&?/]', sm4_embed, 'mgid')
return mgid
def _real_extract(self, url):
title = url_basename(url)
webpage = self._download_webpage(url, title)
mgid = self._extract_mgid(webpage)
videos_info = self._get_videos_info(mgid)
return videos_info

View File

@@ -0,0 +1,30 @@
# coding: utf-8
from __future__ import unicode_literals
from .mtv import MTVServicesInfoExtractor
from ..compat import compat_urllib_parse
class NextMovieIE(MTVServicesInfoExtractor):
IE_NAME = 'nextmovie.com'
_VALID_URL = r'https?://(?:www\.)?nextmovie\.com/shows/[^/]+/\d{4}-\d{2}-\d{2}/(?P<id>[^/?#]+)'
_FEED_URL = 'http://lite.dextr.mtvi.com/service1/dispatch.htm'
_TESTS = [{
'url': 'http://www.nextmovie.com/shows/exclusives/2013-03-10/mgid:uma:videolist:nextmovie.com:1715019/',
'md5': '09a9199f2f11f10107d04fcb153218aa',
'info_dict': {
'id': '961726',
'ext': 'mp4',
'title': 'The Muppets\' Gravity',
},
}]
def _get_feed_query(self, uri):
return compat_urllib_parse.urlencode({
'feed': '1505',
'mgid': uri,
})
def _real_extract(self, url):
mgid = self._match_id(url)
return self._get_videos_info(mgid)

View File

@@ -0,0 +1,63 @@
# coding: utf-8
from __future__ import unicode_literals
from .mtv import MTVServicesInfoExtractor
from ..compat import compat_urllib_parse
class NickIE(MTVServicesInfoExtractor):
IE_NAME = 'nick.com'
_VALID_URL = r'https?://(?:www\.)?nick\.com/videos/clip/(?P<id>[^/?#.]+)'
_FEED_URL = 'http://udat.mtvnservices.com/service1/dispatch.htm'
_TESTS = [{
'url': 'http://www.nick.com/videos/clip/alvinnn-and-the-chipmunks-112-full-episode.html',
'playlist': [
{
'md5': '6e5adc1e28253bbb1b28ab05403dd4d4',
'info_dict': {
'id': 'be6a17b0-412d-11e5-8ff7-0026b9414f30',
'ext': 'mp4',
'title': 'ALVINNN!!! and The Chipmunks: "Mojo Missing/Who\'s The Animal" S1',
'description': 'Alvin is convinced his mojo was in a cap he gave to a fan, and must find a way to get his hat back before the Chipmunks big concert.\nDuring a costume visit to the zoo, Alvin finds himself mistaken for the real Tasmanian devil.',
}
},
{
'md5': 'd7be441fc53a1d4882fa9508a1e5b3ce',
'info_dict': {
'id': 'be6b8f96-412d-11e5-8ff7-0026b9414f30',
'ext': 'mp4',
'title': 'ALVINNN!!! and The Chipmunks: "Mojo Missing/Who\'s The Animal" S2',
'description': 'Alvin is convinced his mojo was in a cap he gave to a fan, and must find a way to get his hat back before the Chipmunks big concert.\nDuring a costume visit to the zoo, Alvin finds himself mistaken for the real Tasmanian devil.',
}
},
{
'md5': 'efffe1728a234b2b0d2f2b343dd1946f',
'info_dict': {
'id': 'be6cf7e6-412d-11e5-8ff7-0026b9414f30',
'ext': 'mp4',
'title': 'ALVINNN!!! and The Chipmunks: "Mojo Missing/Who\'s The Animal" S3',
'description': 'Alvin is convinced his mojo was in a cap he gave to a fan, and must find a way to get his hat back before the Chipmunks big concert.\nDuring a costume visit to the zoo, Alvin finds himself mistaken for the real Tasmanian devil.',
}
},
{
'md5': '1ec6690733ab9f41709e274a1d5c7556',
'info_dict': {
'id': 'be6e3354-412d-11e5-8ff7-0026b9414f30',
'ext': 'mp4',
'title': 'ALVINNN!!! and The Chipmunks: "Mojo Missing/Who\'s The Animal" S4',
'description': 'Alvin is convinced his mojo was in a cap he gave to a fan, and must find a way to get his hat back before the Chipmunks big concert.\nDuring a costume visit to the zoo, Alvin finds himself mistaken for the real Tasmanian devil.',
}
},
],
}]
def _get_feed_query(self, uri):
return compat_urllib_parse.urlencode({
'feed': 'nick_arc_player_prime',
'mgid': uri,
})
def _extract_mgid(self, webpage):
return self._search_regex(r'data-contenturi="([^"]+)', webpage, 'mgid')

View File

@@ -0,0 +1,78 @@
# encoding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import (
compat_str,
compat_urlparse,
)
from ..utils import (
ExtractorError,
float_or_none,
parse_duration,
str_to_int,
)
class PandoraTVIE(InfoExtractor):
IE_NAME = 'pandora.tv'
IE_DESC = '판도라TV'
_VALID_URL = r'https?://(?:.+?\.)?channel\.pandora\.tv/channel/video\.ptv\?'
_TEST = {
'url': 'http://jp.channel.pandora.tv/channel/video.ptv?c1=&prgid=53294230&ch_userid=mikakim&ref=main&lot=cate_01_2',
'info_dict': {
'id': '53294230',
'ext': 'flv',
'title': '頭を撫でてくれる?',
'description': '頭を撫でてくれる?',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 39,
'upload_date': '20151218',
'uploader': 'カワイイ動物まとめ',
'uploader_id': 'mikakim',
'view_count': int,
'like_count': int,
}
}
def _real_extract(self, url):
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
video_id = qs.get('prgid', [None])[0]
user_id = qs.get('ch_userid', [None])[0]
if any(not f for f in (video_id, user_id,)):
raise ExtractorError('Invalid URL', expected=True)
data = self._download_json(
'http://m.pandora.tv/?c=view&m=viewJsonApi&ch_userid=%s&prgid=%s'
% (user_id, video_id), video_id)
info = data['data']['rows']['vod_play_info']['result']
formats = []
for format_id, format_url in info.items():
if not format_url:
continue
height = self._search_regex(
r'^v(\d+)[Uu]rl$', format_id, 'height', default=None)
if not height:
continue
formats.append({
'format_id': '%sp' % height,
'url': format_url,
'height': int(height),
})
self._sort_formats(formats)
return {
'id': video_id,
'title': info['subject'],
'description': info.get('body'),
'thumbnail': info.get('thumbnail') or info.get('poster'),
'duration': float_or_none(info.get('runtime'), 1000) or parse_duration(info.get('time')),
'upload_date': info['fid'][:8] if isinstance(info.get('fid'), compat_str) else None,
'uploader': info.get('nickname'),
'uploader_id': info.get('upload_userid'),
'view_count': str_to_int(info.get('hit')),
'like_count': str_to_int(info.get('likecnt')),
'formats': formats,
}

View File

@@ -0,0 +1,62 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
sanitized_Request,
xpath_text,
xpath_with_ns,
)
class RegioTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?regio-tv\.de/video/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'http://www.regio-tv.de/video/395808.html',
'info_dict': {
'id': '395808',
'ext': 'mp4',
'title': 'Wir in Ludwigsburg',
'description': 'Mit unseren zuckersüßen Adventskindern, außerdem besuchen wir die Abendsterne!',
}
}, {
'url': 'http://www.regio-tv.de/video/395808',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
key = self._search_regex(
r'key\s*:\s*(["\'])(?P<key>.+?)\1', webpage, 'key', group='key')
title = self._og_search_title(webpage)
SOAP_TEMPLATE = '<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"><soap:Body><{0} xmlns="http://v.telvi.de/"><key xsi:type="xsd:string">{1}</key></{0}></soap:Body></soap:Envelope>'
request = sanitized_Request(
'http://v.telvi.de/',
SOAP_TEMPLATE.format('GetHTML5VideoData', key).encode('utf-8'))
video_data = self._download_xml(request, video_id, 'Downloading video XML')
NS_MAP = {
'xsi': 'http://www.w3.org/2001/XMLSchema-instance',
'soap': 'http://schemas.xmlsoap.org/soap/envelope/',
}
video_url = xpath_text(
video_data, xpath_with_ns('.//video', NS_MAP), 'video url', fatal=True)
thumbnail = xpath_text(
video_data, xpath_with_ns('.//image', NS_MAP), 'thumbnail')
description = self._og_search_description(
webpage) or self._html_search_meta('description', webpage)
return {
'id': video_id,
'url': video_url,
'title': title,
'description': description,
'thumbnail': thumbnail,
}

View File

@@ -4,32 +4,9 @@ import re
from .common import InfoExtractor
from .brightcove import BrightcoveLegacyIE
from .discovery import DiscoveryIE
from ..compat import compat_urlparse
class TlcIE(DiscoveryIE):
IE_NAME = 'tlc.com'
_VALID_URL = r'http://www\.tlc\.com\/[a-zA-Z0-9\-]*/[a-zA-Z0-9\-]*/videos/(?P<id>[a-zA-Z0-9\-]*)(.htm)?'
# DiscoveryIE has _TESTS
_TESTS = [{
'url': 'http://www.tlc.com/tv-shows/cake-boss/videos/too-big-to-fly.htm',
'info_dict': {
'id': '104493',
'ext': 'mp4',
'title': 'Too Big to Fly',
'description': 'Buddy has taken on a high flying task.',
'duration': 119,
'timestamp': 1393365060,
'upload_date': '20140225',
},
'params': {
'skip_download': True, # requires ffmpef
},
}]
class TlcDeIE(InfoExtractor):
IE_NAME = 'tlc.de'
_VALID_URL = r'http://www\.tlc\.de/sendungen/[^/]+/videos/(?P<title>[^/?]+)'

View File

@@ -0,0 +1,64 @@
# coding: utf-8
from __future__ import unicode_literals
from .mtv import MTVServicesInfoExtractor
class TVLandIE(MTVServicesInfoExtractor):
IE_NAME = 'tvland.com'
_VALID_URL = r'https?://(?:www\.)?tvland\.com/(?:video-clips|episodes)/(?P<id>[^/?#.]+)'
_FEED_URL = 'http://www.tvland.com/feeds/mrss/'
_TESTS = [{
'url': 'http://www.tvland.com/episodes/hqhps2/everybody-loves-raymond-the-invasion-ep-048',
'playlist': [
{
'md5': '227e9723b9669c05bf51098b10287aa7',
'info_dict': {
'id': 'bcbd3a83-3aca-4dca-809b-f78a87dcccdd',
'ext': 'mp4',
'title': 'Everybody Loves Raymond|Everybody Loves Raymond 048 HD, Part 1 of 5',
}
},
{
'md5': '9fa2b764ec0e8194fb3ebb01a83df88b',
'info_dict': {
'id': 'f4279548-6e13-40dd-92e8-860d27289197',
'ext': 'mp4',
'title': 'Everybody Loves Raymond|Everybody Loves Raymond 048 HD, Part 2 of 5',
}
},
{
'md5': 'fde4c3bccd7cc7e3576b338734153cec',
'info_dict': {
'id': '664e4a38-53ef-4115-9bc9-d0f789ec6334',
'ext': 'mp4',
'title': 'Everybody Loves Raymond|Everybody Loves Raymond 048 HD, Part 3 of 5',
}
},
{
'md5': '247f6780cda6891f2e49b8ae2b10e017',
'info_dict': {
'id': '9146ecf5-b15a-4d78-879c-6679b77f4960',
'ext': 'mp4',
'title': 'Everybody Loves Raymond|Everybody Loves Raymond 048 HD, Part 4 of 5',
}
},
{
'md5': 'fd269f33256e47bad5eb6c40de089ff6',
'info_dict': {
'id': '04334a2e-9a47-4214-a8c2-ae5792e2fab7',
'ext': 'mp4',
'title': 'Everybody Loves Raymond|Everybody Loves Raymond 048 HD, Part 5 of 5',
}
}
],
}, {
'url': 'http://www.tvland.com/video-clips/zea2ev/younger-younger--hilary-duff---little-lies',
'md5': 'e2c6389401cf485df26c79c247b08713',
'info_dict': {
'id': 'b8697515-4bbe-4e01-83d5-fa705ce5fa88',
'ext': 'mp4',
'title': 'Younger|Younger: Hilary Duff - Little Lies',
'description': 'md5:7d192f56ca8d958645c83f0de8ef0269'
},
}]

View File

@@ -2,6 +2,9 @@
from __future__ import unicode_literals
import base64
import random
import string
import time
from .common import InfoExtractor
from ..compat import (
@@ -141,6 +144,11 @@ class YoukuIE(InfoExtractor):
return video_urls_dict
@staticmethod
def get_ysuid():
return '%d%s' % (int(time.time()), ''.join([
random.choice(string.ascii_letters) for i in range(3)]))
def get_hd(self, fm):
hd_id_dict = {
'3gp': '0',
@@ -189,6 +197,8 @@ class YoukuIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
self._set_cookie('youku.com', '__ysuid', self.get_ysuid())
def retrieve_data(req_url, note):
headers = {
'Referer': req_url,

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2015.12.31'
__version__ = '2016.01.01'