How to parse video links from iframe tag using python?

I

ivanovvan2022-03-12 20:36:46

HTML

ivanovvan, 2022-03-12 20:36:46

there are sites with movies from which I wanted to parse temporary links to video files. The problem is that these links are in the iframe tag, and if they are pulled from the site, they simply do not work, in most cases, when you click on them, something like "movie not found" is displayed.
How to get working links?
I used firefox webdriver

driver.switch_to.frame(driver.find_element_by_tag_name("iframe"))
                    element2 = driver.find_element(By.XPATH, """/html/body/div[3]/div/div[2]/main/div[2]/div/article/div[1]/div[2]/div[3]/iframe""")

ps
worked with sites like zetflx, lord film, kinogo

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

S

snxx, 2022-03-13
@lppxx

I don't think you are looking there. The site provides the ability to download videos, even without a player 1/2.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from bs4 import BeautifulSoup as bs
from fake_useragent import UserAgent as usr

import requests
import re


def get_video(url):
    ua = usr(cache=True)
    hdr = {
        'accept': '*/*',
        'user-agent': ua.chrome
    }

    try:
        resp = requests.get(url=url, headers=hdr)
        sp = bs(resp.text, 'lxml')
        video = sp.find(re.compile('iframe'), id='cdn-player')
            .find('div', id='qplayer').find('div', id='qplayer_vbox')
            .find('div', id='qplayer_controls').find('div', id='qplayer_download_control')
            .find('div', class_='qp_down_nav').find(re.compile('^a')).get('href');

        req = requests.get(video, headers=hdr, stream=True)

        with open('video.mp4', 'wb') as file:
            for chunk in req.iter_content(8192):
                file.write(chunk)

    except Exception as ex:
        return 'Upps... Check the URL please!'

def main():
    get_video('https://kinogo.biz/28467-hobbit-nezhdannoe-puteshestvie-2012.html')


if __name__ == '__main__':
    main()

Here is the code that turned out, but I could not fully check the performance (I got caught by the Cloudflare captcha).
Links to videos are located in the download icon (menu-burger).

K

Kadabrov, 2022-03-13
@Kadabrov

loaded via driver, saved all code then processed it via bs4

html = browser.page_source
with open('1.html', 'w') as f:
    f.write(html)
soup = BeautifulSoup(html, 'lxml')
iframe = soup.find('iframe', id="cdn-player").get('data-src')

//red.uboost.one/start/d3894e13ac7867f0ea0db534fb9762fd/b22f104beb331c4c67ca73d4c4d6bd9a