Why is there a 403 error when downloading a video?

D

DED232019-10-31 19:21:56

Python

DED23, 2019-10-31 19:21:56

Hello, I know that there were a lot of such questions, but I didn’t find the answer for myself. I’m not a pro in this, so maybe this is due to an error in the code. Tried different ways but all to no avail. Please tell me how to get around the error.
Thanks in advance!
Do not judge strictly here is the code:

from bs4 import BeautifulSoup
import requests
import wget
url='https://jut.su/jojo-bizarre-adventure/season-1/episode-16.html'
header={'User-Agent':'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 YaBrowser/19.9.3.314 Yowser/2.5 Safari/537.36'}
def dow(url):
    ses=requests.Session()
    r=ses.get(url,headers=header)
    soup=BeautifulSoup(r.content,'html.parser')
    for i in soup.find_all('source',src=True):
        video=i['src']
        print(video)
        wget.download(video,"JOJO.mp4")
        break
dow(url)

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

B

Byxo Cyze, 2019-10-31
@DED23

wget.download doesn't use a user-agent, so the site won't let you download the file.

from bs4 import BeautifulSoup
import requests
url='https://jut.su/jojo-bizarre-adventure/season-1/episode-16.html'

headers = {
            'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:70.0) Gecko/20100101 Firefox/70.0'
        }

r = requests.get(url, headers=headers)
soup = BeautifulSoup(r.content, 'html.parser')

sources = soup.find_all('source', src=True)

with requests.get(sources[0]['src'], headers=headers, stream=True) as r:
        r.raise_for_status()
        with open("JOJO.mp4", 'wb') as f:
            for chunk in r.iter_content(chunk_size=8192): 
                if chunk: 
                    f.write(chunk)