How to parse all pages by search in Youtube?

M

maxilian2020-06-17 02:09:51

YouTube

maxilian, 2020-06-17 02:09:51

I need help in finalizing the script, I need to parse all links to search for a keyword, I don’t understand how to make pagination through youtube. Gives 20 results from the main page, but all search results are needed. I would like to know if this is possible without the API.

Variant with urlib:

import ssl
import certifi
import urllib.request
from bs4 import BeautifulSoup

textToSearch = 'test'
query = urllib.parse.quote(textToSearch)
url = "https://www.youtube.com/results?search_query=" + query
response = urllib.request.urlopen(url,context=ssl.create_default_context(cafile=certifi.where()))
html = response.read()
soup = BeautifulSoup(html, 'html.parser')
for vid in soup.findAll(attrs={'class':'yt-uix-tile-link'}):
    if not vid['href'].startswith("https://googleads.g.doubleclick.net/"):
        print('https://www.youtube.com' + vid['href'])

Option with selenium:

from selenium import webdriver 
import pandas as pd 
from selenium.webdriver.common.by import By 
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome('/Users/mkuzminetc/Documents/chromedriver') 
driver.get("https://www.youtube.com/results?search_query=upwork+approve&sp=EgIQAQ%253D%253D")

user_data = driver.find_elements_by_xpath('//*[@id="video-title"]')
links = []
for i in user_data:
            links.append(i.get_attribute('href'))
print(links)