Answer the question
In order to leave comments, you need to log in
How to parse a dynamic site?
https://forum.malinovka.org/topic/13323-list-action...
From this site you need to parse leaders and information on them.
With a normal req request, I get "Please turn JavaScript on and reload the page." and I can't get the information I need.
The code will not be used by me.
import requests
from bs4 import BeautifulSoup
headers = {"user-agent" : "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36"}
res = requests.get("https://forum.malinovka.org/topic/13323-список-действующих-лидеров/", headers = headers)
soup = BeautifulSoup(res.content, "html.parser")
all_liders = soup.findall("div", class_ = "ipsType_normal ipsType_richText ipsContained")
Answer the question
In order to leave comments, you need to log in
import requests
from bs4 import BeautifulSoup
from fake_useragent import UserAgent
from selenium import webdriver
import time
URL = 'https://forum.malinovka.org/topic/13323-список-действующих-лидеров/'
options = webdriver.ChromeOptions()
driver = webdriver.Chrome(executable_path="chromedriver.exe", options=options)
driver.get(url=URL)
time.sleep(2)
useragent = UserAgent()
needed_html_code = driver.page_source
driver.close()
driver.quit()
soup = BeautifulSoup(needed_html_code, "html.parser")
content_div = soup.find('div', class_='cPost_contentWrap ipsPad')
for p in content_div.find_all('p')[1:]:
for item in p.contents:
print(str(item.string).replace('None', ''), end='\n')
print("-"*15)
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question