Answer the question
In order to leave comments, you need to log in
Latency for the site parser?
Good afternoon, there was such a problem, the site - https://www.coinglass.com/pro/cme/cftc did not load data before (the values were immediately on the site), but recently it started and if you refresh the page, you will notice that the first second, all values on the page are 0. And the BS4 library parses these zeros. Is there any way to make a delay? So that the page is loaded first, and then parsed.
New to programming, tried to make a timer before requests.get, didn't help.
html = requests.get(URL, headers=HEADERS)
time.sleep(3)
soup = BeautifulSoup(html.text, 'lxml')
long_inst = soup.find_all('table', class_='code133741')[+1].find_all('td')[+25].text
long_inst_changes = soup.find_all('table', class_='code133741')[+1].find_all('td')[+42].text
short_inst = soup.find_all('table', class_='code133741')[+1].find_all('td')[+26].text
short_inst_changes = soup.find_all('table', class_='code133741')[+1].find_all('td')[+43].text
long_funds = soup.find_all('table', class_='code133741')[+1].find_all('td')[+28].text
long_funds_changes = soup.find_all('table', class_='code133741')[+1].find_all('td')[+45].text
short_funds = soup.find_all('table', class_='code133741')[+1].find_all('td')[+29].text
short_funds_changes = soup.find_all('table', class_='code133741')[+1].find_all('td')[+46].text
date = soup.find('div', class_='bybt-box').find('div').text[6:]
Answer the question
In order to leave comments, you need to log in
It's not about the delay, the data is loaded dynamically. Regarding the parsing of dynamic sites:
https://qna.habr.com/q/1038438#answer_2008702
I don’t know what you are collecting, but most likely all the necessary data is here
https://fapi.coinglass.com/api/cme/cot/ report
there is a possibility that this JS gives a delay, here I propose an elegant solution
from bs4 import BeautifulSoup
from selenium import webdriver
url = "http://legendas.tv/busca/walking%20dead%20s03e02"
browser = webdriver.PhantomJS()
browser.get(url)
html = browser.page_source
soup = BeautifulSoup(html, 'lxml')
a = soup.find('section', 'wrapper')
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question