Answer the question
In order to leave comments, you need to log in
How to pass bypassing protection?
I have an idea to make my own library with completed games, I'm thinking of parsing a word art site. after shoveling Google. Found that the site has a security system. I tried to use phpQuery, simple_html_dom, as well as XPath and DOM ... zero results, an empty page everywhere, if the page request is wrapped in the Curl function, then the parser redirects to the page " www.world-art.ru/not_connect.html "
Yes there are ways to solve this problem?
Answer the question
In order to leave comments, you need to log in
To parse, you have to become like a browser, think like a browser, send the same requests as a browser.
You open the developer tools, look at the headers that the browser sends when accessing the site, and send the same ones through curl, even through sockets, even through file_get_contents.
And what you listed - phpQuery, simple_html_dom, xpath / DOM - all this has nothing to do with getting the page, how to get the html code - take it apart further than you want.
Наиболее полезный ответ дал Антон , чтобы не "гадать на кофейной гуще" необходимо пользоваться инструментами отладки, в данном случае это сниффер, такой как Fiddler, Wireshark, Charles, или хотя бы web inspector (devtools) в браузере, сниффер в ботах это то же самое что рентгеновский аппарат в медицине.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question