Answer the question
In order to leave comments, you need to log in
Web crawling where to start?
I was interested in this section, but I can’t figure out where to start digging. As I understand it, you need to dig in the direction of the grab and scrapy libraries, but there is practically no information in Russian, and if it comes across, then it is properly outdated. There is also documentation, there is also documentation, but again, this is documentation, but it is training that is of interest.
Answer the question
In order to leave comments, you need to log in
start with request to get the code from the site, and regexp for parsing
then beautifulsoup - you will see the difference and understand the value of a specialized library
then Scrapy - and also draw conclusions for yourself
after that go to the freelance exchange and take any order for parsing and do it for more tool you understand. it may even be a long-closed order. but the result is not making money but doing a real task.
after that, you can already offer yourself for little money, on the same freelance.
This is the path of a beginner Jedi. it will be difficult but interesting :)
In Russian, you can search for articles on Habré. There are both about grab and about scrapy . But in general, it is necessary to know English at the level of reading documentation. Without this, it will be very difficult.
In English, by the way, there is a pretty good book. They mostly use beautifulsoup and standard Python modules. Which I think is better for a beginner. There is also a bit about scrapy.
And the best way is to take some site and parse some data from there. Everything that is not clear to look for in the documentation and on stackoverflow (if everything is bad with English, then the Toaster and various forums dedicated to python).
The simplest crawler can be easily rolled using a grab. Well, then dig, depending on the need. By the way, the author of this library is very responsive on forums, etc. Well, in addition, there are his author's articles on Habré (see everything from Habrovchan lorien).
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question