M
M
Maxim Ermak2018-11-20 18:28:15
Parsing
Maxim Ermak, 2018-11-20 18:28:15

How to write a bot for Parsing?

I want to professionally parse websites for compiling analytical reports for business. And as a beginner, I ask you to give an algorithm where to start!
1. What language are Parsing bots written in?
2. Where to learn the language?
3. What libraries (solutions) have already been developed for these tasks (parsing)?
4. What are the risks associated with web scraping?
5. Which databases should be used for data storage?
6. Is there a guide, courses, or are there mentors who can teach this craft?
I will be glad if you share additional information, in addition to the questions asked above!

Answer the question

In order to leave comments, you need to log in

1 answer(s)
A
Alexander Shilov, 2018-11-22
@tabbols95

1. It depends on what functionality you are considering. Python is quite good for parsing the necessary information from sites (I use it myself). I just use python itself and the BeautifulSoup, requests, selenium, pyautogui libraries. It would be nice to deal with the system. version control, pip, etc.
2. Internet to help. Articles on habr, YouTube. It is best to study in practice, having specific tasks.
3. I cited in the first paragraph, but also additional information is needed to write .csv files. library. Enough for a start, and then develop in a way to give free rein to fantasies.
4. With each site, of course, you need to agree on the automatic collection of data, otherwise you may be sued.
Learn, learn and learn again.
You can look at the code on github, many people post it there)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question