I
I
iiiideb2019-02-27 16:29:35
Python
iiiideb, 2019-02-27 16:29:35

What knowledge do you need to have to write a parser in Python?

Almost learned the basics of Python. What will you need to learn in order to freely write a website parser in Python? And what resources are best suited for this.

Answer the question

In order to leave comments, you need to log in

4 answer(s)
V
Vladimir Proskurin, 2019-02-27
@Vlad_IT

Your question sounds like this: "What knowledge do you need to have in order to create vehicles?". Parsers are different, somewhere just collect a piece of text from a block (ordinary requests is enough), somewhere copy materials from many pages (Scrapy, lxml is more convenient), somewhere authorization will interfere, somewhere captcha will interfere (need write captcha recognition or use paid APIs for captcha recognition), somewhere there will already be serious protections against parsing.
Write simple parsers, then slowly move on to more complex ones, there you will already understand what things are needed.

D
Denis Melnikov, 2019-02-27
@Mi11er

I started with Requests + bs4 .
Well, know
HTML DOM
CSS
sql (where should the data go)

A
Alexey, 2019-02-27
@APodgorny

The easiest way is through bs4. The performance of parsers will not be industrial, frankly, but for training and some one-time tasks it will do.
This is how it looks https://www.youtube.com/watch?v=KPXPr-KS-qk
It is advisable to master the lxml library
https://lxml.de/index.html
There is also a Scrapy framework.
There is even literature on this issue
mirknig.su/knigi/programming/114900-skraping-web-s...
Of course, you need to know the basics of markup, understand what XPath and CSS selectors are.

A
Aleksandr, 2019-02-27
@flyingpandasdiyingslow

If you are looking for examples of web page parsers with parsing and explanations, I share the link
Or here is a large detailed article on Habré from the same guys

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question