F
F
Floki_SMD2020-12-04 12:19:22
C++ / C#
Floki_SMD, 2020-12-04 12:19:22

Parsing for beginners, what to learn?

Good afternoon.
I set out to parse a beech office, which is written in java.
The bottom line is that for parsing, as I have already figured out a little, Selenium will be needed to read the data that appears only after clicking on the match statistics (I did not find it in the site code, perhaps there is not enough experience and knowledge in the site structure).
Initially, I started learning the basics of C #, but after reading forums on this topic, people advise me to write better in Python or Java.
I am ready to learn, but there is not much information on the Internet on this topic.
At the moment I can imagine what arrays are and how to handle them (in theory), I don’t have much practice.
If there are pros, tell me what to read, study, the goal is to write it myself, I don’t ask for parsing code, I ask for help with literature, and then it’s a matter of technology, I will teach.
Thanks in advance to those who will give at least a minute to my request.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
D
Developer, 2020-12-04
@Floki_SMD

It doesn't really matter what you write on. The main mistake of all beginners.
It is important to understand how the HTTP/HTTPS protocol works, what are headers, cookies, to understand authentication, sessions, methods (get, put, update,...), response statuses,... To understand what is a proxy, VPN... and for what they are for, how to use them. You need to understand how the server works, how it can protect itself from parsing.
Learn the OSI model, what layers / levels it consists of

V
Vladimir Korotenko, 2020-12-04
@firedragon

Look at the sharps there is htmlagility, try it. Well, look for c# scrapping

A
Alexander, 2020-12-11
@avorsa

If about Python, then here's a fishing rod for you - https://www.litres.ru/r-mitchell/skraping-veb-sayt...

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question