A
A
Alexey2017-09-04 13:11:59
Python
Alexey, 2017-09-04 13:11:59

Is it possible to sync libraries with sellenium, pandas or beautifulsoup for parsing?

Hello, I have such a task, I need to parse the closed part of the site, that is, the site admin panel, I log in using selenium, but I don’t know what to do next to parse the basic information, how to pass the authorized session further for parsing to pandas or another library, if I express myself correctly at all, I'm still only a beginner, maybe there are other options, or maybe you can do all the parsing using selenium

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
Anton B, 2017-09-07
@nizzit

You can get the page code through the page_source driver properties and then pass it to beautifulsoup for parsing like this:
Or pass the code directly to pandas using the read_html function . But this is only if you have tabular data on the page.
In general, as mentioned above, Selenium is not very well suited for these purposes. If you plan to continue scraping sites, then I recommend mastering scrapy or a bunch of requests and beautifulsoup for this.

C
cgxcwojf, 2017-09-04
@cgxcwojf

Selenium is for other purposes.
SlimerJS for parsing.
or PhantomJS if you need to visually control the process.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question