Answer the question
In order to leave comments, you need to log in
How to parse site pages?
You need to parse values from a bunch of site pages and write them to MySQL. The divs where the values are located are known in advance and their id / class does not change (the pages are static, just the information is different everywhere). Please tell me the easiest way to do this at the present time (maybe there are some tools that simplify this).
I am superficially familiar with PHP, so links to materials on parsing requests / responses, etc. are very welcome.
Thank you.
Answer the question
In order to leave comments, you need to log in
You are the 4th one this week.
But since you don't know how to use search, then...
scrapy, for example, is designed for this (to get information from sites, but writing to MySQL is a separate task that scrapy does not solve).
https://scrapy.org/
But this is for Python.
There is for Go
https://github.com/PuerkitoBio/gocrawl
https://github.com/PuerkitoBio/goquery
Surely there is something similar for PHP.
And you can also use ready-made services:
80legs, Mozenda.
They will rob everything according to your order, give it to you in a convenient form - you will then write down from this form where you need it.
They have free trial plans.
I would say that PHP is not the best solution for the task.
First, you need to see if the resource has a normal AJAX interface, this can be seen in the console
. If not, and you need to parse, then the correct approach, probably for today, is Python + requests + BeautifulSoup (there are alternatives, but this one definitely works and works well)
Install Python (I prefer 2.7, but it's unimportant)
Install requests and BeautifulSoup
Install lxml
Next, write something like this
import requests
from bs4 import BeautifulSoup
page = requests.get('http://www.mysite.com/1').content # Получаем данные
page = BeautifulSoup(page, 'lxml') # Приводим данные к красивому виду
parsedData = page.findAll('div', {'class': 'my-data-class'}) # Выбираем теги по атрибутам (для примера взят класс)
csvfile = open('myfile.csv', 'wb')
writer = csv.writer(csvfile, delimiter=';', quotechar=';', quoting=csv.QUOTE_MINIMAL)
for row in parsedData:
writer.writerow(row)
csvfile.close()
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question