X
X
xenofeel2015-01-24 18:54:35
XPath
xenofeel, 2015-01-24 18:54:35

How to compose an XPath?

From pages such as this, you need to extract the price into a Google spreadsheet. Compiling xpath

=IMPORTXML ("http://electrozon.ru/catalog/videokarty-i-3d-ochki/90064-Videokarta-PCI-E-ASUS-nVidia-GeForce-GT-210-1024Mb-DDR3-210-SL-TC1GD3-L-Retail.html"; "//[email protected] = 'price']")
, which doesn't work, returning an xml processing error. In which direction to dig?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
I
Ilya, 2015-01-28
@xenofeel

In this particular case, the problem is not Google Sheets or the query (besides the forgotten square bracket - //strong [ @class = 'price']), but that the page's HTML cannot be parsed as normal XML. There is an option to run pages through tidy on your server, then you will get valid html at the output and easily apply xpath queries to it.

R
Roman Lipatov, 2018-12-21
@lipatovroman

What if the server is not yours?
I need to get new versions of programs. I have a list of addresses - about 2500.
I started in Google spreadsheets using IMPORTXML to create a separate column, but there are many errors.
For example, I take data from the page adionsoft.net/fastimageresize
Here is a screen
In the table I get #N/A
Second example, page https://biblsoft.ru/windows/system/file-managers/7... I
also copy the xpath, but I first check the page for validity in the service https://validator.w3.org
There are no critical errors, but in Google spreadsheets I still get #N/A.
And this is a small part of the resources from which data is not removed.
You and many others write about Tidy.
But how can you apply Tidy directly to Google Sheets? Is it even possible?
I mean, we don't have our own server or even hosting. We just want to receive and monitor a single element on a variety of sites. There can be many addresses.
For example, there is a desktop program for mac called Dejal Simon.
In it, the principle is this - we indicate a unique code before and after the element being checked.
It looks like this:
Checking occurs once a day.
If what is inside the above code has changed, a notification is sent to the mail.
Can this be implemented in Google Sheets? Or can you still solve the problem with invalid HTML right in Google Sheets?
Sorry for so much text, but I tried to explain the issue as fully as possible. Thank you.
PS By the way, I checked the card from Google Play - https://play.google.com/store/apps/details?id=com....
I copied this xpath - //*[@id="fcxH9b"]/div [4]/c-wiz/div/div[2]/div/div[1]/div/c-wiz[3]/div[1]/div[2]/div/div[4]/span/ div/span
This is the path to the program version. Neither take away nor add. And nothing... The table is empty. Although, according to the same formula, data is normally taken from other sites.
Basically, I need help with this issue.
Thank you.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question