Answer the question
In order to leave comments, you need to log in
Parsing https+post (iaai.com)?
I am engaged in parsing of American auto auctions. for some reason they do not have api, although it would be economically beneficial for them, because resellers who would win lots from them need it.
copart.com is already behind
And the next one is iaai.com everything is much more interesting here, the fact is that it uses https and ajax (POST method). There are no problems with ajax, because I perfectly understand what it is and how it is. But here https confuses me a little, mainly because I can’t understand what the browser sends to the server.
And so we have:
Page www.iaai.com/Vehicles/Search.aspx?RefinerSetName=V...
The task is to get from this page to the next page ("next>" from the bottom in pagination);
What do I know:
To begin with, I checked whether cookies were needed to go to the next page: I deleted them and tried to go to the next page in the browser, to which I received an error in response, in principle, I expected this (with copart.com , this rake has already come first). Using the method of gradual cleaning of cookies, I determined that the
only cookie that is needed for these operations is the session cookie, the one that is ASP.net (the second one is not needed)
In general, this is how it looks.
Then I started to analyze POST data.
- The request goes to the same page
- No cookies important for further work are accepted
- A line was found that is responsible for switching the page to the next one (highlighted)
<img src=" "
The value of this line is the parameter that is passed to the JS function when clicking on the link I need.
Moreover, it is unchanged for all pages, from which we can assume that the selected page is stored in the session.
In general, everything seems to be simple, If it wasn’t https, I read about it, and deducted that the data is transmitted in encrypted form, but because I see the __EVENTTARGET value not encrypted, I assumed that the information I see is not yet encrypted, but in my opinion this is the only line that is not encrypted, the rest of the parameters for me are a bunch of letters and numbers that change with each request.
So the question is what to do with those parameters that I cannot analyze and send accordingly, or are they not important?
PS if someone has some experience with parsing iaai.com (because the topic is quite popular) I will be grateful, and as a thank you I can offer the parser copart.com
Answer the question
In order to leave comments, you need to log in
I believe that you use curl as an https client, then the library will do all the operations with ssl for you, unless of course you specified the correct options.
Now about the request, ~ 25 kb of data is transferred to the post, initially I thought that all data is generated using js, then either parsing the algorithm or executing it.
But all the fields that are transmitted are static (as far as I could notice), they only need to be parsed and passed, as for meaningful fields, I recommend downloading the Temper data plugin for firefox, which allows you to edit the request, and by trial and error find out what to transfer and what can be ignored.
Judging by the anti-aliasing, you have a Mac, but, nevertheless, I dare to suggest Fiddler for Windows, which can proxy HTTPS and display all data transmitted to a remote server. By the way, you can install it on a virtual machine and simply specify it as a proxy for a poppy browser.
Here recently they asked about a parser with a built-in js engine, exactly what you need. Look...
And what's the benefit of this? Even if you (more precisely, your resellers) buy a car in the USA, how can you import it into Russia? It's probably very expensive.
If someone gives up and there is no desire to connect auto auctions to your site, we can connect for a couple of thousand https://web-studio.pro/sozdanie_sajtov/avto-aukciony
I have a Copart and IAAI parser ready.
It has been working stably for a long time and quickly receives all the data for all cars (about a million actual lots).
There is more data than auctions show, for example, there is a seller's reserve and the future date of the auction , which is hidden on the site.
The parser can also send lots to Telegram. Also relevant
in 2022 . The parser continues to work stably even after the recent introduction of a new system of protection against bots at auctions.
Telegram: @JWprogrammer
Email: [email protected]
Пишите мне в Телеграм или на почту
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question