Answer the question
In order to leave comments, you need to log in
Is it possible to parse products from one store on the webarchive and format the data in the form of sql in order to then put it on another store on woocommerce?
Hi all. Actually, the essence of the question is in the title. Here I will need to transfer a lot of products from the webarchive to the new site, too much, and I decided to simplify my life. I'm going to use the PHP libraries for parsing. What are the chances that it will work? And what library would you recommend? And I would be grateful if they gave me advice on how to do it all quickly and correctly) There is a week in time, but I would like to manage it quickly.
Answer the question
In order to leave comments, you need to log in
I did a similar task for several stores that lost all data.
1) We download data in the form of HTML pages from different sources. I bought WebArchive here www.webarchivedownloader.com . There are alternatives on torrents, there are online solutions.
2) Download Google And Yandex Cache. (If the site has recently disappeared and the search engines have not yet managed to throw the site out of the index)
3) Locally deploy the site based on the received HTML pages (XAMP, Denver etc)
4) Parse local sites with the settings and parameters that you need. I used Screaming Frog.
Parse all Title, Meta Description, Breadcrumbs, H1, URL, Images, Description, Content, Price, etc.
5) Data after parsing can be saved as CSV and processed manually in Excel (clean up unnecessary tags, remove spam from previous SEOs in titles, etc.)
6) Import CSV into WP WooCommerce.
Something like this.
Are you doing quantum physics or parsing? Where are the chances? Everything must be determined. If you know how to do it, it will work. Don't know it won't work.
And learn to set tasks correctly, at least for yourself. Details should not be redundant.
I'll try to do it for you:
Now go through the steps
1. Does the web archive contain all the necessary pages? Probably not. Is all information up to date? Probably not. This means that it will not be possible to collect information completely. But everything that is on the web archive is available to us.
2. Can you open the page and understand from the information on the site which product group this product belongs to? And what are the characteristics? Price? Other options? Everything that you find on the site will be possible to parse. The answer is yes
3. Yes
4. Yes
Now the most important question: Do you know how to code? Have you parsed or just heard about it? Do you write SQL queries? If not, then it will take you either a lot of time or money. What to spend - decide for yourself. If the answer were "yes", the question would probably not exist. Libraries are not fundamental here, you still have to learn it.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question