V
V
VicTHOR2019-08-07 22:02:11
Database design
VicTHOR, 2019-08-07 22:02:11

Database or xml?

It's a product catalog. The catalog is uploaded to the server as an xml file.
There are products in the catalog, the product has properties, properties can be multiple (for example, color).
It was proposed to store catalog data in just 2 tables, as I understand it, in the form

id | product_xml_id | xmlNodeName
xmlNodeName_id | Value

Or somehow wrong, if someone knows the variant of 2 tables, correct me.
As I understand it, in this case, the multiple property "color" of each product must have a unique identifier in xml.
You can also store in one table, pulling out in one line
id | product_xml_id | xml
. At the same time, when building a catalog, get everything, and then get what you need through xPath.
Or standard parse xml and build tables for the product
id | product_xml_id | property1 | property2
id | product_id | colors 
id | colors_id | sizes

Or do not enter the values ​​into the database at all, get the necessary data for the product through xPath.
I heard about xml-injection, I don’t think that you can do anything with the directory, but in any case, you can prevent the file from being read to users, allow the script.
Total 4 options.
How relevant will the indices be in the first 2?
In any of the cases, caching is needed, i.e. the data will be taken once, and the next time will be when updating the xml.
xml needs to be updated. In the case of a large file, I suppose it can be long (at least 1s can take a long time to build it), which is undesirable. But you can unload parts and update the database (is it correct in each of the first 2 cases?).
I see caching like this: once a page is built, the response is duplicated in a new file, at the beginning of the page, check the existence of this file, if it exists, output it. After updating xml, delete the file. The cached file will have the correct last-modified.
In this case, there will be non-cached pages (for example, a filter - show products of 2 manufacturers, 1 size, 2 colors).
I need to understand the nuances of different options, cons, optimization (which is faster - read xml or make a selection from 3 tables, for example?), and point out what I may be missing, logic errors.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
Adamos, 2019-08-07
@VicTHOR

There are no options. XML is a convenient format for exchanging information between fundamentally different systems (like 1C and a website, for example). This is where its advantages end, and it makes no sense to use it anywhere else.
As soon as it comes to constant data processing and the speed of this processing, it is naturally logical to do this with the help of databases specially designed for this purpose and polished for a long time better than any of your bikes.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question