Y
Y
Yevhenii K2020-04-13 20:57:55
PHP
Yevhenii K, 2020-04-13 20:57:55

How to properly parse a large xml file and save the data?

I need to "read" a large xml file and add values ​​to the database.

For parsing I use XMLReader + DOMDocumnet and as a result I get an array of this kind.

[
    'categories' => [
        0 => [
            'name' => 'Category',
            'id' => 1456,
            'parent_id' => 284
        ]
    ],
    'products' => [
        0 => [
            'category_id' => 1456,
            'id' => 135,
            'available' => true,
            'price' => 22
            ...
        ],
        ....
    ]
]

  1. How to parse a file correctly so as not to get a crash on "Script execution time" or memory size?

    I came up with what needs to be done first:
    • parse categories
    • catch closing `categories` tag and write data to json
    • collect goods further, but save to a file not at the end, but for example after 100 entries.


    But, something confuses me in this order.

  2. How to add a large amount of data to the database, what would be the minimum impact on performance?

    In fact, now to add one product, you need to make min. 5 requests insert(add a product, link to a category, picture, properties, etc.) are also needed selectto check the uniqueness.

Please tell me how to organize everything.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
N
nokimaro, 2020-04-14
@nokimaro

The fastest way to insert a huge amount of data into MySQL is to import from a csv file
In a similar task, where I had to process 10+GB xml and 150+million rows for insertion, I solved it in 2 stages
1. xml stream parsing so as not to crash from memory, and writing the results to csv
2. importing data from csv into the database via LOAD DATA LOCAL INFILE. No other speed data insertion method could bypass loading from a csv file.

LOAD DATA LOCAL INFILE '{$csv_file}'
INTO TABLE `{$table_tmp}`
FIELDS TERMINATED BY ','
ENCLOSED BY '\"'
LINES TERMINATED BY '\\n'
IGNORE 1 ROWS;

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question