A
A
Alexey Fr2016-05-25 16:12:23
PHP
Alexey Fr, 2016-05-25 16:12:23

How to reduce memory usage when processing a large XML file?

This code generates an SQL query from an XML file.

$data=file_get_contents($uploads.$filename);
    $xmlf=new SimpleXMLElement($data);
    $g_table='SET NAMES "utf8";';
    $g_table.='INSERT INTO groups (group_id,group_name) VALUES ';
    foreach ($xmlf->gr as $group) {
      $g_table.='('.$group['group_id'].','.$group['group_name'].'),';
    }
    $g_table=rtrim($g_table,',');
    $g_table.='ON DUPLICATE KEY UPDATE group_id=VALUES(group_id),group_name=VALUES(group_name)';
    $g_table_file=fopen('sqlfiles/groups.sql','w');
    $file_bool['g_table']['file']=fwrite($g_table_file,$g_table);
    fclose($g_table_file);
    $file_bool['g_table']['query']=$this->query($g_table);
    unset($g_table);
    unset($g_table_file);
    unset($xmlf->gr);

A similar construction is used for other tables.
XML looks like this:
<Main>
<!--- ~200 строк -->
<gr group_id="101" group_dependence="0" group_name="Группа такая-то" group_description="''"/>
<!--- ~11000 строк -->
<gp group_id="1897" ppl_name="Иванов Иван Иваныч" tgl="5"/>
<!--- Еще ~30к строк с различными именами и параметрами -->
</Main>

Why is everything in the same parent node? Easier and faster access with SimpleXML.
Why such shitty code, because it was possible to make one block that would generate all the code? Because it is temporary, although there is nothing more permanent than something temporary.
This design consumes a lot of memory, and splitting one xml into several, subsequently processing them separately is not an option.
Actually the question is: is it possible to process xml so that at the same time I do not need over9000 memory? Or do you still have to increase the limit? I don’t want to increase the limit, not just because I don’t want to, but because it seems that if I need to increase it, then I’m doing something wrong and I can do better.

Answer the question

In order to leave comments, you need to log in

4 answer(s)
A
Alexander Aksentiev, 2016-05-25
@Sanasol

However, if it is split and executed in one script, it may happen that there is not enough memory.

Have you tried clearing variables after use?

D
DarkMatter, 2016-05-25
@darkmatter

Request to the studio and put it in a loop for example

N
Night, 2016-05-25
@maxtm

However, if it is split and executed in one script, it may happen that there is not enough memory.

If you break it, then what is suddenly not enough then?
Choose the size of the request such that it is enough.

#
#algooptimize #bottize, 2016-05-25
@user004

Lord, please describe specifically the task and input data) + solution code

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question