A
A
aidkz2017-07-03 20:01:12
Java
aidkz, 2017-07-03 20:01:12

Java. How to merge xml of large volumes?

XML files 100mb-1gb, DOM memory overflow error (disable -XX:-UseGCOverheadLimit, issue more JMV memory, not an option). Through StaX from scratch (we read each xml and create a new one on the fly) is not an option. JAXB marshaling-unmarshaling (too large volume, many classes will most likely also fly from memory)? What other ways are there, in which direction to dig? (if possible without frameworks).
Thanks in advance.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
al_gon, 2017-07-04
@al_gon

StaX+JAXB

O
Odissey Nemo, 2017-07-07
@odissey_nemo

If I see that there are many small XML files with a total volume of 100Mb-1Gb, then I used the following methods:
1. StringBuilder with a header, then I add the contents of each XML torn out using RegEx into it. At the end, I write in the StringBuilder what is required in the header (for example, the number of processed files), add the tail of the integral XML - and voila.
2. If you do not need to change the header at the end of the work, then open a new ZIP file and write to its stream, which goes immediately to the disk (or to memory, as required by the situation). It turns out even more compact, if so in 20-30 from memory.
3. You can also just write to a buffered file stream, again (see point 2).
I always ripped it out with RegEx only because the conditions were simple: find the start and end tags of the desired piece of XML. If formatting is desired, then you can slightly cheat when adding the next found piece with additional spaces, tabs and \n(\r).

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question