Answer the question
In order to leave comments, you need to log in
How to read data from huge xml?
You need to read data from a huge xml file, for example, 50Gb in size, the file is a root element with a collection of similar nodes (one name, almost always the same set of attributes and nested nodes), the structure of each node is not known in advance, some fields must be read as they are , some are converted/calculated on the fly or after reading all other fields. Before parsing xml, the structure of the entity in the file is not known, everything goes into the database, the list of tables and columns can change throughout the life of the application, there are rules by which we parse xml, for example:
PrimaryKey
we consider the hash value, if any, and put it in the cell KeyHash
of each tableXXXId
, we look at the parent and child nodes with the name XXX
and take the value from itSqlBulkCopy
immediately in the database is not an optionwhile(reader.read()){}
and store it all in memory in arrays, then dump it into the database when enough data is collected (via sqlbulkcopy(writetoserver(idatareader)))
, but the problem is, parsing even a couple of gigabytes takes a very long time. There is a description of the parsec on the page with beautiful processing results, but I don’t find anything about this either.Question Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question