Answer the question
In order to leave comments, you need to log in
Parsing a complex RTF document, tabular data extraction and pagination, how?
Initially, the task seemed simple, but neither on the forehead, nor a couple of parsers found on the jit, nor anything happened. which was somewhat disconcerting.
By the way, there is a critical limitation - all components must be legal and free,
I will be grateful for the tips!
upd based on answers/comments at the moment is an auto-generated report, multi-page, several documents of the same type, with tabular forms. it is necessary to cut into pages and selectively remove the information - let's say the date of the document and part of the tabular data. and the fields do not have any tags
.. and the element tree built by https://github.com/sgolivernet/nrtftree has 620331 lines ))
Answer the question
In order to leave comments, you need to log in
1 - https://github.com/SourceCodeBackup/RtfDomParser is the best candidate for data extraction. and certainly for express research
I just had to learn how to cook itFortunately, the structure of the document is quite clear, so everything can be solved. but either it does not know how to save modified documents, or I still do not understand how to use the local Writer
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question