D
D
D2021-01-13 10:13:37
excel
D, 2021-01-13 10:13:37

How to parse data from excel file?

Hello!
I apologize in advance for not quite clear, as for me, the explanation of my problem.
I am working on a project at the university, the essence of which is parsing data from Excel with subsequent saving in PostgreSQL.
I was given a table, asked to write a program for it. I wrote, the data was successfully parsed. But then they threw me another dozen tables, and this is where the problems started. The tables are somewhat different in the sense that the data that is in the nth row and jth column in the first table (1 screenshot) may be in other tables in other tables (2 screenshot).

Screenshots

5ffe96f2ce1e8098016565.png
5ffe970b1b362024088238.png

And there are a lot of such cells in the tables that do not match in coordinates.
And I wrote a program that starts parsing data from a specific column and a specific row of a specific table, because I assumed that the tables would be the same in structure.

Question : how to correctly write a parser in such a way that it is not tied to certain rows and columns when searching for specific data and, accordingly, does not break if the necessary data in the table is, relatively speaking, in cell C16, and not B16, as expected . How can all these inconsistencies be accounted for?
I ask not because I don’t want to strain myself, but because I myself am interested in how it is possible to write such an “adaptive” program without crutches with a bunch of if-else and for loops, and is it possible in principle.

I don't know if this information is needed or not, but:
1) I use the Java language and the apache.poi library.
2) The project itself is on GitHub

Answer the question

In order to leave comments, you need to log in

3 answer(s)
D
datka, 2021-01-13
@datka

Theoretically: The similarity of these documents is that they have a cell Protocol number , We start looking for this cell from column A. Let's say cell A1, If ​​the cell is not empty, compare it with the value Protocol number , if they do not match, compare further. A2 = Protocol no . An = Protocol # ?. Not found in column A, see column B.
Found the protocol. Then you make an offset from the protocol and, according to the principle, if the cell is not empty, it should equal the date. Check if the value is a date. here either by standard library tools if there is such an opportunity, or by hand (if there are only numbers in the cell, if there are 3 dots, slashes, etc.)

M
mystifier, 2021-01-13
@mystifier

The excel file to be imported into the database must have a strictly specified format.
When loading, you need (as far as possible) to check for compliance.
If a mismatch is found, throw an error and download nothing.
Attempts to search for the necessary data in files of an inappropriate format often end up with a bunch of garbage in the database.

K
Korben5E, 2021-01-13
@Korben5E

To begin with, check whether the names are assigned to the cells, and if so, pull out the values ​​by name, not by address.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question