Answer the question
In order to leave comments, you need to log in
How to properly store/select data in PostgreSQL?
Good day dear!
In continuation of my previous questions,
[PostgreSQL] How to cast strings to INT and other data types?
How to cast an array of numbers from a VARCHAR field to INTEGER type in PostgreSQL?
The essence of the problem: there is some data set (from 1 to 1000+) of records strictly tied to another record (simple relationship, one to many). In view of the fact that this data is of a mixed type (it can be both strings, with all the consequences) and numbers. The data is stored in VARCHAR format for the reasons described above.
At the same time, we need to work with this data, depending on the situation, both with strings and as with numbers. That is, if the search condition is set as "search by strings" - we are looking for everything at once, both by strings and by numbers, as if these are all strings (formally, it is). If the condition is to search for ranges, for example: ... WHERE n >= 10 AND n <= 100;
then we need to select only numbers and compare them accordingly.
As I see solutions to the problem:
Option 1 We store string data in a table for strings, numeric data in a table for numbers (and apparently, for fractional ones, we will have to create a personal table), and depending on the search conditions, we make a selection from two tables. There are minor problems here:
a) The data is fragmented
b) The system will work in such a way that when determining the format of the input data, it will write them to the desired table, while there is some probability of an erroneous definition, because not the fact that "333555" is the sum of something, and not a phone number or something else other than the sum.
Option 2.1 We store all the data in one table, in the VARCHAR format, and by indirect sign we separate the numbers, for example like this:
SELECT field1::integer FROM table1 WHERE field1 ~ E'^\\d+$' AND field1::integer > 3;
Answer the question
In order to leave comments, you need to log in
Third_normal_form
So you don't know what set of fields you actually have, you
don't know which requests prevail in you, you
don't know the set of field types
...
it's not clear what "better", "fast" and so on means
It seems to me that all the problems come from the fact that you store completely different entities in one column. Until you put them in different places, you will suffer. And even regular expressions / flags / attempts to save numbers separately from strings will not solve the problem 100% - you yourself gave an example when it is not clear whether the amount is in a string or a phone number. You have something wrong in the storage scheme itself .... Only you know the subject area, so it's up to you to decide what and where)
If these are dynamic attributes, then it might be worth looking towards jsonb flooded with gin. But you need to understand in more detail what kind of analytics will be over these fields and why numbers are important.
I will join Denis Smirnov . In your case, perhaps the best solution would be JSONB. Of the minuses - data denormalization. If you need referential integrity control inside JSONB data, then the task will become much more complicated. It can be solved, for example, by hanging triggers with control functions.
If this control can be neglected, then I recommend taking a closer look at JSONB.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question