How to merge multiple rows of a separate table into one in Postgresql 9.3?

D

decenter2014-12-11 11:08:08

PostgreSQL

decenter, 2014-12-11 11:08:08

Good afternoon!
There is a branched database in POSTGRESQL 9.3 with a total volume of about 10 GB (7K-2M records in each table). The task was to combine several rows of a separate table into one. I'll try to explain with an abstract example. Suppose there is a films table with several attributes (movie id and title). There is also an actors table where a list of actors is stored. The table data is also linked through an intermediate one (to provide many-to-many relationships).
films

id	title film
one	matrix
2	Johnny mnemonic
3	Bikers

actor_in_films

id_film	id_actor
one	one
one	2
2	one
3	2

actors

id	name
one	Keanu Reeves
2	Laurence Fishburne

In the films table, an additional column actors is created in which it is necessary to store information about the actors (actor name and identifier). Suppose in the format "id-name" with a separator in the form of the sign "|" . as an alternative data in JSON.
films

id	title film	actors
one	matrix	1- Keanu Reeves \| 2- Laurence Fishburne
2	Johnny mnemonic	1- Keanu Reeves
3	Bikers	2- Laurence Fishburne

Is it possible to perform such a manipulation in one query in posgresql 9.3?

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

D

decenter, 2014-12-11
@decenter

Such a task arose to form an intermediate table on the basis of which the web application displays the necessary information. Initially, the selection was made from 4 related tables. but such a request is executed for several seconds, which, in principle, is too much for an application. tests with an intermediate table gave a result of 0.8-0.9 seconds, which is becoming more or less acceptable. in the future, it is planned to combine it with the nosql solution, but translating data and rewriting the code for joint use of postgresql and nosql will require significant time costs. Therefore, in fact, the creation of an intermediate solution arose.
the study of manuals and the poke method has so far led to this option.

UPDATE films SET acters = (SELECT json_agg (ALL(actors.id, actors.name))
FROM actor_in_films  INNER JOIN  actors ON actors.id=actor_in_films.id_actor WHERE actor_in_films.id_film= films.id GROUP BY actor_in_films.id_film);

If anyone can help with a better option I would be immensely grateful.
Caching is a good option, but the problem is that the data is updated daily. (on average, about 100K-200K records are updated per day, and the same number are added. Caching would be just great if the same data were accessed 5 or more times (of a specific record in the table). But alas, more than 2- The same record is accessed 3 times very rarely.The main load is made by a large number of single accesses