How parvily to adjust distribution of the data?

A

arturgspb2012-07-17 12:56:41

PostgreSQL

arturgspb, 2012-07-17 12:56:41

Want to know if this is even possible?
What we have:

DBMS Postgresql
10-20 tables with different data in which up to 7 billion records are expected in each
All entries somehow belong to the entity - "Advertising Campaign"

What do you want:

Break the data into parts within one server and, if new data does not fit on this server (according to our criteria, for example: the number of campaigns on one server is more than 1 million), write them to another free one (it doesn’t matter yet who decides on which exactly).
Necessarily! One "Advertising Campaign" and all records subordinate to it on one server, as there are many related requests. This is necessary to exclude requests to more than one server for any information about the campaign.
The presentation of each table should be simple—i.e. client code should simply query/add/update data from a single table. Something like that - just SELECT * FROM “data”.”campaigns”, etc., although at the same time, as I wrote above, this table should be broken into pieces for a quick search, as it usually works with one “Ad campaign” at a time.
I would like to work without stored procedures.

What is it all for?
I don’t want to store such a large amount of data on one server in one table (by this I mean each of “10-20 tables with different data”), since the search for data slows down significantly - for example, sometimes you need to search by text, but at the same time we know in which "advertising campaign" she is. But since the index is general, large campaigns will interfere with small ones.

Reply

Answer the question

In order to leave comments, you need to log in

4 answer(s)

A

alz, 2012-07-17
@alz

And just sharding doesn't suit you?

A

alz, 2012-07-17
@alz

All logic on the client. If a client for a certain id works with the same server all the time, then nothing will happen on other servers

D

dbmaster, 2012-07-17
@dbmaster

In my opinion, there is a contradiction in the requirements: on the one hand, the separation of data, and on the other hand, a selection from one place.
Possible options:
* partitioning + replication
* additional indexes including campaign id
* exotic solutions sqlrelay.sourceforge.net/sqlrelay/router.html (not tested)

T

ToSHiC, 2012-07-18
@ToSHiC

www.postgresql.org/docs/8.1/static/ddl-partitioning.html
The campaign id must be specified in all tables as the key by which partitioning is done.