Select/where/group by on 100m-200m tables?

V

vittore2012-04-12 18:22:18

MySQL

vittore, 2012-04-12 18:22:18

Feeling that I am missing something with noSql solutions, comments are welcome
There is a table with 100-200 million records, sorted by time, on which I want to quickly execute queries like

select time, field, sum(field2) 
from t inner join t2 on ...
where field3 = x and field  = y 
and time between time1 and time2
group by time, field 
order by time

now all this lives on one server, on sql server and the problem is that indexes to this table take up as much space as data.
What noSql solutions are good for such a task? I have only managed to look at CoachDb so far - it doesn’t look like it will take off

Reply

Answer the question

In order to leave comments, you need to log in

7 answer(s)

S

shagguboy, 2012-04-12
@shagguboy

I mean, such tasks are solved much easier simply by an additional table (field, sum_cache) and updating based on triggers or independently

it's called math.

E

edogs, 2012-04-12
@edogs

The answer is off topic.
But since your query seems to be completely non-SQL-based, and the indexes take up a lot of space ... it may make sense to either stupidly round the time to minutes (the indexes will immediately take up 60 times less space), or enter an additional. a field with time rounded to minutes (the table will inflate, but the indexes will be smaller).

R

rPman, 2012-04-12
@rPman

Does sum(field2) change for each field and how often? Is the speed of its recording critical?
More specifically, does field2 change? Or just adding and removing new entries?
I mean, such tasks are solved much easier with just an additional table (field, sum_cache) and an update based on triggers or on their own ... by the way, as far as I know, there are databases that support cache indexes based on expressions (in fact, they create a field and fill it with triggers )

S

shagguboy, 2012-04-12
@shagguboy

covering the clustered index will save you.

S

shagguboy, 2012-04-12
@shagguboy

msdn.microsoft.com/en-us/library/dd171921%28v=sql.100%29.aspx

S

shagguboy, 2012-04-12
@shagguboy

well, or if you want to make an OLAP cube in a completely adult way and work with it. In MS SQL OLAP goes free to the server

K

kuzemchik, 2012-04-14
@kuzemchik

Cubes are needed for multidimensional slices, they have their own cache, which is updated by sql queries. If there are many different fields by which you group, then the cube will be useful.
If not, then I think Partitioning tables would be better for you.