What is the difference between ORDER BY and WITHIN GROUP (ORDER BY) in plain language?

U

user_of_toster2020-11-30 07:39:26

PostgreSQL

user_of_toster, 2020-11-30 07:39:26

I read the documentation about aggregation expression.

aggregation_function(expression, ORDER BY a) - first sorts the table by a, then applies aggregation_function to the resulting list.

What does aggregation_function(expression), WITHIN GROUP (ORDER BY a) do? I read this part of the documentation many times, the explanations are not clear for a beginner.

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

G

galaxy, 2020-11-30
@user_of_toster

aggregation_function(expression, ORDER BY a) - first sorts the table by a, then applies the aggregation_function to the resulting list.

Yes, only the comma in brackets is superfluous. This syntax allows the aggregates in the SELECT list to sort the input values (recall that the ORDER BY in the query is calculated after GROUP BY and aggregation, i.e. to do this earlier, a subquery was needed to sort the data array), even each aggregate has its own ( previously it was impossible to do it right). Aggregate functions often do not care about the order of input values (COUNT, SUM, MAX - do not depend on the order), but there are functions for which it matters (string_agg, etc, you can also write your own).

What does aggregation_function(expression), WITHIN GROUP (ORDER BY a) do?

This is the syntax for a special kind of aggregates: Ordered-Set Aggregate Functions and Hypothetical-Set Aggregate Functions (you also have an extra comma here). Ordinary ones will not work with him.
They take as input a sorted array of data specified in the ORDER BY (i.e. a , not expression!), plus, possibly, the value of expression (calculated once per query).
The scope is still quite specific: Ordered-Set Aggregate Functions is essentially used only to calculate percentiles (but it has become easy and pleasant to calculate them:

SELECT percentile_disc (0.2) WITHIN GROUP (ORDER BY ...)

).
Hypothetical-Set Aggregate Functions are such analogues of some WINDOW aggregates (rank type):

SELECT rank() OVER (ORDER BY value) ...; -- вернет ранк данной записи в отсортированном по value списке
SELECT rank(55) WITHIN GROUP (ORDER BY value) ...; -- вернет ранк числа 55, как если бы оно было в  отсортированном по value списке

In general, the thing is peculiar. Read the documentation, it can be difficult to understand right off the bat.