What is the difference between the following approaches in sampling?

V

Vitaly Rybchenko2017-07-14 15:08:17

SQL Server

Vitaly Rybchenko, 2017-07-14 15:08:17

Good mood to you all!
In these two approaches, the result is the same, but the first one spends 30ms on the sample, and the second 30s. What is the fundamental difference?
Option 1

SELECT [FormId], count(F.Id)
  FROM [Application].[FormFieldsDataContent] F with(nolock) join
    (select Id from [Schema].ProjectForms with(nolock) where Period_id in (902,855)) A on F.FormId=A.Id
  where PeriodId in (902,855) 		
group by [FormId]

Option 2

SELECT [FormId], count(F.Id)
  FROM [Application].[FormFieldsDataContent] F with(nolock) join
    [Schema].ProjectForms A with(nolock) on F.FormId=A.Id and F.PeriodId=A.Period_id
  where A.Period_id in (902,855) 		
group by [FormId]

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

R

res2001, 2017-07-14
@res2001

In the second, you need to transfer the Period_id in (902,855)) condition to the join phrase, like this:

SELECT [FormId], count(F.Id)
  FROM [Application].[FormFieldsDataContent] F with(nolock) join
    [Schema].ProjectForms A with(nolock) on A.Period_id in (902,855)  and F.FormId=A.Id and F.PeriodId=A.Period_id
group by [FormId]

Apparently you have a large ProjectForms table, without this condition in the right place, all data that satisfies the condition (F.FormId=A.Id and F.PeriodId=A.Period_id) is selected from it, and then, from the resulting large selection, A is selected .Period_id in (902,855).
In the first example, this condition is in a nested query, so the nested selection will be much smaller.
In fact, subqueries tend to take longer than joins, so my version is likely to be even faster.