Developer42

2016-12-19

Filtering on a Date Window

Filed under: SQL Server, Technology — Tags: , , , , , — Developer42 @ 17:21

This post is to cover an explanation I have to repeat quite often when demoing code, or when suggesting improvements to others’ logic.

A lot of the time we’re asked to filter for results falling within a date window and you’ll see code implemented such as this:
(I’ve used SQL in this example; but the concept illustrated here is the same regardless of implementation language).

declare @StartDate datetime = '2016-01-01 00:00'
, @EndDate datetime = '2016-12-31 23:59:59.997' --997 instead of 999 due to sql date quirk: see http://stackoverflow.com/a/3584913/361842

select *
from myTable
where myStartDate >= @StartDate
and myEndDate <= @EndDate

--or

select *
from myTable
where myStartDate between @StartDate and @EndDate
and myEndDate between @StartDate and @EndDate

That’s fine… sometimes.
The logic above will return any results where the entire result’s duration falls within the window’s range; i.e. the record starts after the window opens and completes before it’s closed.
This is best illustrated in the below graphic.

Example of Time Window with Whole Record falling within Window

Example of Time Window with Whole Record falling within Window

However, a lot of the time the requirement is that any part of the record falls into our window. i.e. the results we want are:

Date window where any part of record falls within window

Date window where any part of record falls within window

Given this requirement, lots of people still implement the above logic, causing any records which cross the window’s boundaries to be incorrectly missed off.
Those who do spot this issue often also make a mistake by overcomplicating the logic, providing rules for each scenario (and often missing off the last scenario illustrated below):

select *
from myTable
where 
--whole record in range
(
	myStartDate between @StartDate and @EndDate
	and myEndDate between @StartDate and @EndDate
)
--record starts in range
or
(
	myStartDate between @StartDate and @EndDate
)
--record ends in range
or
(
	myEndDate between @StartDate and @EndDate
)
--record crosses entire range
or
(
	myStartDate <= @StartDate 
	and myEndDate >= @EndDate
)

Preferred Solution

However, this can be drastically simplified, covering all of the above scenarios with a lot less code, and a lot more efficiently:

select *
from myTable
where myStartDate <= @EndDate --choice of < vs <= depending on requirement
and myEndDate >= @StartDate  --choice of > vs >= depending on requirement

To illustrate why this works, and why we use an AND rather than an OR, take a look at the previous image once we add lines showing where the condition’s met (green) or not met (red); here you can see those records where both conditions are met are shown as green, and where one condition’s not met as red (there is no scenario where both conditions are not met, since that would require a record which finished before it started).

Records with any part in date window with lines showing condition results

Records with any part in date window with lines showing condition results

Bonus Info

NB: In the above code’s comments I mention that you can use < or vs >=).
I’d also closed my datetime window with the inclusive-last value.
In many scenarios I’d use an inclusive start date and an exclusive end date; such as is demoed below:

declare @StartDate datetime = '2016-01-01 00:00' --inclusive
, @EndDate datetime = '2017-01-01 00:00'  --exclusive

select *
from myTable
where myStartDate < @EndDate 
and myEndDate >= @StartDate  

The advantage here is that we don’t need to think of how many days there are in a given month, or how many microseconds we can have before we see rounding errors; we just work with whole numbers. It also means that the end date of our previous window can be the start date of our next window without risking issues caused by the windows overlapping (though note: we may still get results which legitimately fall into both windows).
That said; always think about these decisions; as you need to understand what requirement you need to meet, and how your code will meet that requirement for any given data.

Blog at WordPress.com.