This post is to cover an explanation I have to repeat quite often when demoing code, or when suggesting improvements to others’ logic.
A lot of the time we’re asked to filter for results falling within a date window and you’ll see code implemented such as this:
(I’ve used SQL in this example; but the concept illustrated here is the same regardless of implementation language).
declare @StartDate datetime = '2016-01-01 00:00' , @EndDate datetime = '2016-12-31 23:59:59.997' --997 instead of 999 due to sql date quirk: see http://stackoverflow.com/a/3584913/361842 select * from myTable where myStartDate >= @StartDate and myEndDate <= @EndDate --or select * from myTable where myStartDate between @StartDate and @EndDate and myEndDate between @StartDate and @EndDate
That’s fine… sometimes.
The logic above will return any results where the entire result’s duration falls within the window’s range; i.e. the record starts after the window opens and completes before it’s closed.
This is best illustrated in the below graphic.
However, a lot of the time the requirement is that any part of the record falls into our window. i.e. the results we want are:
Given this requirement, lots of people still implement the above logic, causing any records which cross the window’s boundaries to be incorrectly missed off.
Those who do spot this issue often also make a mistake by overcomplicating the logic, providing rules for each scenario (and often missing off the last scenario illustrated below):
select * from myTable where --whole record in range ( myStartDate between @StartDate and @EndDate and myEndDate between @StartDate and @EndDate ) --record starts in range or ( myStartDate between @StartDate and @EndDate ) --record ends in range or ( myEndDate between @StartDate and @EndDate ) --record crosses entire range or ( myStartDate <= @StartDate and myEndDate >= @EndDate )
However, this can be drastically simplified, covering all of the above scenarios with a lot less code, and a lot more efficiently:
select * from myTable where myStartDate <= @EndDate --choice of < vs <= depending on requirement and myEndDate >= @StartDate --choice of > vs >= depending on requirement
To illustrate why this works, and why we use an AND rather than an OR, take a look at the previous image once we add lines showing where the condition’s met (green) or not met (red); here you can see those records where both conditions are met are shown as green, and where one condition’s not met as red (there is no scenario where both conditions are not met, since that would require a record which finished before it started).
NB: In the above code’s comments I mention that you can use < or vs >=).
I’d also closed my datetime window with the inclusive-last value.
In many scenarios I’d use an inclusive start date and an exclusive end date; such as is demoed below:
declare @StartDate datetime = '2016-01-01 00:00' --inclusive , @EndDate datetime = '2017-01-01 00:00' --exclusive select * from myTable where myStartDate < @EndDate and myEndDate >= @StartDate
The advantage here is that we don’t need to think of how many days there are in a given month, or how many microseconds we can have before we see rounding errors; we just work with whole numbers. It also means that the end date of our previous window can be the start date of our next window without risking issues caused by the windows overlapping (though note: we may still get results which legitimately fall into both windows).
That said; always think about these decisions; as you need to understand what requirement you need to meet, and how your code will meet that requirement for any given data.