﻿ ﻿Date and Time Functions - The Essential Guide to SAS Dates and Times, Second Edition (2014)

# The Essential Guide to SAS Dates and Times, Second Edition (2014)

### Chapter 5. Date and Time Functions

SAS has many functions to manipulate dates, times, and datetime values. The functions can be categorized according to what they do. You can obtain the current date, time, or datetime (as specified by the computer's clock). You can also easily extract pieces of dates, times, or datetimes as numerical values from their corresponding SAS values, or you can assemble SAS date, time, and datetime values from SAS variables or constants. Another set of functions operates with intervals such as weeks or months.

5.1 Current Date and Time Functions

Current date and time functions have no arguments and return SAS values as noted in the following table. The values are obtained from the operating system's clock.

Table 5.1: Current Date and Time Functions

 Date and Time Function Description DATE(), TODAY() These functions are identical, and both return the current date as a SAS date value. TIME() This returns the current time as a SAS time value. DATETIME() This returns the current date and time as a SAS datetime value.

5.2 Extracting Pieces from SAS Date, Time, and Datetime Values

The extraction functions all use a single argument (represented by arg in the following table), which represents a SAS date, time, or datetime value. This can be either a SAS variable name or the appropriate constant. If two-digit year values are used, the result will be subject to the YEARCUTOFF= option value in effect. They all return a numeric value as the result. The following tables are separated into functions that use dates as an argument, those that use datetimes as the argument, and those that use times as an argument. Each table gives examples of how to apply each function, along with relevant comments for each example.

Table 5.2: Functions Returning a Date Component and Requiring a SAS Date Value as an Argument

 Function Name Explanation Example DAY(arg) Extracts the number of the day of the month from a SAS date value. DAY("14OCT2015"d) = 14 JULDATE(arg) Extracts the Julian date from a SAS date value. It will return a four- or five-digit value with a one- or two-digit year, if the year portion of the date falls within the 100-year span defined by the YEARCUTOFF= option. If you want to ensure four-digit year values, you should use the JULDATE7() function. JULDATE("09MAY2004"d) = 4130 (result is returned as a numeric value, so there are no leading zeros) JULDATE("09MAY2014"d)=14129 (note difference caused by 2014 versus 2004, above) JULDATE("09MAY1890"d) = 1890129 JULDATE7(arg) Extracts the Julian date with a four-digit year from a SAS date value. This always returns a seven-digit number, regardless of the year. JULDATE7("09MAY2004"d) = 2004130 JULDATE7("09MAY2014"d) = 2014129 JULDATE7("09MAY1890"d) = 1890129 MONTH(arg) Extracts the numerical month from a SAS date value. MONTH("22AUG2015"d) = 8 QTR(arg) Extracts the quarter of the year from a SAS date value. QTR("8JAN2013"d) = 1 WEEK(arg) Extracts the week number from a SAS date value, where Sunday is the first day of the week, which is the "U" algorithm. This function has been augmented in SAS versions 9.1.3 and above by the WEEK(arg,descriptor) function described immediately below. WEEK("02JAN2005"d) = 1 (Version 9.1.3 and up) WEEK(arg,descriptor) Extracts the week number from a SAS date value. descriptor can be "U," "V," or "W" (case-insensitive), and it refers to the algorithm used to calculate the first week of the year. The U algorithm calculates weeks based on Sunday being the first day of the week. WEEK("02JAN2005"d,"U") = 1. January 2, 2005 was a Sunday, so the first week of the year has started. The V algorithm calculates weeks to the ISO standard. Monday is the first day of the week, and the first week of the year is defined as the one that contains both January 4 and the first Thursday of the year. WEEK("02JAN2005"d,"V") = 53. This week is defined as being the 53rd week in 2004, because it doesn't contain the first Monday or Thursday of the year. The W algorithm calculates weeks based on Monday being the first day of the week without restriction. WEEK("02JAN2005"d,"W") = 0. The year 2005 has started, but weeks are calculated with Monday as the first day of the week. Therefore, the first week of 2005 doesn't start until January 3, 2005, so this is week 0 of 2005. WEEKDAY(arg) Extracts the number of the day of the week, where Sunday=1, Monday=2, and so on from a SAS date value. WEEKDAY("14APR2011"d) = 4 (Wednesday, April 14, 2011) YEAR(arg) Extracts the year from a SAS date value. If you use a date constant (as in the example) and not a SAS date value, it is important to remember that the YEARCUTOFF= option affects two-digit years. If OPTIONS YEARCUTOFF=1920; YEAR("19JUL10"d) = 2010; YEAR("19JUL1910"d) = 1910

Table 5.3: Functions Returning a Datetime Component and Requiring a SAS Datetime Value as an Argument

 Function Name Explanation Example DATEPART(arg) Extracts the date from a SAS datetime value as a SAS date value. DATEPART('21MAR2012:17:07:00'dt) = 19073 (March 21, 2012) TIMEPART(arg) Extracts the time portion from a SAS datetime value as a SAS time value. TIMEPART("06SEP2012:13:36:33"dt) = 48993 (1:36:33 PM)

Table 5.4: Functions Returning a Time Component and Requiring a SAS Time Value as an Argument

 Function Name Explanation Example HOUR(arg) Extracts the hour from a SAS time value. HOUR("7:35:00"t) =7 MINUTE(arg) Extracts the minutes from a SAS time value. MINUTE("12:17:43 PM"t) = 17 SECOND(arg) Extracts the seconds from a SAS time value. SECOND("2:17:43"t) = 43

5.3 Creating Dates, Times, and Datetimes from Numbers or Other Information

5.3.1 Introduction

This series of functions will create SAS date, time, and datetime values from numerical variables or constants. While informats take complete date, time, and datetime references and translate them to their corresponding SAS value, these functions will create a SAS value from discrete pieces such as month, day, and year.

5.3.2 List of Functions and Their Descriptions

DATEJUL(Julian-date);

DATEJUL(Julian-date); creates a SAS date value from a numeric value representing a Julian date. Julian-date must be of the type yy(yy)ddd, where yy(yy) is two or four digits representing the year, and ddd must be a number from 1 to 365 (366 if a leap year). If you use two digits for the year, the YEARCUTOFF= option will be used to determine the century. The following table gives examples of how to apply this function:

 Sample Function Call SAS Date Value Formatted with MMDDYY10. Format Comments OPTIONS YEARCUTOFF=1920; DATEJUL(21286) –13959 10/13/1921 With the YEARCUTOFF value of 1920, the 21 is interpreted as 1921. OPTIONS YEARCUTOFF=2000; DATEJUL(21286) 22566 10/13/2021 If YEARCUTOFF is 2000, the 21 is interpreted as 2021. DATEJUL(2014174) 19897 06/23/2014 DATEJUL(1989005) 10597 01/05/1989 DATEJUL(00368) . . 368 is not a valid value for a Julian day, so the function returns a missing value.

DHMS(date,hour,minute,second);

DHMS(date,hour,minute,second); creates a SAS datetime value. All four arguments are required. date is a SAS date value, which can be either a numeric value or a date constant. If you use a two-digit year in a date constant, the date will be translated according to the YEARCUTOFF= option. hour, minute, and second are all numeric variables and/or constants. Hour, minute, and second are not restricted to their clock times. Therefore, hour can be greater than 24, while minute and second can be greater than 60. The following table gives examples of how to apply this function:

 Sample Function Call DatetimeValue Formatted with DATETIME19. Format Comments DHMS("08JUN2015"d,15,24,0) 1749396240 08JUN2015:15:24:00 DHMS("02FEB2014"d,11,54,15) 1706961255 02FEB2014:11:54:15 DHMS("30JUN2012"d,8,7,93) 1656662913 30JUN1992:08:08:33 The value 93 is just an argument. The function ultimately returns the datetime value in seconds and the formatted value converts the result. Therefore, 93 seconds becomes 1 minute, 33 seconds, which adds 1 to the minute value of 7, and reduces the seconds to 33.

Example 5.1: Using DHMS() When You Already Have a SAS Date and Time

If you need to create a datetime from a SAS date and a SAS time value, you do not have to use the HOUR(), MINUTE(), and SECOND() functions to extract those components from the time value. Remember that SAS keeps track of time in seconds since midnight and that the secondsargument in the DHMS function can be greater than 59. Therefore, if you supply the SAS time value as the seconds argument and set hour and minute to zeros, that will work just fine. The following example demonstrates:

1 DATA ex5_1;

2 INPUT row \$ sasdate :date9. sastime :time8.;

3 result = DHMS(sasdate,0,0,sastime);

4 fmt_result = result;

5 format sasdate date9. sastime timeampm.;

6 DATALINES;

7 A 16oct2015 17:30

8 B 08jun2016 11:00

9 C 14apr2015 00:00

10 ;;;

11 RUN;

12 PROC PRINT DATA=ex5_1 NOOBS LABEL SPLIT='\';

13 ID row;

14 VAR sasdate sastime result fmt_result;

15 FORMAT fmt_result datetime19. sasdate sastime;

16 LABEL row='Row'

17 result = '~{text_align=c}SAS Datetime Value\Calculated
Using\DHMS(sasdate,0,0,sastime)'

18 fmt_result = "Result Formatted with\DATETIME19. Format";

19 RUN;

The Result

 Row sasdate sastime SAS Datetime ValueCalculated UsingDHMS(sasdate,0,0,sastime) Result Formatted withDATETIME19. Format A 20377 63000 1760635800 16OCT2015:17:30:00 B 20613 39600 1781002800 08JUN2016:11:00:00 C 20192 0 1744588800 14APR2015:00:00:00

The calculation of the datetime value is done in line 3 of the above program. The formats have been removed from the sasdate and sastime variables in the table for the purpose of illustrating the actual values that are being sent to the function. Row C above has been included to demonstrate that any valid SAS time value will work, even 0 seconds. What good is that? If you need to create a SAS datetime value and you only have a date without a time value, then you can use DHMS(sasdate,0,0,0) to convert your SAS date into a SAS datetime.

HMS(hour,minute,second);

HMS(hour,minute,second); creates a SAS time value. hour, minute, and second are all numeric variables and/or constants. None of the parameters are restricted to their clock times. Therefore, hour can be greater than 24, while minute and second can be greater than 60. All three arguments must be present or you will get a missing value as the result. The following table provides examples:

 Sample Function Call Time Value Formatted with TIME. Format Formatted with TIMEAMPM. Format Comments HMS(18,0,9) 64809 18:00:09 6:00:09 PM HMS(7,45,80) 27980 7:46:20 7:46:20 AM The time is not displayed as "7:45:80" because the value is returned as the total number of seconds, and the format is applied to that. Neither the TIME. nor the TIMEAMPM. formats display minute or second values greater than 59. HMS(15,03,35.56) 54215.56 15:03:36 3:03:36 PM HMS(8,17,33) 29853 8:17:33 8:17:33 AM HMS(21,14,28) 76468 21:14:28 9:14:28 PM

MDY(month,day,year);

MDY(month,day,year); creates a SAS date value from the arguments. All three arguments are required. month, day, and year are all numeric variables or constants. If year is two digits, the century will be determined by the YEARCUTOFF= option. If a value given for any of the arguments is not valid or missing, such as MDY(2,31,2014) (February 31, 2014), the function will return a missing value and give you an "invalid argument to function" message in the log. The following table shows examples:

 Sample Function Call SAS Date Value Formatted with WEEKDATE. Format MDY(9,3,1876) –30434 Sunday, September 3, 1876 MDY(12,14,15) 20436 Monday, December 14, 2015 MDY(3,26,1915) –16352 Friday, March 26, 1915 MDY(5,22,2014) 19865 Thursday, May 22, 2014

NWKDOM(n, weekday, month, year);

NWKDOM(n, weekday, month, year); creates a SAS date value for a given weekday in a given week number from a given month and year. All of the arguments are numeric and can be represented by constants or numeric variables. If any of the arguments are missing, or not valid, then the function will return a missing value. n ranges from 1 (first) to 5 (last). Sometimes using the value of 5 will give the same result as 4, if the fourth week of the month is also the last week of the month. weekday ranges from 1 to 7, where 1 represents Sunday, and 7 represents Saturday.month can range from 1 (January) to 12 (December). year represents the year and is subject to the YEARCUTOFF= option if you use two digits for the year.

You can use this function to find dates that are expressed as "The first Saturday in May," or the "last Tuesday in August," easily. The following table provides examples of how this function works.

 Date Description Sample Function Call SAS Date Value Date Formatted using WEEKDATE. First Tuesday in November. NWKDOM(1,3,11,2014) 20031 Tuesday, November 4, 2014 Second Tuesday in December. NWKDOM(2,3,12,2014) 20066 Tuesday, December 9, 2014 Third Wednesday in April. NWKDOM(3,4,4,2014) 19829 Wednesday, April 16, 2014 Fourth Sunday in May. NWKDOM(4,1,5,2014) 19868 Sunday, May 25, 2014 Last Sunday in May. NWKDOM(5,1,5,2014) 19868 Sunday, May 25, 2014 First Sunday in June. NWKDOM(1,1,6,2018) 21338 Sunday, June 3, 2018 Second Saturday in December. NWKDOM(2,7,12,2018) 21526 Saturday, December 8, 2018 Third Saturday in May. NWKDOM(3,7,5,2018) 21323 Saturday, May 19, 2018 Fourth Monday in October. NWKDOM(4,2,10,2018) 21479 Monday, October 22, 2018 Last Monday in October. NWKDOM(5,2,10,2018) 21486 Monday, October 29, 2018

The bolded and italicized text above shows that the last week of the month and the fourth week of the month may or may not produce the same date.

YYQ(year,qtr);

YYQ(year,qtr); creates a SAS date value from the arguments. Both arguments are required. year is a numeric variable or constant representing the year, and qtr is a numeric variable or constant between 1 and 4, representing the quarter of the year. If year is two digits, the century will be determined by the YEARCUTOFF= system option. This function returns the date of the first day of the quarter in the given year. The following table provides examples of how this function works.

 Sample Function Call SAS Date Value Formatted with mmddyy10. Format Comment YYQ(2015,1) 20089 01/01/2015 YYQ(99,3) 14426 07/01/1999 YYQ(25,2) –12693 04/01/1925 When YEARCUTOFF=1920, 25 translates to 1925 because the range runs from 1920 through 2019. YYQ(25,2) 23832 04/01/2025 In SAS 9.4, the YEARCUTOFF= option default is 1926, so the range is now 1926 through 2025, and 25 is now translated as 2025. YYQ(2015,2) 20179 07/01/2015

5.4 Calculating Elapsed Time, and the HOLIDAY() Function

Because SAS uses simple math as the basis for dates and times, you might think that calculating elapsed time or projecting into the future would be easy. It should be a matter of simple addition or subtraction. However, SAS provides several functions that deal with calculating elapsed time, and for the most part the function is going to be more accurate than simple math. For example, one of the mathematical equations for calculating age is (current date–date of birth)/365.25. This approximation uses the .25 to account for leap years, but it fails to take into account the exception for years that are divisible by 100 but not by 400. While you might rarely need such accuracy when calculating elapsed years, the efficiency of a SAS function might speed things perceptibly when working with big data.

5.4.1 Calculating Elapsed Time with DATDIF() and YRDIF()

These two functions were originally developed for use with securities calculations for specific financial instruments.

DATDIF(start,end,basis);

DATDIF(start,end,basis); calculates the number of days between two dates. start is the starting date, which can be a date constant, a numeric variable, or a SAS expression. end is the ending date, also a date constant, a numeric variable, or a SAS expression. basis is a character constant or variable that tells SAS how to calculate the difference. The start and end arguments are required, while basis is optional. basis has two possible values. Note that if you use a character constant for basis, remember that it will need to be enclosed in quotation marks, or you will get an error.

1. '30/360', which sets each month to 30 days, and the year to 360 days, regardless of how many days are in each month or year in the span between the two dates. If a day is at the end of a month (for example, February 28/29 or March 31), it will be considered as the 30th of the month.

2. 'ACT/ACT', which uses the actual number of days in each month and year in the span between the two dates. This is the default, and it is identical to subtracting start from end.

 Sample Function Call Result Comment DATDIF('19JUL2015'd,'19JUL2016'd,'30/360') 360 basis is "30/360," indicating a year of 360 days by definition. DATDIF('19JUL2015'd,'19JUL2016'd,'ACT/ACT') 366 2016 is a leap year, so 366 days have elapsed between July 19, 2015, and July 19, 2016.

YRDIF(start,end,basis);

YRDIF(start,end,basis); calculates the number of years between two dates. It is almost always more accurate than using mathematical approximation, depending on the basis used and the desired result. start is the starting date, which can be a date constant, a numeric variable, or a SAS expression. end is the ending date, also a date constant, a numeric variable, or a SAS expression. basis is a character constant or variable that tells SAS how to calculate the difference. basis has five possible values, as compared with the two possibilities in the DATDIF() function:

1. '30/360', which sets each month to 30 days, and the year to 360 days, regardless of how many days are in each month or year in the span between the two dates. If a day is at the end of a month (for example, February 28/29 or March 31), it will be considered as the 30th of the month.

2. 'ACT/ACT', which uses the actual number of days in each month and year in the span between the two dates. This was the default basis through SAS version 9.2. You can use the alias "Actual," not 'ACT.'

3. 'ACT/360', which uses the actual number of days between the two dates to calculate the number of years, but it uses a 360-day year, regardless of how many days are in each year, so the result is number of days divided by 360.

4. 'ACT/365', which uses the actual number of days between the two dates to calculate the number of years, but uses a 365-day year, regardless of how many days are in each year, so the result is number of days divided by 365.

5. 'AGE', which is used to calculate a person's age. This is available starting with SAS version 9.3, and as of that release, 'AGE' is now the default.

 Sample Function Call Resulting value Comment YRDIF('07AUG1967'd,'24MAY2014'd,'30/360') 46.797222 A month is defined as having 30 days, and the year is 360 days long. YRDIF('07AUG1967'd,'24MAY2014'd,'ACT/ACT') 46.794521 Actual days in a month and actual days in a year are used. YRDIF('07AUG1967'd,'24MAY2014'd,'ACT/360') 47.477778 Actual number of days in a month are used; year is defined as having 360 days, YRDIF('07AUG1967'd,'24MAY2014'd,'ACT/365') 46.827397 Actual number of days in a month are used; year is defined as having 365 days

As you can see, all four results are different, and this is due to the way they were calculated. Prior to version 9.3, this function was often used to calculate ages, but even the 'ACT/ACT' basis doesn't calculate ages precisely. The 'ACT/ACT' basis averages leap year days across the four years. Now let's examine the YRDIF function when the 'ACT/ACT' basis and the 'AGE' basis are used.

 Sample Function Call Resulting value YRDIF('07AUG1967'd,'24MAY2014'd,'ACT/ACT') 46.794521 YRDIF('07AUG1967'd,'24MAY2014'd,'AGE') 46.794521

This looks as if the YRDIF() function will yield the same result for both the 'AGE' basis and the 'ACT/ACT' basis. Where is the difference? Let's look at another series of dates.

 Sample Function Call Resulting value A YRDIF('07AUG1968'd,'24MAY2014'd,'ACT/ACT') 45.79342 YRDIF('07AUG1968'd,'24MAY2014'd,'AGE') 45.794521 B YRDIF('07AUG1969'd,'24MAY2014'd,'ACT/ACT') 44.794521 YRDIF('07AUG1969'd,'24MAY2014'd,'AGE') 44.794521 C YRDIF('07AUG1971'd,'24MAY2014'd,'ACT/ACT') 42.794521 YRDIF('07AUG1971'd,'24MAY2014'd,'AGE') 42.794521 D YRDIF('07AUG1972'd,'24MAY2014'd,'ACT/ACT') 41.79342 YRDIF('07AUG1972'd,'24MAY2014'd,'AGE') 41.794521

In groupings A and D, you see that the basis makes a difference, while in groupings B and C, the 'ACT/ACT' and 'AGE' return identical results. What makes groupings A and D so different? These are leap years, and the leap day is accounted for as a whole day for that given year, as opposed to the averaging of a quarter day per year performed by the 'ACT/ACT' algorithm.

HOLIDAY(holiday,year); provides the date of selected holidays in any given year as a SAS date value. This function is valid for U.S. and Canada holidays only. holiday can be a character string enclosed in quotation marks or a character variable containing one of the arguments listed below. The valid values of holiday are listed in the table below. Note: If you use a character variable instead of a string, your variable should be at least 18 characters long to accommodate the longest argument.

 Argument Used in Function Holiday Observed Date BOXING Boxing Day December 26 CANADA Canada Day July 1 CANADAOBSERVED Canada Day observed July 1, or July 2 if July 1 is a Sunday CHRISTMAS Christmas December 25 COLUMBUS Columbus Day 2nd Monday in October EASTER Easter Sunday date varies FATHERS Father's Day 3rd Sunday in June HALLOWEEN Halloween October 31 LABOR Labor Day 1st Monday in September MLK Martin Luther King, Jr. 's birthday 3rd Monday in January beginning in 1986 MEMORIAL Memorial Day last Monday in May (since 1971) MOTHERS Mother's Day 2nd Sunday in May NEWYEAR New Year's Day January 1 THANKSGIVING U.S. Thanksgiving Day 4th Thursday in November THANKSGIVINGCANADA Canadian Thanksgiving Day 2nd Monday in October USINDEPENDENCE U.S. Independence Day July 4 USPRESIDENTS Abraham Lincoln's and George Washington's birthdays observed 3rd Monday in February (since 1971) VALENTINES Valentine's Day February 14 VETERANS Veterans Day November 11 VETERANSUSG Veterans Day (U.S. government-observed) U.S. government-observed date for Monday–Friday schedule VETERANSUSPS Veterans Day (U.S. post office observed) U.S. government-observed date for Monday–Saturday schedule (U.S. Post Office) VICTORIA Victoria Day Monday on or preceding May 24
 Sample Function Call SAS Date Value Date Formatted using WEEKDATE. HOLIDAY("EASTER",2014) 19833 Sunday, April 20, 2014 HOLIDAY("EASTER",2019) 21660 Sunday, April 21, 2019 HOLIDAY("EASTER",2026) 24201 Sunday, April 5, 2026 HOLIDAY("EASTER",2039) 28954 Sunday, April 10, 2039 HOLIDAY("THANKSGIVINGCANADA",2014) 20009 Monday, October 13, 2014 HOLIDAY("THANKSGIVINGCANADA",2019) 21836 Monday, October 14, 2019 HOLIDAY("THANKSGIVINGCANADA",2026) 24391 Monday, October 12, 2026 HOLIDAY("THANKSGIVINGCANADA",2039) 29137 Monday, October 10, 2039 HOLIDAY("USINDEPENDENCE",2014) 19908 Friday, July 4, 2014 HOLIDAY("USINDEPENDENCE",2019) 21734 Thursday, July 4, 2019 HOLIDAY("USINDEPENDENCE",2026) 24291 Saturday, July 4, 2026 HOLIDAY("USINDEPENDENCE",2039) 29039 Monday, July 4, 2039

5.5 The Basics of SAS Intervals

Some of the SAS functions described in section 5.4, such as DATDIF, are very good at calculating the exact amount of elapsed time between two SAS dates, and as demonstrated in some of the above examples, you can see the difference between the function and simple math. There aren't functions to project future dates, because it would seem simple enough: you just add a number of days, hours, or minutes, and you come up with an answer.

However, we frequently need to refer to units of time that are not uniform, such as months, which can be 28, 29, 30, or 31 days long. SAS provides functions to calculate intervals because, in many cases, simple math is still only an approximation. SAS has several standard interval definitions that are used with dates, times, and datetimes that represent many of the normal periods of time that we refer to, such as weeks or quarters. You are not restricted to the intervals given in the standard definitions, because you also have the ability to easily modify them. You can use multipliers and/or a shift index in conjunction with the standard intervals. Multipliers enable you to define intervals that are multiples of a standard interval and are not already defined, such as a decade or a century. A shift index enables you to define intervals that do not correspond with the starting values used by an interval (standard OR with a multiplier), such as a fiscal year that begins in July instead of January, or, to give you an example with a multiplier, a decade that starts in years ending in the number '5,' instead of years ending in 0. Section 5.6 will discuss the concepts of multipliers and the shift index in detail. If you need intervals that cannot be described by using multipliers and/or a shift index with the standard SAS intervals, SAS has the capacity for you to define your own intervals, and this is covered in depth in Section 5.7.

For the remainder of this book, when the term "interval" is used in a function definition, it means a SAS interval name, along with any multiplier and/or shift index unless explicitly specified otherwise. We will begin our discussion of intervals by providing a list of all the standard interval definitions and the periods that they describe in Table 5.5.

Table 5.5 SAS Interval Definitions Used with Dates, Times, and Datetimes

 Category Interval Name Definition Default Starting Point Date DAY Daily intervals Each day WEEK Weekly intervals of seven days Each Sunday WEEKDAYdaysW Daily intervals with Friday-Saturday-Sunday counted as the same day (five-day work week with a Saturday-Sunday weekend). days identifies the individual numbers of the weekend day(s) by number (1=Sunday … 7=Saturday). By default, days="17," so the default interval is WEEKDAY17W. Each day TENDAY Ten-day intervals (a U.S. automobile industry convention) 1st, 11th, and 21st of each month SEMIMONTH Half-month intervals First and sixteenth of each month MONTH Monthly intervals First of each month QTR Quarterly (three-month) intervals 1-Jan1-Apr1-Jul1-Oct SEMIYEAR Semi-annual (six-month) intervals 1-Jan1 Jul YEAR Yearly intervals 1-Jan Datetime DTDAY Daily intervals Each day DTWEEK Weekly intervals of seven days Each Sunday DTWEEKDAYdaysW Daily intervals with Friday-Saturday-Sunday counted as the same day (five-day work week with a Saturday-Sunday weekend). days identifies the individual weekend days by number (1=Sunday … 7=Saturday). By default, days="17," so the default interval is DTWEEKDAY17W. Each day DTTENDAY Ten-day intervals (a U.S. automobile industry convention) 1st, 11th, and 21st of each month DTSEMIMONTH Half-month intervals First and sixteenth of each month DTMONTH Monthly intervals First of each month DTQTR Quarterly (three-month) intervals 1-Jan1-Apr1-Jul1-Oct DTSEMIYEAR Semiannual (six-month) intervals 1- Jan1 Jul DTYEAR Yearly intervals 1-Jan DTSECOND Second intervals Seconds DTMINUTE Minute intervals Minutes DTHOUR Hour intervals Hours Time SECOND Second intervals Seconds MINUTE Minute intervals Minutes HOUR Hourly intervals Hours

5.5.1 The Interval Calculation Functions: INTCK() and INTNX()

The interval calculation functions INTCK() and INTNX() use SAS interval definitions. INTCK() counts the number of intervals between two given dates, times, or datetimes. INTNX() calculates the date, time, or datetime that results after a given number of intervals have been added to an initial date, time, or datetime value.

The syntax for the INTCK function is as follows.

INTCK(interval, start-of-period,end-of-period,method);

interval is the SAS designation for a period of time, and can be a character literal or character variable that corresponds to one of the defined time intervals (see Table 5.5). start-of-period is the beginning date, time, or datetime value, while end-of-period is the ending one. Both start-of-period and end-of-period can be anything that evaluates to a valid SAS date, time, or datetime value.

As of SAS version 9, method determines how SAS is going to count the intervals. There are two possible values: CONTINUOUS (or C or CONT) and DISCRETE or (D or DISC). The default is DISCRETE, and this has been how intervals have traditionally been calculated in SAS. When method is DISCRETE, the INTCK() function is counting the number of times that the period interval begins between start-of-period and end-of-period, inclusive. It does not count the number of complete intervals between start-of-period and end-of-period. This also means that the count does not begin with start-of-period, but at the beginning of the first interval after that. The following example demonstrates how INTCK() counts using the DISCRETE method. For example, take the dates Saturday, December 31, 2011, and Sunday, January 1, 2012.

Example 5.2: How the INTCK() Function Counts by Default (method Is DISCRETE)

 Function Call Result INTCK('DAY','31dec2011'd,'01jan2012'd); 1 INTCK('WEEK','31dec2011'd,'01jan2012'd) 1 INTCK('MONTH','31dec2011'd,'01jan2012'd) 1 INTCK('YEAR','31dec2011'd,'01jan2012'd) 1

All of the intervals are equal to 1 even though only one day has passed! January 1, 2012, is the start of day, week, month, and year intervals. The starting day occurred on December 31, and the ending day began on January 1, so it is obvious that the result for the DAY interval should be 1, because the DAY interval boundary was crossed on January 1. However, when it comes to the WEEK interval, Sunday is the beginning of the week. Therefore, the week containing December 31st started on Sunday, December 25. Sunday, January 1, is the beginning of the next week, so you cross the WEEK interval boundary at January 1. This means that one WEEK interval has elapsed between the start of the week in December and the start of the week in January, so that causes the result for WEEK to be 1. Similarly, the month containing December 31 started on December 1, 2011, and the month for January 1, 2012, started on January 1. You are crossing the MONTH interval boundary on January 1, so one MONTH interval has elapsed between the start of the intervals for the two dates, and therefore, that result is 1 as well. Finally, the year for December 31, 2011, started on January 1 of 2011, while the year for January 1, 2012, starts on the same date. You cross the YEAR interval boundary at January 1, which causes the YEAR interval result to be 1. All of the values are equal to 1 because the INTCK() function is counting the DAY, WEEK, MONTH, and YEAR interval boundary, which occurs at Sunday, January 1, 2012, and not because of the number of days, weeks, months, or years that have passed.

Now let's look at the corresponding results when you use CONTINUOUS for the method.

Example 5.3: How the INTCK() Function Counts When method Is CONTINUOUS

 Function Call Result INTCK('DAY','31dec2011'd,'01jan2012'd,'C'); 1 INTCK('WEEK','31dec2011'd,'01jan2012'd,'C') 0 INTCK('MONTH','31dec2011'd,'01jan2012'd,'C' 0 INTCK('YEAR','31dec2011'd,'01jan2012'd,'C') 0

One day has passed, as in the DISCRETE example. However, this looks more like what you might expect when you ask how many weeks, months, and years have passed between December 31, 2011, and January 1, 2012. Only one day has passed, not an entire week, month, or year. The CONTINUOUS method calculates continuous time, and it uses the calendar definition of a week, month, or year, starting on the date supplied as the first argument. Let's change the ending dates appropriately to reproduce the results in Example 5.2.

Example 5.4: How the INTCK() Function Counts When method Is CONTINUOUS

 Function Call Result Days Elapsed INTCK('DAY','31dec2011'd,'01jan2012'd,'C'); 1 1 INTCK('WEEK','31dec2011'd,'07jan2012'd,'C') 1 7 INTCK('MONTH','31dec2011'd,'31jan2012'd,'C' 1 31 INTCK('YEAR','31dec2011'd,'31dec2012'd,'C') 1 366

Now the picture should be a little more clear that using CONTINUOUS causes the INTCK() function to look at continuous elapsed time, not interval boundaries. To firm up this demonstration of the CONTINUOUS method for the INTCK() function, let's look at how the definition of the MONTH interval can change.

Example 5.5: How the INTCK() Function Counts When method Is CONTINUOUS

 Function Call Result Days Elapsed A INTCK('MONTH','01jan2012'd,'01feb2012'd,'C'); 1 31 B INTCK('MONTH','01jan2012'd,'31jan2012'd,'C'); 0 30 C INTCK('MONTH','01apr2012'd,'01may2012'd,'C'); 1 30 D INTCK('MONTH','28feb2012'd,'28mar2012'd,'C'); 1 29 E INTCK('MONTH','29feb2012'd,'28mar2012'd,'C'); 0 28 F INTCK('MONTH','28feb2013'd,'28mar2013'd,'C'); 1 28

Rows A and B look normal: 31 days is a month, while 30 days is not a month. However, row C states that 30 days is a month. This is only true for the months of April, June, September, and November. Finally, rows D through F demonstrate that, when you are using the CONTINUOUS method, the MONTH interval changes based on the month and the year. In leap years, a period of 29 days across February is considered a month (row D). However, a period of 28 days is not considered a month in leap years (row E). On the other hand, if it is not a leap year, 28 days in February is a month.

The CONTINUOUS method is useful for calculating anniversaries and milestones tied to dates, times, and datetimes. Although this method seems much more intuitive than the traditional (and still default) way that SAS handles intervals, you will have to be very cautious in choosing which method you use and when. It is not recommended that you replace all of your existing code to use this method instead of the default, because the two methods differ in their definition of interval boundaries. The CONTINUOUS method shifts the interval by the starting date provided. With the DISCRETE method, all values within an interval boundary are considered to be equivalent. That makes it possible to group observations occurring within an interval for analysis.

Since the CONTINUOUS method is more intuitive to the way we tend to look at periods of time, the remainder of this section on the INTCK() function will be devoted to the traditional DISCRETE method. Please remember that the DISCRETE method is the default, and in order to use the CONTINUOUS method, you will have to supply the proper method argument in the INTCK() function. All of the following examples rely on the default method, so there is no method argument used in the function calls.

To complete the picture of how the traditional use of the INTCK() function works, let's look at the effect that the ending date has on INTCK().

Example 5.6: How the INTCK() Function Counts

 Function Call Result INTCK('DAY','31dec2014'd,'06jan2015'd); 6 INTCK('WEEK','31dec2014'd,'06jan2015'd) 1 INTCK('MONTH','31dec2014'd,'06jan2015'd 1 INTCK('YEAR','31dec2014'd,'06jan2015'd) 1

Here you can see that although 6 days have elapsed, still only 1 week, month, and year have elapsed according to INTCK()! That is because the start of the week, month, and year interval for January 6, 2015, is still January 1, 2015, and that is what INTCK() is counting. Here are some more examples of the use of the INTCK() function with its default of the DISCRETE method.

Example 5.7: The INTCK Function: The Basics

 Function Call Result INTCK('DAY','15jun2013'd,'22jun2013'd); 7 INTCK('WEEK','01jan2016'd,'01jan2017'd); 53 INTCK('DTDAY','01oct1872:08:00:00'dt,'20dec1872:18:00:00'dt); 80 INTCK('MONTH','05mar2000'd,'01may2000'd); 2

Example 5.8: The INTCK Function: Counting Backward

 Function Call Result INTCK('YEAR','22dec2015'd,'16jul2010'd); -5

In this example, the from date is after the to date, and therefore, the answer is negative. Since INTCK() counts interval boundaries, the answer is –5 because it starts counting at the start of a year. The start of the year interval for December 22, 2015, is January 1, 2015, so it is not counted. January 1, 2014; January 1, 2013; January 1, 2012; January 1, 2011; and January 1, 2010, are the boundaries of the YEAR intervals that it is counting, which corresponds to the beginning dates of each year.

Example 5.9: The INTCK Function: Counting Weekdays

 Function Call Result A INTCK('WEEKDAY17W','12JAN2014'd,'14JAN2014'd); 2 B INTCK('WEEKDAY17W','13JAN2014'd,'14JAN2014'd); 1 C INTCK('WEEKDAY17W','17JAN2014'd,'18JAN2014'd); 0

If you are counting the number of work weekdays that have elapsed, you must be careful to remember that INTCK() counts interval boundaries and that the starting date is not counted in the answer. Row A above seems perfectly reasonable. You would expect that there would be two weekdays between Sunday, January 12, 2014, and Tuesday, January 14, 2014. Even though the starting boundary for WEEKDAY17W is Monday, the starting date is Sunday, January 12, 2014, so it counts Monday and Tuesday. However, you might think that there are two weekdays in the span Monday, January 13, 2014, to Tuesday, January 14, 2014. In row B, the INTCK() function is counting only 1 weekday, because it doesn't include Monday, January 13, 2014 (the start argument used in the function), when it counts the interval boundaries. Why is row C equal to zero then? First, Friday is the starting day, and the starting day is never included in the count. January 18, 2014, is a Saturday, and since we've defined the weekend days as Saturday and Sunday, those days are outside of any interval boundaries, so by definition, no interval boundaries have been passed.

Example 5.10 illustrates the difference between the three methods that have been discussed for calculating elapsed years using SAS. We will use the YRDIF() function using both the 'AGE' basis and 'ACT/ACT' basis. Since the INTCK() function also counts elapsed intervals, we will use it and show both the DISCRETE and CONTINUOUS methods, and compare those with mathematical estimation.

Example 5.10: The YRDIF() Function as Opposed to Mathematical Estimation and INTCK()

This uses the same dates, August 7, 1963, and May 8, 2014, for all five methods, but row A uses the YRDIF() function with 'ACT/ACT,' row B uses the same function, but with the 'AGE' basis. The two values are equivalent with the YRDIF() function, but as we have seen earlier, 'AGE' and 'ACT/ACT' only yield different results when the year being tested is a leap year. The mathematical approximation divides the number of days between the two dates by 365.25, and although the discrepancy is minute in this example, the difference is caused by the fact that the number of leap years in the period (11) is not evenly divisible by 4, rendering the value 365.25 an approximation. The INTCK function using the DISCRETE method counts interval boundaries from their beginning, so it is counting the number of January firsts between January 1, 1963, and January 1, 2015. In effect, unless you were born on January 1, using the DISCRETE method with INTCK() to calculate your age will make you old before your time! It is more appropriate in this type of scenario to use the CONTINUOUS method with the INTCK() function, which yields the answer 50 years old.

 Calculation method Result A Using YRDIF() with actual days in month, actual year 50.750685 B Using YRDIF() with AGE basis 50.750685 C Using ('08may2014'd - '07aug1963'd)/365.25 50.75154 D Using INTCK('YEAR','07aug1963'd,'08may2014'd,'DISCRETE') 51 E Using INTCK('YEAR','07aug1963'd,'08may2014'd,'CONTINUOUS') 50

Those are some examples of the use of the INTCK() function. Remember, as of SAS version 9, the INTCK() function has an additional argument that you can use to change the way that INTCK counts interval boundaries. However, any existing code using INTCK() does not need to be changed, because the default method is the 'DISCRETE' method, which is how INTCK() has always worked. Once again, it is not a good idea to change existing code without considering how INTCK is being used, because the methods differ significantly in their calculations.

The next of the interval functions that we will consider is the INTNX() function. This function takes a given SAS date, time, or datetime value and calculates a new value by incrementing by the given number of intervals. Where INTCK() calculates the number of intervals between any two date, time, or datetime values, INTNX() takes the start of the period and increments it by a number of intervals to give the end of the period. The syntax of the INTNX() function is as follows:

INTNX(interval,start-from,number-of-increments,alignment);

INTNX(interval,start-from,number-of-increments,alignment); is one of the SAS intervals defined in Table 5.5 and can be a character literal or character variable that evaluates to one of the defined intervals. start-from is the starting date, time, or datetime value, which can be a constant, numeric variable, or a SAS expression. number-of-increments is an integer constant or a numeric variable that indicates how many intervals to advance. If it is not an integer, only the integer portion of the value will be used. alignment sets the returned date, time, or datetime value according to one of four predefined settings. The function calculates the dates at the beginning of the interval period, and then the alignment argument adjusts the result. The values are: Beginning or B, Middle or M, End or E, and Same or S (Sameday is also acceptable as an alias). The default value for alignment is Beginning. The Sameday argument cannot be used with the DTQTR, DTSEMIYEAR, or DTYEAR intervals, and it has no effect on time values.

The next series of examples demonstrates various uses and effects of the INTNX() function. The first example is the default use of the function with each of the date, time, and datetime intervals, while the second shows what happens when you use a non-integer as the increment to the function. Example 5.14 illustrates the use of the alignment arguments to yield specific dates.

Example 5.11: The INTNX() Function with Default Alignment

Each of the examples below increments the date, time, or datetime value by 3 of the interval shown in bold italics. For dates and datetimes, the start date is the same: Thursday, November 6, 2014. The only difference is that datetime intervals (interval names starting with DT) return datetime values, not dates. The interval values for datetime values are calculated in seconds, not days. By default, the INTNX() function returns the beginning of the interval given. If you want to change this, use the alignment argument discussed above.

 Function Call Result Comment INTNX('DAY','06NOV2014'd,3) November 9, 2014 INTNX('DTDAY','06NOV2014:15:00:00'dt,3) 11/9/2014 12:00 AM The time returned is midnight of 11/09/2014, not 3 p.m. INTNX('WEEKDAY17W','06NOV2014'd,3) November 11, 2014 The returned date is the following Tuesday. Saturday (7) and Sunday (1) don't count since they are defined as the weekend days. INTNX('DTWEEKDAY17W','06NOV2014:15:00:00'dt,3) 11/11/2014 12:00 AM INTNX('WEEK','06NOV2014'd,3) November 23, 2014 The value returned is the Sunday of the week, not 21 calendar days. INTNX('DTWEEK','06NOV2014:15:00:00'dt,3) 11/23/2014 12:00 AM INTNX('TENDAY','06NOV2014'd,3) December 1, 2014 The value returned is the first of the month, 25 calendar days, not 30. This interval will always return either the 1st, 11th, or 21st of the month. INTNX('DTTENDAY','06NOV2014:15:00:00'dt,3) 12/1/2014 12:00 AM INTNX('SEMIMONTH','06NOV2014'd,3) December 16, 2014 Although November has 30 days, the returned date is the 16th, not 45 calendar days, which would be the 21st. INTNX('DTSEMIMONTH','06NOV2014:15:00:00'dt,3) 12/16/2014 12:00 AM INTNX('MONTH','06NOV2014'd,3) February 1, 2015 The date returned is the beginning of the following month, not the same date. To get the same date, use the "S" (Sameday) alignment argument. INTNX('DTMONTH','06NOV2014:15:00:00'dt,3) 2/1/2015 12:00 AM INTNX('QTR','06NOV2014'd,3) July 1, 2015 The first day of the quarter is returned. INTNX('DTQTR','06NOV2014:15:00:00'dt,3) 7/1/2015 12:00 AM INTNX('SEMIYEAR','06NOV2014'd,3) January 1, 2016 The beginning of the 3rd semi-year after 11/02/2014 is January 1, 2016. INTNX('DTSEMIYEAR','06NOV2014:15:00:00'dt,3) 1/1/2016 12:00 AM INTNX('YEAR','06NOV2014'd,3) January 1, 2017 The beginning of the 3rd year after 11/02/2014 is January 1, 2017. INTNX('DTYEAR','06NOV2014:15:00:00'dt,3) 1/1/2017 12:00 AM INTNX('SECOND','8:00:00 AM't,3) 8:00:03 AM INTNX('MINUTE','8:00:00 AM't,3) 8:03:00 AM INTNX('HOUR','8:00:00 AM't,3) 11:00:00 AM

Example 5.12: The INTNX() Function: Using Non-Integer Increments

 INTNX('HOUR','16:45't,2.5) 06:00 PM (18:00)

When the number of increments argument is not an integer, SAS will take the integer part as the value. In this case, 2.5 becomes 2. The beginning of the first hour increment is 17:00, and the second is 18:00. Since alignment is the default value of B or beginning, that sets the answer to the beginning of the hour, which gives the result of 18:00.

Example 5.13: Counting Backward with the INTNX() Function

 INTNX('DAY','29NOV2014'd,-4) 25NOV2014

INTNX() always adds the number of intervals to the starting date, time, or datetime. If you want to find a prior date, then use a negative increment.

Example 5.14: The INTNX() Function with Alignment Arguments

 INTNX Call Date Comment INTNX('WEEK','02NOV2014'd,3,'B') 11/23/2014 Beginning of the week interval. Sunday, November 23, 2014 INTNX('WEEK','02NOV2014'd,3,'M') 11/26/2014 Middle of the week interval. Wednesday, November 26, 2014 INTNX('WEEK','02NOV2014'd,3,'E') 11/29/2014 End of the week interval. Saturday, November 29, 2014 INTNX('WEEK','02NOV2014'd,3,'S') 11/23/2014 Same day, so the week interval ends on the same day of the week (Sunday), leaving the duration at 21 days. Sunday, November 23, 2014

Although the alignment arguments do change the value returned by the INTNX() function, it must be reiterated that the function performs the calculation based on the beginning of the interval, and the adjustment to the final result only occurs after that first calculation has been done.

5.6 Modifying SAS Intervals

The default SAS intervals don't match every situation. However, there are two ways to customize SAS intervals: You can modify the existing SAS intervals with multipliers and a shift index, or you can create your own intervals from a SAS data set. This section will discuss multipliers and the shift index. Section 5.7 will discuss creating your own intervals.

The interval function INTCK(interval,start-of-period,end-of-period,method) counts the number of intervals between two given SAS dates, times, or datetimes, while the INTNX(interval,start-from,number-of-increments,alignment) function advances a given date, time, or datetime by a specified number of intervals. Both functions use the standard SAS interval definitions. (See Table 5.6, below.) With the DISCRETE method, the INTCK() function counts at the starting point of a given interval, while the CONTINUOUS method adjusts the starting point of the interval based on the starting date and counts from there. Each method has its merits, and which one you use will depend on your analysis or reporting needs.

The INTNX() function always advances to the beginning of the given interval. The alignment argument available with INTNX() will adjust the result to the middle, end, or the same day of the interval. Nonetheless, the intervals are still measured from their starting point, and the adjustment is only applied after INTNX() has moved the specified number of intervals.

This behavior can lead to problems when your definition of an interval doesn't exactly match the standard SAS definition of that same interval. For example, the standard SAS definition of year has the first day of the year defined as starting January 1. What happens if your fiscal year starts on July 1, and that is how you want to measure your year? What if your semi-month period is truly 14 days long?

A shift index and interval multipliers enable you to do this. You have the ability to define the start of an interval definition by adding a shift index to it. The shift index defines the number of shift points to move the start of the interval. There is one restriction on the value of the shift index: It must be less than the number of shift points within the interval. For example, you can shift the start of a YEAR interval by up to 12 months, but not 13, as there are only 12 months in a year. This makes sense as if you were to shift a year by 13 months, you have shifted into the next YEAR interval, and this is much less intuitive than adding 1 year to the starting date(s) that you send to the interval function.

Table 5.6 shows the SAS interval name, definition, and the shift increment period for each one.

Table 5.6: SAS Intervals and Their Shift Points

 Category Interval Definition Shift Point Date DAY Daily intervals Days WEEK Weekly intervals of seven days Days WEEKDAY Daily intervals with Friday-Saturday-Sunday counted as the same day. days identifies the individual weekend days by number (1=Sunday … 7=Saturday). By default, days="17," so the default interval is WEEKDAY17W. Days TENDAY Ten-day intervals Ten-day periods SEMIMONTH Half-month intervals Semimonthly periods MONTH Monthly intervals Months QTR Quarterly (three-month) intervals Months SEMIYEAR Semi-annual (six-month) intervals Months YEAR Yearly intervals Months Datetime DTDAY Daily intervals Days DTWEEK Weekly intervals of seven days Days DTWEEKDAY Daily intervals with Friday-Saturday-Sunday counted as the same day. days identifies the individual weekend days by number (1=Sunday … 7=Saturday). By default, days="17," so the default interval is DTWEEKDAY17W. Days DTTENDAY Ten-day intervals Ten-day periods DTSEMIMONTH Half-month intervals Semimonthly periods DTMONTH Monthly intervals Months DTQTR Quarterly (three-month) intervals Months DTSEMIYEAR Semiannual (six-month) intervals Months DTYEAR Yearly intervals Months DTSECOND Second intervals Seconds DTMINUTE Minute intervals Minutes DTHOUR Hour intervals Hours Time SECOND Second intervals Seconds MINUTE Minute intervals Minutes HOUR Hourly intervals Hours

For our first example, let's use the standard case of a fiscal year that starts on July 1. Technically, a shift index says that you are moving the start of the interval to the beginning of one of the subperiods within that interval. Years are subdivided into months, so that is the shift unit. The start point is always included in the count of subperiods to be shifted. The easiest way for me to remember this is that shifting an interval by one subperiod is the same as the unshifted base interval. Think of this another way: while the start of the year is January 1, the beginning of the first month in the year is also January 1. If we use a shift index of 1 with our YEAR interval, we are shifting to the beginning of the first month in the year.

Therefore, in order to move your YEAR interval by seven months, you need to move it to the beginning of the seventh subperiod. (January 1, February 1, March 1, April 1, May 1, June 1, July 1—that's seven). The shift value is added to the interval name by appending it with a decimal point.

Therefore, in order to advance the start of the YEAR interval by seven months to July 1, you would use the interval "YEAR.7," as illustrated by the following:

Example 5.15: Moving the Start of a Year Interval from January 1 to July 1

 Sample Function Call Result A INTCK('YEAR','01jan2013'd,'06jul2014'd,'DISCRETE'); 1 B INTCK('YEAR','01jan2013'd,'06jul2014'd,'CONTINUOUS'); 1 C INTCK('YEAR.7','01jan2013'd,'06jul2014'd,'DISCRETE'); 2 D INTCK('YEAR.7','01jan2013'd,'06jul2014'd,'CONTINUOUS'); 1

INTCK() counts the start of interval boundaries with the DISCRETE method (rows A and C). When the interval is "YEAR," it is measuring from January 1, 2013, until January 1, 2014 (the start of the year containing July 6, 2014). When the interval is "YEAR.7," you have shifted the beginning of the YEAR interval by 7 months, which declares that the year starts on July 1. In the example above, that shift causes the measurement to begin on July 1, 2012 (the start of the shifted "year" containing January 1, 2013) and end on July 1, 2014, which is the start of the year containing July 6, 2004. That is why the value returned in row C is 2, not 1. With the CONTINUOUS method, the shifted year starts on July 1, 2013, so the start date is before the start of the measurement. Therefore, the only interval boundary crossed is on July 1, 2014, so the answer in row D is 1.

While a shift index enables you to move the starting point of any given SAS interval, an interval multiplier enables you to define the length of your own intervals. You define your new interval by applying a multiplier value to an interval. For example, if you want to measure biweekly (14-day) periods, you would take the WEEK interval and multiply it by 2. You add the numeric multiplier to the end of the SAS interval name, so in this case your interval name would be "WEEK2." It measures 14-day periods starting on Sunday because the regular SAS interval WEEK starts on Sunday. Example 5.16 demonstrates the creation and use of a custom interval, by using the example of a multiplier of 2 for the WEEK interval. It also illustrates some of the features of custom intervals.

Example 5.16: Using an Interval Multiplier to Create a Custom Interval

 Sample Function Call Result INTNX('WEEK','09feb2014'd,2,'S'); Sunday, February 23, 2014 INTNX('WEEK2','09feb2014'd,2,'S'); Sunday, March 9, 2014

In the above example, the start of the WEEK interval two weeks from February 9 is February 23, so it makes sense that the start of the second biweekly period from February 9 is March 9, because February 2014 has 28 days, which is the same as two of your WEEK2 intervals. This is not as simple as it seems. When using SAS intervals with the INTNX() function, you must keep in mind that intervals are always measured from the beginning of the starting interval to the beginning of the ending interval. This is independent of the starting and ending dates that you supply. Let's move the starting date in Example 5.16 backward by one week to February 2, 2014.

 Sample Function Call Result 1 INTNX('WEEK','02feb2014'd,2,'S'); Sunday, February 16, 2014 2 INTNX('WEEK2','02feb2014'd,2,'S'); Sunday, February 23, 2014

Advancing by two WEEKS (row 1) is still a difference of 14 days, as expected. What happened in row 2? Shouldn't you get 28 days instead of 21? No, because the start of the WEEK2 interval containing February 2, 2014, is January 24, 2014, and that is where the INTNX() function begins its count. Here is a sample program to produce the starting dates of the intervals:

Example 5.17: WEEK2 Intervals from January 1, 2014

DATA tricky;

DO intervals= 0 TO 5;

interval_start = INTNX('WEEK2','01jan2014'd,intervals);

OUTPUT;

END;

RUN;

PROC REPORT DATA=tricky NOWD SPLIT='\';

COLUMNS intervals interval_start;

DEFINE intervals / DISPLAY "Number of WEEK2\intervals from\January 1, 2014"
STYLE={CELLWIDTH=1.25in TEXT_ALIGN=C};

DEFINE interval_start / FORMAT=weekdate32. "Starting Date of Interval";

RUN;

The above program creates a table that shows the starting dates for the first five WEEK2 intervals of 2014. If the date(s) supplied fall between the starting dates of any two boundaries, the interval count (or incrementing) will begin from the starting date of the previous interval. This is why moving the starting date of the INTNX function by one week doesn't necessarily move the result by one week. You can move up to 14 days within a WEEK2 interval before you change the interval boundary.

 Number of WEEK2intervals fromJanuary 1, 2014 Starting Date of Interval 0 Sunday, December 29, 2013 1 Sunday, January 12, 2014 2 Sunday, January 26, 2014 3 Sunday, February 9, 2014 4 Sunday, February 23, 2014 5 Sunday, March 9, 2014

How does SAS determine what the starting date of a given interval is if you use a multiplier? It takes the multiplied interval that you've created and starts counting beginning with January 1, 1960. This is true for all multiplied intervals except multiplied WEEK intervals. Multiplied WEEK intervals are counted starting from Sunday, December 27, 1959, because weeks are defined as starting on Sundays, and January 1, 1960, was a Friday.

You can also use a multiplier and a shift indicator together if needed. Interval multipliers are directly appended to the interval name (for example, WEEK2), and shift indicators are appended to the interval name with a leading decimal point (for example, WEEK.5). To use both the multiplier and a shift index, you first append the multiplier to the interval name to create a new interval name (for example, WEEK2), and then you append the shift indicator to the interval name. Given a multiplier of 2, and a shift of 5, the interval name becomes "WEEK2.5." Again, you cannot use a shift index that is greater than the number of shift points in the interval. In the case of a WEEK2 interval, you would not be able to use a shift index greater than 14, as that would place the starting point of your interval into the next WEEK2 interval. In that case, it would be more intuitive to change the starting date(s) that you will supply to the interval function.

To demonstrate, let's expand on the scenario used in example 5.17. What if you wanted your biweekly periods to start on January 1, 2014? January 1, 2014, was a Wednesday, so you want to move the starting date to the fourth day of the week (Sunday=1, therefore, Wednesday=4). Now we'll generate the same table as above, using a shift indicator of 4 days in addition to the biweekly interval of WEEK2.

Example 5.18: Using Both an Interval Multiplier and Shift Index to Create a Custom Interval

DATA tricky;

DO intervals= 0 TO 5;

plain_interval_start = INTNX('WEEK2','01jan2014'd,intervals);

shifted_interval_start = INTNX('WEEK2.4','01jan2014'd,intervals);

OUTPUT;

END;

RUN;

PROC REPORT DATA=tricky NOWD SPLIT='\';

COLUMNS intervals plain_interval_start shifted_interval_start;

DEFINE intervals / "Number of WEEK2\intervals from\January 1, 2014"

STYLE={CELLWIDTH=1.25in TEXT_ALIGN=C} DISPLAY;

DEFINE plain_interval_start /. "Starting Date of WEEK2 Interval"

FORMAT=weekdate32.;

DEFINE shifted_interval_start / "Starting Date of WEEK2.4 Interval"

FORMAT=weekdate32.;

RUN;

 Number of WEEK2Intervals fromJanuary 1, 2014 Starting Date of WEEK2 Interval Starting Date of WEEK2.4 Interval 0 Sunday, December 29, 2013 Wednesday, January 1, 2014 1 Sunday, January 12, 2014 Wednesday, January 15, 2014 2 Sunday, January 26, 2014 Wednesday, January 29, 2014 3 Sunday, February 9, 2014 Wednesday, February 12, 2014 4 Sunday, February 23, 2014 Wednesday, February 26, 2014 5 Sunday, March 9, 2014 Wednesday, March 12, 2014

When you use a multiplier, you also have the ability to define your shifts within the entire interval created by the multiplier. For example, let's create a decade interval by using YEAR10 as the interval. Remember that intervals start at the beginning of the boundary, so the decades would start at the beginning of the first year of the decade. What can you do if you want to define the decade as starting in May of the middle year in the decade (for example, May of 1955 as opposed to January of 1950)?

To shift intervals across years, you need to use the first nested interval within the YEAR interval, which is MONTH. So you would use (number of years to shift*12) to calculate the number of months that you need to shift. If you wanted to shift 5 years, you would use 5*12=60. Add 5 to that, which shifts the starting month from January to May, and your interval definition is now YEAR10.65. That would be decades starting in May of years that end in 5. The code below shows the effect of moving the interval by 65 months.

Example 5.19: Shifting a Multiplied Interval

DATA tricky3;

DO intervals= 0 TO 7;

plain_interval_start = INTNX('YEAR10','01sep1950'd,intervals);

shifted_interval_start =
INTNX('YEAR10.65','01sep1950'd,intervals);

OUTPUT;

END;

RUN;

ods rtf file="c:\book\2ndEd\examples\ex5.6.5.rtf";

PROC REPORT DATA=tricky3 NOWD SPLIT='\';

COLUMNS intervals plain_interval_start shifted_interval_start;

DEFINE intervals / DISPLAY "Number of\intervals from\September 1, 1950"

STYLE={CELLWIDTH=1in TEXT_ALIGN=C};

DEFINE plain_interval_start / FORMAT=weekdate32.

"Starting Date of YEAR10 Interval";

DEFINE shifted_interval_start / FORMAT=weekdate32.

"Starting Date of Shifted YEAR10 Interval";

RUN;

 Number ofIntervals fromSeptember 1, 1950 Starting Date of YEAR10 Interval Starting Date of Shifted YEAR10 Interval 0 Sunday, January 1, 1950 Tuesday, May 1, 1945 1 Friday, January 1, 1960 Sunday, May 1, 1955 2 Thursday, January 1, 1970 Saturday, May 1, 1965 3 Tuesday, January 1, 1980 Thursday, May 1, 1975 4 Monday, January 1, 1990 Wednesday, May 1, 1985 5 Saturday, January 1, 2000 Monday, May 1, 1995 6 Friday, January 1, 2010 Sunday, May 1, 2005 7 Wednesday, January 1, 2020 Friday, May 1, 2015

The first thing to note is that even though the date that we specified is September 1, the starting date of the interval is January 1, because that is the start of the YEAR interval. In the second column, you can see that the interval has been shifted. Even though the starting date of the YEAR10. interval is January 1, 1950, the shifted interval itself starts on Tuesday, May 1, 1945 (bold italics added), not May 1, 1955. Why? Because it is the start of the interval that contains January 1, 1950.

The next example will demonstrate the importance of making sure that you use the correct shift index and that you have to be aware of the starting date and/or datetime for your shifted, multiplied interval.

Example 5.20: The Interaction between the Starting Date and the Interval Starting Point: The "Working Shift" Interval

A company has a 24-hour production cycle that consists of three 8-hour working shifts per day. It is easy to create an interval that represents each shift by using HOUR8 as your base interval, but shifts do not start at midnight, 8 a.m., and 4 p.m., which would be the default for the HOUR8 interval. The table below shows when the HOUR8 interval starts based on the default.

 Function Call Starting Time INTNX('HOUR8','00:00't,0) 12:00:00 AM INTNX('HOUR8','00:00't,1) 8:00:00 AM INTNX('HOUR8','00:00't,2) 4:00:00 PM

However, in this case, the actual shifts start at 6 a.m., 2 p.m., and 10 p.m., so we will need to use a shift index to align our HOUR8 interval with the actual shift hours. It is important to remember that when you are shifting intervals, a shift index of 1 moves the start of the interval to the start of the first subperiod within the interval—which is always the same as the start of the unshifted interval. Look at the following table, which will tell us what interval shift will give us the starting point at 6 a.m.

 Function Call Interval Name Starting Time Comment INTNX('HOUR8','6:00't,0); HOUR8 12:00 AM INTNX(HOUR8.1,'6:00't,0) HOUR8.1 12:00 AM Same as HOUR8 interval. INTNX(HOUR8.2,'6:00't,0) HOUR8.2 1:00 AM INTNX(HOUR8.3,'6:00't,0) HOUR8.3 2:00 AM INTNX(HOUR8.4,'6:00't,0) HOUR8.4 3:00 AM INTNX(HOUR8.5,'6:00't,0) HOUR8.5 4:00 AM INTNX(HOUR8.6,'6:00't,0) HOUR8.6 5:00 AM INTNX(HOUR8.7,'6:00't,0) HOUR8.7 6:00 AM This is what we want.

Notice that in the above table, we've used 6 a.m. as our reference time in the INTNX() function call, and we get the starting point of the interval by adding 0 intervals to our reference point. This method works for finding the starting point of any interval in relation to the reference point. Now we'll repeat our first table, showing the starting points for a 24-hour period starting at 6 a.m., with our HOUR8.7 interval to demonstrate that everything is correct.

 Function Call Starting Time INTNX('HOUR8.7','06:00't,0) 6:00:00 AM INTNX('HOUR8.7','06:00't,1) 2:00:00 PM INTNX('HOUR8.7','06:00't,2) 10:00:00 PM

Although it might seem odd that a shift index of 7 is needed to get a change of 6 hours, you must remember that the shift index is the number of subperiods within the interval to move. The shift is always to the beginning of each subperiod within the interval. Therefore, the first subperiod within an interval always starts at the beginning of the interval. In this case, the HOUR8 interval starts at midnight, and the first hour within the HOUR8 interval also starts at midnight. That is why the shift index necessary is 7, not 6.

In summary, the most important thing to remember about using intervals, multipliers, and the shift index is that all intervals, no matter how they are defined, are measured from their beginning. If you are using the INTCK() function, the definition of "beginning" is dependent upon the method that you use, DISCRETE or CONTINUOUS. With DISCRETE, it is considered to be the beginning of the interval boundary, but with CONTINUOUS, beginning is defined based on the start date provided. The INTNX() function measures the beginning from the interval boundary; if it is easier to think of INTNX as always using the DISCRETE method, then you can do that. However, remember that the alignment arguments "B", "M", "E", and "S" for the INTNX() function do not adjust the date until after the function has executed and calculated its start-of-interval result (also remember that the alignment arguments only work with the INTNX() function). No matter what, you can use the interval multipliers and a shift index to move the starting point of an interval, and you can use them anywhere that you can use a date, time, or datetime interval.

5.7 Creating Your Own SAS Intervals

Another way to create intervals for use with interval functions is by creating a SAS data set that defines the boundaries of your interval. This data set defines the duration of periods within the interval and the period across which the interval applies as well, which is an important concept that will be discussed when it comes to errors with using user-defined intervals. User-defined intervals enable you to create intervals that do not correspond with shifted and/or multiplied SAS intervals, such as intervals with irregular boundaries, those that do not encompass an entire predefined SAS interval, or intervals that are specific to your job, company, or industry. However, if you needed to count working shifts in a situation where there are two 10-hour work shifts per day with a four-hour break in-between while the machinery resets, you could not do it by using multipliers and a shift index.

You can also redefine an interval. For example, the shift point for a day is the day; what happens if you want your "day" to start at 6 a.m. instead of midnight? DTHOUR24.7 would create an interval that runs 24 hours starting at 6 a.m. (See example 5.20, which does the same for an HOUR8 interval. The only difference is that there are 24-hour-long subperiods in a DTHOUR24 interval.) Unfortunately, the original question was to shift a DAY interval by 6 hours, but you can't do that with a date value. DTHOUR intervals only apply to datetime values. You would have to convert all of your date data to datetime data in order to use that interval. Defining your own interval where days start at the desired time would solve this problem.

The SAS data set you have to create consists of at least one variable named begin. It can also have two other optional variables, end and season. If you do not include the end variable, it is assumed to be one less than the start of the next period within that interval; for example, if your user-defined interval is based on days, the end will be the day before the begin value for the next record. While seasonality is a time series concept outside of the scope of this book, it can be accounted for with the variable season. One additional requirement is that begin (and end, if present in the data set) must have an appropriate date, time, or datetime format assigned at the time that the data set is created. In order to associate this data set with a user-defined interval, you need to use the INTERVALDS system option.

OPTIONS INTERVALDS=(interval-name1, dataset-with-interval-records1, interval-name2, dataset-with-interval-records2…)

This is where you name your user-defined interval and tell SAS which data set to use for that interval. You can specify as many user-defined intervals as you need in a single OPTIONS INTERVALDS statement. interval-name must conform to standard SAS naming conventions, and it cannot be a SAS reserved word. If you try to use a reserved word, you will get an error at the OPTIONS statement. This applies to names of existing intervals defined by SAS as well. You cannot change the definition of an existing SAS interval by creating one of your own with the same name. dataset-with-interval-records is a standard SAS data set reference of the type libref.dataset-name. Since this statement merely identifies a data set for association with your interval name, data set options are not allowed.

For our first example, let's define a SEMESTER interval, based on an academic calendar with two semesters and a summer session. The sessions are defined as follows: The fall semester begins the day following the first Monday in September and ends on the third Friday in December. The spring semester begins on the second Monday in January, unless that day falls on the Martin Luther King, Jr., holiday. In that case, it begins on the day following the holiday. The summer session starts on the last Tuesday in May and proceeds through the third Friday in August. The following example uses both the NWKDOM() and HOLIDAY() functions described in sections 5.3 and 5.4, respectively.

Example 5.21: Academic Calendar: Creating a SEMESTER Interval

OPTIONS INTERVALDS=(semester=semester); ❶

DATA semester;

DO year=2014 TO 2017; ❷

/* Fall Semester */

begin = NWKDOM(5,2,8,year); /* Last Monday in August */

end = NWKDOM(3,6,12,year); /* Third Friday in December */

OUTPUT;

/* Spring Semester */

begin = NWKDOM(2,2,1,year+1); /* Second Monday in January of
following calendar year */

if begin eq HOLIDAY('MLK',year+1) then /* If MLK day, then Tuesday */

begin = begin + 1;

end = NWKDOM(2,6,5,year+1); /* Second Friday in May */

OUTPUT;

/* Summer Session */

begin = NWKDOM(1,2,6,year+1); /* First Monday in June */

end = NWKDOM(3,6,8,year+1); /* Third Friday in August */

OUTPUT;

END;

FORMAT begin end WEEKDATE.; /* Required */

RUN;

❶ tells SAS that we are defining an interval named "semester," and it is to be based on the data set WORK.SEMESTER.

❷ sets a definite limit on the interval, from the start of the 2014 academic year through the 2017 academic year, which ends in August 2018. Below is the resulting data set:

 begin end Monday, August 25, 2014 Friday, December 19, 2014 Monday, January 12, 2015 Friday, May 8, 2015 Monday, June 1, 2015 Friday, August 14, 2015 Monday, August 31, 2015 Friday, December 18, 2015 Monday, January 11, 2016 Friday, May 13, 2016 Monday, June 6, 2016 Friday, August 12, 2016 Monday, August 29, 2016 Friday, December 16, 2016 Monday, January 9, 2017 Friday, May 12, 2017 Monday, June 5, 2017 Friday, August 11, 2017 Monday, August 28, 2017 Friday, December 15, 2017 Monday, January 8, 2018 Friday, May 11, 2018 Monday, June 4, 2018 Friday, August 10, 2018

You can then use the SEMESTER interval that you have created in the INTCK or INTNX interval functions. The following table shows how many semesters have elapsed from September 15, 2014, to the respective end dates shown.

Example 5.22: Using a User-Defined Interval with INTCK

 Function Call Result INTCK('SEMESTER',"15SEP2014"d,"22JUN2015"d); 2 INTCK('SEMESTER',"15SEP2014"d,"24NOV2016"d); 6 INTCK('SEMESTER',"15SEP2014"d,"01MAY2018"d); 10

For a second example, let's consider a case where we need to count the actual number of working days for each year. The WEEKDAY interval is a useful approximation, but it doesn't account for holidays, so if you need to know the exact number of working days between two dates, this will solve that problem.

Example 5.23: Customized Company Working Days

OPTIONS INTERVALDS=(WorkingDays=Workdays);

DATA workdays (KEEP=begin); /* End variable not needed */

start = '01JAN2014'D;

stop = '31DEC2015'D;

nweekdays = INTCK('WEEKDAY',start,stop); /* Based on Mon-Fri weekdays */

DO i = 0 TO nweekdays;

begin = INTNX('WEEKDAY',start,i);

year = YEAR(begin);

/* Company-specific holidays */

/* Company closes day after thanksgiving */

xthanks = HOLIDAY("THANKSGIVING",year) + 1;

/* Christmas Eve */

xmaseve = HOLIDAY('CHRISTMAS',year) - 1;

/* Friday before Easter */

sprng = HOLIDAY("EASTER",year) - 2;

/* Founders Day Company Holiday. If on weekend, move forward or back */

founders = MDY(8,6,year);

SELECT(WEEKDAY(founders));

WHEN(6) founders = founders - 1;

WHEN(1) founders = founders + 1;

OTHERWISE founders = founders;

END;

/* Exclude dates of standard and company holidays from interval data set */

IF BEGIN ne HOLIDAY("NEWYEAR",year) AND

BEGIN ne HOLIDAY("MLK",year) AND

BEGIN ne HOLIDAY("USPRESIDENTS",year) AND

BEGIN ne HOLIDAY("MEMORIAL",year) AND

BEGIN ne HOLIDAY("USINDEPENDENCE",year) AND

BEGIN ne HOLIDAY("LABOR",year) AND

BEGIN ne HOLIDAY("VETERANS",year) AND

BEGIN ne HOLIDAY("THANKSGIVING",year) AND

BEGIN ne HOLIDAY("CHRISTMAS",year) AND

BEGIN ne xmaseve AND

BEGIN ne xthanks AND

BEGIN ne sprng AND

BEGIN ne founders THEN OUTPUT;

END;

FORMAT begin DATE9.;

RUN;

The following code generates a table showing the difference between the number of calendar days, the number of weekdays, and the number of working days for this company for the calendar year 2014.

DATA CountDays;

start = '01JAN2014'D;

stop = '01JAN2015'D;

ActualDays = INTCK('DAYS',start,stop);

Weekdays = INTCK('WEEKDAYS',start,stop);

ProductionDays = INTCK('WORKINGDAYS',start,stop);

LABEL ActualDays="Actual Days"

Weekdays="Weekdays"

ProductionDays = "Production Days"

start = "Starting Date"

end = "Ending Date"

;

FORMAT start stop DATE9.;

RUN;

PROC PRINT DATA=CountDays NOOBS;

RUN;

 start stop ActualDays Weekdays ProductionDays 01JAN2014 01JAN2015 365 261 248

How much of a difference can this make? Consider that this company has six product lines and the time to manufacture each product varies. A customer comes in with a large order and wants to know what the delivery date would be for each product. It is critical that the delivery be guaranteed to the day, so the contract calls for a substantial discount if the delivery date is not met. In the following table, the order date is June 27, 2014. We use the INTNX() function to add the corresponding number of production days for each product to the order date. The approximate delivery date is calculated using the SAS WEEKDAY interval, while the true delivery date uses our user-defined WORKINGDAYS interval.

DATA production_lines;

LENGTH product \$ 40;

orderDate = "27JUN2014"d;

Product = "Std Product 1";

days_from_order = 23;

OUTPUT;

Product = "Std Product 2";

days_from_order = 32;

OUTPUT;

Product = "Std Product 3";

days_from_order = 35;

OUTPUT;

Product = "Custom Product 1";

days_from_order = 33;

OUTPUT;

Product = "Custom Product 2";

days_from_order = 42;

OUTPUT;

Product = "Custom Product 3";

days_from_order = 56;

OUTPUT;

FORMAT orderdate weekdate.;

RUN;

DATA ordertime;

SET production_lines;

deliveryDate = INTNX('WORKINGDAYS',orderdate,days_from_order,'S');

approx_deliveryDate = INTNX('WEEKDAY',orderdate,days_from_order,'S');

FORMAT deliverydate approx_deliveryDate weekdate.;

RUN;

PROC REPORT DATA=ordertime NOWD SPLIT='\';

COLUMNS product days_from_order approx_deliveryDate deliveryDate;

DEFINE product / "Product";

DEFINE days_from_order / "Production Days\Required"

STYLE={TEXT_ALIGN=C};

DEFINE approx_deliveryDate / STYLE={TEXT_ALIGN=L}

"~S={text_align=c}Approximate Delivery Date\using WEEKDAYS Interval";

DEFINE deliveryDate / "~S={TEXT_ALIGN=C}Actual Delivery Date\using Custom\WORKINGDAYS Interval" STYLE={TEXT_ALIGN=L};

RUN;

 Product Production DaysRequired Approximate Delivery Date Using WEEKDAYS Interval Actual Delivery DateUsing CustomWORKINGDAYS Interval Std Product 1 23 Wednesday, July 30, 2014 Thursday, July 31, 2014 Std Product 2 32 Tuesday, August 12, 2014 Thursday, August 14, 2014 Std Product 3 35 Friday, August 15, 2014 Tuesday, August 19, 2014 Custom Product 1 33 Wednesday, August 13, 2014 Friday, August 15, 2014 Custom Product 2 42 Tuesday, August 26, 2014 Thursday, August 28, 2014 Custom Product 3 56 Monday, September 15, 2014 Thursday, September 18, 2014

There is one specific problem to be aware of when you are working with user-defined intervals. If a specific date, time, or datetime is not covered in the interval data set, and you try to use one of these values in a function, or one of these values is to be returned by a function, you will not get the answer that you expect. Such unexpected results can be avoided by ensuring that the dates that you define in the data set encompass all the dates that you may encounter. The next two examples will demonstrate what happens when you use a date that is out of the range of your user-defined interval. We will use the WORKINGDAYS interval as defined in the previous example. That interval is only defined for a period of 2 years, from January 1, 2014, through December 31, 2015. What happens if you try to calculate the number of working days using the INTCK() function with ending dates outside of that range, or you try to project a date using the INTNX() function that would be beyond that range?

Example 5.24: Out-of-Interval Calculation Using INTCK() and "WORKINGDAYS" Custom Interval

 Obs startDate endDate result 1 Wednesday, December 23, 2015 Thursday, December 24, 2015 0 2 Wednesday, December 23, 2015 Friday, December 25, 2015 0 3 Wednesday, December 23, 2015 Saturday, December 26, 2015 0 4 Wednesday, December 23, 2015 Sunday, December 27, 2015 0 5 Wednesday, December 23, 2015 Monday, December 28, 2015 1 6 Wednesday, December 23, 2015 Tuesday, December 29, 2015 2 7 Wednesday, December 23, 2015 Wednesday, December 30, 2015 3 8 Wednesday, December 23, 2015 Thursday, December 31, 2015 4 9 Wednesday, December 23, 2015 Friday, January 1, 2016 4 10 Wednesday, December 23, 2015 Saturday, January 2, 2016 4 11 Wednesday, December 23, 2015 Sunday, January 3, 2016 4 12 Wednesday, December 23, 2015 Monday, January 4, 2016 4 13 Wednesday, December 23, 2015 Tuesday, January 5, 2016 4 14 Wednesday, December 23, 2015 Wednesday, January 6, 2016 4 15 Wednesday, December 23, 2015 Thursday, January 7, 2016 4

Rows 1 through 4 make sense. There are no working days from Thursday through Sunday, as December 24 is a company holiday. Rows 5 through 8 count a working day for Monday through Thursday as you would expect. Friday, January 1, 2016, is also a company holiday. Therefore, you would not expect it to count as a working day, nor would you expect the following Saturday and Sunday to count as workdays. But what about that first full week in January, rows 12 through 15? Those days aren't company holidays, so why aren't they being counted as working days?

The answer is that they aren't part of the WORKINGDAYS interval. The definition of the interval ended on December 31, 2015. Because they are not in the data set, any dates before or after the dates specified in the data set will not be considered working days. This makes sense when you consider how we initially created the company holidays. We did not include those dates in the data set that created the interval. Therefore, this problem actually starts with January 1, 2016. It is merely coincidental that January 1st is a holiday, and the 2nd and 3rd are weekend days; that is not what is preventing them from being counted as working days. Their absence from the data set we used to create the interval is the only reason that they are not being counted. This next example shows the type of problem that can arise when using the INTNX() function in a similar fashion.

Example 5.25: Out-of-Interval Calculation with INTNX() and "WORKINGDAYS" Custom Interval

 Obs Function Call Result 1 INTNX('workingdays','26dec2015'd,1,'S'); Monday, December 28, 2015 2 INTNX('workingdays','26dec2015'd,2,'S'); Tuesday, December 29, 2015 3 INTNX('workingdays','26dec2015'd,3,'S'); Wednesday, December 30, 2015 4 INTNX('workingdays','26dec2015'd,4,'S'); Thursday, December 31, 2015 5 INTNX('workingdays','26dec2015'd,5,'S'); . 6 INTNX('workingdays','26dec2015'd,6,'S'); . 7 INTNX('workingdays','26dec2015'd,7,'S'); . 8 INTNX('workingdays','26dec2015'd,8,'S'); .

What happened here? Rows 1 through 4 count the working days as expected, Monday through Thursday. However, if you try to increment the starting date by more than 4 working days, the function call returns a missing value. SAS cannot do the calculation because the result is undefined; it ran out of possible results to return at December 31, 2015, because you did not provide any in the data set.

The ability to define your own intervals will cover situations when the combination of standard SAS intervals, a multiplier, and a shift index does not suffice. It is not difficult to create the data set upon which SAS will base your user-defined interval, but be aware that dates that are not included in the interval can cause problems when you try to use the intervals that you create. If you see unexpected results from using your user-defined interval, the first thing to investigate is to make sure that the range that you have defined for your interval encompasses the range of possible results.

SAS has a series of functions that provide information about intervals that are being used. They can tell you which interval is the best fit between two or three dates, or suggest a format for a given interval. There is a function to tell you the shift point in effect for a given interval. (For basic intervals, this returns the values in Table 5.5, but interval multipliers can change the returned value.) There is also a function that will tell you if the interval name that you are using is valid. This can allow for increased automation and building robust applications.

5.8.1 INTFIT(argument-1,argument-2,type)

INTFIT(argument-1,argument-2,type) takes two date, datetime, or observation numbers as the first two arguments and returns the interval between them deemed to be "the best fit." This function assumes that the alignment is "SAME," so that the exact values are matched between the two arguments. type can be one of three case-insensitive values:

 Interval Name Definition D Use when the two arguments are SAS date values. These can be date literals or variables containing SAS date values DT Use when the two arguments are SAS datetime values. These can be datetime literals or variables containing SAS datetime values OBS Use this when you want to find the interval between two observations by providing observation numbers. This will return an interval with the base name of "OBS."

You can use this function to give you the interval between two points in time. The interval returned might contain a multiplier and/or a shift index. As you can see in the table below, some of the intervals are not as you might anticipate.

 start end result A Tuesday, April 1, 2014 Wednesday, April 2, 2014 DAY B Tuesday, April 1, 2014 Thursday, April 3, 2014 DAY2 C Tuesday, April 1, 2014 Friday, April 4, 2014 DAY3.3 D Tuesday, April 1, 2014 Saturday, April 5, 2014 DAY4.3 E Tuesday, April 1, 2014 Sunday, April 6, 2014 DAY5.5 F Tuesday, April 1, 2014 Tuesday, April 8, 2014 WEEK.3 G Tuesday, April 1, 2014 Wednesday, April 9, 2014 DAY8.7 H Tuesday, April 1, 2014 Friday, April 11, 2014 TENDAY I Tuesday, April 1, 2014 Sunday, April 13, 2014 DAY12.3 J Tuesday, April 1, 2014 Tuesday, April 15, 2014 WEEK2.10 K Tuesday, April 1, 2014 Wednesday, April 16, 2014 SEMIMONTH L Tuesday, April 1, 2014 Thursday, April 17, 2014 DAY16.7 M Tuesday, April 1, 2014 Saturday, April 19, 2014 DAY18.15 N Tuesday, April 1, 2014 Monday, April 21, 2014 TENDAY2.2 O 5/1/2014 9:00 AM 5/1/2014 2:00 PM DTHOUR5 P 5/1/2014 9:00 AM 5/1/2014 8:45 PM DTMINUTE705.136 Q 5/1/2014 9:00 AM 5/2/2014 3:30 AM DTMINUTE1110.61

The italicized rows show examples of intervals that can be puzzling. Remember that the function provides the exact intervals between the pair of dates or datetimes, and the lowest common denominator will be used for base interval. In addition, the dates are measured using the sameday alignment, which means that the interval is measured from the beginning of the interval, and then adjusted. At first glance, the DAY5.5 result in row E doesn't seem to be right; Tuesday is only the third day of the week, so shouldn't the shift be 3 as it is for the DAY3 and DAY4 intervals? The start of DAY 5 intervals are Sunday, then Friday. If you want to measure a DAY5 interval to a Sunday from a Tuesday, then you will have to measure from the Friday, which gives you a shift of 5. Rows P and Q demonstrate that it might become even more confusing with times and/or datetimes, as the base interval shift points become minutes and/or seconds, so the resulting multipliers and shift indices might be large enough so that you lose the context of the reference point.

The following table shows a sample result of using the INTFIT() function with the 'OBS' type. The records are from a sequential data set with Wednesday, April 2, 2014, as the first date. Since the dates are keyed by observation, the result of the INTFIT() function shows the interval as the relationship between observations, not the actual date values themselves.

 Sample Function Call Date Result INTFIT(1,1,'OBS') Wednesday, April 2, 2014 INTFIT(1,4,'OBS') Saturday, April 5, 2014 OBS3.2 INTFIT(1,5,'OBS') Sunday, April 6, 2014 OBS4.2 INTFIT(1,10,'OBS') Friday, April 11, 2014 OBS9.2 INTFIT(1,12,'OBS') Sunday, April 13, 2014 OBS11.2 INTFIT(1,13,'OBS') Monday, April 14, 2014 OBS12.2 INTFIT(1,14,'OBS') Tuesday, April 15, 2014 OBS13.2 INTFIT(1,16,'OBS') Thursday, April 17, 2014 OBS15.2 INTFIT(1,19,'OBS') Sunday, April 20, 2014 OBS18.2

5.8.2 INTFMT('interval','size')

INTFMT('interval','size') takes the interval that you give and returns a suggested format for date, time, or datetime values using this interval. interval can be any standard SAS interval (with multipliers and/or a shift index if desired), or a user-defined interval that you have created, as shown in Section 5.6. If interval is represented by a character string, then it must be enclosed in quotation marks, but you can use a character variable containing the name of an interval instead. size refers to whether the format returned will have a two-digit year ('short'/'s') or a four-digit year ('long'/'l'). This argument must be enclosed in quotation marks also, but as with the interval argument, you can use a character variable for the argument. There is one important usage note about the size argument: If you do not use lowercase, the function might return an unpredictable result (see Row A) in the example table below:

 Sample Function Call Result Comment A INTFMT('WEEK','LONG') size argument is in uppercase B INTFMT('WEEK','long') WEEKDATX17. C INTFMT('MONTH','long') MONYY7. D INTFMT('QTR','short') YYQC4. YYQC. format width is only 4 because it displays two-digit year when the size argument='short' E INTFMT('QTR','long') YYQC6. F INTFMT('YEAR','long') YEAR4. G INTFMT('YEAR10','l') YEAR4. Using alias 'l' for 'long' H INTFMT('YEAR100','l') YEAR4. Using alias 'l' for 'long' I INTFMT('DTMONTH','l') DTMONYY7. Using alias 'l' for 'long' J INTFMT('DTYEAR','l') DTYEAR4. Using alias 'l' for 'long' K INTFMT('HOUR','l') DATETIME10. Using alias 'l' for 'long' L INTFMT('WORKINGDAYS','l') DATE11. Using alias 'l' for 'long'

This function is most useful when you use the INTFIT or INTGET functions to determine the interval dynamically, so that you will always display a result in the proper context. See example 5.23 to see how this can be used to dynamically format results.

5.8.3 INTGET(argument1,argument2,argument3)

INTGET(argument1,argument2,argument3) determines an interval from three date or datetime values that you provide. The arguments can be variables or date/datetime literals, but they must be of the same type; you cannot mix dates and datetimes as arguments. The function calculates all intervals between the first two arguments, and then between the second and third arguments. If the intervals are the same, it will return that interval. If the intervals are not the same, then the function will test the interval between the second and third arguments to see whether it is a multiple of the interval between the first two arguments. If this is true, then the function will return the smaller of the two intervals. If neither of these cases are true, then the INTGET() function will return a missing value.

 Sample Function Call Result A INTGET('05SEP2013'd,20SEP2013'd,05OCT2013'd) SEMIMONTH B INTGET('15JAN2014'd,15APR2014'd,15OCT2015'd) QTR C INTGET('15JAN2014'd,15APR2015'd,15OCT2015'd) D INTGET('09JUL2015'd,09SEP2015'd,09MAR2016'd) MONTH2 E INTGET('09JUL2015'd,29AUG2015'd,15SEP2015'd) DAY17.15

In row A, the period between the dates is 15 days, which corresponds to the SEMIMONTH interval. Row B demonstrates that the dates do not have to have the same interval. The gap between January and April is one quarter, but the gap between April and October of the following year is five quarters. Since the interval for the second pair of arguments would be 'QTR5,' since it is a multiple of the QTR interval, the function returns the unmultiplied interval. What happened in row C? Simply, the interval between January 2014 and April 2015 is now QTR5, but the interval between April and October 2015 is QTR2. Since QTR2 is not a multiple of QTR5, the result is missing. Row D shows that if the interval between two arguments is itself a multiplied interval (MONTH2 versus MONTH6), the function will return the interval with the smallest multiplier, while row E shows the ingenuity of the function: The DAY17 interval works if it is shifted by 15 days.

5.8.4 INTSHIFT('interval')

INTSHIFT('interval') takes the interval that you give and returns the shift point for that interval. interval can be any standard SAS interval (with multipliers and/or a shift index if desired). If interval is represented by a character string, then it must be enclosed in quotation marks, but you can use a character variable containing the name of an interval instead. If you try to use this function with a user-defined interval that you have created, the function will return a missing value.

 Sample Function Call Result Comment INTSHIFT('WEEK') DAY INTSHIFT('MONTH') MONTH INTSHIFT('QTR3.4') MONTH INTSHIFT('YEAR') MONTH INTSHIFT('YEAR10.5') MONTH INTSHIFT('YEAR100') MONTH INTSHIFT('DTMONTH') DTMONTH INTSHIFT('DTYEAR.7') DTMONTH INTSHIFT('HOUR') DTHOUR Note that shift points for time intervals are expressed as datetime shift points. INTSHIFT('WORKINGDAYS') User-defined interval, so no shift points can be calculated.

5.8.5 INTTEST('interval')

INTTEST('interval') takes the interval that you give and returns a 1 if it is a valid interval name, or 0 if it is not. You can use this to determine whether you have created a valid interval with multipliers and/or a shift index. interval can be any standard SAS interval (with multipliers and/or a shift index if desired). If interval is represented by a character string, then it must be enclosed in quotation marks, but you can use a character variable containing the name of an interval instead. If you try to use this function with a user-defined interval that you have created, the function will return a missing value.

 Sample Function Call Result Comment INTTEST('WEEK') 1 INTTEST('QTR.1') 1 You can shift a quarter by up to 3 months (counting the initial starting point as .1). INTTEST('QTR3.13') 0 The shift point for quarters is months, so the maximum number of months that you can shift in a 3-quarter period would be 9. INTTEST('YEAR.7') 1 INTTEST('YEAR10.14') 1 INTTEST('DTMONTH') 1 INTTEST('HOUR2.3') 0 Cannot shift 3 hours in a 2-hour interval. INTTEST('WORKINGDAYS') 0 This function does not recognize user-defined intervals.

5.9 Retail Calendar Intervals and Seasonality

This section covers intervals and functions that are most frequently used with time series analyses. Although the topic of time series analysis is beyond the scope of this book, these intervals and functions are described here because the intervals can be used with INTNX() and INTCK(), while the functions are a part of Base SAS. However, they are presented without context.

5.9.1 Retail Calendar Intervals

SAS has added intervals that are specifically designed for the retail industry. These intervals are ISO 8601 compliant, and can be used with any of the interval functions. They facilitate comparisons across years, because the weeks remain consistent between years. In order to facilitate this, some years will have leap weeks. Year definitions are based on the ISO 8601 definition of a week, which is the first Monday preceding January 4, which in some cases might place the beginning of the week in December. These intervals enable you to define the structure of your 52-week year, expecting that for the first 13-week period of your interval, there will be 1 month that has 5 weeks in it. This means that you can set the month pattern to 5-4-4, 4-5-4, or 4-4-5. You can work with years, months, or quarters in this fashion. The full list of retail intervals is given inTable 5.7.

Table 5.7: Retail Calendar Intervals

 Interval Description YEARV Specifies ISO 8601 yearly intervals. The ISO 8601 year begins on the Monday on or immediately preceding January 4. Note that it is possible for the ISO 8601 year to begin in December of the preceding year. Also, some ISO 8601 years contain a leap week. The beginning subperiod s is written in ISO 8601 weeks (WEEKV). R445YR Is the same as YEARV except that in the retail industry the beginning subperiod s is 4-4-5 months (R445MON). R454YR Is the same as YEARV except that in the retail industry the beginning subperiod s is 4-5-4 months (R454MON). R544YR Is the same as YEARV except that in the retail industry the beginning subperiod s is 5-4-4 months (R544MON). R445QTR Specifies retail 4-4-5 quarterly intervals (every 13 ISO 8601 weeks). Some fourth quarters contain a leap week. The beginning subperiod s is 4-4-5 months (R445MON). R454QTR Specifies retail 4-5-4 quarterly intervals (every 13 ISO 8601 weeks). Some fourth quarters contain a leap week. The beginning subperiod s is 4-5-4 months (R454MON). R544QTR Specifies retail 5-4-4 quarterly intervals (every 13 ISO 8601 weeks). Some fourth quarters contain a leap week. The beginning subperiod s is 5-4-4 months (R544MON). R445MON Specifies retail 4-4-5 monthly intervals. The 3rd, 6th, 9th, and 12th months are five ISO 8601 weeks long with the exception that some 12th months contain leap weeks. All other months are four ISO 8601 weeks long. R445MON intervals begin with the 1st, 5th, 9th, 14th, 18th, 22nd, 27th, 31st, 35th, 40th, 44th, and 48th weeks of the ISO year. The beginning subperiod s is 4-4-5 months (R445MON). R454MON Specifies retail 4-5-4 monthly intervals. The 2nd, 5th, 8th, and 11th months are five ISO 8601 weeks long with the exception that some 12th months contain leap weeks. R454MON intervals begin with the 1st, 5th, 10th, 14th, 18th, 23rd, 27th, 31st, 36th, 40th, 44th, and 49th weeks of the ISO year. The beginning subperiod s is 4-5-4 months (R454MON). R544MON Specifies retail 5-4-4 monthly intervals. The 1st, 4th, 7th, and 10th months are five ISO 8601 weeks long. All other months are four ISO 8601 weeks long with the exception that some 12th months contain leap weeks. R544MON intervals begin with the 1st, 6th, 10th, 14th, 19th, 23rd, 27th, 32nd, 36th, 40th, 45th, and 49th weeks of the ISO year. The beginning subperiod s is 5-4-4 months (R544MON). WEEKV Specifies ISO 8601 weekly intervals of seven days. Each week begins on Monday. The beginning subperiod s is calculated in days (DAY). Note that WEEKV differs from WEEK in that WEEKV.1 begins on Monday, WEEKV.2 begins on Tuesday, and so on.

5.9.2 Seasonality Functions

Seasonality is used in time series analysis and can be used in SAS/ETS. It helps to account for normal seasonal variations in patterns inside an analysis. While it is not strictly a date and time matter, it does use intervals. Therefore, the seasonality functions are documented here, but without context. For more information about seasonality and its application, you can refer to the SAS/ETS documentation, and support.sas.com is a great place to find more help on this topic.

INTCINDEX('interval',argument)

INTCINDEX('interval',argument) returns the index of the seasonal cycle based on interval for argument, where argument is a SAS date, time, or datetime value. interval can be any standard SAS interval (with multipliers and/or a shift index if desired).

INTCYCLE('interval',seasonality)

INTCYCLE('interval',seasonality) returns the interval of the seasonal cycle. interval can be any standard SAS interval (with multipliers and/or a shift index if desired). seasonality is an optional argument that enables you to define seasonal cycles, and it can be a number or a cycle (such as 'YEAR'). For example, you can use the seasonality argument to specify your year as having 53 weeks instead of 52.

INTINDEX('interval',argument,seasonality)

INTINDEX('interval',argument,seasonality) returns the seasonal index when given an interval, a SAS date, time, or datetime of the seasonal cycle. interval can be any standard SAS interval (with multipliers and/or a shift index if desired. argument is a SAS date, time, or datetime value. Note that the interval specified must be appropriate for the argument. seasonality is an optional argument that enables you to define seasonal cycles, and it can be a number or a cycle (such as 'YEAR'). The INTINDEX function returns the seasonal index, while the INTCINDEX function returns the cycle index.

INTSEAS('interval',seasonality)

INTSEAS('interval',seasonality) returns the number of intervals in a seasonal cycle. interval can be any standard SAS interval (with multipliers and/or a shift index if desired). seasonality is an optional argument that enables you to define seasonal cycles. This is a good function to be aware of if you are not familiar with concepts of seasonality, because the number of intervals in a seasonal cycle might not be intuitive. For example, while you might expect the number of intervals for the QTR interval to be 4 (because there are 4 quarters in a year), the number of intervals for a DAY interval is 7, for the number of days in the week. This can help you check your expectations.

﻿