Excel 2016 Formulas (2016)
PART II
Leveraging Excel Functions
Chapter 7
Counting and Summing Techniques
In This Chapter
· Counting and summing cells
· Counting and summing records in databases and pivot tables
· Basic counting formulas
· Advanced counting formulas
· Formulas for common summing tasks
· Conditional summing formulas using a single criterion
· Conditional summing formulas using multiple criteria
· Using VBA for counting and summing tasks
Many of the most frequently asked spreadsheet questions involve counting and summing values and other worksheet elements. It seems that people are always looking for formulas to count or sum various items in a worksheet. This chapter will answer the majority of such questions.
Counting and Summing Worksheet Cells
Generally, a counting formula returns the number of cells in a specified range that meet certain criteria. A summing formula returns the sum of the values of the cells in a range that meet certain criteria. The range that you want counted or summed may or may not consist of a worksheet database or a table.
Table 7.1 lists the worksheet functions that come into play when you are creating counting and summing formulas. If none of the functions in Table 7.1 can solve your problem, an array formula can likely come to the rescue.
Table 7.1 Excel Counting and Summing Functions
Function |
Description |
AGGREGATE |
Can be used for counting and summing, with options to ignore hidden cells, error values, and nested SUBTOTAL or AGGREGATE functions. |
COUNT |
Returns the number of cells in a range that contain a numeric value. |
COUNTA |
Returns the number of nonblank cells in a range. |
COUNTBLANK |
Returns the number of blank cells in a range. |
COUNTIF |
Returns the number of cells in a range that meet a single specified criterion. |
COUNTIFS* |
Returns the number of cells in a range that meet one or more specified criteria. |
DCOUNT |
Counts the number of records in a worksheet database that meet specified criteria. |
DCOUNTA |
Counts the number of nonblank records in a worksheet database that meet specified criteria. |
DEVSQ |
Returns the sum of squares of deviations of data points from the sample mean; used primarily in statistical formulas. |
DSUM |
Returns the sum of a column of values in a worksheet database that meet specified criteria. |
FREQUENCY |
Calculates how often values occur within a range of values and returns a vertical array of numbers; used only in a multicell array formula. |
SUBTOTAL |
When used with a first argument of 2 or 3, returns a count of cells that make up a subtotal; when used with a first argument of 9, returns the sum of cells that make up a subtotal. Ignores other nested SUBTOTAL functions. |
SUM |
Returns the sum of its arguments. |
SUMIF |
Returns the sum of cells in a range that meet a specified criterion. |
SUMIFS* |
Returns the sum of the cells in a range that meet one or more specified criteria. |
SUMPRODUCT |
Multiplies corresponding cells in two or more ranges and returns the sum of those products. |
* These functions were introduced in Excel 2007.
Getting a quick count or sum
The Excel status bar can display useful information about the currently selected cells—no formulas required. Normally, the status bar displays the sum and count of the values in the selected range. You can, however, right-click the status bar to bring up a menu with other options. You can choose any or all the following: Average, Count, Numerical Count, Minimum, Maximum, and Sum.
Other Counting Methods
As is often the case, Excel provides more than one way to accomplish a task. This chapter deals with standard formulas to count and sum cells.
Other parts of this book cover additional methods that are used for counting and summing:
§ Filtering: If your data is in the form of a table, you can use AutoFilter to accomplish many counting and summing operations. Just set the AutoFilter criteria, and the table displays only the rows that match your criteria: The nonqualifying rows in the table are hidden. Then you can select formulas to display counts or sums in the table’s total row. See Chapter 9, “Working with Tables and Lists,” for more information on using tables.
§ Advanced filtering: Special database functions provide additional ways to achieve counting and summing. Excel’s DCOUNT and DSUM functions are database functions. They work in conjunction with a worksheet database and require a special criterion range that holds the counting or summing criteria. See Chapter 9 for more information.
§ Pivot tables: Creating a pivot table is a quick way to get a count or sum of items without using formulas. Using a pivot table is appropriate when your data is in the form of a worksheet database or table. See Chapter 18, “Pivot Tables,” for information about pivot tables.
Basic Counting Formulas
The basic counting formulas presented here are all straightforward and relatively simple. They demonstrate how to count the number of cells in a range that meet specific criteria. Figure 7.1 shows a worksheet that uses formulas (in column E) to summarize the contents of range A1:B10—a 20-cell range named Data.
Figure 7.1 Formulas provide various counts of the data in A1:B10.
On the Web
You can access the basic counting.xlsx workbook shown in Figure 7.1 at this book’s website.
About this chapter’s examples
Most of the examples in this chapter use named ranges for function arguments. When you adapt these formulas for your own use, you’ll need to substitute either the actual range address or a range name defined in your workbook.
Also, some examples are array formulas. An array formula, as explained in Chapter 14, “Introducing Arrays,” is a special type of formula. You can spot an array formula because it is enclosed in brackets when it is displayed in the Formula bar. For example:
{=Data*2}
When you enter an array formula, press Ctrl+Shift+Enter (not just Enter). You don’t need to type the brackets—Excel inserts the brackets for you. And if you need to edit an array formula, don’t forget to press Ctrl+Shift+Enter after you finish editing. Otherwise, the array formula will revert to a normal formula, and it will return an incorrect result.
Counting the total number of cells
To get a count of the total number of cells in a range, use the following formula. This formula returns the number of cells in a range named Data. It simply multiplies the number of rows (returned by the ROWS function) by the number of columns (returned by the COLUMNS function).
=ROWS(Data)*COLUMNS(Data)
Counting blank cells
The following formula returns the number of blank (empty) cells in a range named Data:
=COUNTBLANK(Data)
This function works only with a contiguous range of cells. If Data is defined as a noncontiguous range, the function returns a #VALUE! error.
The COUNTBLANK function also counts cells containing a formula that returns an empty string. For example, the formula that follows returns an empty string if the value in cell A1 is greater than 5. If the cell meets this condition, the COUNTBLANK function counts that cell:
=IF(A1>5,"",A1)
Note
The COUNTBLANK function does not count cells that contain a zero value, even if you clear the Show a Zero in Cells That Have Zero Value option in the Excel Options dialog box. (Choose File ➜ Options and navigate to the Display Options for This Worksheet section of the Advanced tab.)
You can use the COUNTBLANK function with an argument that consists of entire rows or columns. For example, this next formula returns the number of blank cells in column A:
=COUNTBLANK(A:A)
The following formula returns the number of empty cells on the entire worksheet named Sheet1. You must enter this formula on a sheet other than Sheet1, or it will create a circular reference.
=COUNTBLANK(Sheet1!1:1048576)
Counting nonblank cells
The following formula uses the COUNTA function to return the number of nonblank cells in a range named Data:
=COUNTA(Data)
The COUNTA function counts cells that contain values, text, or logical values (TRUE or FALSE).
Note
If a cell contains a formula that returns an empty string, that cell is included in the count returned by COUNTA even though the cell appears to be blank.
Counting numeric cells
To count only the numeric cells in a range, use the following formula, which assumes that the range is named Data:
=COUNT(Data)
Cells that contain a date or a time are considered to be numeric cells. Cells that contain a logical value (TRUE or FALSE) are not considered to be numeric cells.
Counting text cells
To count the number of text cells in a range, you need to use an array formula. The array formula that follows returns the number of text cells in a range named Data:
{=SUM(IF(ISTEXT(Data),1))}
Counting nontext cells
The following array formula uses Excel’s ISNONTEXT function, which returns TRUE if its argument refers to any nontext cell (including a blank cell). This formula returns the count of the number of cells not containing text (including blank cells):
{=SUM(IF(ISNONTEXT(Data),1))}
Counting logical values
The following array formula returns the number of logical values (TRUE or FALSE) in a range named Data:
{=SUM(IF(ISLOGICAL(Data),1))}
Counting error values in a range
Excel has three functions that help you determine whether a cell contains an error value:
§ ISERROR: Returns TRUE if the cell contains any error value (#N/A, #VALUE!, #REF!, #DIV/0!, #NUM!, #NAME?, or #NULL!)
§ ISERR: Returns TRUE if the cell contains any error value except #N/A
§ ISNA: Returns TRUE if the cell contains the #N/A error value
Note
Notice that the #N/A error value is treated separately. In most cases, #N/A is not a "real" error. #N/A is often used as a placeholder for missing data. You can enter the #N/A error value directly or use the NA function:
=NA()
You can use these functions in an array formula to count the number of error values in a range. The following array formula, for example, returns the total number of error values in a range named Data:
{=SUM(IF(ISERROR(Data),1))}
Depending on your needs, you can use the ISERR or ISNA function in place of ISERROR.
If you need to count specific types of errors, you can use the COUNTIF function. The following formula, for example, returns the number of #DIV/0! error values in the range named Data:
=COUNTIF(Data,"#DIV/0!")
Note that the COUNTIF function works only with a contiguous range argument. If Data is defined a noncontiguous range, the formula returns a #VALUE! error.
Advanced Counting Formulas
Most of the basic examples we presented previously use functions or formulas that perform conditional counting. The advanced counting formulas that we present here represent more complex examples for counting worksheet cells based on various types of selection criteria.
Counting cells with the COUNTIF function
Excel’s COUNTIF function is useful for single-criterion counting formulas. The COUNTIF function takes two arguments:
§ range: The range that contains the values that determine whether to include a particular cell in the count
§ criteria: The logical criteria that determine whether to include a particular cell in the count
Table 7.2 contains several examples of formulas that use the COUNTIF function. All these formulas work with a range named Data. As you can see, the criteria argument is quite flexible. You can use constants, expressions, functions, cell references, and even wildcard characters (* and ?).
Table 7.2 Examples of Formulas Using the COUNTIF Function
Function |
Description |
=COUNTIF(Data,12) |
Returns the number of cells containing the value 12 |
=COUNTIF(Data,"<0") |
Returns the number of cells containing a negative value |
=COUNTIF(Data,"<>0") |
Returns the number of cells not equal to 0 |
=COUNTIF(Data,">5") |
Returns the number of cells greater than 5 |
=COUNTIF(Data,A1) |
Returns the number of cells equal to the contents of cell A1 |
=COUNTIF(Data,">"&A1) |
Returns the number of cells greater than the value in cell A1 |
=COUNTIF(Data,"*") |
Returns the number of cells containing text |
=COUNTIF(Data,"???") |
Returns the number of text cells containing exactly three characters |
=COUNTIF(Data,"budget") |
Returns the number of cells containing the single word budget and nothing else (not case sensitive) |
=COUNTIF(Data,"*budget*") |
Returns the number of cells containing the text budget anywhere within the text (not case sensitive) |
=COUNTIF(Data,"A*") |
Returns the number of cells containing text that begins with the letter A (not case sensitive) |
=COUNTIF(Data,TODAY()) |
Returns the number of cells containing only the current date |
=COUNTIF(Data,">"&AVERAGE(Data)) |
Returns the number of cells with a value greater than the average |
=COUNTIF(Data,">"&AVERAGE(Data)+ STDEV(Data)*3) |
Returns the number of values exceeding three standard deviations above the mean |
=COUNTIF(Data,3)+COUNTIF(Data,-3) |
Returns the number of cells containing the value 3 or –3 |
=COUNTIF(Data,TRUE) |
Returns the number of cells containing or returning logical TRUE |
=COUNTIF(Data,TRUE)+COUNTIF (Data,FALSE) |
Returns the number of cells containing or returning a logical value (TRUE or FALSE) |
=COUNTIF(Data,"#N/A") |
Returns the number of cells containing the #N/A error value |
Counting cells that meet multiple criteria
In many cases, your counting formula will need to count cells only if two or more criteria are met. These criteria can be based on the cells that are being counted or based on a range of corresponding cells.
Figure 7.2 shows a simple worksheet that we use for the examples in this section. This sheet shows sales figures (Amount) categorized by Month, SalesRep, and Type. The worksheet contains named ranges that correspond to the labels in row 1.
Figure 7.2 This worksheet demonstrates various counting techniques that use multiple criteria.
On the Web
The workbook multiple criteria counting.xlsx is available at this book’s website.
Note
Several of the examples in this section use the COUNTIFS function, which was introduced in Excel 2007. We also present alternative versions of the formulas, which you should use if you plan to share your workbook with others who use a version prior to Excel 2007.
Using And criteria
An And criterion counts cells if all specified conditions are met. A common example is a formula that counts the number of values that fall within a numerical range. For example, you may want to count cells that contain a value greater than 100 and less than or equal to 200. For this example, the COUNTIFS function will do the job:
=COUNTIFS(Amount,">100",Amount,"<=200")
The COUNTIFS function accepts any number of paired arguments. The first member of the pair is the range to be counted (in this case, the range named Amount); the second member of the pair is the criterion. The preceding example contains two sets of paired arguments and returns the number of cells in which Amount is greater than 100 and less than or equal to 200.
Prior to Excel 2007, you would need to use a formula like this:
=COUNTIF(Amount,">100")-COUNTIF(Amount,">200")
The preceding formula counts the number of values that are greater than 100 and then subtracts the number of values that are greater than 200. The result is the number of cells that contain a value greater than 100 and less than or equal to 200.
Creating this type of formula can be confusing because the formula refers to a condition ">200" even though the goal is to count values that are less than or equal to 200. An alternate technique is to use an array formula, such as the one that follows. You may find creating this type of formula easier:
{=SUM((Amount>100)*(Amount<=200))}
Note
When you enter an array formula, remember to use Ctrl+Shift+Enter—and don’t type the curly brackets.
Sometimes, the counting criteria will be based on cells other than ones being counted. You may, for example, want to count the number of sales that meet the following criteria:
§ Month is January, and
§ SalesRep is Brooks, and
§ Amount is greater than 1,000
The following formula returns the number of items that meet all three criteria. Note that the COUNTIFS function uses three sets of pairs of arguments:
=COUNTIFS(Month,"January",SalesRep,"Brooks",Amount,">1000")
An alternative formula, which works with versions prior to Excel 2007, uses the SUMPRODUCT function. The following formula returns the same result as the previous formula:
=SUMPRODUCT((Month="January")*(SalesRep="Brooks")*(Amount>1000))
Yet another way to perform this count is to use an array formula:
{=SUM((Month="January")*(SalesRep="Brooks")*(Amount>1000))}
Using Or criteria
To count cells by using an Or criterion, you can sometimes use multiple COUNTIF functions. The following formula, for example, counts the number of sales made in January or February:
=COUNTIF(Month,"January")+COUNTIF(Month,"February")
You can also use the COUNTIF function in an array formula. The following array formula, for example, returns the same result as the previous formula:
{=SUM(COUNTIF(Month,{"January","February"}))}
But if you base your Or criteria on cells other than the ones being counted, the COUNTIF function won’t work. Referring to Figure 7.2, suppose that you want to count the number of sales that meet at least one of the following criteria:
§ Month is January, or
§ SalesRep is Brooks, or
§ Amount is greater than 1,000
If you attempt to create a formula that uses COUNTIF, some double counting will occur. The solution is to use an array formula like this:
{=SUM(IF((Month="January")+(SalesRep="Brooks")+(Amount>1000),1))}
Combining And and Or criteria
In some cases, you may need to combine And and Or criteria when counting. For example, perhaps you want to count sales that meet both of the following criteria:
§ Month is January, and
§ SalesRep is Brooks, or SalesRep is Cook
You can add two COUNTIFS functions to get the desired result:
=COUNTIFS(Month,"January",SalesRep,"Brooks")+
COUNTIFS(Month,"January",SalesRep,"Cook")
Because you have to repeat the And portion of the criteria in each function’s arguments, using COUNTIFS can produce long formulas with more criteria. When you have a lot of criteria, it makes sense to use an array formula, like this one that produces the same result:
{=SUM((Month="January")*((SalesRep="Brooks")+(SalesRep="Cook")))}
Counting the most frequently occurring entry
Excel’s MODE function returns the most frequently occurring value in a range. Figure 7.3 shows a worksheet with values in range A1:A9 (named Data). The formula that follows returns 10 because that value appears most frequently in the Data range:
Figure 7.3 The MODE function returns the most frequently occurring value in a range.
=MODE(Data)
The formula returns an #N/A error if the Data range contains no duplicated values.
To count the number of times the most frequently occurring value appears in the range—in other words, the frequency of the mode—use the following formula:
=COUNTIF(Data,MODE(Data))
This formula returns 3 because the modal value (10) appears three times in the Data range.
The MODE function works only for numeric values, and it ignores cells that contain text. To find the most frequently occurring text entry in a range, you need to use an array formula.
To count the number of times the most frequently occurring item (text or values) appears in a range named Data, use the following array formula:
{=MAX(COUNTIF(Data,Data))}
This next array formula operates like the MODE function except that it works with both text and values:
{=INDEX(Data,MATCH(MAX(COUNTIF(Data,Data)),COUNTIF(Data,Data),0))}
Warning
If there is a tie for the most frequent value, the preceding formula returns only the first in the list.
Counting the occurrences of specific text
The examples in this section demonstrate various ways to count the occurrences of a character or text string in a range of cells. Figure 7.4 shows a worksheet that demonstrates these examples. Various text appears in the range A1:A10 (named Data); cell B1 is namedText.
Figure 7.4 This worksheet demonstrates various ways to count characters in a range.
On the Web
This book’s website contains a workbook named counting text in a range.xlsx that demonstrates the formulas in this section.
Entire cell contents
To count the number of cells containing the contents of the Text cell (and nothing else), you can use the COUNTIF function. The following formula demonstrates:
=COUNTIF(Data,Text)
For example, if the Text cell contains the string Alpha, the formula returns 2 because two cells in the Data range contain this text. This formula is not case sensitive, so it counts both Alpha (cell A2) and alpha (cell A10). Note, however, that it does not count the cell that contains Alpha Beta (cell A8).
The following array formula is similar to the preceding formula, but this one is case sensitive:
{=SUM(IF(EXACT(Data,Text),1))}
Partial cell contents
To count the number of cells that contain a string that includes the contents of the Text cell, use this formula:
=COUNTIF(Data,"*"&Text&"*")
For example, if the Text cell contains the text Alpha, the formula returns 3 because three cells in the Data range contain the text alpha (cells A2, A8, and A10). Note that the comparison is not case sensitive.
An alternative is a longer array formula that uses the SEARCH function:
{=SUM(IF(NOT(ISERROR(SEARCH(text,data))),1))}
The SEARCH function returns an error if Text is not found in Data. The preceding formula counts one for every cell where SEARCH does not find an error. Because SEARCH is not case sensitive, neither is this formula.
If you need a case-sensitive count, you can use the following array formula:
{=SUM(IF(LEN(Data)-LEN(SUBSTITUTE(Data,Text,""))>0,1))}
If the Text cell contains the text Alpha, the preceding formula returns 2 because the string appears in two cells (A2 and A8).
Like the SEARCH function, the FIND function returns an error if Text is not found in Data, as in this alternative array formula:
{=SUM(IF(NOT(ISERROR(FIND(text,data))),1))}
Unlike SEARCH, the FIND function is case sensitive.
Total occurrences in a range
To count the total number of occurrences of a string within a range of cells, use the following array formula:
{=(SUM(LEN(Data))-SUM(LEN(SUBSTITUTE(Data,Text,""))))/LEN(Text)}
If the Text cell contains the character B, the formula returns 7 because the range contains seven instances of the string. This formula is case sensitive.
The following array formula is a modified version that is not case sensitive:
{=(SUM(LEN(Data))-SUM(LEN(SUBSTITUTE(UPPER(Data),UPPER(Text),""))))/LEN(Text)}
Counting the number of unique values
The following array formula returns the number of unique values in a range named Data:
{=SUM(1/COUNTIF(Data,Data))}
To understand how this formula works, you need a basic understanding of array formulas. (See Chapter 14 for an introduction to this topic.) In Figure 7.5, range A1:A12 is named Data. Range C1:C12 contains the following multicell array formula. A single formula was entered into all 12 cells in the range:
{=COUNTIF(Data,Data)}
Figure 7.5 Using an array formula to count the number of unique values in a range.
On the Web
You can access the workbook count unique.xlsx shown in Figure 7.5 at this book’s website.
The array in range C1:C12 consists of the count of each value in Data. For example, the number 125 appears three times, so each array element that corresponds to a value of 125 in the Data range has a value of 3.
Range D1:D12 contains the following array formula:
{=1/C1:C12}
This array consists of each value in the array in range C1:C12, divided into 1. For example, each cell in the original Data range that contains a 200 has a value of 0.5 in the corresponding cell in D1:D12.
Summing the range D1:D12 gives the number of unique items in Data. The array formula presented at the beginning of this section essentially creates the array that occupies D1:D12 and sums the values.
This formula has a serious limitation: if the range contains any blank cells, it returns an error. The following array formula solves this problem:
{=SUM(IF(COUNTIF(Data,Data)=0,"",1/COUNTIF(Data,Data)))}
Cross-Ref
To create an array formula that returns a list of unique items in a range, see Chapter 15, “Performing Magic with Array Formulas.”
Creating a frequency distribution
A frequency distribution basically comprises a summary table that shows the frequency of each value in a range. For example, an instructor may create a frequency distribution of test scores. The table would show the count of As, Bs, Cs, and so on. Excel provides a number of ways to create frequency distributions. You can
§ Use the FREQUENCY function.
§ Create your own formulas.
§ Use the Analysis ToolPak add-in.
§ Use a pivot table.
On the Web
The frequency distribution.xlsx workbook that demonstrates these four techniques is available at this book’s website.
The FREQUENCY function
The first method that we discuss uses the FREQUENCY function. This function always returns an array, so you must use it in an array formula entered into a multicell range.
Figure 7.6 shows some data in range A1:E25 (named Data). These values range from 1 to 500. The range G2:G11 contains the bins used for the frequency distribution. Each cell in this bin range contains the upper limit for the bin. In this case, the bins consist of <=50, 51–100, 101–150, and so on. See the sidebar, “Creating bins for a frequency distribution,” to discover an easy way to create a bin range.
Figure 7.6 Creating a frequency distribution for the data in A1:E25.
To create the frequency distribution, select a range of cells that corresponds to the number of cells in the bin range. Then enter the following array formula in H2:H11:
{=FREQUENCY(Data,G2:G11)}
The array formula enters the count of values in the Data range that fall into each bin. To create a frequency distribution that consists of percentages, use the following array formula:
{=FREQUENCY(Data,G2:G11)/COUNT(Data)}
Figure 7.7 shows two frequency distributions—one in terms of counts, and one in terms of percentages. The figure also shows a chart (histogram) created from the frequency distribution.
Figure 7.7 Frequency distributions created using the FREQUENCY function.
Creating bins for a frequency distribution
When creating a frequency distribution, you must first enter the values into the bin range. The number of bins determines the number of categories in the distribution. Most of the time, each of these bins will represent an equal range of values.
To create 10 evenly spaced bins for values in a range named Data, enter the following array formula into a range of 10 cells in a column:
{=MIN(Data)+(ROW(INDIRECT("1:10"))*(MAX(Data)-MIN(Data)+1)/10)-1}
This formula creates 10 bins, based on the values in the Data range. The upper bin will always equal the maximum value in the range.
To create more or fewer bins, use a value other than 10 and enter the array formula into a range that contains the same number of cells. For example, to create 5 bins, enter the following array formula into a 5-cell vertical range:
{=MIN(Data)+(ROW(INDIRECT("1:5"))*(MAX(Data)-MIN(Data)+1)/5)-1}
Using formulas to create a frequency distribution
Figure 7.8 shows a worksheet that contains test scores for 50 students in column B. (The range is named Grades.) Formulas in columns G and H calculate a frequency distribution for letter grades. The minimum and maximum values for each letter grade appear in columns D and E. For example, a test score between 80 and 89 (inclusive) qualifies for a B.
Figure 7.8 Creating a frequency distribution of test scores.
The formula in cell G2 that follows is an array formula that counts the number of scores that qualify for an A:
{=SUM((Grades>=D2)*(Grades<=E2))}
You may recognize this formula from a previous section in this chapter. (See “Counting cells that meet multiple criteria.”) This formula was copied to the four cells below G2.
The formulas in column H calculate the percentage of scores for each letter grade. The formula in H2, which was copied to the four cells below H2, is
=G2/SUM($G$2:$G$6)
Using the Analysis ToolPak to create a frequency distribution
After you install the Analysis ToolPak add-in, you can use the Histogram option to create a frequency distribution. Start by entering your bin values in a range. Then choose Data ➜ Analysis ➜ Data Analysis to display the Data Analysis dialog box. Next, select Histogram and click OK. You should see the Histogram dialog box shown in Figure 7.9.
Figure 7.9 The Analysis ToolPak’s Histogram dialog box.
Is the Analysis ToolPak installed?
To make sure that the Analysis ToolPak add-in is installed, click the Data tab. If the Ribbon displays the Data Analysis command in the Analysis group, you’re all set. If not, you’ll need to install the add-in:
1. Choose File ➜ Options to display the Excel Options dialog box.
2. Click the Add-Ins tab on the left.
3. Select Excel Add-Ins from the Manage drop-down list.
4. Click Go to display the Add-Ins dialog box.
5. Place a check mark next to Analysis ToolPak.
6. Click OK.
Note: In the Add-Ins dialog box, you see an additional add-in, Analysis ToolPak - VBA. This add-in is for a programmer, and you don’t need to install it.
Specify the ranges for your data (Input Range), bins (Bin Range), and results (Output Range), and then select any options. Figure 7.10 shows a frequency distribution (and chart) created with the Histogram option.
Figure 7.10 A frequency distribution and chart generated by the Analysis ToolPak’s Histogram option.
Warning
Note that the frequency distribution consists of values, not formulas. Therefore, if you make any changes to your input data, you need to rerun the Histogram procedure to update the results.
Using a pivot table to create a frequency distribution
If your data is in the form of a table, you may prefer to use a pivot table to create a histogram. Figure 7.11 shows the student grade data summarized in a pivot table and a pivot chart.
Cross-Ref
We cover pivot tables in Chapter 18.
Figure 7.11 Summarizing grades with a pivot table and pivot chart.
Using adjustable bins to create a histogram
Figure 7.12 shows a worksheet with student grades listed in column B (67 students total). Columns D and E contain formulas that calculate the upper and lower limits for bins, based on the entry in cell E1 (named BinSize). For example, if BinSize is 12 (as in the figure), then each bin contains 12 scores (1–12, 13–24, and so on).
Figure 7.12 The chart displays a histogram; the contents of cell E1 determine the number of categories.
On the Web
The workbook adjustable bins.xlsx, shown in Figure 7.12, is available at this book’s website.
The chart uses two dynamic names in its SERIES formula. You can define the name Categories with the following formula:
=OFFSET(Sheet1!$E$4,0,0,ROUNDUP(100/BinSize,0))
You can define the name Frequencies with this formula:
=OFFSET(Sheet1!$F$4,0,0,ROUNDUP(100/BinSize,0))
The net effect is that the chart adjusts automatically when you change the BinSize cell.
Cross-Ref
See Chapter 17, “Charting Techniques,” for more about creating a chart that uses dynamic names in its SERIES formula.
Summing Formulas
The examples in this section demonstrate how to perform common summing tasks by using formulas. The formulas range from very simple to relatively complex array formulas that compute sums of cells that match multiple criteria.
Summing all cells in a range
It doesn’t get much simpler than this. The following formula returns the sum of all values in a range named Data:
=SUM(Data)
The SUM function can take up to 255 arguments. The following formula, for example, returns the sum of the values in five noncontiguous ranges:
=SUM(A1:A9,C1:C9,E1:E9,G1:G9,I1:I9)
You can use complete rows or columns as an argument for the SUM function. The formula that follows, for example, returns the sum of all values in column A. If this formula appears in a cell in column A, it generates a circular reference error.
=SUM(A:A)
The following formula returns the sum of all values on Sheet1. To avoid a circular reference error, this formula must appear on a sheet other than Sheet1.
=SUM(Sheet1!1:1048576)
The SUM function is versatile. The arguments can be numerical values, cells, ranges, text representations of numbers (which are interpreted as values), logical values, array constants, and even embedded functions. For example, consider the following formula:
=SUM(B1,5,"6",,SQRT(4),{1,2,3},A1:A5,TRUE)
This formula, which is a perfectly valid formula, contains all the following types of arguments, listed here in the order of their presentation:
§ A single cell reference
§ A literal value
§ A string that looks like a value
§ A missing argument
§ An expression that uses another function
§ An array constant
§ A range reference
§ A logical TRUE value
Warning
The SUM function is versatile, but it’s also inconsistent when you use logical values (TRUE or FALSE). Logical values stored in cells are always treated as 0. But logical TRUE, when used as an argument in the SUM function, is treated as 1.
Summing a range that contains errors
The SUM function does not work if the range to be summed includes errors. For example, if one of the cells to be summed displays #N/A, the SUM function will also return #N/A.
To add the values in a range and ignore the error cells, use the AGGREGATE function. For example, to sum a range named Data (which may have error values), use this formula:
=AGGREGATE(9,6,Data)
The AGGREGATE function is versatile and can do a lot more than just add values. In this example, the first argument (9) specifies SUM. The second argument (6) means to ignore error values.
The arguments are described in the Excel Help. Excel also provides good autocomplete assistance when you enter a formula that uses this function.
Note
The AGGREGATE function was introduced in Excel 2010. For compatibility with earlier versions, use this array formula:
{=SUM(IF(ISERROR(Data),"",Data))}
Computing a cumulative sum
You may want to display a cumulative sum of values in a range—sometimes known as a running total. Figure 7.13 illustrates a cumulative sum. Column B shows the monthly amounts, and column C displays the cumulative (year-to-date) totals.
Figure 7.13 Simple formulas in column C display a cumulative sum of the values in column B.
The formula in cell C2 is
=SUM(B$2:B2)
Notice that this formula uses a mixed reference. The first cell in the range reference always refers to the same row: in this case, row 2. When this formula is copied down the column, the range argument adjusts so that the sum always starts with row 2 and ends with the current row. For example, after copying this formula down column C, the formula in cell C8 is
=SUM(B$2:B8)
You can use an IF function to hide the cumulative sums for rows in which data hasn’t been entered. The following formula, entered in cell C2 and copied down the column, is
=IF(ISBLANK(B2),"",SUM(B$2:B2))
Figure 7.14 shows this formula at work.
Figure 7.14 Using an IF function to hide cumulative sums for missing data.
On the Web
The workbook cumulative sum.xlsx is available at this book’s website.
Summing the “top n” values
In some situations, you may need to sum the n largest values in a range—for example, the top 10 values. If your data resides in a table, you can use AutoFilter to hide all but the top n rows and then display the sum of the visible data in the table’s Total row.
Another approach is to sort the range in descending order and then use the SUM function with an argument consisting of the first n values in the sorted range.
A better solution—which doesn’t require a table or sorting—uses an array formula like this one:
{=SUM(LARGE(Data,{1,2,3,4,5,6,7,8,9,10}))}
This formula sums the 10 largest values in a range named Data. To sum the 10 smallest values, use the SMALL function instead of the LARGE function:
{=SUM(SMALL(Data,{1,2,3,4,5,6,7,8,9,10}))}
These formulas use an array constant comprising the arguments for the LARGE or SMALL function. If the value of n for your top-n calculation is large, you may prefer to use the following variation. This formula returns the sum of the top 30 values in the Datarange. You can, of course, substitute a different value for 30. Figure 7.15 shows this array formula in use:
{=SUM(LARGE(Data,ROW(INDIRECT("1:30"))))}
Cross-Ref
See Chapter 14 for more information about array constants.
Figure 7.15 Using an array formula to calculate the sum of the 30 largest values in a range.
Conditional Sums Using a Single Criterion
Often, you need to calculate a conditional sum. With a conditional sum, values in a range that meet one or more conditions are included in the sum. This section presents examples of conditional summing using a single criterion.
The SUMIF function is useful for single-criterion sum formulas. The SUMIF function takes three arguments:
§ range: The range containing the values that determine whether to include a particular cell in the sum.
§ criteria: An expression that determines whether to include a particular cell in the sum.
§ sum_range: (Optional) The range that contains the cells you want to sum. If you omit this argument, the function uses the range specified in the first argument.
The examples that follow demonstrate the use of the SUMIF function. These formulas are based on the worksheet shown in Figure 7.16, set up to track invoices. Column F contains a formula that subtracts the date in column E from the date in column D. A negative number in column F indicates a past-due payment. The worksheet uses named ranges that correspond to the labels in row 1. Various summing formulas begin in row 15.
Figure 7.16 A negative value in column F indicates a past-due payment.
On the Web
All the examples in this section also appear at this book’s website in the file named conditional summing.xlsx.
Summing only negative values
The following formula returns the sum of the negative values in column F. In other words, it returns the total number of past-due days for all invoices. For this worksheet, the formula returns –63:
=SUMIF(Difference,"<0")
Because you omit the third argument, the second argument ("<0") applies to the values in the Difference range.
You do not need to hard-code the arguments for the SUMIF function into your formula. For example, you can create a formula such as the following, which gets the criteria argument from the contents of cell G2:
=SUMIF(Difference,G2)
This formula returns a new result if you change the criteria in cell G2.
Note
You can also use the following array formula to sum the negative values in the Difference range:
{=SUM(IF(Difference<0,Difference))}
Summing values based on a different range
The following formula returns the sum of the past-due invoice amounts (see column C in Figure 7.16):
=SUMIF(Difference,"<0",Amount)
This formula uses the values in the Difference range to determine whether the corresponding values in the Amount range contribute to the sum.
Note
You can also use the following array formula to return the sum of the values in the Amount range, where the corresponding value in the Difference range is negative:
{=SUM(IF(Difference<0,Amount))}
Summing values based on a text comparison
The following formula returns the total invoice amounts for the Oregon office:
=SUMIF(Office,"=Oregon",Amount)
Using the equal sign is optional. The following formula has the same result:
=SUMIF(Office,"Oregon",Amount)
To sum the invoice amounts for all offices except Oregon, use this formula:
=SUMIF(Office,"<>Oregon",Amount)
Text comparisons are not case sensitive.
Summing values based on a date comparison
The following formula returns the total invoice amounts that have a due date after May 1, 2013:
=SUMIF(DateDue,">="&DATE(2013,5,1),Amount)
Notice that the second argument for the SUMIF function is an expression. The expression uses the DATE function, which returns a date. Also, the comparison operator, enclosed in quotation marks, is concatenated (using the & operator) with the result of the DATE function.
The formula that follows returns the total invoice amounts that have a future due date (including today):
=SUMIF(DateDue,">="&TODAY(),Amount)
Conditional Sums Using Multiple Criteria
All the examples in the preceding section use a single comparison criterion. The examples in this section involve summing cells based on multiple criteria.
Figure 7.17 shows the sample worksheet again, for your reference. The worksheet also shows the result of several formulas that demonstrate summing by using multiple criteria.
Figure 7.17 This worksheet demonstrates summing based on multiple criteria.
The SUMIFS function (introduced in Excel 2007) can be used to sum a range when multiple conditions are met. The first argument of SUMIFS is the range to be summed. The remaining arguments are 1 to 127 range/criterion pairs that determine which values in the sum range are included. In the following examples, alternatives to SUMIFS are presented for those workbooks that are required to work in versions prior to 2007.
Using And criteria
Suppose you want to get a sum of both the invoice amounts that are past due as well as associated with the Oregon office. In other words, the value in the Amount range will be summed only if both of the following criteria are met:
§ The corresponding value in the Difference range is negative.
§ The corresponding text in the Office range is Oregon.
The SUMIFS function was designed for just this task:
=SUMIFS(Amount,Difference,"<0",Office,"Oregon")
For use with versions prior to Excel 2007, the following array formula also does the job:
{=SUM((Difference<0)*(Office="Oregon")*Amount)}
This formula creates two new arrays (in memory):
§ A Boolean array that consists of TRUE if the corresponding Difference value is less than zero; FALSE otherwise
§ A Boolean array that consists of TRUE if the corresponding Office value equals Oregon; FALSE otherwise
Multiplying Boolean values result in the following:
§ TRUE * TRUE = 1
§ TRUE * FALSE = 0
§ FALSE * FALSE = 0
Therefore, the corresponding Amount value returns nonzero only if both the corresponding values in the memory arrays are TRUE. The result produces a sum of the Amount values that meet the specified criteria.
Note
You may think that you can rewrite the previous array function as follows, using the SUMPRODUCT function to perform the multiplication and addition:
=SUMPRODUCT((Difference<0),(Office="Oregon"),Amount)
For some reason, the SUMPRODUCT function does not handle Boolean values properly, so the formula does not work. The following formula, which multiplies the Boolean values by 1, does work:
=SUMPRODUCT(1*(Difference<0),1*(Office="Oregon"),Amount)
Using Or criteria
Suppose you want to get a sum of past-due invoice amounts, or ones associated with the Oregon office. In other words, the value in the Amount range will be summed if either of the following criteria is met:
§ The corresponding value in the Difference range is negative.
§ The corresponding text in the Office range is Oregon.
The following array formula does the job:
{=SUM(IF((Office="Oregon")+(Difference<0),1,0)*Amount)}
A plus sign (+) joins the conditions; you can include more than two conditions.
Using And and Or criteria
As you might expect, things get a bit tricky when your criteria consists of both And and Or operations. For example, you may want to sum the values in the Amount range when both of the following conditions are met:
§ The corresponding value in the Difference range is negative.
§ The corresponding text in the Office range is Oregon or California.
Notice that the second condition actually consists of two conditions, joined with Or. Using multiple SUMIFS can accomplish this:
=SUMIFS(Amount,Difference,"<0",Office,"Oregon")
+SUMIFS(Amount,Difference,"<0",Office,"California")
The following array formula also does the trick:
{=SUM((Difference<0)*((Office="Oregon")+(Office="California"))*(Amount))}