Excel 2016 All-in-One For Dummies (2016)
Book VI
Data Management
Chapter 2
Filtering and Querying a Data List
In This Chapter
Understanding how to filter and query a data list
Using AutoFilter to filter out unwanted data
Filtering a list with custom criteria
Filtering a list on font color, fill color, or cell icons
Using Database functions to compute statistics from records that match your filter criteria
Performing external data queries with text files, web pages, and data files kept in other database sources
It’s one thing to set up a data list and load it with tons of data and quite another to get just the information that you need out of the list. How you go about extracting the data that’s important to you is the subject of this chapter. The procedure for specifying the data that you want displayed in an Excel data list is called filtering the data list or database. The procedure for extracting only the data that you want from the database or data list is called querying the database.
In addition to helping you with filtering and querying the data in your list, this chapter explains how you can use Excel’s Database functions to perform calculations on particular numerical fields for only the records that meet the criteria that you specify. These calculations can include getting totals (DSUM), averages (DAVERAGE), the count of the records (DCOUNT and DCOUNTA), and the like.
Finally, this chapter introduces you to querying external databases in order to bring some of their data into the more familiar worksheet setting. These external databases that you query in Excel can be in other Windows database programs, such as Microsoft Access 2016 or in even more sophisticated, server-based database-management systems, such as those provided by SQL Analysis Services, Microsoft Windows Azure Marketplace, and OData Data connections.
Data List Filtering 101
If you ever have the good fortune to attend my class on database management, you’ll hear my spiel on the difference between data and information in the tables in a database (or data list, in Excel-speak). In case you’re the least little bit interested, it goes like this: A data list or the tables that make up a database consist of a vast quantity of raw data, which simply represents all the stuff that everybody in the company wants stored on a given subject (employees, sales, clients, you name it). For example, suppose that you keep a data list on the sales transactions made by your customers. This data list can very well track such stuff as the customers’ identification numbers, names, addresses, telephone numbers, whether they have a charge account with the store, the maximum amount that they can charge, the purchases that they’ve made (including the dates and amounts), and whether their accounts are due (or overdue).
However, this vast quantity of data stored in the customer data list is not to be confused with the information that particular people in the office want out of the data. For example, suppose that you’re working in the marketing department and you’re about to introduce a line of expensive household items that you want to advertise. You want to limit the advertising to those customers who have a charge account with the store and have purchased at least $5,000 of merchandise in the last six months. Use the data provided in the data list to supply that information by selecting only those customers out of the entire data list.
On the other hand, suppose that you work in the accounting department and you need to send out nasty notices to all the customers who have charge accounts that are more than 90 days past due. In this case, you want only the data identifying those customers whose accounts are overdue. You couldn’t care less about what was actually purchased. All you care about is reaching these folks and convincing them to pay up. You again use the data provided in the data list to supply the information to select only those customers that you need from the data list.
From these simple examples, it should be clear that the data that supplies information to one group in the company at a particular time is often not the same data that supplies information to another group. In other words, for most people, the data list dispenses information only when you are able to filter out the stuff that you currently don’t want to see, and leaves behind just the stuff that interests you.
Filtering Data
Filtering the data list to leave behind only the information that you want to work with is exactly the procedure that you follow in Excel. At the most basic level, you use the AutoFilter feature to temporarily hide the display of unwanted records and leave behind only the records that you want to see. Much of the time, the capabilities of the AutoFilter feature are all that you need, especially when your main concern is simply displaying just the information of interest in the data list.
You will encounter situations, however, in which the AutoFilter feature is not sufficient, and you must do what Microsoft refers to as advanced filtering in your data list. You need to use advanced filtering to filter the data list when you use computed criteria (such as when you want to see all the records where the entry in the Sales column is twice the amount in the Owed column) and when you need to save a copy of the filtered data in a different part of the worksheet (Excel’s version of querying the data in a data list).
Using AutoFilter
Excel’s AutoFilter feature makes filtering out unwanted data in a data list as easy as clicking the AutoFilter button on the column on which you want to filter the data and then choosing the appropriate filtering criteria from that column’s drop-down menu.
If you open a worksheet with a data list and you don’t find AutoFilter buttons attached to each of the field names at the top of the list, you can display them simply by positioning the cell pointer in one of the cells with the field names and then clicking the Filter command button on the Ribbon’s Data tab or pressing Ctrl+Shift+L or Alt+AT.
The filter options on a column’s AutoFilter drop-down menu depend on the type of entries in the field. On the drop-down menu in a column that contains only date entries, the menu contains a Date Filters option to which a submenu of the actual filters is attached. On the drop-down menu in a column that contains only numeric entries (besides dates) or a mixture of dates with other types of numeric entries, the menu contains a Number Filters option. On the drop-down menu in a column that contains only text entries or a mixture of text, date, and other numeric entries, the menu contains a Text Filters option.
Doing basic filtering by selecting specific field entries
In addition to the Date Filters, Text Filters, or Number Filters options (depending on the type of field), the AutoFilter drop-down menu for each field in the data list contains a list box with a complete listing of all entries made in that column, each with its own check box. At the most basic level, you can filter the data list by clearing the check box for all the entries whose records you don’t want to see in the list.
This kind of basic filtering works best in fields such as City, State, or Country, which contain many duplicates, so you can see a subset of the data list that contains only the cities, states, or countries you want to work with at the time.
The easiest way to perform this basic type of filtering on a field is to first deselect the check box in front of the (Select All) option at the top of the field’s list box to clear the check boxes, and then select each of the check boxes containing the entries for the records you do want displayed in the filtered data list. After you finish selecting the check boxes for all the entries you want to keep, you click OK to close the AutoFilter drop-down menu.
Excel then hides rows in the data list for all records except for those that contain the entries you just selected. The program also lets you know which field or fields have been used in the filtering operation by adding a cone filter icon to the column’s AutoFilter button. To restore all the records to the data list, you can remove the filtering by clicking the Clear command button in the Sort & Filter group of the Data tab of the Ribbon or by pressing Alt+AC.
When doing this basic kind of list filtering, you can select specific entries from more than one field in this list. Figure 2-1 illustrates this kind of situation. Here, I want only the employees in the company who work in the Engineering and Information Services departments in the Chicago and Seattle offices. To do this, I selected only the Engineering and Information Services entries in the list box on the Dept field’s AutoFilter drop-down menu and only the Chicago and Seattle entries in the list box on the Location field’s AutoFilter drop-down menu.
Figure 2-1: The employee data list after filtering the Dept and Location fields.
As you can see in Figure 2-1, after filtering the Employee data list so that only the records for employees in either the Engineering or Information Services department in either the Chicago or Seattle office locations are listed, Excel adds the cone filter icon to the AutoFilter buttons on both the Dept and Location fields in the top row, indicating that the list is filtered using criteria involving both fields.
Keep in mind that after filtering the data list in this manner, you can then copy remaining records that make up the desired subset of the data list to a new area in the same worksheet or to a new sheet in the workbook. You can then sort the data (by adding AutoFilter buttons with the Filter command button on the Data tab), chart the data (see Book V, Chapter 1), analyze the data (see “Using the Database Functions” later in this chapter), or summarize the data in a pivot table (covered in Book VII, Chapter 2).
Using the Text Filters options
The AutoFilter drop-down menu for a field that contains only text or a combination of text, date, and numeric entries contains a Text Filters option that when you click or highlight displays its submenu containing the following options:
· Equals: Opens the Custom AutoFilter dialog box with the Equals operator selected in the first condition.
· Does Not Equal: Opens the Custom AutoFilter dialog box with the Does Not Equal operator selected in the first condition.
· Begins With: Opens the Custom AutoFilter dialog box with the Begins With operator selected in the first condition.
· Ends With: Opens the Custom AutoFilter dialog box with the Ends With operator selected in the first condition.
· Contains: Opens the Custom AutoFilter dialog box with the Contains operator selected in the first condition.
· Does Not Contain: Opens the Custom AutoFilter dialog box with the Does Not Contain operator selected in the first condition.
· Custom Filter: Opens the Custom AutoFilter dialog box where you can select your own criteria for applying more complex AND or OR conditions.
Using the Date Filters options
The AutoFilter drop-down menu for a field that contains only date entries contains a Date Filters option that when you click or highlight displays its submenu containing the following options:
· Equals: Opens the Custom AutoFilter dialog box with the Equals operator selected in the first condition.
· Is Before: Opens the Custom AutoFilter dialog box with the Is Before operator selected in the first condition.
· Is After: Opens the Custom AutoFilter dialog box with the Is After operator selected in the first condition.
· Between: Opens the Custom AutoFilter dialog box with the Is After or Equal To operator selected in the first condition and the Is Before or Equal To operator selected in the second AND condition.
· Tomorrow: Filters the data list so that only records with tomorrow’s date in this field are displayed in the worksheet.
· Today: Filters the data list so that only records with the current date in this field are displayed in the worksheet.
· Yesterday: Filters the data list so that only records with yesterday’s date in this field are displayed in the worksheet.
· Next Week: Filters the data list so that only records with date entries in the week ahead in this field are displayed in the worksheet.
· This Week: Filters the data list so that only records with date entries in the current week in this field are displayed in the worksheet.
· Last Week: Filters the data list so that only records with date entries in the previous week in this field are displayed in the worksheet.
· Next Month: Filters the data list so that only records with date entries in the month ahead in this field are displayed in the worksheet.
· This Month: Filters the data list so that only records with date entries in the current month in this field are displayed in the worksheet.
· Last Month: Filters the data list so that only records with date entries in the previous month in this field are displayed in the worksheet.
· Next Quarter: Filters the data list so that only records with date entries in the three-month quarterly period ahead in this field are displayed in the worksheet.
· This Quarter: Filters the data list so that only records with date entries in the current three-month quarterly period in this field are displayed in the worksheet.
· Last Quarter: Filters the data list so that only records with date entries in the previous three-month quarterly period in this field are displayed in the worksheet.
· Next Year: Filters the data list so that only records with date entries in the calendar year ahead in this field are displayed in the worksheet.
· This Year: Filters the data list so that only records with date entries in the current calendar year in this field are displayed in the worksheet.
· Last Year: Filters the data list so that only records with date entries in the previous calendar year in this field are displayed in the worksheet.
· Year to Date: Filters the data list so that only records with date entries in the current year up to the current date in this field are displayed in the worksheet.
· All Dates in the Period: Filters the data list so that only records with date entries in the quarter (Quarter 1 through Quarter 4) or month (January through December) that you choose from its submenu are displayed in the worksheet.
· Custom Filter: Opens the Custom AutoFilter dialog box where you can select your own criteria for more complex AND or OR conditions.
When selecting dates for conditions using the Equals, Is Before, Is After, Is Before or Equal To, or Is After or Equal To operator in the Custom AutoFilter dialog box, you can select the date by clicking the Date Picker button (the one with the calendar icon) and then clicking the specific date on the drop-down date palette. When you open the date palette, it shows the current month and the current date selected. To select a date in an earlier month, click the Previous button (the one with the triangle pointing left) until its month is displayed in the palette. To select a date in a later month, click the Next button (the one with the triangle pointing right) until its month is displayed in the palette.
Using the Number Filters options
The AutoFilter drop-down menu for a field that contains only number entries besides dates or a combination of dates and other numeric entries contains a Number Filters option that when you click or highlight it displays its submenu containing the following options:
· Equals: Opens the Custom AutoFilter dialog box with the Equals operator selected in the first condition.
· Does Not Equal: Opens the Custom AutoFilter dialog box with the Does Not Equal operator selected in the first condition.
· Is Greater Than: Opens the Custom AutoFilter dialog box with the Is Greater Than operator selected in the first condition.
· Is Greater Than or Equal To: Opens the Custom AutoFilter dialog box with the Is Greater Than or Equal To operator selected in the first condition.
· Is Less Than: Opens the Custom AutoFilter dialog box with the Is Less Than operator selected in the first condition.
· Is Less Than or Equal To: Opens the Custom AutoFilter dialog box with the Is Less Than or Equal to operator selected in the first condition.
· Between: Opens the Custom AutoFilter dialog box with the Is Greater Than or Equal To operator selected in the first condition and the Is Less Than or Equal To operator selected in the second AND condition.
· Top 10: Opens the Top 10 AutoFilter dialog box so that you can filter the list to just the ten or so top or bottom values or percentages in the field. (See “Making it to the Top Ten!” that follows in this chapter for details.)
· Above Average: Filters the data list to display only records where the values in the field are greater than the average of the values in this field.
· Below Average: Filters the data list to display only records where the values in the field are less than the average of the values in this field.
· Custom Filter: Opens the Custom AutoFilter dialog box where you can select your own criteria for more complex AND or OR conditions.
Making it to the Top Ten!
The Top Ten option on the Number Filters option’s submenu enables you to filter out all records except those whose entries in that field are at the top or bottom of the list by a certain number (10 by default) or in a certain top or bottom percent (10 by default). Of course, you can only use the Top Ten item in numerical fields and date fields; this kind of filtering doesn’t make any sense when you’re dealing with entries in a text field.
When you click the Top Ten option on the Number Filters option’s submenu, Excel opens the Top 10 AutoFilter dialog box where you can specify your filtering criteria. By default, the Top 10 AutoFilter dialog box is set to filter out all records except those whose entries are among the top ten items in the field by selecting Top in the drop-down list box on the left, 10 in the middle combo box, and Items in the drop-down list box on the right. If you want to use these default criteria, you simply click OK in the Top 10 AutoFilter dialog box.
Figure 2-2 shows you the sample employee data list after using the Top 10 Items AutoFilter to display only the records with the top ten salaries in the data list.
Figure 2-2: Using the Top 10 Items AutoFilter to filter out all records except for those with the top ten salaries.
You can also change the filtering criteria in the Top 10 AutoFilter dialog box before you filter the data. You can choose between Top and Bottom in the leftmost drop-down list box and between Items and Percent in the rightmost one. You can also change the number in the middle combo box by clicking it and entering a new value or using the spinner buttons to select one.
Filtering a data list on a field’s font and fill colors or cell icons
Just as you can sort a data list using the font or fill color or cell icons that you’ve assigned with the Conditional Formatting feature to values in the field that are within or outside of certain parameters (see the section on the Conditional Formatting feature in Book II, Chapter 2 for details), you can also filter the list.
To filter a data list on a font color, fill color, or cell icon used in a field, you click its AutoFilter button and then select the Filter by Color option from the drop-down menu. Excel then displays a submenu from which you choose the font color, fill color, or cell icon to use in the sort:
· To filter the data list so that only the records with a particular font color in the selected field — assigned with the Conditional Formatting Highlight Cell Rules or Top/Bottom Rules options — appear in the list, click its color swatch in the Filter by Font Color submenu.
· To filter the data list so that only the records with a particular fill color in the selected field — assigned with the Conditional Formatting Highlight Cell Rules, Top/Bottom Rules, Data Bars, or Color Scales options — appear in the list, click its color swatch in the Filter by Font Color submenu.
· To filter the data list so that only the records with a particular cell icon in the selected field — assigned with the Conditional Formatting Icon Sets options — appear in the list, click the icon in the Filter by Cell Icon submenu.
Custom AutoFilter at your service
You can click the Custom Filter option on a field’s Text Filters, Date Filters, or Number Filters continuation menu to open the Custom AutoFilter dialog box, where you can specify your own filtering criteria by using conditions with the AND and OR logical operators (called AND and OR conditions for short). When you click the Custom Filter option, Excel opens the Custom AutoFilter dialog box, similar to the one shown in Figure 2-3.
Figure 2-3: Using Custom AutoFilter to filter out records except for those within a range of salaries.
Here, you select the type of operator to use in evaluating the first and second conditions in the top and bottom drop-down list boxes and the values to be evaluated in the first and second conditions in the associated combo boxes. You also specify the type of relationship between the two conditions with the And or Or option button. (The And option button is selected by default.)
When selecting the operator for the first and second condition in the leftmost drop-down list boxes at the top and bottom of the Custom AutoFilter dialog box, you have the following choices, depending on the types of entries in the selected field:
· Equals: Matches records where the entry in the field is identical to the text, date, or number you enter in the associated combo box.
· Does Not Equal: Matches records where the entry in the field is anything other than the text, date, or number you enter in the associated combo box.
· Is After: Matches records where the entry in the date field comes after the date you enter or select in the associated combo box.
· Is After or Equal To: Matches records where the entry in the date field comes after or is the same as the date you enter or select in the associated combo box.
· Is Before: Matches records where the entry in the date field precedes the date you enter or select in the associated combo box.
· Is Before or Equal To: Matches records where the entry in the date field precedes or is the same as the date you enter or select in the associated combo box.
· Is Greater Than: Matches records where the entry in the field follows the text in the alphabet, comes after the date, or is larger than the number you enter in the associated combo box.
· Is Greater Than or Equal To: Matches records where the entry in the field follows the text in the alphabet or is identical, the date comes after or is identical, or the number is larger than or equal to the one you enter in the associated combo box.
· Is Less Than: Matches records where the entry in the field comes before the text in the alphabet, comes before the date, or is less than the number you enter in the associated combo box.
· Is Less Than or Equal To: Matches records where the entry in the field comes before the text in the alphabet or is identical, the date comes before or is identical, or the number is less than or equal to the one you enter in the associated combo box.
· Begins With: Matches records where the entry in the field starts with the text, the part of the date, or the number you enter in the associated combo box.
· Does Not Begin With: Matches records where the entry in the field starts with anything other than the text, the part of the date, or the number you enter in the associated combo box.
· Ends With: Matches records where the entry in the field ends with the text, the part of the date, or the number you enter in the associated combo box.
· Does Not End With: Matches records where the entry in the field ends with anything other than the text, the part of the date, or the number you enter in the associated combo box.
· Contains: Matches records where the entry in the field contains the text, the part of the date, or the number you enter in the associated combo box.
· Does Not Contain: Matches records where the entry in the field contains anything other than the text, the part of the date, or the number you enter in the associated combo box.
Note that you can use the Begins With, Ends With, and Contains operators and their negative counterparts when filtering a text field — you can also use the question mark (?) and asterisk (*) wildcard characters when entering the values for use with these operators. (The question mark wildcard stands for individual characters, and the asterisk stands for one or more characters.) You use the other logical operators when dealing with numeric and date fields.
When specifying the values to evaluate in the associated combo boxes on the right side of the Custom AutoFilter dialog box, you can type in the text, number, or date, or you can select an existing field entry by clicking the box’s drop-down list button and then clicking the entry on the drop-down menu. In date fields, you can select the dates directly from the date drop-down palette opened by clicking the box’s Date Picker button (the one with the calendar icon).
Figure 2-3 illustrates setting up filtering criteria in the Custom AutoFilter dialog box that selects records whose Salary values fall within two separate ranges of values. In this example, I’m using an OR condition to filter out all records where the salaries fall below $35,000 or are greater than $75,000 by entering the following complex condition:
Salary Is Greater Than 75000 OR Is Less Than 35000
Using the Advanced Filter
When you use advanced filtering, you don’t use the field’s AutoFilter buttons and associated drop-down menu options. Instead, you create a so-called Criteria Range somewhere on the worksheet containing the data list to be filtered before opening the Advanced Filter dialog box.
If you use the Advanced Filter feature to do a query, you extract copies of the records that match your criteria by creating a subset of the data list. You can locate the Criteria Range in the top rows of columns to the right of the data list and then specify the Copy To range underneath the Criteria Range, similar to the arrangement shown in Figure 2-4.
Figure 2-4: Using Advanced Filter to copy records that meet the criteria in the Criteria Range.
To create a Criteria Range, you copy the names of the fields in the data list to a new part of the worksheet and then enter the values (text, numbers, or formulas) that are to be used as the criteria in filtering the list in rows underneath. When setting up the criteria for filtering the data list, you can create either comparison criteria or calculated criteria.
After you’ve set up your criteria range with all the field names and the criteria that you want used, you click the Advanced command button on the Ribbon’s Data tab (or press Alt+AQ) to open the Advanced Filter dialog box similar to the one shown in Figure 2-4. Here, you specify whether you just want to filter the records in the list (by hiding the rows of all those that don’t meet your criteria) or you want to copy the records that meet your criteria to a new area in the worksheet (by creating a subset of the data list).
To just filter the data in the list, leave the Filter the List, In-Place option button selected. To query the list and copy the data to a new place in the same worksheet (note that the Advanced Filter feature doesn’t let you copy the data to another sheet or workbook), you select the Copy to Another Location option button. When you select this option button, the Copy To text box becomes available, along with the List Range and Criteria Range text boxes.
To specify the data list that contains the data that you want to filter or query, click the List Range text box and then enter the address of the cell range or select it directly in the worksheet by dragging through its cells. To specify the range that contains a copy of the field names along with the criteria entered under the appropriate fields, you click the Criteria Range text box and then enter the range address of this cell range or select it directly in the worksheet by dragging through its cells. When selecting this range, be sure that you include all the rows that contain the values that you want evaluated in the filter or query.
Only the unique need apply!
To filter out duplicate rows or records that match your criteria, select the Unique Records Only check box in the Advanced Filter dialog box before you start the filtering operation. You can remove the display of all duplicate records from a data list by selecting this check box and removing all cell references from the Criteria Range text box before you click OK or press Enter.
If you’re querying the data list by copying the records that meet your criteria to a new part of the worksheet (indicated by clicking the Copy to Another Location option button), you also click the Copy To text box and then enter the address of the cell that is to form the upper-left corner of the copied and filtered records or click this cell directly in the worksheet.
After specifying whether to filter or query the data and designating the ranges to be used in this operation, click OK to have Excel apply the criteria that you’ve specified in the Criteria Range in either filtering or copying the records.
After filtering a data list, you may feel that you haven’t received the expected results — for example, no records are listed under the field names that you thought should have several. You can bring back all the records in the list by clicking the Clear command button on the Data tab of the Ribbon or by pressing Alt+AC. Now you can fiddle with the criteria in the Criteria Range text box and try the whole advanced filtering thing all over again.
Specifying comparison criteria
Entering selection criteria in the Criteria Range for advanced filtering is very similar to entering criteria in the data form after selecting the Criteria button. However, you need to be aware of some differences. For example, if you are searching for the last name Paul and enter the labelPaul in the criteria range under the cell containing the field name Last Name, Excel will match any last name that begins with P-a-u-l such as Pauley, Paulson, and so on. To avoid having Excel match any other last name beside Paul, you would have to enter a formula in the cell below the one with the Last Name field name, as in
="Paul"
When entering criteria for advanced filtering, you can also use the question mark (?) or the asterisk (*) wildcard character in your selection criteria just like you do when using the data form to find records. If, for example, you enter J*n under the cell with the First Name field name, Excel will consider any characters between J and n in the First Name field to be a match including Joan, Jon, or John as well as Jane or Joanna. To restrict the matches to just those names with characters between J and n and to prevent matches with names that have trailing characters, you need to enter the following formula in the cell:
="J*n"
When you use a selection formula like this, Excel will match names such as Joan, Jon, and John but not names such as Jane or Joanna that have a character after the n.
When setting up selection criteria, you can also use the other comparative operators, including >, >=, <, <=, and <>, in the selection criteria. See Table 2-1 for descriptions and examples of usage in selection criteria for each of these logical operators.
Table 2-1 The Comparative Operators in the Selection Criteria
Operator |
Meaning |
Example |
Locates |
= |
Equal to |
=“CA” |
Records where the state is CA |
> |
Greater than |
>m |
Records where the name starts with a letter after M (that is, N through Z) |
>= |
Greater than |
>=3/4/02 |
Records where the date is on or after or equal to March 4, 2002 |
< |
Less than |
<d |
Records where the name begins with a letter before D (that is, A, B, or C) |
<= |
Less than |
<=12/12/04 |
Records where the date is on or before or equal to December 12, 2004 |
<> |
Not equal to |
<>“CA” |
Records where the state is not equal to CA |
To find all the records where a particular field is blank in the database, enter = and press the spacebar to enter a space in the cell beneath the appropriate field name. To find all the records where a particular field is not blank in the database, enter <> and press the spacebar to enter a space in the cell beneath the appropriate field name.
Setting up logical AND and logical OR conditions
When you enter two or more criteria in the same row beneath different field names in the Criteria Range, Excel treats the criteria as a logical AND condition and selects only those records that meet both of the criteria. Figure 2-5 shows an example of the results of a query that uses an AND condition. Here, Excel has copied only those records where the location is Boston and the date hired is before January 1, 2000, because both the criteria Boston and <1/1/00 are placed in the same row (row 2) under their respective field names, Location and Date Hired.
Figure 2-5: Copied records where the location is Boston and the date hired is prior to January 1, 2000.
When you enter two or more criteria in different rows of the Criteria Range, Excel treats the criteria as a logical OR and selects records that meet any one of the criteria they contain. Figure 2-6 shows you an example of the results of a query using an OR condition. In this example, Excel has copied records where the location is either Boston or San Francisco because Boston is entered under the Location field name in the second row (row 2) of the Criteria Range above San Francisco entered in the third row (row 3).
Figure 2-6: Copied records where the location is Boston and the date hired prior to January 1, 2000 or the location is San Francisco location for any date hired .
When creating OR conditions, you need to remember to redefine the Criteria Range to include all the rows that contain criteria, which in this case is the cell range L2:U3. (If you forget, Excel uses only the criteria in the rows included in the Criteria range.)
When setting up your criteria, you can combine logical AND and logical OR conditions (again, assuming that you expand the Criteria Range sufficiently to include all the rows containing criteria). For example, if you enter Boston in cell R2 (under Location) and <1/1/00 in cell S2 (under Date Hired) in row 2 and enter San Francisco in cell R3 and then repeat the query, Excel copies the records where the location is Boston and the date hired is before January 1, 2000, as well as the records where the location is San Francisco (regardless of the date hired).
Setting up calculated criteria
You can use calculated criteria when filtering or querying your data list. All you need to do is enter a logical formula that Excel can evaluate as either TRUE or FALSE in the Criteria Range under a made-up name that is not any field name used in the data list (I repeat, is not a field name in the data list). Calculated criteria enable you to filter or query records based on a comparison of entries in a particular field with entries in other fields of the list or based on a comparison with entries in the worksheet that lie outside the data list itself.
Figure 2-7 shows an example of using a calculated criterion that compares values in a field to a calculated value that isn’t actually entered in the data list. Here, you want to perform a query that copies all the records from the Employee data list where the employee’s salary is above the average salary. In this figure, cell V2 contains the formula that uses the AVERAGE function to compute average employee salary and then compares the first salary entry in cell F2 of the data list to that average with the following formula:
=F2>AVERAGE($F$2:$F$33)
Figure 2-7: Copied records extracted from the data list for employees whose salaries are above the salary average.
Note that this logical formula is placed under the label Calculated Criteria in cell V2, which has been added to the end of the Criteria Range. Cell F2 is the first cell in the data list that contains a salary entry. The cell range, $F$2:$F$33, used as the argument of the AVERAGE function, is the range in the Salary field that contains all the salary entries.
To use this calculated criterion, you must remember to place the logical formula under a name that isn’t used as a field name in the data list itself. (In this example, the label Calculated Criteria does not appear anywhere in the row of field names.) You must include this label and formula in the Criteria Range. (For this query example, the Criteria Range is defined as the cell range L2:V2.)
When you then perform the query by using the Advanced Filter feature, Excel applies this calculated criterion to every record in the database. Excel does this by adjusting the first Salary field cell reference F2 (entered as a relative reference) as the program examines the rest of the records below. Note, however, that the range reference specified as the argument of the AVERAGE function is entered as an absolute reference ($F$2:$F$33) in the criterion formula so that Excel won’t adjust this reference but compare the Salary entry for each record to AVERAGE computed for this entire range (which just happens to be 40,161). Note in Figure 2-7 how Excel 2016 automatically converts the cell references ($F$2:$F$33) in the AVERAGE function’s argument to the range name equivalent, (Table2[Salary]).
When entering formulas for calculated criteria that compare values outside the data list to values in a particular field, you should always reference the cell containing the very first entry for that field in order to ensure that Excel applies your criteria to every record in the data list.
You can also set up calculated criteria that compare entries in one or more fields to other entries in the data list. For example, to extract the records where the Years of Service entry is at least two years greater than the record above it (assuming that you have sorted the data list in ascending order by years of service), you would enter the following logical formula under the cell labeled Calculated Criteria:
=I3>I2+2
Most often, when referencing cells within the data list itself, you want to leave the cell references relative so that they can be adjusted, because each record is examined, and the references to the cells outside the database are absolute so that these won’t be changed when making the comparison with the rest of the records.
When you enter the logical formula for a calculated criterion, Excel returns the logical value TRUE or FALSE. This logical value applies to the field entry for the first record in the data list that you refer to in the logical formula. By inspecting this field entry in the database and seeing whether it does indeed meet your intended selection criteria, you can usually tell whether your logical formula is correct.
Using the AND and OR functions in calculated criteria
You can also use Excel’s AND, OR, and NOT functions with the logical operators in calculated criteria to find records that fall within a range. For example, to find all the records in the employee database where the salaries range between $55,000 and $75,000, you would enter the following logical formula with the AND function under the cell with the label Calculated Criteria:
=AND(F2>=55000,F2<=75000)
To find all the records in the Employee data list where the salary is either below $29,000 or above $45,000, you would enter the following logical formula with the OR function under the cell with the label Calculated Criteria:
=OR(F2<29000,F2>45000)
Using the Database Functions
Excel includes a number of database functions that you can use to calculate statistics, such as the total, average, maximum, minimum, and count in a particular field of the data list only when the criteria that you specify are met. For example, you could use the DSUM function in the sample Employee data list to compute the sum of all the salaries for employees who were hired after January 1, 2000, or you could use the DCOUNT function to compute the number of records in the data list for the Human Resources department.
The database functions, regardless of the difference in names (and they all begin with the letter D) and the computations that they perform, all take the same three arguments as illustrated by the DAVERAGE function:
DAVERAGE(database,field,criteria)
The arguments for the database functions require the following information:
· database is the argument that specifies the range containing the list and it must include the row of field names in the top row.
· field is the argument that specifies the field whose values are to be calculated by the database function (averaged in the case of the DAVERAGE function). You can specify this argument by enclosing the name of the field in double quotes (as in “Salary” or “Date Hired”), or you can do this by entering the number of the column in the data list (counting from left to right with the first field counted as 1).
· criteria is the argument that specifies the address of the range that contains the criteria that you’re using to determine which values are calculated. This range must include at least one field name that indicates the field whose values are to be evaluated and one cell with the values or expression to be used in the evaluation.
Note that in specifying the field argument, you must refer to a column in the data list that contains numeric or date data for all the database functions with the exception of DGET. All the rest of the database functions can’t perform computations on text fields. If you mistakenly specify a column with text entries as the field argument for these database functions, Excel returns an error value or 0 as the result. Table 2-2 lists the various database functions available in Excel along with an explanation of what each one calculates. (You already know what arguments each one takes.)
Table 2-2 The Database Functions in Excel
Database Function |
What It Calculates |
DAVERAGE |
Averages all the values in a field of the data list that match the criteria you specify. |
DCOUNT |
Counts the number of cells with numeric entries in a field of the data list that match the criteria you specify. |
DCOUNTA |
Counts the number of nonblank cells in a field of the data list that match the criteria you specify. |
DGET |
Extracts a single value from a record in the data list that matches the criteria you specify. If no record matches, the function returns the #VALUE! error value. If multiple records match, the function returns the #NUM! error value. |
DMAX |
Returns the highest value in a field of the data list that matches the criteria you specify. |
DMIN |
Returns the lowest value in a field of the data list that matches the criteria you specify. |
DPRODUCT |
Multiplies all the values in a field of the data list that match the criteria you specify. |
DSTDEV |
Estimates the standard deviation based on the sample of values in a field of the data list that match the criteria you specify. |
DSTDEVP |
Calculates the standard deviation based on the population of values in a field of the data list that match the criteria you specify. |
DSUM |
Sums all the values in a field of the data list that match the criteria you specify. |
DVAR |
Estimates the variance based on the sample of values in a field of the data list that match the criteria you specify. |
DVARP |
Calculates the variance based on the population of values in a field of the data list that match the criteria you specify. |
The Database functions are too rarely used to rate their own command button on the Ribbon’s Formulas tab. As a result, to use them in a worksheet, you must click the Function Wizard (fx) button on the Formula bar and then select Database from the Select a Category drop-down list box and then click the function to use or type the Database function directly into the cell.
Figure 2-8 illustrates the use of the Database function, DSUM. Cell C2 in the worksheet shown in this figure contains the following formula:
=DSUM(Table1[#All],"Salary",F1:F2)
Figure 2-8: Using the DSUM to total the salaries over $55,000 in the Employee data list.
This DSUM function computes the total of all the salaries in the data list that are above $55,000. This total is $468,500, as shown in cell C2, which contains the formula.
To perform this calculation, I specified the range A3:J35, which contains the entire data list. This range includes the top row of field names as the database argument (which Excel 2016 automatically converted to its range name equivalent, Table1[#All]). I then specified “Salary” as the field argument of the DSUM function because this is the name of the field that contains the values that I want totaled. Finally, I specified the range F1:F2 as the criteria argument of the DSUM function because these two cells contain the criteria range that designate that only the values exceeding 55000 in the Salary field are to be summed.
External Data Query
Excel makes it possible to query data lists (tables) stored in external databases to which you have access and then extract the data that interests you into your worksheet for further manipulation and analysis. Excel 2016 makes it easy to acquire data from a variety of different data sources, including Microsoft Access database files, web pages on the Internet, text files, and other data sources such as database tables on SQL Servers and Analysis Services, XML data files, and data tables from online connections to Microsoft Windows Azure DataMarket and OData Data feeds.
When importing data from such external sources into your Excel worksheets, you may well be dealing with data stored in multiple related tables all stored in the database (what is referred to in Excel 2016 as a Data Model). The relationship between different tables in the same database is based on a common field (column) that occurs in each related data table, which is officially known as a key field, but in Excel is generally known as a lookup column. When relating tables on a common key field, in at least one table, the records for that field must all be unique with no duplicates, such as Clients data table where the Customer ID field is unique and assigned only once (where it’s known as the primary key). In the other related data table, the common field (known as the foreign key) may or may not be unique as in an Orders data table where entries in its Customer ID may not all be unique, as it’s quite permissible (even desirable) to have the same client purchasing multiple products multiple times.
There’s only one other thing to keep in mind when working with related data tables and that is the type of relationship that exists between the two tables. There are two types of relationships supported in an Excel Data Model:
· One-to-one relationship where the entries in both the primary and foreign key fields are totally unique such as a relationship between a Clients data list and Discount data list where the Customer ID field occurs only once in each table (as each client has only one discount percentage assigned)
· One-to-many relationship where duplicate entries in the foreign key field are allowed and even expected as in a relationship between a Clients data list and an Orders data list where the Customer ID field may occur multiple times (as the client makes multiple purchases)
Most of the time Excel 2016 is able to figure out the relationship between the data tables you import. However, if Excel should ever get it wrong or your tables contain more than one common field that could possibly serve as the key, you can manually define the proper relationship. Simply select the Relationships button in the Data Tools group on the Ribbon’s Data tab (Alt+AA) to open the Manage Relationships dialog box. There you click New to open the Create Relationship dialog box, where you define the common field in each of the two related data tables. After creating this relationship, you can use any of the fields in either of the two related tables in reports that you prepare or pivot tables that you create. (See Book VII, Chapter 2 for details on creating and using pivot tables.)
To import external data, you select the Get External Data command button on the Ribbon’s Data tab (Alt+AZX). When you do this, Excel displays a menu with the following choices:
· From Access to import database tables saved in Microsoft Access
· From Web to perform a web page query to import data lists from web pages on the Internet
· From Text to import data saved in a text file by defining how to parse its data into particular columns and rows of your worksheet
· From Other Sources to open a drop-down menu that offers a variety of choices: From SQL Server, From Analysis Services, From Windows Azure Marketplace, From OData Data Feed, From XML Data Import, From Data Connection Wizard, and From Microsoft Query
· Existing Connections to reuse a connection to a data service or data feed (using one of the import options, especially in the From Other Sources section) that you’ve already established either to retrieve more data or refresh previously imported data
Retrieving data from Access database tables
To make an external data query to an Access database table, you click the From Access command button on the Get External Data button’s drop-down menu on the Ribbon’s Data tab or press Alt+AZXFA. Excel opens the Select Data Source dialog box, where you select the name of the Access database (using an *.mdb file extension) and then click the Open button.
The Select Table dialog box then appears, and you can select the data table that you want to import into the worksheet. To import multiple (related) data tables from the selected Access database, select the Enable Selection of Multiple Tables check box. Excel then displays check boxes before the name of each table in the database. After you select the check boxes for all the tables you want to import, you click OK.
As soon as you click OK, Excel opens the Import Data dialog box. This dialog box contains the following option buttons that you can use to determine how to view the data in your worksheet as well as where to import it:
· Table option button to have the data in the Access data table(s) imported into an Excel data table in either the current or new worksheet — see the “Existing Worksheet” and “New Worksheet” bullets that follow. Note that when you import more than one data table, the Existing Worksheet option is no longer available, and the data from each imported data table will be imported to a separate new worksheet in the current workbook.
· PivotTable Report option button (the default) to have the data in the Access data table(s) imported into a new pivot table (see Book VII, Chapter 2) that you can construct with the Access data.
· PivotChart option button to have the data in the Access data table(s) imported into a new pivot table (see Book VII, Chapter 2) with an embedded pivot chart that you can construct with the Access data.
· Only Create Connection option button to create a connection to the Access database table(s) that you can use later to actually import its data.
· Existing Worksheet option button to have the data in the Access data table(s) imported into the current worksheet starting at the current cell address listed in the text box below.
· New Worksheet option button to have the data in the Access data table(s) imported into new sheet(s) that’s added to the end of the sheets already in the workbook.
· Add This Data to the Data Model check box to add the imported data in the Access data table(s) to the Data Model already defined in the Excel workbook using relatable, key fields.
· Properties drop-down button to open the drop-down menu with the Import Relationships Between Tables check box (selected by default) and Properties item. Deselect the check box to prevent Excel from recognizing the relationship established between the data tables in Access. Click the Properties button to open the Connection Properties dialog box, where you can modify all sorts of connection properties, including when the Access data’s refreshed in the Excel worksheet and how the connection is made.
Figure 2-9 shows you a new Northwind Customer Orders workbook after importing both the Customers and Orders data tables from the sample Northwind Access database as new data tables on separate worksheets. When I imported the two data tables, Excel automatically added two new worksheets (Sheet2 and Sheet3) to the workbook, while at the same time importing the Customers data table to Sheet2 (which I renamed Customers) and the Orders data table to Sheet3 (renamed Orders). I then deleted Sheet1 (which was blank) prior to taking the screenshot forFigure 2-9.
Figure 2-9: Northwind Customer Orders workbook with the customer data imported from the Customers data table in the sample Northwind database.
Figure 2-10 shows same new workbook, this time with the Orders worksheet selected and the Manage Relationships dialog box open (by clicking the Relationships button on the Data tab or pressing Alt+AA). When Excel imported these two data tables, it automatically picked up on and retained the original relationship between them in the Northwind database, where the CustomerID field is the primary key field in the Customers data table and a foreign key field in the Orders data table.
Figure 2-10: Orders worksheet with the data imported from the Orders data table in the sample Northwind database showing the relationship with the Customers table.
After importing the data, you can then use the Filter buttons attached to the various fields to sort the data (as described in Book VI, Chapter 1) and filter the data (as described earlier in this chapter).
Excel keeps a list of all the external data queries you make so that you can reuse them to import updated data from another database or web page. To reuse a query, click the Connections button on the Data tab (Alt+AO) to open the Workbook Connections dialog box to access this list and then click the name of the query to repeat. You can also use a connection to an external data source when creating a new pivot table so that you can cross tabulate data from related tables in the data model (see Book VII, Chapter 2 for more on creating and using pivot tables).
Retrieving data from the web
To make a web page query, you click the From Web command button on the Get External Data button’s drop-down menu on the Ribbon’s Data tab or press Alt+AZXFW. Excel then opens the New Web Query dialog box containing the home page for your computer’s default web browser (Internet Explorer in many cases).
To select the web page containing the data you want to import into Excel, you select or type the URL web address of the website containing the data you want to import in the Address text box at the top of the home page in the New Web Query dialog box, as in
http://money.msn.com
If you’ve visited the website before, you can select its URL address by clicking the drop-down button attached to the Address text box and then clicking it in the drop-down list.
When you have the main page of the website displayed in the New Web Query dialog box, you can use the site’s links to find the page that actually contains the data tables you want to import. Excel indicates which tables of information you can import from the web page into the worksheet by adding a check box with an arrowhead pointing right. To import these tables, you simply select this check box to add a check mark to it. (See Figure 2-11.)
Figure 2-11: Selecting stock data tables to import from the MSN Money web page into a new Excel worksheet.
After you finish checking all the tables you want to import on the page, click the Import button to close the New Web Query dialog box. Excel then opens a version of the Import Data dialog box, where you indicate where the table data is to be imported by selecting one of the following two option buttons:
· Existing Worksheet option button (default) to have the data in the data table(s) selected on the web page imported into the current worksheet starting at the current cell address listed in the text box below.
· New Worksheet option button to have the data in the Access data table imported into a new sheet that’s added to the beginning of the workbook.
You can also select the Add This Data to the Data Model check box to have Excel automatically relate the imported web data tables to data lists in the workbook’s existing Data Model.
After you click OK in the Import Data dialog box, Excel closes the box and then imports all the tables of data you selected on the web page into a new worksheet starting at cell A1 or in the existing worksheet starting at the cell whose address was entered in the text box in the Import Data dialog box.
You can only make web queries when your computer has Internet access. So, if you’re using Excel on a device that can’t connect to the web at the moment, you won’t be able to perform a new web query until you get to a place where you can get online.
Retrieving data from text files
If you have a text file containing data you need to bring into your worksheet, you can import it by clicking the From Text command button on the Get External Data button’s drop-down menu on the Ribbon’s Data tab (Alt+AZXFT) and then selecting the file to use in the Import Text File dialog box. After you select the text file containing the data you need to retrieve in this dialog box and click its Import button, Excel opens the first dialog box of the Text Import Wizard, which you can then use to tell Excel how to split up (or parse) its data into separate cells of the worksheet.
Most text files containing lists of related data use some sort of standard character to separate each data item (such as a comma or tab) in every line, just as it uses the character for the Enter key to mark the separation of each line of data within the file. Those text files that use the comma to separate data items are known as CSV files (for Comma Separated Values). Those that use tabs to separate the individual data items are known as Tab delimited files. Note that some programs use the generic term, delimited files, to refer to any text file that uses a standard character, such as a comma or tab, to separate its individual data items.
The Text Import Wizard uses these facts about text files to analyze the structure of incoming text files to help you to determine how to parse the data in the text file. Because the Text Import Wizard always imports the parsed text data into the current worksheet starting at the active cell and then uses as many subsequent columns and rows as necessary, you should always select an empty cell at the beginning of a blank region in the worksheet (or better yet, in a blank worksheet) before you invoke the Text Import Wizard. That way, you never run the risk of the incoming text file data wiping out existing data in the worksheet.
Figure 2-12 illustrates how the Text Import Wizard works and can help you to successfully import a text file that consists of a data list into an Excel worksheet. This figure shows the Text Import Wizard — Step 1 of 3 dialog box that first appears when you try to open a text file from the Import Text File dialog box.
Figure 2-12: The first of three Text Import Wizard dialog boxes for parsing a text file.
As this figure shows, Excel has analyzed the data and determined that it uses some sort of delimiting character. If you’re dealing with a text file in which the data items all use the same number of characters (such as 11 spaces for SSN and 10 spaces for ID number), click the Fixed Width option button.
The Text Import Wizard always assumes that you want to start importing the data from the first to the very last line in the text file. If you don’t need the first line or lines imported (because they contain data, such as titles, that you’d only have to eliminate from the worksheet if you did bring them in), use the preview list box to determine the number of the first line to import and then enter that number in the Start Import Row text box or use its spinner buttons to select this number.
Figure 2-13 shows the second Text Import Wizard dialog box that appears when you click the Next button in the Step 1 of 3 dialog box. As you can see, the Text Import Wizard — Step 2 of 3 dialog box contains a Data Preview section that shows your text data aligned (simulating its column arrangement in your Excel worksheet).
Figure 2-13: The second of three Text Import Wizard dialog boxes for parsing the text file.
In the Step 2 of 3 dialog box, you need to select the delimiting character in the event that the wizard selects the wrong character in the Delimiters section. If your text file uses a custom delimiting character, you need to select the Other check box and then enter that character in its text box. If your file uses two consecutive characters (such as a comma and a space), you need to select their check boxes as well as the Treat Consecutive Delimiters As One check box.
By default, the Text Import Wizard treats any characters enclosed in a pair of double quotes as text entries (as opposed to numbers). If your text file uses a single quote, click the single quote (‘) character in the Text Qualifier drop-down list box.
Figure 2-14 shows you the third Text Import Wizard dialog box that appears when you click the Next button in the Step 2 of 3 dialog box. In the Step 3 of 3 dialog box, you get to assign a data type to the various columns of text data or indicate that a particular column of data should be skipped and therefore not imported into your Excel worksheet.
Figure 2-14: The third of three Text Import Wizard dialog boxes for parsing the text file.
When setting data formats for the columns of the text file, you can choose among the following three data types:
· General (the default) to convert all numeric values to numbers, entries recognized as date values to dates, and everything else in the column to text
· Text to convert all the entries in the column to text
· Date to convert all the entries to dates by using the date format shown in the associated drop-down list box
To assign one of the three data types to a column, click its column in the Data Preview section and then click the appropriate radio button (General, Text, or Date) in the Column Data Format section in the upper-right corner.
In determining values when using the General data format, Excel uses the period (.) as the decimal separator and the comma (,) as the thousands separator. If you’re dealing with data that uses these two symbols in just the opposite way (the comma for the decimal and the period for the thousands separator), as is the case in many European countries, click the Advanced button in the Step 3 of 3 dialog box to open the Advanced Text Import Settings dialog box. There, select the comma (,) in the Decimal Separator drop-down list box and the period (.) in the Thousands Separator drop-down list box before you click OK. If your text file uses trailing minus signs (as in 100-) to represent negative numbers (as in -100), make sure that the Trailing Minus for Negative Numbers check box contains a check mark.
If you want to change the date format for a column to which you’ve assigned the Date data format, click its M-D-Y code in the Date drop-down list box (where M stands for the month, D for the day, and Y for the year).
To skip the importing of a particular column, click it in the Data Preview and then select the Do Not Import Column (Skip) option button at the bottom of the Column Data Format section.
After you have all the columns formatted as you want, click the Finish button. Excel then displays the Import Data dialog box, where you specify where you want the parsed text data imported Existing Worksheet or New Worksheet (the options for determining how to view your data are all grayed out) and whether or not to add this parsed data to workbook’s Data Model.
After you click OK in the Import Data dialog box, Excel imports and parses the text file data starting at the current cell. Figure 2-15 shows the rows of the imported and parsed text data that appeared in the new worksheet (Sheet2) starting at cell A1.
Figure 2-15: New worksheet with data after opening the parsed text file.
Querying data from other data sources
Database tables created and maintained with Microsoft Office Access are not, of course, the only external database sources on which you can perform external data queries. To import data from other sources, you click the From Other Sources button on the Get External Data button’s drop-down menu on the Data tab or press Alt+AZXFO to open a drop-down menu with the following options:
· From SQL Server to import data from an SQL server table.
· From Analysis Services to import data from an SQL Server Analysis cube.
· From Windows Azure Marketplace to import data from any of the various marketplace service providers. Note that you must provide the file location (usually a URL address) and your account key before you can import any marketplace data into Excel. (Visitwww.windowsazure.com for details on using this service.)
· From OData Data Feed to import data any database table following the Open Data Protocol (shortened to OData) — note that you must provide the file location (usually a URL address) and user ID and password to access the OData data feed before you can import any of its data into Excel.
· From XML Data Import to import data from an XML file that you open and map.
· From Data Connection Wizard to import data from a database table using the Data Connection Wizard that follows the OLEDB (Object Linked Embedded Database) standards.
· From Microsoft Query to import data from a database table using Microsoft Query that follows the ODBC (Open DataBase Connectivity) standards.
Retrieving external data with Microsoft Query
To retrieve data from an external database using Microsoft Query, you must complete two procedures. In the first procedure, you define the data source — that is, the database that contains the data you want to query. In the second procedure, you specify the data query itself, including all the columns of data that you want extracted along with the criteria for selecting particular records.
Creating a new data source definition
To perform the first procedure that creates the new data source, follow these steps:
1. Click the From Other Sources command button on the Get External Data button’s drop-down menu on the Ribbon’s Data tab or press Alt+AFO and then select From Microsoft Query from the drop-down menu.
The Choose Data Source dialog box, shown in Figure 2-16, appears with the <New Data Source> item at the top of the list box on the Databases tab.
2. Click OK to accept the default <New Data Source> setting in the Choose Data Source dialog box.
This action opens the Create New Data Source dialog box where you need to name your new database query and specify the driver to be used in accessing the external database.
3. Enter a descriptive name for the database query in the What Name Do You Want to Give Your Data Source text box.
By naming the data source definition, you can reuse it without having to go through all these tedious steps for defining it. Next, you need to select a driver for your data source from the Select a Driver for the Type of Database You Want to Access drop-down list box. This driver list lets you select between Microsoft Access, Microsoft Access Text, Microsoft Excel, and SQL Server.
4. Select the name of the driver to be used from the Select a Driver for the Type of Database You Want to Access drop-down list box.
Now you’re ready to select the database to be accessed.
5. Click the Connect button in the Create New Data Source dialog box.
This action opens a dialog box where you can select the database to be used. For example, if you select Microsoft Access Driver (*.mdb) as the driver in the Create New Data Source dialog box, Excel opens an ODBC Microsoft Access Setup dialog box.
6. Click the Select button, locate the folder that contains the database file that you want to query in the Select Database dialog box, and then click OK.
After you’ve selected the database to work with, you have completed the first major step of specifying the data source to use in your external query.
7. Click OK in the Setup dialog box for the type of database that you’re accessing.
This action returns you to the Create New Data Source dialog box (shown in Figure 2-17), which now displays the name of the database that you selected. If you want, you can specify a default table to use in the database and, if you had to specify a username and password to gain access to the database, you can have this information saved as part of the data source definition.
8. (Optional) Select the name of the default table from the Select a Default Table for Your Data Source (Optional) drop-down list box and select the Save My User ID and Password in the Data Source Definition check box to save this information.
Now you’re ready to close the Create New Data Source dialog box and return to the Choose Data Source dialog box from which you can perform the second procedure of specifying your database query.
9. Click the OK button in the Create New Data Source dialog box.
This action closes the Create New Data Source dialog box, returning you to the Choose Data Source dialog box, where the name that you’ve given to the data source definition that you’ve just completed appears selected.
Figure 2-16: Using the Choose Data Source dialog box to create a new database query.
Figure 2-17: Creating a new data source for the external database query.
Specifying the database query
Now that you’ve finished the data source definition, you can use it with the Query Wizard to specify which fields in the database to acquire. At this point, the Choose Data Source dialog box is still open, and the name of your new data source definition is selected along with the Use the Query Wizard to Create/Edit Queries check box.
To perform the second procedure in which you specify the conditions of the query by using your new data source definition, follow these steps:
1. Make sure that the name of your data source is highlighted on the Databases tab and the Use the Query Wizard to Create/Edit Queries check box has a check mark in it and then click OK in the Choose Data Source dialog box.
This action closes the Choose Data Source dialog box and opens the Query Wizard — Choose Columns dialog box, similar to the one shown in Figure 2-18. This dialog box contains two list boxes: the Available Tables and Columns list box on the left side and the Columns in Your Query list box on the right side.
To select the fields that you want to acquire, click the Expand button (+) in front of the name of each table in the external database that contains fields that you want. Then, click the name of the field followed by the > button to copy the field name to the Columns in Your Query list box. To preview the data in that field, click the Preview Now button when the field name is selected in the Columns in Your Query list box.
Note that the order in which you add the fields determines their column order in your Excel worksheet. To change the order after copying the fields to the Columns in Your Query list box, click the field and then click the button to the right with the triangle pointing upward to promote the field in the list, or click the button with the triangle pointing down to demote it in the list.
2. Select the fields that you want to use in the Available Tables and Columns list box on the left and then copy them to the Columns in Your Query list box on the right.
After you finish selecting the fields to use in the query, you’re ready to move on to the next dialog box in the Query Wizard where you specify how the data is to be filtered.
3. Click Next to open the Query Wizard — Filter Data dialog box.
To set up the criteria by which records are selected in the Filter Data dialog box (similar to the one shown in Figure 2-19), click the field for which you want to set criteria and then select the criteria to use from the drop-down list box on the left and the value to be evaluated in the combo box on the right.
The criteria available in the drop-down list boxes on the left are the same as those used with the Custom AutoFilter (see the “Custom AutoFilter at your service” section earlier in this chapter for details) with the exception of the “like” and “not like” and “is Null” and “is Not Null” operators, which are not available when setting criteria for the Custom AutoFilter. (The “like” operator refers to entries that sound like one that you enter in the associated combo box on the right, and “Null” refers to empty entries in the field.)
When entering the values to be evaluated in the associated combo boxes on the right side of the Filter Data dialog box, you can use the question mark (?) and asterisk (*) wildcard characters (question marks for single characters and the asterisk for multiple characters) in the text that you enter in these boxes. You can also select data entries in a field from which to compare to by selecting them from the drop-down list.
To set up an AND condition, make sure that the AND option button is selected when you specify the second and even third set of filtering criteria. (Remember that, in an AND condition, records are selected only when all sets of criteria are TRUE.) To set up a logical OR condition, click the Or option button before you specify the second or even third set of criteria. (Remember that, in an OR condition, records are selected when any one of the sets of criteria are TRUE.)
Note that if you want to acquire all data in a selected field, you don’t specify any filtering criteria for that field in the Filter Data dialog box.
4. Specify the filtering criteria, including any AND and OR condition, in the criteria drop-down list boxes on the left and evaluation combo boxes on the right for each field that should be filtered in the Column to Filter list box.
After you finish specifying the filtering to be used on the fields, you’re ready to specify the order in which matching records are to be sorted.
5. Click the Next button to open the Query Wizard — Sort Order dialog box.
To sort the data that you acquire in the external database query, click the name of the field in the Sort By drop-down list box and then select either the Ascending (default) or Descending option button. (See Figure 2-20.) To sort any duplicates in the field that you specify as the primary sorting key, you select the tie-breaking field for the secondary key from the Then By drop-down list box and then select between its Ascending (default) and Descending option buttons.
You repeat this procedure to sort the incoming data on up to three fields. If you don’t want the data sorted, click the Next button without selecting any fields as sorting keys.
6. Specify the field or fields on which the external data is to be sorted and then click the Next button.
Clicking Next opens the Query Wizard — Finish dialog box. This dialog box contains two option buttons that determine what happens next. You can return the data to the current or a new worksheet by leaving the Return Data to Microsoft Excel option button selected. You can view the data and/or edit the query in the Microsoft Query dialog box by clicking the View Data or Edit Query in Microsoft Query option button.
You can also save your new query by selecting the Save Query button in the Query Wizard — Finish dialog box. When you do this, Excel saves it as a separate query file (indicated by the .dqy file extension) so that you can reuse the query from any workbook file. Note that Excel automatically saves the data source definition as a file (indicated by the .dsn) when you next save the current workbook but not the query.
7. Click the Save Query button and then enter the filename for the new query file in the Save As dialog box before you click the Save button.
If you want to see the data and review your query before you bring it into your Excel worksheet, you select the View Data or Edit Query in Microsoft Query option button before you click OK in the Query Wizard — Finish dialog box. When you do this, Excel opens a Microsoft Query window similar to the one shown in Figure 2-21, where you can preview the way the acquired data will appear when you bring it into Excel. You can also edit the database query in this window.
8. (Optional) Select the View Data or Edit Query in Microsoft Query option button and then click the Finish button to open the Microsoft Query window showing the fields and records to be acquired by the query along with the filtering criteria.
After you finish viewing the data in the Microsoft Query window, you can click its Close button. Doing this opens the Import Data dialog box described in Step 9.
The Import Data dialog box is also displayed when you click the Finish button when the Return Data to Microsoft Excel option button is selected in the Query Wizard — Finish dialog box, which is described in Step 9.
9. Click the Finish button in the Query Wizard — Finish dialog box.
This action closes this dialog box and opens the Import Data dialog box. Here, you indicate where you want to put the data acquired from the external database.
By default, the Table option is selected to import the data as a simple data list, the Existing Worksheet option button is selected, and cell A1 is designated as the start of this list’s range. To import the external data as a new pivot table (see Book VII, Chapter 2), click the PivotTable Report option button. To import the data as a new pivot table and pivot chart, click the PivotChart and PivotTable Report option button. To change the starting cell, click it in the worksheet. To import the data into a brand-new worksheet in the current workbook, select the New Worksheet option button instead. To add the data to the current Data Model in your Excel worksheet, select the Add This Data to the Data Model check box.
10. Indicate how you want to view the data and where you want the data imported during the query and then click OK to start importing the data.
As soon as you click OK, Excel executes the database query and acquires the data from the external database. (See Figure 2-22.) After the program finishes importing all the records that match your filtering criteria (when the Table option is selected), the program also displays the Table Tools contextual tab and selects its sole Design tab on the Ribbon. You can use the command buttons on the Design tab to further format the table (by selecting a new table style in the Quick Styles gallery), refresh the data by performing the query again (with the Refresh command button), or summarize the data list by creating a pivot table for it (with the Summarize with Pivot command button).
Figure 2-18: Specifying the fields to acquire in the new external database query.
Figure 2-19: Specifying how the data are to be filtered in the new external database query.
Figure 2-20: Specifying how the data are to be sorted in the new external database query.
Figure 2-21: Viewing the data to be acquired by the query in the Microsoft Query window.
Figure 2-22: Worksheet after importing the data using the external database query.
After you’ve saved the data query, you can use it to connect to an external database and to acquire its data according to the query’s parameters. To do this, click the Existing Connections command button on the Get External Data button’s drop-down menu on the Data tab of the Ribbon (or press Alt+AZXX) to open the Existing Connections dialog box and there click the descriptive name given to your query before you click the Open button.
After you perform an external query using Microsoft Query that extracts the data as a table, you can use the following command buttons on the Design tab of the Table Tools contextual tab:
· Table Name text box to change the name of the data table containing the external data.
· Resize the Table command button to select a new data range for the data retrieved by the external query.
· Summarize with PivotTable command button to open the Create PivotTable dialog box that enables you to create a new pivot table using the data retrieved by the external query. (See Book VII, Chapter 2.)
· Remove Duplicates command button to remove all duplicate entries from the data table.
· Convert to Range command button to convert the data table containing the external data into a regular range of data (which removes the AutoFilter buttons and table formatting).
· Insert Slicer command button to create slicers for particular fields in the data list, that is, graphic objects that enable to you further filter your data. (See Book VII, Chapter 2.)
· Export command button to export the external data to a SharePoint list or a Visio diagram.
· Refresh command button to have Excel redo the query and automatically update the data table according to any changes made to the external database table(s).
· Properties command button to open the External Data Properties dialog box, where you can change formatting and layout options for the data retrieved in the worksheet.
· Open in Browser command button to open the external data as a table in your computer’s web browser.
· Unlink command button to sever a link between the data retrieved in the worksheet by the external query and its original data source on a server so that the data are no longer automatically kept up to date.