SQL - The Shortest Route For Beginners (2015)

Chapter 6. Query Data From Multiple Tables

Referential Integrity in Relational Databases

All modern DBMSs are also called Relational Database Management Systems (RDBMS), because they all follow a relational database model, which ensures that all of the data in the database is valid according to a set of rules. Referential integrity is also an elemental data integrity rule of the relational database model. Referential integrity defines the relationships among different columns and tables in a relational database. It is called referential integrity because the values in one column or set of columns in a table refers to or must match the values in a related column or set of columns in other tables. For example, the DEPARTMENTS and EMPLOYEES tables in the following figure track different, but related information. The DEPARTMENTS table stores information (such as id, name etc.) about different departments. Each department’s record has a unique department ID. The EMPLOYEES table stores employees information along with the department they belong to. EMPLOYEES has a column (DEPARTMENT_ID) to indicate the department of each employee. As you can see, the department ID column in EMPLOYEES refers to the department ID column in DEPARTMENTS. Referential integrity simply makes sure that at all times, whenever a new employee’s record is created, it must have a department ID that matches a department ID in the DEPARTMENTS table.

Referential Integrity in Relational Databases

What is a Join?

From the previous discussion it is evident that rows in one table may be joined to rows in another table according to corresponding column values existing in both tables. In database terminology, these columns are referred to as Primary and Foreign key columns, that you already went through in chapter 1.

Whenever you need data from more than one table in the database, you make use of JOIN. Join is one of the most important feature in SQL which has the following main types:

1. Equijoin

2. Inner Join

3. Outer Join

Points to Remember:

· Databases performs a join whenever multiple tables appear in the FROM clause of the query.

· The select list of the query can select any columns from any of these tables. The columns in the join conditions need not also appear in the select list.

· If any two of these tables have a column name in common, then you must qualify all references to these columns (by prefixing table names) throughout the query to avoid ambiguity.

· To join tables together, you need a minimum of the number of join conditions summarized as the number of tables minus one. For example, to join four tables, a minimum of three joins would be required.

· As far as SQL is concerned, there is no limit to the number of tables that may be specified in a SELECT statement for creating the join. However, the rules to follow are that all the tables must be listed in the FROM clause, and their relationship is defined in the WHERE clause. Refer to your DBMS documentation which might impose some restrictions on the maximum number of tables per join.

· A WHERE clause that contains a join condition can also contain other conditions that refer to columns of only one table. These conditions can further restrict the rows returned by the join query.

The EQUIJOIN

Equijoin is one of the most commonly used join with a join condition containing an equality operator. It is the most simple join that is used either in the FROM clause or in the WHERE clause of a SELECT statement to show data from multiple related tables. Here is its syntax:

NOTE:Using complete table names with qualifying column names can be very time consuming, especially with lengthy table names. You can overcome this problem by using table aliases like this:

SELECT e.first_name, d.department_name

FROM employees e, departments d;

Syntax :

SELECT table.column, table.column …

FROM table1, table2

WHERE table1.column = table2.column;

The following statement is similar to all the

previous SELECT statements except for two things. First of all, this statement is now carrying two columns from Employees table and one column from Departments table, as compared to previous examples where all the columns were listed from a single table. Secondly, the FROM clause list two tables. These are the tables that are being joined in the query using the WHERE clause which instructs the DBMS to match department ID in the Employees table with department ID in the Departments table. One more thing that is different from previous examples is the use of table names in the SELECT and WHERE clauses. This expression is called fully qualified column names and is used to inform DBMS which department id you are referring to. It is always mandatory to use such qualifiers to avoid ambiguity, which you averted in the following statement due to the presence of the department id column in both tables.

SQL Statement :

SELECT employees.first_name, employees.department_id, departments.department_name

FROM employees, departments

WHERE employees.department_id=departments.department_id;

Output :

The INNER JOIN

The INNER JOIN is similar to the EQUIJOIN and uses the same equality operator to establish a join between two tables. The difference is that it uses the INNER JOIN keyword in its syntax which explicitly specifies the join. The keyword is used in the FROM clause which makes it different from the previous example. Another difference between the two joins is the use of a special ON clause which is used instead of WHERE to pass the same condition as specified in the WHERE clause previously. You can use both the simple (EQUIJOIN) and the standard (INNER JOIN) syntax formats to produce the same result. However, ANSI SQL standard prefers the INNER JOIN over the simple EQUIJOIN. Depending on your platform, you can use one of the following formats:

Syntax :

SELECT table.column, table.column

FROM table1 INNER JOIN table2

ON table1.column = table2.column;

SELECT table.column, table.column

FROM table1 JOIN table2

ON table1.column = table2.column;

The INNER JOIN also returns all rows from the specified tables as long as there is a match between the columns. If there are some rows in the DEPARTMENTS table that do not have matches in EMPLOYEES, these departments will not be fetched. For example, the DEPARTMENTS table has some more department records (120 through 270) that do not have an associated record in the EMPLOYEES table, so these records are eliminated by the INNER JOIN query. See OUTER JOIN in the next section to incorporate these missing departments in the result.

SQL Statement :

SELECT D.department_id “ID”, D.department_name “Name”, E.first_name “Employee”

FROM departments D INNER JOIN Employees E ON D.department_id=E.department_id

ORDER BY d.department_id;

Output :

The OUTER JOIN

In the previous two exercises, only those rows were returned by DBMS which satisfied the join condition (e.department_id=d.department_id), and didn’t display departments numbering from 120 through 270 because, there were no employees enrolled in these departments. In order to get these missing rows, you are provided with OUTER JOIN operator, which is used in the join condition. Under Oracle implementation, the OUTER JOIN is represented by a plus sign (+) enclosed in parentheses, and is placed on the side of the join that is deficient in information. The operator creates that may NULL rows to which rows from the non-deficient table can be joined, as done in the following example in which all the missing departments are fetched in the output with null values in the Employee column.

Syntax :

SELECT table.column, table.column …

FROM table1, table2

ON table1.column = table2.column(+);

SELECT table.column, table.column

FROM table1, table2

ON table1.column(+) = table2.column;

SQL Statement :

SELECT D.department_id “ID”, D.department_name “Name”, E.first_name “Employee”

FROM departments D INNER JOIN Employees E ON D.department_id=E.department_id(+)

ORDER BY d.department_id;

Output :

OUTER JOIN (continued)

Some implementations use LEFT JOIN and RIGHT JOIN keywords to create outer joins. The query with the LEFT JOIN get all rows from the left table (table1), with the matching rows in the right table (table2). When there is no match, the right side result displays NULL. The RIGHT JOIN is just the opposite, which displays NULL with no matching rows in the left table. In some implementations, these two are called LEFT OUTER JOIN and RIGHT OUTER JOIN, respectively.

NOTE: In SQLite, there is no RIGHT OUTER JOIN. If you do need this join, simply reverse the order of the tables defined in the FROM and WHERE clauses.

Here is the syntax to use OUTER JOIN for non-Oracle platforms:

Syntax :

SELECT table.column, table.column …

FROM table1 LEFT JOIN/RIGHT JOIN table2

ON table1.column = table2.column;

To produce the same output as illustrated on the previous page, you will structure the statement like this:

SQL Statement :

SELECT D.department_id, E.employee_id

FROM departments LEFT OUTER JOIN Employees

ON department.department_id = employees.employee_id;

JOINS vs. SUBQUERIES

The data that you retrieve using joins can also be retrieved through subqueries. The question arises that which one is better? From technical point of view you must use joins when you can retrieve the same information using either a query that joins multiple tables or a query that has one or more subqueries. In some situations you might use nested subqueries, spreading multiple levels, to get the desired information. However, as a general rule, nested queries or queries with nested subqueries execute slower than queries that join multiple table. So, you should probably not use nested queries unless you cannot retrieve the desired result using joins.

Test Your Skill

1. Which integrity constraint would you enforce to create a joint statement:

a. Entity

b. Referential

c. Column

d. User Defined

2. The columns in the join condition must also exist in the SELECT list. Yes / No

3. You can select a column from a table not used in the FROM clause. Yes / No

4. What would you do if two tables in a join query have a common column name.

5. How many conditions will you add to a query which uses five tables.

6. Create a query which displays name of employees with their department names as follows:

7. Write a query to display the name of employees and department names for John.

8. Form a statement which shows the first name, department name, and city (from the Locations table) of all employees who earn a commission.

9. Form a statement by joining Countries and Locations tables to display the following output. Use the OUTERJOIN operator to find missing countries in the Locations table.