Using INNER JOIN versus CROSS JOIN in SQL
Using the right JOIN statement in your SQL code is critical for accurate results. While it might be easy to identify a bad data set when only a few results are returned, it’s not so easy when you have thousands of results. The syntax for INNER JOIN and CROSS JOIN are similar, but both statements produce radically different results. It’s important to understand the differences to create accurate reports and user results from your database.
The INNER JOIN Statement
We’ll start with the INNER JOIN statement because it’s the easiest to understand and the most common. An INNER JOIN returns all rows from joined tables where there is a match. Let’s use the customer and order example. Each customer can have multiple orders, which makes the relationship a one-to-many. In a one-to-many relationship, all rows are returned from the order table where there is a link with a customer record.
Let’s take a look at a query and results. First, we’ll write an INNER JOIN statement.
SELECT c.FirstName, c.LastName, o.OrderId FROM customer c INNER JOIN order o on c.CustomerId = o.CustomerId
And the results:
As you can see from the results, you have 10 orders that match a customer. One thing to note about an INNER JOIN statement is that if a customer does not have an order, you won’t see the customer in the result set at all. There must be a match in both tables for an INNER JOIN to return a record. You’ll use INNER JOINs in several of your queries when you write SQL code for applications and reports.
In most cases, you want to filter records using the WHERE clause. Let’s add a WHERE clause to the statement and only return records for a customer with an ID of 6. We now can see all orders for this customer, which is only 1 record according to the SQL output.
The CROSS JOIN Statement
A CROSS JOIN statement is much more complex, but once you understand the result set, you’ll know when you need one. A CROSS JOIN takes all records from one table and all records from a second table and gives you a Cartesian product. In case you’re not familiar with a Cartesian product, it’s the multiplication of one set and a second set.
Let’s get a better understanding of a Cartesian product using our customer and order tables. You can calculate the number of records in your result set by knowing the number of records in each table. The results are always the number of records in the customer table multiplied by the number of records in the order table.
For instance, if we have 5 customers and 10 orders, the result set returned by the SQL Server contains 50 records. You should also note that there is no relationship between the records like an INNER JOIN. Remember an INNER JOIN links based on a primary-foreign key relationship. The Cartesian product links all records from one table with all records from the second regardless of the relationship.
Let’s look at a real-world sample using the same customer and order table.
Here is the SQL command we’ll use. Notice that the only change is to the JOIN statement – it was previously INNER JOIN and now it’s CROSS JOIN. Since we’re not using relationships, we also remove the CustomerId link statement.
Here are the results:
Our record set returned 70 records, which is much more than the INNER JOIN statement. You’ll notice a pattern in the way the records are returned. Notice that for each customer, the order IDs are displayed, and for each order ID, SQL lists a customer.
Just like the INNER JOIN statement, you can also use the WHERE clause with the CROSS JOIN statement. Let’s use the same WHERE filter for the CROSS JOIN statement that we used with the INNER JOIN statement.
What do we get out of this result set? We know that there is one customer with the ID of 6, and we know that we have orders with IDs between 5 and 14, so a total of 10 orders in the order table. While this type of statement isn’t used often, there are times when you’ll need to use this type of JOIN statement for reports and more complex queries.
When to Use CROSS JOIN versus INNER JOIN
For the most part, you’ll use an INNER JOIN much more than a CROSS JOIN. Think about the logic of the two statements. If a customer logs in to an application and needs to view orders, you’d use the INNER JOIN statement. Most people wonder why you would ever use a CROSS JOIN.
A CROSS JOIN is used when no simple INNER JOIN option exists. You might also be surprised that CROSS JOIN runs very efficiently. If you find that an INNER JOIN becomes too convoluted and slow, it might be time for a CROSS JOIN statement. Usually, CROSS JOIN statements are used in reporting.
CROSS JOIN statements are used in reports for retrieving averages and projections. They are mostly used to retrieve total amounts for each customer or order (for example) relative to the number of records in a specific table.
As you create applications or even just SQL procedures, you’ll need to understand both of these statements. While the INNER JOIN is the most common and understood statement, if you’re responsible for any reports, you’ll run into a CROSS JOIN at some point in your career.