Friday, 17 January 2014

Self Join with an example



There are 3 different types of joins available in sql server, and they are
1. Cross Join 
2. Inner Join or Join 
3. Outer Join

Outer Join is again divided into 3 types as shown below.
1. Left Outer Join or Left Join 
2. Right Outer Join or Right Join 
3. Full Outer Join or Full Join 

I strongly recomend to learn about the basics and types of joins, before reading this article. Read the articles below, before proceeding with self join.
1. Basics of Joins
2. Inner Join
3. Left Outer Join
4. Right Outer Join
5. Full Outer Join


Self join is not a different type of join. Self join means joining a table with itself. We can have an inner self join or outer self join. Let us try to understand with an example.

To set up the data for the example, use the script below to create Employee Table and populate it with some sample data. We will be usingEmployee Table to understand Self Join.

CREATE TABLE EMPLOYEE
(
[EMPLOYEEID] INT PRIMARY KEY,
[NAME] NVARCHAR(50),
[MANAGERID] INT
)
GO

INSERT INTO EMPLOYEE VALUES(101,'Mary',102)
INSERT INTO EMPLOYEE VALUES(102,'Ravi',NULL)
INSERT INTO EMPLOYEE VALUES(103,'Raj',102)
INSERT INTO EMPLOYEE VALUES(104,'Pete',103)
INSERT INTO EMPLOYEE VALUES(105,'Prasad',103)
INSERT INTO EMPLOYEE VALUES(106,'Ben',103)
GO

We use Self Join, if we have a table that references itself. For example, In the Employee Table below MANAGERID column references EMPLOYEEID column. So the table is said to referencing itself. This is the right scenario where we can use Self Join. Now I want to write a query that will give me the list of all Employee Names and their respective Manager Names. In order to achieve this I can use Self Join. In the Table below,Raj is the manager for Pete,Prasad and Ben. Ravi is the manager for Raj and Mary. Ravi does not have a manager as he is the president of the Company.


The query below is an example of Self Join. Both E1 and E2 refer to the same Employee Table. In this query we are joining the Employee Table with itself. 


SELECT E1.[NAME],E2.[NAME] AS [MANAGER NAME]
FROM EMPLOYEE E1 
INNER JOIN EMPLOYEE E2 
ON E2.EMPLOYEEID =E1.MANAGERID

If we run the above query we only get 5 rows out of the 6 rows as shown below.

Inner Self Join

This is because Ravi does not have a Manager. MANAGERID column for Ravi is NULL. If we want to get all the rows then we can use LEFT OUTER JOIN as shown below.


SELECT E1.[NAME],E2.[NAME] AS [MANAGER NAME]
FROM EMPLOYEE E1 
LEFT OUTER JOIN EMPLOYEE E2 
ON E2.EMPLOYEEID =E1.MANAGERID

If we execute the above query we get all the rows, including the row that has a null value in the MANAGERID column. The results are shown below. The MANAGERNAME for 2nd record is NULL as Ravi does not have a Manager.
Left Outer Self Join

Let us now slightly modify the above query using COALESCE as shown below. Read COALESCE function in SQL Server to understand COALESCE in a greater detail.


SELECT E1.[NAME],COALESCE(E2.[NAME],'No Manager'AS [MANAGER NAME]
FROM EMPLOYEE E1 
LEFT JOIN EMPLOYEE E2 
ON E2.EMPLOYEEID =E1.MANAGERID

If we execute the above query the output will be as shown in the image below. This is how COALESCE can be used.

Left Outer Self Join with COALESCE

No comments:

Post a Comment

C# program Selection Sorting

Selection sort is a straightforward sorting algorithm. This algorithm search for the smallest number in the elements array and then swap i...