SQL Quick Revisit

Common Relational DB SQL concepts for quick revisit, recapped mainly from w3school and go/Bard, not every DB supports the listed syntax, so please use them accordingly.

SQL Clauses Layout Order

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
SELECT column_name(s)
FROM table1
JOIN table2 ON table1.column = table2.column
WHERE condition
GROUP BY column_name(s)
HAVING condition
ORDER BY column_name(s);

-- you can use multiple tables, it is the same as inner join
SELECT column_name(s)
FROM table1, table2
WHERE condition
GROUP BY column_name(s)
HAVING condition
ORDER BY column_name(s);

Distinct

1
2
3
4
5
6
7
SELECT COUNT(DISTINCT Name) FROM Customers;

-- Distinct applies on the combination in SELECT, not just the first column
SELECT DISTINCT collectors.first_name, collectors.last_name
FROM collectors
JOIN sales
ON collectors.id = sales.collector_id;

Operators in Where

WHERE operators:

  • -, >, <, >=, <=, !=(<>)
  • BETWEEN
  • LIKE
  • IN

Quotes for Text

1
2
3
4
5
6
7
-- Single quote for text value
SELECT * FROM Customers
WHERE Country='Mexico';

-- No quote around number value
SELECT * FROM Customers
WHERE CustomerID=1;

Using Parenthesis

1
2
SELECT * FROM Customers
WHERE Country='Germany' AND (City='Berlin' OR City='München');

Min and Max

1
2
SELECT MAX(Price) AS LargestPrice
FROM Products;

Count, Avg and Sum

1
2
SELECT COUNT(ProductID)
FROM Products;

Order By

1
2
3
4
5
6
7
SELECT * FROM Customers
ORDER BY Country ASC, CustomerName DESC;

-- order by field can be irrelevant from select
SELECT Address
FROM Customers
ORDER BY CustomerName ASC

NULL

Not every DB supports NULL for empty values, be aware.

1
2
3
4
5
6
7
SELECT CustomerName, ContactName, Address
FROM Customers
WHERE Address IS NULL;

SELECT CustomerName, ContactName, Address
FROM Customers
WHERE Address IS NOT NULL;

Limit

1
2
SELECT * FROM Customers
LIMIT 3;

In

The IN operator is a shorthand for multiple OR conditions.

1
2
3
4
5
6
SELECT * FROM Customers
WHERE Country NOT IN ('Germany', 'France', 'UK');

-- subquery
SELECT * FROM Customers
WHERE Country IN (SELECT Country FROM Suppliers);

Alias

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- New column with a Literal value
SELECT 'Customer' AS Cus, ContactName, City, Country
FROM Customers

-- CONCAT is a SQL function
SELECT CustomerName,
CONCAT(Address,', ',PostalCode,', ',City,', ',Country) AS Address
FROM Customers;

-- you can omit AS for table alias
SELECT o.OrderID, o.OrderDate, c.CustomerName
FROM Customers AS c, Orders AS o
WHERE c.CustomerName='Around the Horn' AND c.CustomerID=o.CustomerID;
-- the same as
SELECT o.OrderID, o.OrderDate, c.CustomerName
FROM Customers c, Orders o
WHERE c.CustomerName='Around the Horn' AND c.CustomerID=o.CustomerID;

Joins

  • (INNER) JOIN: Returns records that have matching values in all tables.
  • LEFT (OUTER) JOIN: Returns all records from the left table, and the matched records from the right table.
  • RIGHT (OUTER) JOIN: Returns all records from the right table, and the matched records from the left table.
  • FULL (OUTER) JOIN: Returns all records when there is a match in either left or right table.

See explanation diagrams here.

Inner Join

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
SELECT column_name(s)
FROM table1
INNER JOIN table2
ON table1.column_name = table2.column_name;
-- the same as JOIN
SELECT column_name(s)
FROM table1
JOIN table2
ON table1.column_name = table2.column_name;

-- Inner join 3 tables
SELECT Orders.OrderID, Customers.CustomerName, Shippers.ShipperName
FROM
(
(
Orders
INNER JOIN Customers
ON Orders.CustomerID = Customers.CustomerID
)
INNER JOIN Shippers
ON Orders.ShipperID = Shippers.ShipperID
);

You can use USING to subsitute ON if the columns name are the same, please see here

1
2
3
4
SELECT column_name(s)
FROM table1
INNER JOIN table2
USING (id, score);

Left Join

1
2
3
4
5
6
7
8
9
10
11
12
-- Show both CustomerID can better explain the left join
SELECT
Customers.CustomerName, Customers.CustomerID as id1,
Orders.CustomerID as id2, Orders.OrderID
FROM
Customers
LEFT JOIN
Orders
ON
Customers.CustomerID = Orders.CustomerID
ORDER BY
Customers.CustomerName;

The LEFT JOIN keyword returns all records from the left table (Customers), even if there are no matches in the right table (Orders) and if there is no that CustomerID in Orders table, you get null for the Orders selected fields.

Right Join

1
2
3
4
5
6
7
8
9
SELECT
Orders.OrderID, Orders.EmployeeID as id1,
Employees.EmployeeID as id2,
Employees.LastName,
Employees.FirstName
FROM Orders
RIGHT JOIN Employees
ON Orders.EmployeeID = Employees.EmployeeID
ORDER BY Orders.OrderID;

The RIGHT JOIN keyword returns all records from the right table (Employees), even if there are no matches in the left table (Orders).

Full Join

1
2
3
4
SELECT Customers.CustomerName, Orders.OrderID
FROM Customers
FULL OUTER JOIN Orders ON Customers.CustomerID=Orders.CustomerID
ORDER BY Customers.CustomerName;

The FULL OUTER JOIN keyword returns all matching records from both tables whether the other table matches or not. So, if there are rows in “Customers” that do not have matches in “Orders”, or if there are rows in “Orders” that do not have matches in “Customers”, those rows will be listed as well.

Self Join

1
2
3
4
5
6
7
8
9
10
11
12
13
-- there will be duplicates introduced by switch CustomerName in A and B tables
SELECT
A.CustomerName AS CustomerName1,
B.CustomerName AS CustomerName2,
A.City
FROM
Customers A, Customers B -- can omit AS in table alias
WHERE
A.CustomerID <> B.CustomerID
AND
A.City = B.City
ORDER BY
A.City;

Union

The UNION operator is used to combine the result-set of two or more SELECT statements.

  • Every SELECT statement within UNION must have the same number of columns.
  • The columns must also have similar data types.
  • The columns in every SELECT statement must also be in the same order.
1
2
3
4
5
6
7
8
-- 'Customer' and 'Supplier' are literal value under the 'Type' column
-- here is used to distinguish the rows are from which select
SELECT 'Customer' AS Type, ContactName, City, Country
FROM Customers
-- UNION ALL -- can have duplicates
UNION -- union distinct row only
SELECT 'Supplier', ContactName, City, Country
FROM Suppliers;

Group By

The GROUP BY statement is often used with aggregate functions (COUNT(), MAX(), MIN(), SUM(), AVG()) to group the result-set by one or more columns.

1
2
3
4
5
6
7
8
9
10
11
12
SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
ORDER BY column_name(s);

-- Example
SELECT Shippers.ShipperName, COUNT(Orders.OrderID) AS NumberOfOrders
FROM Orders
LEFT JOIN Shippers
ON Orders.ShipperID = Shippers.ShipperID
GROUP BY ShipperName;

Having

The HAVING clause was added to SQL because the WHERE keyword cannot be used with aggregate functions.

1
2
3
4
5
6
7
8
9
10
11
12
13
SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
HAVING condition
ORDER BY column_name(s);

-- Example
SELECT COUNT(CustomerID) as CustomerNumber, Country
FROM Customers
GROUP BY Country
HAVING CustomerNumber > 5
ORDER BY CustomerNumber;

Exist

The EXISTS operator is used to test for the existence of any record in a subquery.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
SELECT column_name(s)
FROM table_name
WHERE EXISTS
(
SELECT column_name
FROM table_name
WHERE condition
);

-- Example
SELECT SupplierName
FROM Suppliers
WHERE EXISTS
(
SELECT ProductName
FROM Products
-- Suppliers is from outer layer
WHERE Products.SupplierID = Suppliers.supplierID
AND Price < 20
);

Any and ALL

The ANY and ALL operators allow you to perform a comparison between a single column value and a range of other values.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
SELECT column_name(s)
FROM table_name
WHERE column_name operator ANY
(SELECT column_name
FROM table_name
WHERE condition);

-- Example
SELECT ProductName
FROM Products
WHERE ProductID = ANY -- can be rewritten as IN operator
(SELECT ProductID
FROM OrderDetails
WHERE Quantity = 10);

Subquery

This is how you can boost your query, basically you can use subquery in below clauses:

  • SELECT
  • FROM
  • WHERE
  • HAVING
  • JOIN

Examples:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
-- SELECT clause
SELECT
first_name,
last_name,
( -- inner query runs for every row of collectors table
SELECT count(*) AS paintings
FROM sales
WHERE collectors.id = sales.collector_id -- correlated with outer query
)
FROM collectors;

-- WHERE clause
SELECT customer_name
FROM customers
WHERE customer_id IN (
SELECT customer_id
FROM orders
WHERE order_date > '2023-03-08'
);

-- HAVING clause
SELECT product_category, average_price
FROM products
GROUP BY product_category
-- no where as where cannot use aggragate functions
HAVING average_price > (
SELECT AVG(price)
FROM products
WHERE product_category = 'Electronics'
);

-- FROM clause
SELECT product_name, quantity
FROM (
/* Generate a new table from subquery */
SELECT product_name, SUM(quantity) AS quantity
FROM order_items
GROUP BY product_name
);

-- with JOIN, the example no special meaning.
SELECT
res.region,res.availability_zone,
res.machine_type,
res.billing_account_id,
res.reservation_code
FROM
(
(
SELECT DISTINCT r.earmark.region,
r.earmark.availability_zone,
r.earmark.machine_type,
r.earmark.billing_account_id,
r.earmark.reservation_code
FROM
Reservations AS r
WHERE
r.ProvenanceMetadata.version_state = 'LATEST' AND
r.earmark.reservation_code!=""
) AS res
JOIN
(
SELECT distinct p.PlacementGroup.region,
p.PlacementGroup.availability_zone,
p.PlacementGroup.machine_type
FROM
PlacementGroupCapacity AS p
WHERE
p.ProvenanceMetadata.version_state = 'LATEST' AND
p.PlacementGroup.pg_pool=""
) AS pgc
ON
res.region=pgc.region AND
res.availability_zone=pgc.availability_zone AND
res.machine_type=pgc.machine_type
);
0%