As a data analyst, SQL (Structured Query Language) is an essential tool for managing and analyzing data. SQL allows data analysts to retrieve, manipulate, and transform data stored in databases. In this project, I will explore some common SQL scenarios that data analysts frequently encounter in their work. By understanding these scenarios, data analysts can effectively utilize SQL to extract valuable insights from large datasets and make data-driven decisions.
Sample Database: AdventureWorks2017.bak
Tool: Azure Data Studio was connected to Microsoft SQL Server by launching the application, selecting "New Connection" to open the Connection pane, and entering the necessary server details and authentication credentials to establish the connection
These scenarios can range from simple data retrieval to complex data transformations and aggregations. Here are 30 scenario SQL cases that you may come across in your job.
Problem Statement: Order the 'country names' in ascending order from A-Z
select * from Person.CountryRegion order by Name ASC
Problem Statement: Order the 'country names' in descending order from Z-A
select * from Person.CountryRegion order by Name DESC
Problem Statement: List the "comments with more words" to get more insight about a product
select ProductID, ReviewerName, Rating, Comments from Production.ProductReview order by len(Comments) DESC
select TransactionID, ProductID, TransactionDate, TransactionType from Production.TransactionHistory
-- 113,443 rows returned * 10% = 11,345
Problem Statement: Display only the first 10% rows
select Top 10 percent TransactionID, ProductID, TransactionDate, TransactionType from Production.TransactionHistory
-- 11,345 rows returned
select * from Sales.CreditCard
select CardType from Sales.CreditCard
Problem Statement: Find the total Card Types without any duplication
select distinct CardType from Sales.CreditCard
select ProductID, ReviewerName, Rating from Production.ProductReview
Problem Statement: Display the output to be more understandable (1=Poor, 2=Fair, 3=Good, 4=Very Good, 5=Excellent)
select ProductID, ReviewerName, case Rating when 1 then 'Poor' when 2 then 'Fair' when 3 then 'Good' when 4 then 'Very Good' when 5 then 'Excellent' end as Rating from Production.ProductReview
select BillOfMaterialsID, ProductAssemblyID, StartDate from Production.BillOfMaterials
Problem Statement: Instead of NULL values we need to have value 0 for 'ProductAssemblyID' column "without changing any value" in the table
select BillOfMaterialsID, ISNULL(ProductAssemblyID,0) AS ProductAssemblyID, StartDate from Production.BillOfMaterials
select ProductModelID, ProductDescriptionID from Production.ProductModelProductDescriptionCulture
Problem Statement Name columns and table name with an Alias
select ProductModelID AS ID, ProductDescriptionID AS DescID from Production.ProductModelProductDescriptionCulture AS Table1
select * from Person.AddressType
Problem Statement: Filter out rows that has the name 'Archive' from the display
select * from Person.AddressType where NOT Name = 'Archive'
select * from Purchasing.PurchaseOrderDetail
Problem Statement: Find all purchase order for the ProductID = 512 that costs less than $35 unit price
select * from Purchasing.PurchaseOrderDetail where ProductID = 512 AND UnitPrice <35
select Name, ProductNumber, ListPrice from Production.Product
Problem Statement: Find the name of products that has a list price in the range of $10-$20
select Name, ProductNumber, ListPrice from Production.Product where ListPrice BETWEEN 10 AND 20
select * from Production.WorkOrder
Problem Statement:
- Find records for Products with ProductID = 995
- Find records for Products with ProductID = 995 that has more than 500 orders
- Find records for Products with ProductID = 995 that has more than 500 orders and received before May 3, 2013
select * from Production.WorkOrder where ProductID = 995
select * from Production.WorkOrder where ProductID = 995 AND OrderQty > 500
select * from Production.WorkOrder where ProductID = 995 AND OrderQty > 500 AND StartDate < '2013-05-03'
select Name, ListPrice from Production.Product
Problem Statement: Find the name of products that has these 3 ListPrice values: 106.50, 1003.91, 333.42
select Name, ListPrice from Production.Product where ListPrice IN (106.50, 1003.91, 333.42)
select * from Person.CountryRegion
Problem Statement: Need to find the name of the countries that start with the letter 'V'
select * from Person.CountryRegion where Name like 'V%'
Problem Statement: Need to find the name of the countries that start with the letter 'Vi'
select * from Person.CountryRegion where Name like 'Vi%'
select * from Production.WorkOrder
Problem Statement: Find Work Orders that has a Scrap Reason
select * from Production.WorkOrder where ScrapReasonID IS NOT NULL
select * from Production.WorkOrder
Problem Statement: Find the Name of Products having more than 20,000 Order Quantity
select ProductID from Production.WorkOrder where OrderQty > 20000
-- Query 1
-- Name of Products from another Table
select ProductID, Name from Production.Product
-- Query 2
select ProductID, Name from Production.Product where ProductID = ANY( select ProductID from Production.WorkOrder where OrderQty > 20000 )
-- Query 1 inside Query 2
select Name, ProductNumber from Production.Product
Problem Statement: The "ProductNumber" has Characters in uppercase.
So, Covert characters in "Name" column also in uppercase
select UPPER(Name) AS Name, ProductNumber from Production.Product
-- Converting to "ProductNumber" lowercase characters
select Name, LOWER(ProductNumber) AS ProductNumber from Production.Product
select Name, ProductNumber from Production.Product
Problem Statement: Get(extract) only the 2 characters from the "ProductName" for each "Name" column
select Name, LEFT(ProductNumber,2) AS ProductNumber from Production.Product
-- Left function
Problem Statement: Get(extract) only the 4 numbers from the "ProductName" for each "Name" column
select Name, RIGHT(ProductNumber,4) AS ProductNumber from Production.Product
-- Right function
select WorkOrderID, ProductID from Production.WorkOrder
Problem Statement: Find the "Product Name" of each ProductID along the with the WorkOrderID
select ProductID, Name from Production.Product
-- Getting the Product Name
select A.WorkOrderID, A.ProductID, B.Name from Production.WorkOrder AS A INNER JOIN Production.Product AS B ON A.ProductID = B.ProductID
-- Innner Join
select ProductID, Name from Production.Product
Problem Statement: Find the Sales Orders for all ProductID's along with the ProductID and Name
select ProductID, SalesOrderID from Sales.SalesOrderDetail
-- Sales Order details
select A.ProductID, A.Name, B.SalesOrderID from Production.Product AS A LEFT JOIN Sales.SalesOrderDetail AS B ON A.ProductID = B.ProductID
-- Left Join
select ProductID, Name from Production.Product
-- First Table
select ProductID, Comments from Production.ProductReview
-- Second Table
Problem Statement: Find the reviews of products along with the product name (Right Join)
select B.ProductID, B.Comments, A.Name from Production.Product AS A RIGHT JOIN Production.ProductReview AS B ON A.ProductID = B.ProductID
select ProductID, Name, ProductSubcategoryID from Production.Product
-- Product Table
Problem Statement: Find the Sub-category name to which each Product belongs, and also find if any Sub-category name is not assigned to a Product name
select ProductSubcategoryID, Name from Production.ProductSubcategory
-- Sub-category Table
select A.ProductID, A.Name, A.ProductSubcategoryID, B.Name from Production.Product AS A FULL JOIN Production.ProductSubcategory AS B ON A.ProductSubcategoryID = B.ProductSubcategoryID
-- Full Join
select COUNT(ProductNumber) from Production.Product
Problem Statement: Check how many product Numbers are there without any duplication
select distinct COUNT(ProductNumber) from Production.Product
select SalesOrderID, ProductID, LineTotal, ModifiedDate from Sales.SalesOrderDetail
Problem Statement: Find the Total revenue from the Product 777 sold in the year 2011
select SalesOrderID, ProductID, LineTotal, ModifiedDate from Sales.SalesOrderDetail where ProductID = 777 and ModifiedDate BETWEEN '2011-01-01' AND '2011-12-31'
select SUM(LineTotal) from Sales.SalesOrderDetail where ProductID = 777 and ModifiedDate BETWEEN '2011-01-01' AND '2011-12-31'
-- Total Revenue
select SalesOrderID, ProductID, LineTotal, ModifiedDate from Sales.SalesOrderDetail
Problem Statement: Find the Average Price on which the Product 777 got sold in 2011
select SalesOrderID, ProductID, LineTotal, ModifiedDate from Sales.SalesOrderDetail where ProductID = 777 and ModifiedDate BETWEEN '2011-01-01' AND '2011-12-31'
select AVG(LineTotal) from Sales.SalesOrderDetail where ProductID = 777 and ModifiedDate BETWEEN '2011-01-01' AND '2011-12-31'
-- Average
Problem Statement: Check how many product Numbers are there without any duplication
select * from Production.ProductInventory where ProductID = 944
select MIN(Quantity) from Production.ProductInventory where ProductID = 944
Problem Statement: Find the largest quantity in stock for the ProductID 747
select * from Production.ProductInventory where ProductID = 747
select MAX(Quantity) from Production.ProductInventory where ProductID = 747
select StateProvinceID, StateProvinceCode, Name from Person.StateProvince
Problem Statement: Display State Code and State Name in this format: State Code-State Name. Example: for State Code-AK and State Name- Alaska, need to display as 'Ak-Alaska'
select StateProvinceID, CONCAT(StateProvinceCode, '-', Name) from Person.StateProvince
select StateProvinceID, CONCAT(StateProvinceCode, '-', Name) AS State from Person.StateProvince
-- Using 'State' as the column name
select ProductID, UnitPrice, OrderQty, LineTotal, RejectedQty from Purchasing.PurchaseOrderDetail
Problem Statement: Find the Amount lost due to the rejected quantity
select ProductID, UnitPrice, OrderQty, LineTotal, RejectedQty, (UnitPrice * RejectedQty) AS LossAmount from Purchasing.PurchaseOrderDetail
select ProductID, Quantity from Production.ProductInventory
Problem Statement: Find the lowest Quantity for each ProductID in the Inventory
select ProductID, MIN(Quantity) AS MinQuantity from Production.ProductInventory GROUP BY ProductID
select ProductID, Quantity, LocationID from Production.ProductInventory
Problem Statement: If the location 'LocationID' is less than 3 for a ProductID, then we need to find the lowest stock quantity 'Quantity' for only that ProductID
select ProductID, MIN(Quantity) AS MinCount, COUNT(LocationID) AS Locations from Production.ProductInventory GROUP BY ProductID HAVING COUNT(LocationID) < 3
SQL is a powerful tool for data analysts to manage and analyze data. By mastering SQL and understanding various scenarios, data analysts can efficiently retrieve, manipulate, and transform data to derive valuable insights. Whether it is querying databases, aggregating data, or performing complex calculations, SQL provides the necessary capabilities to handle diverse data analysis tasks.