Dataset name: warehouse_orders
Table names: orders, warehouse
I have uploaded these tables in to bigquery and here are contents of these tables
Order table:
warehouse table:
As we can see, warehouse_id is a common column in both tables.
Orders table gives information about the orders fulfilled and warehouse table gives information about warehouse parameters.
Our goal is to calculate how much % of orders are being fulfilled by each warehouse.
Our expected end result should be something like this...
% orders fulfilled = number of orders per warehouse/total orders X 100
- No of orders fulfilled can be obtained using COUNT function from orders table
- Total number of orders can be obtained from orders table using SUM function
- Warehouse name can be obtained from warehouse table (it is already available in column name warehouse_alias
- Warehouse_id is in both tables so we can join both tables using this column
Let us join tables in the first
query so we get warehouse_id and warehouse name columns for our end result
SELECT
warehouse.warehouse_id,
warehouse.warehouse_alias
FROM
`my-sql-project-336018.warehouse_orders.warehouse` AS warehouse
LEFT JOIN
`my-sql-project-336018.warehouse_orders.orders` AS orders
ON
warehouse.warehouse_id = orders.warehouse_id
I am joining tables with LEFT JOIN...why?
Because I want to ensure that all warehouse_id in the warehouse table must be there and the corresponding warehouse_id data in orders table should be there.
Which means, if there are warehouse_id in orders table but not in
warehouse table, they are not required for our task.
Also always check that subquery in itself should run independently because we are going to use its result in the main query. We are getting below result when we run this subquery.
Next, let us find out number of orders by using COUNT function in orders table.
COUNT(orders.order_id) AS number_of_orders_fulfilled
And since using aggregation, add GROUP BY function too...
SELECT
warehouse.warehouse_id,
warehouse.warehouse_alias,
COUNT(orders.order_id) AS number_of_orders_fulfilled
FROM
`my-sql-project-336018.warehouse_orders.warehouse` AS warehouse
LEFT JOIN
`my-sql-project-336018.warehouse_orders.orders` AS orders
ON
warehouse.warehouse_id = orders.warehouse_id
GROUP BY warehouse.warehouse_id, warehouse.warehouse_alias
This is the result we got so far...
Some warehouse/s has zero fulfilled orders. So when we checked the details of the data, it says some warehouses are newly built and are not in operation yet. So we will have to remove those from the list.
In order to get % of orders fulfilled, we still need total orders fulfilled.
If we run an independent query for that, it would be something like this...
SELECT
COUNT(*) as total_orders_fulfilled
FROM
`my-sql-project-336018.warehouse_orders.orders`
And the result of this query is...
Now adding this into our main query...
SELECT
warehouse.warehouse_id,
warehouse.warehouse_alias,
COUNT(orders.order_id) AS number_of_orders_fulfilled,
ROUND(COUNT(orders.order_id)/(SELECT COUNT(*) FROM warehouse_orders.orders)*100,2) AS percent_order_fulfilled
FROM
`my-sql-project-336018.warehouse_orders.warehouse` AS warehouse
LEFT JOIN
`my-sql-project-336018.warehouse_orders.orders` AS orders
ON
warehouse.warehouse_id = orders.warehouse_id
GROUP BY warehouse.warehouse_id, warehouse.warehouse_alias
The end result we are getting is...
If we calculate a couple of samples to make sure the result is right, out of total 9999 orders, when 548 are fulfilled, it is 5.48% of fulfillment.
This is good but we want to do some cosmetic changes to this query so it looks more presentable.
First, we want to remove warehouses with zero fulfillment which we will do using HAVING function.
Second, warehouse_alias column gives only name of the warehouse but does not indicate where this warehouse is located. So, using CONCAT function, we can have warehouse name and state name together.
Third, instead of giving exact fulfillment percentages, how about giving it in a range?
For example, we can say 0 to 20% , 20 to 60% and above 60% fulfillment
We can do this using CASE statement
Below is the CONCAT function we are going to use so we get
State:Warehouse_alias format and we will call that column as warehouse_name
CONCAT(warehouse.state,':',warehouse.warehouse_alias) AS warehouse_name
And, to remove warehouses with zero orders,
HAVING COUNT(orders.order_id) > 0
Now result looks like this...
This looks much better. We will still try one last change here by adding fulfillment ranges from 0 to 20 to 60 using CASE statement...something like this...and we will call this column "fulfillment_summary"
CASE
WHEN COUNT(orders.order_id)/(SELECT COUNT(*) FROM my-sql-project-336018.warehouse_orders.orders AS orders) <= 0.20
THEN "Fulfilled 0-20% of orders"
WHEN COUNT(orders.order_id)/(SELECT COUNT(*) FROM my-sql-project-336018.warehouse_orders.orders AS orders) > 0.20 AND
COUNT(orders.order_id)/(SELECT COUNT(*) FROM my-sql-project-336018.warehouse_orders.orders AS orders) <= 0.60
THEN "Fulfilled 21-60% or orders"
ELSE
"Fulfilled more than 60% of orders"
END AS fulfillment_summary
So the full query looks like this...
SELECT
warehouse.warehouse_id,
CONCAT(warehouse.state,':',warehouse.warehouse_alias) AS warehouse_name,
COUNT(orders.order_id) AS total_orders_fulfilled,
CASE
WHEN COUNT(orders.order_id)/(SELECT COUNT(*) FROM my-sql-project-336018.warehouse_orders.orders AS orders) <= 0.20
THEN "Fulfilled 0-20% of orders"
WHEN COUNT(orders.order_id)/(SELECT COUNT(*) FROM my-sql-project-336018.warehouse_orders.orders AS orders) > 0.20 AND
COUNT(orders.order_id)/(SELECT COUNT(*) FROM my-sql-project-336018.warehouse_orders.orders AS orders) <= 0.60
THEN "Fulfilled 21-60% or orders"
ELSE
"Fulfilled more than 60% of orders"
END AS fulfillment_summary
FROM
`my-sql-project-336018.warehouse_orders.warehouse` AS warehouse
LEFT JOIN
`my-sql-project-336018.warehouse_orders.orders` AS orders
ON
warehouse.warehouse_id = orders.warehouse_id
GROUP BY warehouse.warehouse_id, warehouse_name
HAVING COUNT(orders.order_id) > 0
ORDER BY fulfillment_summary DESC
And our final result looks like this...
No comments:
Post a Comment