Working with numbers in a dataset often requires cumulative values so that one can quickly identify total values without having to do any calculations.
This is where python's cumulative functions play their roles.
cumsum( ), cummax( ), cummin( ), cumprod( ), cumcount( )
Let's say that we have a data of sales persons in a company and their sales quantity. There are 3 sales persons and every time they made a sale, it is captured as a row in this dataset.
Now we are creating a new column cumulative_sale which gives cumulative sales after every new sale has been done. And another column cs_person which gives cumulative sales achieved by individual employee after each sale.
Any time, we want to see what is total sales so far, we just look at the value in the cumulative_sales column in the last row (33)
And how many qty sold by each sales person can be found by looking at the cs_person column.
There is another way we can use this function too. Let's say we have dataset of employees working in Toyota dealership have been selling Camry and corolla.
Dataframe given to is showing employee and model sold by them for every sale they made. Each row represents sales transaction done by that employee.
Something like this...
Now our requirement is that we need to show how many corolla and Camry sold by employees at the end of every row(sale)
Using cumcount+1, now we can see that whenever an employee sales a model, in the new column count_by_employee, it shows the total sales done by that employee up to date.
For example, when Shah sold Camry, we know that this is 2nd sale of Shah for this month (1 corolla and 1 Camry)
We take this further by adding so dealership can keep track of total number of model sold up to date.
Now when Shah sold a Camry, dealership manager knows immediately that so fat their dealership has sold 3 Camry to this date.
One more way we can use it by finding per employee per model data
Now when Patel sold Camry, manager knows that this is 2nd Camry sold by him and when Shah sold Camry, it is his 1st Camry sold for this month(he already sold corolla before)
As we can see there are many ways we can apply cumulative functions to make the data more meaningful.
No comments:
Post a Comment