We are given below data from an online grocery store and we are trying to create a new column which gives accumulated values for an order. user_id is customer
This customer (user_id=1) has placed total 11 orders so far and each order has unique order_id.
We are trying to find out total no. of orders placed by customers in a separate column, so we know which customers are our top 100 customers based on no. of orders placed.
One way to obtain this information is by using count function.
orders.groupby('user_id').order_number.count()
So it means cust-1 placed 11 orders, cust 2 placed 15 orders, and so on...
This is fine but we want to show it as a separate column on orders dataframe so we can see it along with other data.
orders['ord_count'] = orders.groupby('user_id').order_number.count()
As we can see, this does not give the correct presentation of the data.
So, another way to accomplish this is by using transform( )
orders['order_count'] = orders.groupby('user_id').order_number.transform('count')
And now we can see that the total order count is presented properly.
No comments:
Post a Comment