Using transform( ) in python

 We are given below data from an online grocery store and we are trying to create a new column which gives accumulated values for an order. user_id is customer




This customer (user_id=1) has placed total 11 orders so far and each order has unique order_id. 

We are trying to find out total no. of orders placed by customers in a separate column, so we know which customers are our top 100 customers based on no. of orders placed.

One way to obtain this information is by using count function.

orders.groupby('user_id').order_number.count()


So it means cust-1 placed 11 orders, cust 2 placed 15 orders, and so on...

This is fine but we want to show it as a separate column on orders dataframe so we can see it along with other data.

orders['ord_count'] = orders.groupby('user_id').order_number.count()


As we can see, this does not give the correct presentation of the data.

So, another way to accomplish this is by using transform( )

orders['order_count'] = orders.groupby('user_id').order_number.transform('count')


And now we can see that the total order count is presented properly.

No comments:

Post a Comment

Complex query example

Requirement:  You are given the table with titles of recipes from a cookbook and their page numbers. You are asked to represent how the reci...