Calculate Mean by Group in Python (2 Examples)

 

In this article you’ll learn how to get the mean by group in the Python programming language.

The content of the page looks as follows:

Here’s the step-by-step process…

 

Example Data & Add-On Libraries

We first need to import the pandas library, in order to use the functions that are included in the library:

import pandas as pd                                  # Load pandas library

The following data will be used as a basis for this Python programming language tutorial:

data = pd.DataFrame({'x1':[6, 5, 3, 5, 2, 7, 2, 8],  # Create pandas DataFrame
                     'x2':range(11, 19),
                     'group1':['A', 'B', 'A', 'C', 'C', 'C', 'B', 'A'],
                     'group2':['a', 'a', 'a', 'a', 'b', 'b', 'b', 'b']})
print(data)                                          # Print pandas DataFrame

 

table 1 DataFrame calculate mean group python programming language

 

As you can see based on Table 1, our example data is a DataFrame containing eight rows and four columns.

 

Example 1: Mean by Group in pandas DataFrame

The following syntax illustrates how to calculate the mean of all pandas DataFrame columns by group.

For this task, we can use the groupby and mean functions as shown below:

print(data.groupby('group1').mean())                 # Get mean by group
#               x1    x2
# group1                
# A       5.666667  14.0
# B       3.500000  14.5
# C       4.666667  15.0

The previous console output shows the result of our Python syntax. You can see the averages for each group and column in our pandas DataFrame.

 

Example 2: Mean by Group & Subgroup in pandas DataFrame

Example 1 has shown how to get the mean for different groups based on one grouping column.

Example 2, in contrast, demonstrates how to return the mean by group based on multiple group and subgroup columns.

To achieve this, we have to specify a list of group columns within the groupby function.

Consider the Python syntax below:

print(data.groupby(['group1', 'group2']).mean())     # Get mean by multiple groups
#                 x1    x2
# group1 group2           
# A      a       4.5  12.0
#        b       8.0  18.0
# B      a       5.0  12.0
#        b       2.0  17.0
# C      a       5.0  14.0
#        b       4.5  15.5

The previous output shows the mean values for each subgroup in both of our float columns.

 

Video, Further Resources & Summary

Do you want to learn more about the computation of the mean value by group? Then you could watch the following video on my YouTube channel. In the video, I’m explaining the examples of this article:

 

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube Content Consent Button Thumbnail

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

 

Furthermore, you might have a look at some of the related posts on statisticsglobe.com:

 

To summarize: This article has illustrated how to calculate and find the mean by group in the Python programming language. Don’t hesitate to tell me about it in the comments below, if you have any further comments and/or questions. In addition, please subscribe to my email newsletter to get updates on new articles.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published.

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top