Variance in Python (5 Examples)

 

In this article you’ll learn how to calculate the variance in the Python programming language.

Table of contents:

Let’s dive right into the examples:

 

Example 1: Variance of List Object

The following Python syntax shows how to get the variance of a list object in Python.

To do this, we first have to create an example list:

my_list = [1, 5, 3, 9, 5, 8, 3, 1, 1]           # Create example list
print(my_list)                                  # Print example list
# [1, 5, 3, 9, 5, 8, 3, 1, 1]

Furthermore, we have to load the NumPy library to Python:

import numpy as np                              # Load NumPy

Next, we can apply the var function to find the variance of our list object:

print(np.var(my_list))                          # Get var of list
# 8.0

The variance of our list is 8.0.

Note that this result reflects the population variance. In case you want to use the formula of the sample variance, you have to set the ddof argument within the var function to the value 1.

 

Example 2: Variance of One Particular Column in pandas DataFrame

It is also possible to compute the variance for a column of a pandas DataFrame in Python.

This example explains how to do that based on a live example.

First, we have to import the pandas library:

import pandas as pd                             # Import pandas library in Python

In the next step, we have to create an exemplifying pandas DataFrame:

data = pd.DataFrame({'x1':range(32, 1, - 2),    # Create pandas DataFrame
                     'x2':[5, 2, 7, 3, 1, 4, 3, 4, 4, 2, 3, 3, 1, 1, 7, 5],
                     'x3':range(100, 116),
                     'group':['A', 'A', 'B', 'C', 'B', 'C', 'C', 'A', 'C', 'A', 'C', 'A', 'B', 'B', 'B', 'A']})
print(data)                                     # Print pandas DataFrame

 

table 1 DataFrame variance python programming language

 

By executing the previously shown Python programming syntax, we have created Table 1, i.e. a pandas DataFrame with four columns. The variables x1, x2, and x3, are floats and the variable group is a group indicator.

Now, we are set up and can calculate the variance for one of the columns in our data set as shown below:

print(data['x1'].var())                         # Get variance of one column
# 90.66666666666667

The previous Python code has returned the variance of the column x1, i.e. 90.667.

 

Example 3: Variance of All Columns in pandas DataFrame

The following Python syntax illustrates how to calculate the variance of all columns in a pandas DataFrame.

For this, we simply have to apply the var function to our entire data set:

print(data.var(numeric_only = True))            # Get variance of all columns
# x1    90.666667
# x2     3.595833
# x3    22.666667
# dtype: float64

After executing the previous Python code, the variance for each of our float columns is shown in the console.

 

Example 4: Variance of Rows in pandas DataFrame

Example 4 demonstrates how to get the variance for each row of a pandas DataFrame.

To accomplish this, we have to set the axis argument within the var function to be equal to 1:

print(data.var(axis = 1, numeric_only = True))  # Get variance of rows
# 0     2396.333333
# 1     2604.333333
# 2     2490.333333
# 3     2743.000000
# 4     2923.000000
# 5     2902.333333
# 6     3049.000000
# 7     3121.000000
# 8     3237.333333
# 9     3436.333333
# 10    3522.333333
# 11    3652.333333
# 12    3864.333333
# 13    4003.000000
# 14    3926.333333
# 15    4146.333333
# dtype: float64

 

Example 5: Variance by Group in pandas DataFrame

This example shows how to print the variance by group.

For this task, we have to use the groupby function. Within the groupby function, we have to specify the name of our group indicator (i.e. ‘group’):

print(data.groupby('group').var())              # Get variance by group
#                x1    x2         x3
# group                             
# A      135.066667   1.9  33.766667
# B      124.000000  10.8  31.000000
# C       29.200000   0.3   7.300000

Looks good!

 

Video, Further Resources & Summary

Do you want to learn more about the computation of the variance of a list or the columns and rows of a pandas DataFrame? Then you may watch the following video on my YouTube channel. In the video, I demonstrate the Python programming syntax of this article.

 

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube Content Consent Button Thumbnail

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

 

Additionally, you may read the other tutorials on my website:

 

In this Python programming tutorial you have learned how to calculate the variance of a list or the columns of a pandas DataFrame. Don’t hesitate to let me know in the comments section, in case you have any further questions.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published.

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top