Variance in Python (5 Examples)
In this article you’ll learn how to calculate the variance in the Python programming language.
Table of contents:
Let’s dive right into the examples:
Example 1: Variance of List Object
The following Python syntax shows how to get the variance of a list object in Python.
To do this, we first have to create an example list:
my_list = [1, 5, 3, 9, 5, 8, 3, 1, 1] # Create example list print(my_list) # Print example list # [1, 5, 3, 9, 5, 8, 3, 1, 1]
Furthermore, we have to load the NumPy library to Python:
import numpy as np # Load NumPy
Next, we can apply the var function to find the variance of our list object:
print(np.var(my_list)) # Get var of list # 8.0
The variance of our list is 8.0.
Note that this result reflects the population variance. In case you want to use the formula of the sample variance, you have to set the ddof argument within the var function to the value 1.
Example 2: Variance of One Particular Column in pandas DataFrame
It is also possible to compute the variance for a column of a pandas DataFrame in Python.
This example explains how to do that based on a live example.
First, we have to import the pandas library:
import pandas as pd # Import pandas library in Python
In the next step, we have to create an exemplifying pandas DataFrame:
data = pd.DataFrame({'x1':range(32, 1, - 2), # Create pandas DataFrame 'x2':[5, 2, 7, 3, 1, 4, 3, 4, 4, 2, 3, 3, 1, 1, 7, 5], 'x3':range(100, 116), 'group':['A', 'A', 'B', 'C', 'B', 'C', 'C', 'A', 'C', 'A', 'C', 'A', 'B', 'B', 'B', 'A']}) print(data) # Print pandas DataFrame
By executing the previously shown Python programming syntax, we have created Table 1, i.e. a pandas DataFrame with four columns. The variables x1, x2, and x3, are floats and the variable group is a group indicator.
Now, we are set up and can calculate the variance for one of the columns in our data set as shown below:
print(data['x1'].var()) # Get variance of one column # 90.66666666666667
The previous Python code has returned the variance of the column x1, i.e. 90.667.
Example 3: Variance of All Columns in pandas DataFrame
The following Python syntax illustrates how to calculate the variance of all columns in a pandas DataFrame.
For this, we simply have to apply the var function to our entire data set:
print(data.var(numeric_only = True)) # Get variance of all columns # x1 90.666667 # x2 3.595833 # x3 22.666667 # dtype: float64
After executing the previous Python code, the variance for each of our float columns is shown in the console.
Example 4: Variance of Rows in pandas DataFrame
Example 4 demonstrates how to get the variance for each row of a pandas DataFrame.
To accomplish this, we have to set the axis argument within the var function to be equal to 1:
print(data.var(axis = 1, numeric_only = True)) # Get variance of rows # 0 2396.333333 # 1 2604.333333 # 2 2490.333333 # 3 2743.000000 # 4 2923.000000 # 5 2902.333333 # 6 3049.000000 # 7 3121.000000 # 8 3237.333333 # 9 3436.333333 # 10 3522.333333 # 11 3652.333333 # 12 3864.333333 # 13 4003.000000 # 14 3926.333333 # 15 4146.333333 # dtype: float64
Example 5: Variance by Group in pandas DataFrame
This example shows how to print the variance by group.
For this task, we have to use the groupby function. Within the groupby function, we have to specify the name of our group indicator (i.e. ‘group’):
print(data.groupby('group').var()) # Get variance by group # x1 x2 x3 # group # A 135.066667 1.9 33.766667 # B 124.000000 10.8 31.000000 # C 29.200000 0.3 7.300000
Looks good!
Video, Further Resources & Summary
Do you want to learn more about the computation of the variance of a list or the columns and rows of a pandas DataFrame? Then you may watch the following video on my YouTube channel. In the video, I demonstrate the Python programming syntax of this article.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Additionally, you may read the other tutorials on my website:
- Variance by Group in Python
- Variance of NumPy Array in Python
- pvariance & variance Functions of statistics Module
- Standard Deviation by Group
- Summary Statistics of pandas DataFrame
- pandas Library Tutorial in Python
- Introduction to Python
In this Python programming tutorial you have learned how to calculate the variance of a list or the columns of a pandas DataFrame. Don’t hesitate to let me know in the comments section, in case you have any further questions.