Replace Blank Values by NaN in pandas DataFrame in Python (Example)

 

In this Python post you’ll learn how to substitute empty cells in a pandas DataFrame by NaN values.

The article consists of one example for the replacement of empty cells in a pandas DataFrame by NaN values. To be more specific, the tutorial contains this content:

You’re here for the answer, so let’s get straight to the example.

 

Exemplifying Data & Add-On Libraries

We first need to import the pandas library:

import pandas as pd                                                # Import pandas library

We’ll use the following data as basement for this tutorial:

data = pd.DataFrame({'x1':[1, '', '   ', 2, 3],                    # Create example DataFrame
                     'x2':['', '', 'a', 'b', 'c'],
                     'x3':['    ', 'a', 'b', 'c', 'd']})
print(data)                                                        # Print example DataFrame
#     x1 x2    x3
# 0    1         
# 1             a
# 2       a     b
# 3    2  b     c
# 4    3  c     d

The previous output of the Python console shows the structure of the example data – A pandas DataFrame where some of the cells are empty. Note that some of these empty cells contain multiple white spaces.

 

Example: Exchange Blank by NaN in pandas DataFrame Using replace() Function

In this example, I’ll show how to convert blank cells in a pandas DataFrame to NaN values.

For this task, we can use the replace function as shown below:

data_new = data.copy()                              # Create duplicate of example data
data_new = data_new.replace(r'^s*$', float('NaN'), regex = True)  # Replace blanks by NaN
print(data_new)                                     # Print updated data
#     x1   x2   x3
# 0  1.0  NaN  NaN
# 1  NaN  NaN    a
# 2  NaN    a    b
# 3  2.0    b    c
# 4  3.0    c    d

Have a look at the previous output of the Python console: We have set all blank cells to NaN.

 

Video, Further Resources & Summary

Would you like to know more about how to substitute empty cells in a pandas DataFrame by NaN values? Then I can recommend having a look at the following video on my YouTube channel. In the video, I show and explain the Python programming code of this article:

 

 

Have a look at the following video on the Minsuk Heo YouTube channel. It shows how to check for NaN values and how to replace them in Python – a task that you might need after replacing your blank values by NaN:

 

 

Furthermore, you might read the related tutorials on this website.

 

In this Python tutorial you have learned how to replace and set empty character strings in a pandas DataFrame by NaNs. Tell me about it in the comments section, if you have any further questions.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top