Replace Blank Values by NaN in pandas DataFrame in Python (Example)

 

In this Python post you’ll learn how to substitute empty cells in a pandas DataFrame by NaN values.

The article consists of one example for the replacement of empty cells in a pandas DataFrame by NaN values. To be more specific, the tutorial contains this content:

You’re here for the answer, so let’s get straight to the example.

 

Exemplifying Data & Add-On Libraries

We first need to import the pandas library:

import pandas as pd                                                # Import pandas library

We’ll use the following data as basement for this tutorial:

data = pd.DataFrame({'x1':[1, '', '   ', 2, 3],                    # Create example DataFrame
                     'x2':['', '', 'a', 'b', 'c'],
                     'x3':['    ', 'a', 'b', 'c', 'd']})
print(data)                                                        # Print example DataFrame
#     x1 x2    x3
# 0    1         
# 1             a
# 2       a     b
# 3    2  b     c
# 4    3  c     d

The previous output of the Python console shows the structure of the example data – A pandas DataFrame where some of the cells are empty. Note that some of these empty cells contain multiple white spaces.

 

Example: Exchange Blank by NaN in pandas DataFrame Using replace() Function

In this example, I’ll show how to convert blank cells in a pandas DataFrame to NaN values.

For this task, we can use the replace function as shown below:

data_new = data.copy()                              # Create duplicate of example data
data_new = data_new.replace(r'^s*$', float('NaN'), regex = True)  # Replace blanks by NaN
print(data_new)                                     # Print updated data
#     x1   x2   x3
# 0  1.0  NaN  NaN
# 1  NaN  NaN    a
# 2  NaN    a    b
# 3  2.0    b    c
# 4  3.0    c    d

Have a look at the previous output of the Python console: We have set all blank cells to NaN.

 

Video, Further Resources & Summary

Would you like to know more about how to substitute empty cells in a pandas DataFrame by NaN values? Then I can recommend having a look at the following video on my YouTube channel. In the video, I show and explain the Python programming code of this article:

 

 

Have a look at the following video on the Minsuk Heo YouTube channel. It shows how to check for NaN values and how to replace them in Python – a task that you might need after replacing your blank values by NaN:

 

 

Furthermore, you might read the related tutorials on this website.

 

In this Python tutorial you have learned how to replace and set empty character strings in a pandas DataFrame by NaNs. Tell me about it in the comments section, if you have any further questions.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

The maximum upload file size: 2 MB. You can upload: image. Drop file here

Top