Change Data Type of pandas DataFrame Column in Python (8 Examples)

 

This tutorial illustrates how to convert DataFrame variables to a different data type in Python.

The article looks as follows:

Let’s jump right to the examples…

 

Construction of Exemplifying Data

As the first step, we have to load the pandas library to Python

import pandas as pd                               # Load pandas

Next, we have to create some example data. I’ll use the following data as basement for this Python tutorial:

data = pd.DataFrame({"x1":["10", "9", "8", "7"],  # Create example data
                     "x2":["1.1", "2.1", "3.1", "4.1"],
                     "x3":range(1, 5)})
print(data)                                       # Print example data
#    x1   x2  x3
# 0  10  1.1   1
# 1   9  2.1   2
# 2   8  3.1   3
# 3   7  4.1   4

Have a look at the previous console output: As you can see we have created a pandas DataFrame consisting of four rows and three columns.

We can check the data types of our DataFrame variables by printing the dtypes attribute:

print(data.dtypes)                                # Return data types of columns
# x1    object
# x2    object
# x3     int64
# dtype: object

The previous output shows that the first and second columns of our DataFrame are objects (i.e. character strings), and the third column has the integer class.

In the following examples, I’ll explain how to convert some or all of our DataFrame variables to a different data type.

 

Example 1: Convert pandas DataFrame Column to Integer

Example 1 demonstrates how to change the data type of a DataFrame column to the integer class.

For this task, we have to specify “int” within the astype function as shown in the following Python code:

data["x1"] = data["x1"].astype(int)               # Convert column to integer

After running the previous code, our data set has been updated. We can check this by printing the data types of our variables once again:

print(data.dtypes)                                # Return data types of columns
# x1     int32
# x2    object
# x3     int64
# dtype: object

Compare this output with the previous output. The data type of the variable x1 has been converted from the character string class to the integer class.

 

Example 2: Convert pandas DataFrame Column to Float

Example 2 illustrates how to set a column of a pandas DataFrame to the float data type.

Similar to Example 1, we can use the astype function. This time, however, we have to specify “float” within the function:

data["x2"] = data["x2"].astype(float)             # Convert column to float

Let’s check the classes of our variables again:

print(data.dtypes)                                # Return data types of columns
# x1      int32
# x2    float64
# x3      int64
# dtype: object

As you can see, the data type of x2 has been changed to the float class.

 

Example 3: Convert pandas DataFrame Column to String

Example 3 demonstrates how to use the astype function to convert a pandas DataFrame column to the character string class by specifying “str” within the astype function.

Have a look at the following Python syntax:

data["x3"] = data["x3"].astype(str)               # Convert column to string

Again, let’s check the data types of our columns by printing the dtypes attribute:

print(data.dtypes)                                # Return data types of columns
# x1      int32
# x2    float64
# x3     object
# dtype: object

The column x3 has been transformed to the character string class (represented by “object”).

 

Example 4: Convert Multiple Columns of pandas DataFrame to Different Data Types

So far, we have only converted one single variable to a different data type. The following code demonstrates how to change the class of multiple variables in one line of code.

For this, we have to specify curly brackets, the names of the variables we want to change, and the corresponding data type to which we want to change our variables within the astype function:

data = data.astype({"x2": int, "x3": complex})    # Convert multiple columns

Let’s have another look at the classes of our DataFrame:

print(data.dtypes)                                # Return data types of columns
# x1         int32
# x2         int32
# x3    complex128
# dtype: object

As you can see, we have changed the classes of the columns x2 and x3. Note that we have converted the variable x3 to the complex class, i.e. a new class that we have not used yet.

 

Example 5: Convert All Columns of pandas DataFrame to Other Data Type

We can also use the astype function to convert all variables of a pandas DataFrame to the same data type.

To do this, we simply have to apply the astype function to our entire DataFrame, not only to one column:

data = data.astype(str)                           # Convert all columns

Let’s print the data types of our updated data set:

print(data.dtypes)                                # Return data types of columns
# x1    object
# x2    object
# x3    object
# dtype: object

All variables have the object, i.e. character string, data type.

 

Example 6: Convert pandas DataFrame Column to Other Data Type Using to_numeric Function

In the previous examples, we have used the astype function to convert our DataFrame columns to a different class. However, the Python programming language also provides other functions to switch between data types.

This example explains how to use the to_numeric function to change the class of a variable.

Consider the following Python code:

data["x1"] = pd.to_numeric(data["x1"])            # Using to_numeric function

Let’s check the classes of our updated data once again:

print(data.dtypes)                                # Return data types of columns
# x1     int64
# x2    object
# x3    object
# dtype: object

As you can see, we have changed the first column of our data set to the integer class.

 

Example 7: Convert All pandas DataFrame Columns to Other Data Type Using infer_objects Function

Another function that is provided by the Python programming language is the infer_objects function.

The infer_objects command attempts to infer better data types for object columns, so for example it can be used to convert an object column to a more explicit class such as a string or an integer.

The infer_objects function can be applied as shown below:

data = data.infer_objects()                       # Using infer_objects function

In our specific case, this doesn’t change much:

print(data.dtypes)                                # Return data types of columns
# x1     int64
# x2    object
# x3    object
# dtype: object

However, depending on your input data the infer_objects function improves your data classes.

 

Example 8: Convert All pandas DataFrame Columns to Other Data Type Using convert_dtypes Function

Another function that is used to convert columns to the best possible data types is the convert_dtypes function. It can be applied as follows:

data = data.convert_dtypes()                      # Using convert_dtypes function

Let’s check the updated data types of our columns:

print(data.dtypes)                                # Return data types of columns
# x1     Int64
# x2    string
# x3    string
# dtype: object

As you can see, we have managed to convert the second and third variables of our DataFrame explicitly to the string class.

 

Video & Further Resources

In case you need more explanations on the handling of data types in Python, I recommend having a look at the data types video on the Telusko YouTube channel. You can find the video below:

 

 

Besides that, you may read the related tutorials on this website:

 

In this article, I have explained how to transform the class of a pandas DataFrame column in the Python programming language. In case you have additional questions, tell me about it in the comments.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top