Change Data Type of pandas DataFrame Column in Python (8 Examples)
This tutorial illustrates how to convert DataFrame variables to a different data type in Python.
The article looks as follows:
Let’s jump right to the examples…
Construction of Exemplifying Data
As the first step, we have to load the pandas library to Python
import pandas as pd # Load pandas
Next, we have to create some example data. I’ll use the following data as basement for this Python tutorial:
data = pd.DataFrame({"x1":["10", "9", "8", "7"], # Create example data "x2":["1.1", "2.1", "3.1", "4.1"], "x3":range(1, 5)}) print(data) # Print example data # x1 x2 x3 # 0 10 1.1 1 # 1 9 2.1 2 # 2 8 3.1 3 # 3 7 4.1 4
Have a look at the previous console output: As you can see we have created a pandas DataFrame consisting of four rows and three columns.
We can check the data types of our DataFrame variables by printing the dtypes attribute:
print(data.dtypes) # Return data types of columns # x1 object # x2 object # x3 int64 # dtype: object
The previous output shows that the first and second columns of our DataFrame are objects (i.e. character strings), and the third column has the integer class.
In the following examples, I’ll explain how to convert some or all of our DataFrame variables to a different data type.
Example 1: Convert pandas DataFrame Column to Integer
Example 1 demonstrates how to change the data type of a DataFrame column to the integer class.
For this task, we have to specify “int” within the astype function as shown in the following Python code:
data["x1"] = data["x1"].astype(int) # Convert column to integer
After running the previous code, our data set has been updated. We can check this by printing the data types of our variables once again:
print(data.dtypes) # Return data types of columns # x1 int32 # x2 object # x3 int64 # dtype: object
Compare this output with the previous output. The data type of the variable x1 has been converted from the character string class to the integer class.
Example 2: Convert pandas DataFrame Column to Float
Example 2 illustrates how to set a column of a pandas DataFrame to the float data type.
Similar to Example 1, we can use the astype function. This time, however, we have to specify “float” within the function:
data["x2"] = data["x2"].astype(float) # Convert column to float
Let’s check the classes of our variables again:
print(data.dtypes) # Return data types of columns # x1 int32 # x2 float64 # x3 int64 # dtype: object
As you can see, the data type of x2 has been changed to the float class.
Example 3: Convert pandas DataFrame Column to String
Example 3 demonstrates how to use the astype function to convert a pandas DataFrame column to the character string class by specifying “str” within the astype function.
Have a look at the following Python syntax:
data["x3"] = data["x3"].astype(str) # Convert column to string
Again, let’s check the data types of our columns by printing the dtypes attribute:
print(data.dtypes) # Return data types of columns # x1 int32 # x2 float64 # x3 object # dtype: object
The column x3 has been transformed to the character string class (represented by “object”).
Example 4: Convert Multiple Columns of pandas DataFrame to Different Data Types
So far, we have only converted one single variable to a different data type. The following code demonstrates how to change the class of multiple variables in one line of code.
For this, we have to specify curly brackets, the names of the variables we want to change, and the corresponding data type to which we want to change our variables within the astype function:
data = data.astype({"x2": int, "x3": complex}) # Convert multiple columns
Let’s have another look at the classes of our DataFrame:
print(data.dtypes) # Return data types of columns # x1 int32 # x2 int32 # x3 complex128 # dtype: object
As you can see, we have changed the classes of the columns x2 and x3. Note that we have converted the variable x3 to the complex class, i.e. a new class that we have not used yet.
Example 5: Convert All Columns of pandas DataFrame to Other Data Type
We can also use the astype function to convert all variables of a pandas DataFrame to the same data type.
To do this, we simply have to apply the astype function to our entire DataFrame, not only to one column:
data = data.astype(str) # Convert all columns
Let’s print the data types of our updated data set:
print(data.dtypes) # Return data types of columns # x1 object # x2 object # x3 object # dtype: object
All variables have the object, i.e. character string, data type.
Example 6: Convert pandas DataFrame Column to Other Data Type Using to_numeric Function
In the previous examples, we have used the astype function to convert our DataFrame columns to a different class. However, the Python programming language also provides other functions to switch between data types.
This example explains how to use the to_numeric function to change the class of a variable.
Consider the following Python code:
data["x1"] = pd.to_numeric(data["x1"]) # Using to_numeric function
Let’s check the classes of our updated data once again:
print(data.dtypes) # Return data types of columns # x1 int64 # x2 object # x3 object # dtype: object
As you can see, we have changed the first column of our data set to the integer class.
Example 7: Convert All pandas DataFrame Columns to Other Data Type Using infer_objects Function
Another function that is provided by the Python programming language is the infer_objects function.
The infer_objects command attempts to infer better data types for object columns, so for example it can be used to convert an object column to a more explicit class such as a string or an integer.
The infer_objects function can be applied as shown below:
data = data.infer_objects() # Using infer_objects function
In our specific case, this doesn’t change much:
print(data.dtypes) # Return data types of columns # x1 int64 # x2 object # x3 object # dtype: object
However, depending on your input data the infer_objects function improves your data classes.
Example 8: Convert All pandas DataFrame Columns to Other Data Type Using convert_dtypes Function
Another function that is used to convert columns to the best possible data types is the convert_dtypes function. It can be applied as follows:
data = data.convert_dtypes() # Using convert_dtypes function
Let’s check the updated data types of our columns:
print(data.dtypes) # Return data types of columns # x1 Int64 # x2 string # x3 string # dtype: object
As you can see, we have managed to convert the second and third variables of our DataFrame explicitly to the string class.
Video & Further Resources
In case you need more explanations on the handling of data types in Python, I recommend having a look at the data types video on the Telusko YouTube channel. You can find the video below:
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Besides that, you may read the related tutorials on this website:
- Introduction to the pandas Library in Python
- Check Data Type of Columns in pandas DataFrame
- Get List of Column Names Grouped by Data Type in Python
- Check if Column Exists in pandas DataFrame in Python
- Modify & Edit pandas DataFrames in Python
- pandas DataFrames Operations in Python
- Python Programming Examples
In this article, I have explained how to transform the class of a pandas DataFrame column in the Python programming language. In case you have additional questions, tell me about it in the comments.