Python ParserError: Error tokenizing data. C error: Expected X fields in line Y, saw Z (Example)
In this tutorial you’ll learn how to fix the “ParserError: Error tokenizing data. C error: Expected X fields in line Y, saw Z” in the Python programming language.
The article consists of the following information:
Let’s get started.
Example Data & Software Libraries
Consider the CSV file illustrated below as a basis for this tutorial:
You may already note that rows 4 and 6 contain one value too much. Those two rows contain four different values, but the other rows contain only three values.
Let’s assume that we want to read this CSV file as a pandas DataFrame into Python.
For this, we first have to import the pandas library:
import pandas as pd # Load pandas |
import pandas as pd # Load pandas
Let’s move on to the examples!
Reproduce the ParserError: Error tokenizing data. C error: Expected X fields in line Y, saw Z
In this section, I’ll show how to replicate the error message “ParserError: Error tokenizing data. C error: Expected X fields in line Y, saw Z”.
Let’s assume that we want to read our example CSV file using the default settings of the read_csv function. Then, we might try to import our data as shown below:
data_import = pd.read_csv('data.csv') # Try to import CSV file # ParserError: Error tokenizing data. C error: Expected 3 fields in line 4, saw 4 |
data_import = pd.read_csv('data.csv') # Try to import CSV file # ParserError: Error tokenizing data. C error: Expected 3 fields in line 4, saw 4
Unfortunately, the “ParserError: Error tokenizing data. C error: Expected X fields in line Y, saw Z” is returned after executing the Python syntax above.
The reason for this is that our CSV file contains too many values in some of the rows.
In the next section, I’ll show an easy solution for this problem. So keep on reading…
Debug the ParserError: Error tokenizing data. C error: Expected X fields in line Y, saw Z
In this example, I’ll explain an easy fix for the “ParserError: Error tokenizing data. C error: Expected X fields in line Y, saw Z” in the Python programming language.
We can ignore all lines in our CSV file that are formatted wrongly by specifying the error_bad_lines argument to False.
Have a look at the example code below:
data_import = pd.read_csv('data.csv', # Remove rows with errors error_bad_lines = False) print(data_import) # Print imported pandas DataFrame |
data_import = pd.read_csv('data.csv', # Remove rows with errors error_bad_lines = False) print(data_import) # Print imported pandas DataFrame
As shown in Table 2, we have created a valid pandas DataFrame output using the previous code. As you can see, we have simply skipped the rows with too many values.
This is a simply trick that usually works. However, please note that this trick should be done with care, since the discussed error message typically points to more general issues with your data.
For that reason, it’s advisable to investigate why some of the rows are not formatted properly.
For this, I can also recommend this thread on Stack Overflow. It discusses how to identify wrong lines, and it also discusses other less common reasons for the error message “ParserError: Error tokenizing data. C error: Expected X fields in line Y, saw Z”.
Video & Further Resources
Have a look at the following video on my YouTube channel. In the video, I’m explaining the Python codes of this tutorial:
The YouTube video will be added soon.
Furthermore, you might read the other articles on this website. You can find some interesting tutorials below:
- Read CSV File as pandas DataFrame in Python
- Skip Rows but Keep Header when Reading CSV File
- Skip First Row when Reading pandas DataFrame from CSV File
- Specify Delimiter when Reading pandas DataFrame from CSV File
- Ignore Header when Reading CSV File as pandas DataFrame
- Check Data Type of Columns in pandas DataFrame in Python
- Change Data Type of pandas DataFrame Column in Python
- Basic Course for the pandas Library in Python
- Introduction to Python Programming
Summary: In this article, I have explained how to handle the “ParserError: Error tokenizing data. C error: Expected X fields in line Y, saw Z” in the Python programming language. If you have any further questions or comments, let me know in the comments. Furthermore, don’t forget to subscribe to my email newsletter to get updates on new articles.