Read Only Certain Columns of CSV File as pandas DataFrame in Python (Example)
In this tutorial, I’ll illustrate how to import only some specific columns from a CSV file in the Python programming language.
The article will consist of the following information:
You’re here for the answer, so let’s get straight to the example!
Example Data & Add-On Libraries
First, we need to import the pandas library:
import pandas as pd # Load pandas library |
import pandas as pd # Load pandas library
The following data is used as a basis for this Python programming tutorial:
data = pd.DataFrame({'x1':range(14, 22), # Create pandas DataFrame 'x2':[7, 3, 7, 5, 3, 9, 1, 8], 'x3':['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'], 'x4':[9, 2, 1, 1, 4, 1, 4, 8], 'x5':range(35, 27, - 1)}) |
data = pd.DataFrame({'x1':range(14, 22), # Create pandas DataFrame 'x2':[7, 3, 7, 5, 3, 9, 1, 8], 'x3':['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'], 'x4':[9, 2, 1, 1, 4, 1, 4, 8], 'x5':range(35, 27, - 1)})
As you can see, our example data set contains the five columns x1, x2, x3, x4, and x5.
We can export this data set to a CSV file using the to_csv function:
data.to_csv('data.csv') # Export pandas DataFrame |
data.to_csv('data.csv') # Export pandas DataFrame
Once the previous Python syntax has been executed, a new CSV file containing our example data appears in our current working directory.
Example: Read Only Specific Columns from CSV File
The following Python programming syntax shows how to import only certain variables of a CSV file as a new pandas DataFrame in Python.
To accomplish this, we have to apply the read_csv function and the usecols argument as shown below. To the usecols argument, we have to assign a list of column names that we want to load:
data_import3 = pd.read_csv('data.csv', # Read pandas DataFrame from CSV usecols = ['x2', 'x4', 'x5']) print(data_import3) # Print imported pandas DataFrame |
data_import3 = pd.read_csv('data.csv', # Read pandas DataFrame from CSV usecols = ['x2', 'x4', 'x5']) print(data_import3) # Print imported pandas DataFrame
Table 1 displays the Python console output and shows that we have created a new pandas DataFrame that contains eight rows and the three columns x2, x4, and x4.
The variables x1 and x3 from our input CSV have been ignored.
Video & Further Resources
Do you want to know more about the reading of particular variables from a CSV file as a pandas DataFrame? Then I recommend having a look at the following video on my YouTube channel. In the video, I show the Python syntax of this tutorial in a live programming session:
The YouTube video will be added soon.
In addition, you might read some of the other tutorials on this website:
- Read CSV File as pandas DataFrame in Python
- Read CSV File without Unnamed Index Column
- Rename Columns of pandas DataFrame in Python
- Append Columns to pandas DataFrame in Loop in Python
- Iterate Over Columns of pandas DataFrame in Python
- Sort pandas DataFrame by Multiple Columns in Python
- Add Multiple Columns to pandas DataFrame in Python
- Sum of Columns & Rows of pandas DataFrame in Python
- Basic Course for the pandas Library in Python
- The Python Programming Language
You have learned in this tutorial how to load only some specific columns from a CSV file to a pandas DataFrame in the Python programming language. Don’t hesitate to tell me about it in the comments section, in case you have additional questions and/or comments.