Difference Between List & pandas DataFrame in Python
Hi! This short tutorial will explain the difference between lists and pandas DataFrames in the Python programming language.
The table of content is structured as follows:
Let’s dive into the discussion and code!
Python List
A Python list is used to store different data types, like strings, integers, lists, etc. Below are examples of Python lists containing integers, strings, a mix of integers, a list and strings, and user-defined objects from top to bottom.
# list of integers int_list = [2,4,6,8,10,12,14] print(int_list) # [2, 4, 6, 8, 10, 12, 14] # list of strings str_list = ["zucchini", "asparagus", "spinach", "broccoli", "lettuce"] print(str_list) # ['zucchini', 'asparagus', 'spinach', 'broccoli', 'lettuce'] # list of mixed data types mixed_list = [4,2,5,[1.8,4.6,8.2],"airplane", "ship", "train"] print(mixed_list) # [4, 2, 5, [1.8, 4.6, 8.2], 'airplane', 'ship', 'train'] # list of object class CarBrand: def __init__(self, brand, price): self.brand = brand self.price = price my_list = [] my_list.append(CarBrand("Toyota",3500)) my_list.append(CarBrand("Honda",3000)) my_list.append(CarBrand("Mercedes",4200)) my_list.append(CarBrand("Ford",5000)) my_list.append(CarBrand("Kia",2500)) # for loop to print object list for c in my_list: print(f"{c.brand}({c.price})") # Toyota(3500) # Honda(3000) # Mercedes(4200) # Ford(5000) # Kia(2500)
We can also create a Python list using the list() function:
# list of integers int_list = list((2,4,6,8,10,12,14)) print(int_list) # [2, 4, 6, 8, 10, 12, 14] # list of strings str_list = list(("zucchini", "asparagus", "spinach", "broccoli", "lettuce")) print(str_list) # ['zucchini', 'asparagus', 'spinach', 'broccoli', 'lettuce'] # list of mixed data types mixed_list = list((4,2,5,[1.8,4.6,8.2],"airplane", "ship", "train")) print(mixed_list) # [4, 2, 5, [1.8, 4.6, 8.2], 'airplane', 'ship', 'train']
In the examples above, we parsed a tuple to the list constructor, which returned a list of objects in common type, like in int_list and str_list, and mixed types, like in mixed_list.
Attributes of Python List
Below are the attributes of Python list:
- A list is an ordered collection of elements, where each element has a specific index starting from 0.
- Lists are mutable, meaning you can add, remove, or modify elements after creating the list.
- The elements in a list can be accessed by their index values.
- The indexes of a list are always integers.
- A list can hold duplicate values.
pandas DataFrame
A pandas DataFrame is a two-dimensional data structure, or a table comprising rows and columns. Before we can create a DataFrame with pandas, however, we need to install and import the library. So, in your preferred Python IDE, run the lines of code below to install and import pandas:
# install pandas pip install pandas # import pandas import pandas as pd
With pandas installed and imported into our Python programming environment, we can now create example DataFrames:
# DataFrame holding homogenous data data = {"Name":["Sally","Tayo","Ali","Morgan","James"], "Sex":["Female","Male","Male","Female","Male"], "Pet":["cat", "dog", "rabbit", "pig", "dog"], "Color":["pink", "green", "blue", "yellow", "black"]} df = pd.DataFrame(data) print(df) # Name Sex Pet Color #0 Sally Female cat pink #1 Tayo Male dog green #2 Ali Male rabbit blue #3 Morgan Female pig yellow #4 James Male dog black # DataFrame holding heterogeneous data data = {"Name":["Sally","Tayo","Ali","Morgan","James"], "Age":[22,24,25,20,18], "Sex":["Female","Male","Male","Female","Male"], "Height":[5.1,6.2,6.5,5.5,6.0]} df = pd.DataFrame(data) print(df) # Name Age Sex Height #0 Sally 22 Female 5.1 #1 Tayo 24 Male 6.2 #2 Ali 25 Male 6.5 #3 Morgan 20 Female 5.5 #4 James 18 Male 6.0
In both examples, we first created a Python dictionary, which contained the data that was parsed to the pd.DataFrame() function in order to create the pandas DataFrame.
Attributes of pandas DataFrame
Below are the attributes of DataFrames in general, including those made in pandas:
- A DataFrame is made up of rows or observations. Each row represents a single record or entity, and each column represents a variable or attribute of that record.
- A DataFrame is size-mutable, and can hold potentially heterogeneous data. The Python pandas library can be used to create a DataFrame.
- A DataFrame is also made up of columns or variables. Each column contains a specific data type, such as numeric, character, or boolean data.
- DataFrames usually have a header containing the columns’ names. The header is used to label each column and is typically the first row of the DataFrame.
- A DataFrame also has an index that labels each row. The index can be a unique identifier, such as a customer ID or a timestamp.
- The size of a DataFrame is the total number of cells, which is equal to the number of rows times the number of columns.
- DataFrames have a shape that describes the number of rows and columns. For example, a DataFrame with 100 rows and 5 columns has a shape of (100, 5).
So, while lists can be used to store homogenous or heterogeneous elements, and are suitable for simple data structures where indexing is the primary operation, DataFrames, on the other hand, are used to store homogenous or heterogeneous tabular data and provide more sophisticated methods for data cleaning, transformation, and analysis.
With that, we have examined the difference between lists and pandas DataFrame in Python. I hope you found this tutorial helpful!
Video, Further Resources & Summary
Do you need more explanations on the difference between lists and pandas DataFrame in Python? Then you should have a look at the following YouTube video from the Statistics Globe YouTube channel.
In the video, we explain in some more detail the difference between lists and pandas DataFrame in Python.
The YouTube video will be added soon.
Furthermore, I encourage you to check out other interesting Python list tutorials on Statistics Globe, starting with these:
- Change Index of Element in List in Python (Example)
- Add Float to Each Element in List in Python (4 Examples)
- Learn Python Programming
- Access Elements in List within Dictionary in Python (2 Examples)
- Convert List to Matrix & Vice-Versa in Python (Examples)
This post has shown the difference between lists and pandas DataFrame in Python. In case you have further questions, you may leave a comment below.
This page was created in collaboration with Ifeanyi Idiaye. You might check out Ifeanyi’s personal author page to read more about his academic background and the other articles he has written for the Statistics Globe website.
Statistics Globe Newsletter