How to Create a Pandas DataFrame in Python

DataFrame is a two dimensional object of Pandas to store data in a structured way. DataFrame is used to represent data in tabular format in rows and columns. It is like a spreadsheet or a SQL table.

There are different ways to create a DataFrame in python. To create a dataframe, you need to import pandas first. DataFrame can be created using dataframe() function. The data can be in form of dictionary of lists or list of lists.

Create DataFrame from Dictionary of Lists

import pandas as pd
data = {'id': ['202301', '202302', '202303','202304', '202305', '202306'],
'name': ['Minhaj', 'Ridhwan', 'Tanveer','Sharodia', 'Alve', 'Intisar'],
'math_score': [92, 86, 76, 89, 99, 99]}
df = pd.DataFrame(data, index=None)
print(df)

The output will be a table having three columns named ‘id’ , ‘name’ and ‘math_score’ with the provided data fed into the table as below:

       id      name  math_score
0  202301    Minhaj          92
1  202302   Ridhwan          86
2  202303   Tanveer          76
3  202304  Sharodia          89
4  202305      Alve          99
5  202306   Intisar          99

Create DataFrame from List of Lists

import pandas as pd
data=[['202301','Minhaj',92],['202302','Ridhwan',86],['202303','Tanveer',76],['202304','Sharodia',89],['202305','Alve',99],['202306','Intisar',99]] df=pd.DataFrame(data,columns=['id','name','math_score'],index=None)
print(df.head())

This also gives the same output. The only difference is in the form in which the data is provided. Since the columns names are not specified earlier, it is needed to pass column names as arguments in the dataframe() function otherwise it will create column name like index (i.e. 0, 1, 2).

       id      name  math_score
0  202301    Minhaj          92
1  202302   Ridhwan          86
2  202303   Tanveer          76
3  202304  Sharodia          89
4  202305      Alve          99

Create Customized Indexed DataFrame

import pandas as pd
data = {'id': ['202301', '202302', '202303','202304', '202305', '202306'],
'name': ['Minhaj', 'Ridhwan', 'Tanveer','Sharodia', 'Alve', 'Intisar'],
'math_score': [92, 86, 76, 89, 99, 99]}
df = pd.DataFrame(data,columns=['id','name','math_score'],index=['i1','i2','i3','i4','i5','i6'])
print(df.head())

This also gives the same output except index name. It will create index name like (i.e. i1, i2, i3, i4, i5, i6) as below:

        id      name  math_score
i1  202301    Minhaj          92
i2  202302   Ridhwan          86
i3  202303   Tanveer          76
i4  202304  Sharodia          89
i5  202305      Alve          99

In this tutorial, I tried to brief how to create Pandas DataFrame in Python. Hope you have enjoyed the tutorial. If you want to get updated, like my facebook page https://www.facebook.com/LearningBigDataAnalytics and stay connected.

Add a Comment