How to Create a Pandas DataFrame in Python
DataFrame is a two dimensional object of Pandas to store data in a structured way. DataFrame is used to represent data in tabular format in rows and columns. It is like a spreadsheet or a SQL table.
There are different ways to create a DataFrame in python. To create a dataframe, you need to import pandas first. DataFrame can be created using dataframe() function. The data can be in form of dictionary of lists or list of lists.
Create DataFrame from Dictionary of Lists
import pandas as pd
data = {'id': ['202301', '202302', '202303','202304', '202305', '202306'],
'name': ['Minhaj', 'Ridhwan', 'Tanveer','Sharodia', 'Alve', 'Intisar'],
'math_score': [92, 86, 76, 89, 99, 99]}
df = pd.DataFrame(data, index=None)
print(df)
The output will be a table having three columns named ‘id’ , ‘name’ and ‘math_score’ with the provided data fed into the table as below:
id name math_score 0 202301 Minhaj 92 1 202302 Ridhwan 86 2 202303 Tanveer 76 3 202304 Sharodia 89 4 202305 Alve 99 5 202306 Intisar 99
Create DataFrame from List of Lists
import pandas as pd
data=[['202301','Minhaj',92],['202302','Ridhwan',86],['202303','Tanveer',76],['202304','Sharodia',89],['202305','Alve',99],['202306','Intisar',99]]
df=pd.DataFrame(data,columns=['id','name','math_score'],index=None)
print(df.head())
This also gives the same output. The only difference is in the form in which the data is provided. Since the columns names are not specified earlier, it is needed to pass column names as arguments in the dataframe() function otherwise it will create column name like index (i.e. 0, 1, 2).
id name math_score 0 202301 Minhaj 92 1 202302 Ridhwan 86 2 202303 Tanveer 76 3 202304 Sharodia 89 4 202305 Alve 99
Create Customized Indexed DataFrame
import pandas as pd
data = {'id': ['202301', '202302', '202303','202304', '202305', '202306'],
'name': ['Minhaj', 'Ridhwan', 'Tanveer','Sharodia', 'Alve', 'Intisar'],
'math_score': [92, 86, 76, 89, 99, 99]}
df = pd.DataFrame(data,columns=['id','name','math_score'],index=['i1','i2','i3','i4','i5','i6'])
print(df.head())
This also gives the same output except index name. It will create index name like (i.e. i1, i2, i3, i4, i5, i6) as below:
id name math_score i1 202301 Minhaj 92 i2 202302 Ridhwan 86 i3 202303 Tanveer 76 i4 202304 Sharodia 89 i5 202305 Alve 99
In this tutorial, I tried to brief how to create Pandas DataFrame in Python. Hope you have enjoyed the tutorial. If you want to get updated, like my facebook page https://www.facebook.com/LearningBigDataAnalytics and stay connected.