CBSE CS and IP

CBSE Class 11 & 12 Computer Science and Informatics Practices Python Materials, Video Lecture

What is Pandas DataFrame ? How to Create it ?

Python Pandas DataFrame



What is DataFrame?

It is a 2 dimensional data structure with columns of different types. It is just similar to a spreadsheet or SQL table, or a dict of Series objects.

Python Pandas DatFrame

Characteristics of DataFrame Object:
  • It has two indexes or axis row index (axis = 0) and column index (axis = 1)
  • Row index is known as index and column index is known as column name
  • Index(Row-Index) or Column (Column-Index) can be numbers or letters or stings
  • A column can have values of different types.
  • DataFrame is Value Mutable and Size Mutable

Creation of DataFrame

Now we will discuss How to create Pandas DataFrame. Before creating DataFrame Object we have to import pandas library.

1
import pandas

Syntax for DataFrame Creation:

1
2
<df_object> = pandas.DataFrame(data = None, index = None, 
                            columns = None, dtype = None, copy = False)

1. Dictionary of List / Series

Dictionary of List

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
import pandas as pd

d = {'first_name': ['Sheldon', 'Raj', 'Leonard', 'Howard', 'Amy'],
        'last_name': ['Copper', 'Koothrappali', 'Hofstadter', 'Wolowitz', 'Fowler'],
        'age': [42, 38, 36, 41, 35],
        'Comedy_Score': [9, 7, 8, 8, 5],
        'Rating_Score': [25, 25, 49, 62, 70]}

df = pd.DataFrame(d)
print(df)


Output
------
  first_name     last_name  age  Comedy_Score  Rating_Score
0    Sheldon        Copper   42             9            25
1        Raj  Koothrappali   38             7            25
2    Leonard    Hofstadter   36             8            49
3     Howard      Wolowitz   41             8            62
4        Amy        Fowler   35             5            70

Dictionary of Series

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import pandas as pd
d = {'first_name': pd.Series(['Sheldon', 'Raj', 'Leonard', 'Howard', 'Amy']),
        'last_name': pd.Series(['Copper', 'Koothrappali', 'Hofstadter', 'Wolowitz', 'Fowler']),
        'age': pd.Series([42, 38, 36, 41, 35]),
        'Comedy_Score': pd.Series([9, 7, 8, 8, 5]),
        'Rating_Score': pd.Series([25, 25, 49, 62, 70])}

df = pd.DataFrame(d)
print(df)

Output
------
first_name     last_name  age  Comedy_Score  Rating_Score
0    Sheldon        Copper   42             9            25
1        Raj  Koothrappali   38             7            25
2    Leonard    Hofstadter   36             8            49
3     Howard      Wolowitz   41             8            62
4        Amy        Fowler   35             5            70


2. From List of List / Dictionaries

List of List

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
import pandas as pd
l= [ ['Sheldon', 'Copper', 42, 9, 25],
     ['Raj', 'Koothrappali', 38, 7, 25],
     ['Leonard', 'Hofstadter', 36, 8, 49],
     ['Howard', 'Wolowitz', 41, 8, 62],
     ['Amy', 'Fowler', 35, 5, 70] ]

df = pd.DataFrame(l)
print(df)


         0             1   2  3   4
0  Sheldon        Copper  42  9  25
1      Raj  Koothrappali  38  7  25
2  Leonard    Hofstadter  36  8  49
3   Howard      Wolowitz  41  8  62
4      Amy        Fowler  35  5  70


List of Dictionary

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
import pandas as pd

l = [ {'first_name': 'Sheldon', 'last_name': 'Copper', 'age': 42, 'Comedy_Score': 9, 'Rating_Score': 25},
{'first_name': 'Raj', 'last_name': 'Koothrappali', 'age': 38, 'Comedy_Score': 7, 'Rating_Score': 25},
{'first_name': 'Leonard', 'last_name': 'Hofstadter', 'age': 36, 'Comedy_Score': 8, 'Rating_Score': 49},
{'first_name': 'Howard', 'last_name': 'Wolowitz', 'age': 41, 'Comedy_Score': 8, 'Rating_Score': 62},
{'first_name': 'Amy', 'last_name': 'Fowler', 'age': 35, 'Comedy_Score': 5, 'Rating_Score': 70} ]

df = pd.DataFrame(l)
print(df)

Output
------
  first_name     last_name  age  Comedy_Score  Rating_Score
0    Sheldon        Copper   42             9            25
1        Raj  Koothrappali   38             7            25
2    Leonard    Hofstadter   36             8            49
3     Howard      Wolowitz   41             8            62
4        Amy        Fowler   35             5            70



3. Text / CSV Files

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
file: data.csv
--------------
first_name,last_name,age,Comedy_Score,Rating_Score
Sheldon, Copper, 42, 9, 25
Raj, Koothrappali, 38, 7, 25
Leonard, Hofstadter, 36, 8, 49
Howard, Wolowitz, 41, 8, 62
Amy, Fowler, 35, 5, 70

import pandas as pd
df = pd.read_csv("data.csv")
# you can give a .txt file also, but the data should be comma separated
print(df)

Output
------
  first_name      last_name  age  Comedy_Score  Rating_Score
0    Sheldon         Copper   42             9            25
1        Raj   Koothrappali   38             7            25
2    Leonard     Hofstadter   36             8            49
3     Howard       Wolowitz   41             8            62
4        Amy         Fowler   35             5            70


No comments:

Post a Comment