What is DataFrame?
It is a 2 dimensional data structure with columns of different types. It is just similar to a spreadsheet or SQL table, or a dict of Series objects.
Characteristics of DataFrame Object:
- It has two indexes or axis row index (axis = 0) and column index (axis = 1)
- Row index is known as index and column index is known as column name
- Index(Row-Index) or Column (Column-Index) can be numbers or letters or stings
- A column can have values of different types.
- DataFrame is Value Mutable and Size Mutable
Creation of DataFrame
Now we will discuss How to create Pandas DataFrame. Before creating DataFrame Object we have to import pandas library.
Syntax for DataFrame Creation:
1
2 | <df_object> = pandas.DataFrame(data = None, index = None,
columns = None, dtype = None, copy = False)
|
1. Dictionary of List / Series
Dictionary of List
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20 | import pandas as pd
d = {'first_name': ['Sheldon', 'Raj', 'Leonard', 'Howard', 'Amy'],
'last_name': ['Copper', 'Koothrappali', 'Hofstadter', 'Wolowitz', 'Fowler'],
'age': [42, 38, 36, 41, 35],
'Comedy_Score': [9, 7, 8, 8, 5],
'Rating_Score': [25, 25, 49, 62, 70]}
df = pd.DataFrame(d)
print(df)
Output
------
first_name last_name age Comedy_Score Rating_Score
0 Sheldon Copper 42 9 25
1 Raj Koothrappali 38 7 25
2 Leonard Hofstadter 36 8 49
3 Howard Wolowitz 41 8 62
4 Amy Fowler 35 5 70
|
Dictionary of Series
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18 | import pandas as pd
d = {'first_name': pd.Series(['Sheldon', 'Raj', 'Leonard', 'Howard', 'Amy']),
'last_name': pd.Series(['Copper', 'Koothrappali', 'Hofstadter', 'Wolowitz', 'Fowler']),
'age': pd.Series([42, 38, 36, 41, 35]),
'Comedy_Score': pd.Series([9, 7, 8, 8, 5]),
'Rating_Score': pd.Series([25, 25, 49, 62, 70])}
df = pd.DataFrame(d)
print(df)
Output
------
first_name last_name age Comedy_Score Rating_Score
0 Sheldon Copper 42 9 25
1 Raj Koothrappali 38 7 25
2 Leonard Hofstadter 36 8 49
3 Howard Wolowitz 41 8 62
4 Amy Fowler 35 5 70
|
2. From List of List / Dictionaries
List of List
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 | import pandas as pd
l= [ ['Sheldon', 'Copper', 42, 9, 25],
['Raj', 'Koothrappali', 38, 7, 25],
['Leonard', 'Hofstadter', 36, 8, 49],
['Howard', 'Wolowitz', 41, 8, 62],
['Amy', 'Fowler', 35, 5, 70] ]
df = pd.DataFrame(l)
print(df)
0 1 2 3 4
0 Sheldon Copper 42 9 25
1 Raj Koothrappali 38 7 25
2 Leonard Hofstadter 36 8 49
3 Howard Wolowitz 41 8 62
4 Amy Fowler 35 5 70
|
List of Dictionary
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19 | import pandas as pd
l = [ {'first_name': 'Sheldon', 'last_name': 'Copper', 'age': 42, 'Comedy_Score': 9, 'Rating_Score': 25},
{'first_name': 'Raj', 'last_name': 'Koothrappali', 'age': 38, 'Comedy_Score': 7, 'Rating_Score': 25},
{'first_name': 'Leonard', 'last_name': 'Hofstadter', 'age': 36, 'Comedy_Score': 8, 'Rating_Score': 49},
{'first_name': 'Howard', 'last_name': 'Wolowitz', 'age': 41, 'Comedy_Score': 8, 'Rating_Score': 62},
{'first_name': 'Amy', 'last_name': 'Fowler', 'age': 35, 'Comedy_Score': 5, 'Rating_Score': 70} ]
df = pd.DataFrame(l)
print(df)
Output
------
first_name last_name age Comedy_Score Rating_Score
0 Sheldon Copper 42 9 25
1 Raj Koothrappali 38 7 25
2 Leonard Hofstadter 36 8 49
3 Howard Wolowitz 41 8 62
4 Amy Fowler 35 5 70
|
3. Text / CSV Files
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22 | file: data.csv
--------------
first_name,last_name,age,Comedy_Score,Rating_Score
Sheldon, Copper, 42, 9, 25
Raj, Koothrappali, 38, 7, 25
Leonard, Hofstadter, 36, 8, 49
Howard, Wolowitz, 41, 8, 62
Amy, Fowler, 35, 5, 70
import pandas as pd
df = pd.read_csv("data.csv")
# you can give a .txt file also, but the data should be comma separated
print(df)
Output
------
first_name last_name age Comedy_Score Rating_Score
0 Sheldon Copper 42 9 25
1 Raj Koothrappali 38 7 25
2 Leonard Hofstadter 36 8 49
3 Howard Wolowitz 41 8 62
4 Amy Fowler 35 5 70
|
No comments:
Post a Comment