What is DataFrame?
- It has two indexes or axis row index (axis = 0) and column index (axis = 1)
- Row index is known as index and column index is known as column name
- Index(Row-Index) or Column (Column-Index) can be numbers or letters or stings
- A column can have values of different types.
- DataFrame is Value Mutable and Size Mutable
Creation of DataFrame
1 | import pandas |
Syntax for DataFrame Creation:
1 2 | <df_object> = pandas.DataFrame(data = None, index = None, columns = None, dtype = None, copy = False) |
1. Dictionary of List / Series
Dictionary of List
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | import pandas as pd d = {'first_name': ['Sheldon', 'Raj', 'Leonard', 'Howard', 'Amy'], 'last_name': ['Copper', 'Koothrappali', 'Hofstadter', 'Wolowitz', 'Fowler'], 'age': [42, 38, 36, 41, 35], 'Comedy_Score': [9, 7, 8, 8, 5], 'Rating_Score': [25, 25, 49, 62, 70]} df = pd.DataFrame(d) print(df) Output ------ first_name last_name age Comedy_Score Rating_Score 0 Sheldon Copper 42 9 25 1 Raj Koothrappali 38 7 25 2 Leonard Hofstadter 36 8 49 3 Howard Wolowitz 41 8 62 4 Amy Fowler 35 5 70 |
Dictionary of Series
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | import pandas as pd d = {'first_name': pd.Series(['Sheldon', 'Raj', 'Leonard', 'Howard', 'Amy']), 'last_name': pd.Series(['Copper', 'Koothrappali', 'Hofstadter', 'Wolowitz', 'Fowler']), 'age': pd.Series([42, 38, 36, 41, 35]), 'Comedy_Score': pd.Series([9, 7, 8, 8, 5]), 'Rating_Score': pd.Series([25, 25, 49, 62, 70])} df = pd.DataFrame(d) print(df) Output ------ first_name last_name age Comedy_Score Rating_Score 0 Sheldon Copper 42 9 25 1 Raj Koothrappali 38 7 25 2 Leonard Hofstadter 36 8 49 3 Howard Wolowitz 41 8 62 4 Amy Fowler 35 5 70 |
2. From List of List / Dictionaries
List of List
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | import pandas as pd l= [ ['Sheldon', 'Copper', 42, 9, 25], ['Raj', 'Koothrappali', 38, 7, 25], ['Leonard', 'Hofstadter', 36, 8, 49], ['Howard', 'Wolowitz', 41, 8, 62], ['Amy', 'Fowler', 35, 5, 70] ] df = pd.DataFrame(l) print(df) 0 1 2 3 4 0 Sheldon Copper 42 9 25 1 Raj Koothrappali 38 7 25 2 Leonard Hofstadter 36 8 49 3 Howard Wolowitz 41 8 62 4 Amy Fowler 35 5 70 |
List of Dictionary
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | import pandas as pd l = [ {'first_name': 'Sheldon', 'last_name': 'Copper', 'age': 42, 'Comedy_Score': 9, 'Rating_Score': 25}, {'first_name': 'Raj', 'last_name': 'Koothrappali', 'age': 38, 'Comedy_Score': 7, 'Rating_Score': 25}, {'first_name': 'Leonard', 'last_name': 'Hofstadter', 'age': 36, 'Comedy_Score': 8, 'Rating_Score': 49}, {'first_name': 'Howard', 'last_name': 'Wolowitz', 'age': 41, 'Comedy_Score': 8, 'Rating_Score': 62}, {'first_name': 'Amy', 'last_name': 'Fowler', 'age': 35, 'Comedy_Score': 5, 'Rating_Score': 70} ] df = pd.DataFrame(l) print(df) Output ------ first_name last_name age Comedy_Score Rating_Score 0 Sheldon Copper 42 9 25 1 Raj Koothrappali 38 7 25 2 Leonard Hofstadter 36 8 49 3 Howard Wolowitz 41 8 62 4 Amy Fowler 35 5 70 |
3. Text / CSV Files
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | file: data.csv -------------- first_name,last_name,age,Comedy_Score,Rating_Score Sheldon, Copper, 42, 9, 25 Raj, Koothrappali, 38, 7, 25 Leonard, Hofstadter, 36, 8, 49 Howard, Wolowitz, 41, 8, 62 Amy, Fowler, 35, 5, 70 import pandas as pd df = pd.read_csv("data.csv") # you can give a .txt file also, but the data should be comma separated print(df) Output ------ first_name last_name age Comedy_Score Rating_Score 0 Sheldon Copper 42 9 25 1 Raj Koothrappali 38 7 25 2 Leonard Hofstadter 36 8 49 3 Howard Wolowitz 41 8 62 4 Amy Fowler 35 5 70 |