In Python missing data is represented by two value:
- None: None is a Python singleton object that is often used for missing data in Python code.
- NaN: NaN (an acronym for Not a Number), is a special floating-point value
NaN stands for not a number. It is a numeric data type that is used to represent any value which is undefined or unpresentable. None data type is used to specify the missing values but NaN is a numeric datatype which specifies the data which is not a Number.
For Example 0/0 will have an undefined value, hence it will be represented by NaN.
If you check the data type of NaN, it will be of float type. So NaN values in python will have Float Data Type. In Python NaN is available in Math Module and Numpy Module. When we use python pandas we generally use numpy's NaN.
Check the following code which explains the difference between None and NaN.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | import pandas as pd import numpy as np s = pd.Series([1,2,None]) print(s) 0 1.0 1 2.0 2 NaN dtype: float64 print(type(None)) >>> NoneType ## From numpy Module print(type(np.NaN)) >>> <class 'float'> |
In the above code we have created a Series "S". S is having one undefined value in a list of numbers. Hence it is represented by NaN in Python Pandas.
No comments:
Post a Comment