CBSE CS and IP

CBSE Class 11 & 12 Computer Science and Informatics Practices Python Materials, Video Lecture

Remove or Replace any character from Python Pandas DataFrame Column

removing a character from dataframe column


If you are searching for the solution on How to remove a Character from Pandas DataFrame Columns, you have come to the right place.

You can remove or replace any character from any column from your Pandas DataFrame using the following code:


dataframe_name    = df   ## your dataframe name 
dataframe_col_idx = 0    ## dataframe column index, on which you want to perform operation
char_to_replace   = 'a'  ## char which you want to replace
replaced_char     = 'XX' ## char/string into which you want to replace, '' in case to remove 

n = 0
for (i,j) in dataframe_name.iteritems():
    if i == dataframe_col_name:
        for name in j:            
            dataframe_name.iloc[n,dataframe_col_idx] = name.replace(char_to_replace,replaced_char)
            n = n + 1



You can use the above code to remove or replace any character from DataFrame column. Below are the instructions on how to use the above code:

  1. Change the dataframe_name variable and give your dataframe name.
  2. Give the index (in the form of an integer) of your column in dataframe_col_idx variable.
  3. Now give the character which you want to replace in char_to_replace.
  4. and replaced_char will have a character or string into which you want to change your character. 
If you just want to remove any character simply give replaced_char as ' ' (an empty string).

Consider the following example:

import pandas as pd

d = {'Name':['Sachin','Dhoni','Virat','Rohit','Shikhar','Sachin'],
     'Age':[26,25,25,24,31,33],
     'Score':[87,67,89,55,47,90]}

df = pd.DataFrame(d,index = ['A','B','C','D','E','F'])

df

Output:
        Name   Age  Score
A   Sachin   26     87
B    Dhoni   25     67
C    Virat   25     89
D    Rohit   24     55
E  Shikhar   31     47
F   Sachin   33     90


If I want to remove all the character 'a' from column 'Name', we can use the following code:
dataframe_name    = df   ## your dataframe name 
dataframe_col_idx = 0    ## dataframe column index, on which you want to perform operation
char_to_replace   = 'a'  ## char which you want to replace
replaced_char     = ''   ## char/string into which you want to replace, '' in case to remove 

n = 0
for (i,j) in dataframe_name.iteritems():
    if i == dataframe_col_name:
        for name in j:            
            dataframe_name.iloc[n,dataframe_col_idx] = name.replace(char_to_replace,replaced_char)
            n = n + 1



DataFrame df after running the above code:
     Name   Age    Score
A    Schin   26     87
B    Dhoni   25     67
C     Virt   25     89
D    Rohit   24     55
E   Shikhr   31     47
F    Schin   33     90


You can see how the character 'a' has been removed from my datafram. Using this code you can also remove special characters from your dataframe.

1 comment:

  1. what about dataframe_col_name ??? is it literally the column name???

    ReplyDelete