CBSE CS and IP

CBSE Class 11 & 12 Computer Science and Informatics Practices Python Materials, Video Lecture

Accessing Pandas Series Slices

Slicing means extracting the part of the Series. Slicing can be done in the following ways:

  1. Using indexing operator( [ start : stop : step ] )
    1. Position wise (slicing includes stop - 1 data)
    2. Data label wise (slicing includes both ends )
      1. With unique data labels
      2. With duplicate data labels
  2. Using .loc attribute
  3. Using .iloc attribute
Accessing Pandas Series Slicing

Let us now discuss each type one by one:

1. Using indexing operator( [ start : stop : step ] )

Indexing operator is used for slicing, it is very similar to list and string slicing. There are three things start, stop and step. The Start is the starting point of the slice and it will go up to Stop - 1 with taking the mentioned Step

Start, Stop can be Series Data Labels/Index or Index Position. The Step can be a positive or negative number. The default value of Step is 1.

Let us now discuss what is Series Data Labels / Index and Series Index Position. To know the difference between these two terms, check the below given Series student:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
import pandas as pd
student = pd.Series(
data = ["BOB", "JHON", "RAM", "MOHAN"],
index = ['S1','S2','S3','S4'])
print(student)

S1      BOB
S2     JHON
S3      RAM
S4    MOHAN
dtype: object

We have created a Series student with data elements as "BOB", "JHON", "RAM" and "MOHAN" and its data labels/index as 'S1', 'S2', 'S3' and 'S4'. Here 'S1', 'S2', 'S3' and 'S4' are called data labels/index of given Series student. Pandas internally maintain a Position for these data labels starting from 0 up to (length - 1) from top and -1 to length from the bottom. You can understand both the terms as below:
 
1
2
3
4
5
Position   Index   Data_Values
  0/-4        S1       BOB
  1/-3        S2       JHON
  2/-2        S3       RAM
  3/-1        S4       MOHAN

Since our Series student has 4 elements, we have positions starting from 0 up to 3. I hope you have now understood the difference between the Series index and index positions.

a) Position wise (slicing includes stop - 1 data)

As we have discussed the position is a number, which pandas assigns to series internally, so we will use that position to find the series slice.

In this type of slicing the data will come up to Stop - 1.

syntax:

<Series Object> [start : stop : step]

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
>>> print(student)
S1      BOB
S2     JHON
S3      RAM
S4    MOHAN
dtype: object

>>> student[0:3:1]
S1     BOB
S2    JHON
S3     RAM
dtype: object

>>> student[-3:-1:1]
S2    JHON
S3     RAM
dtype: object

>>> student[-1:-4:-2]
S4    MOHAN
S2     JHON
dtype: object


b) Data label wise (slicing includes both ends )

We can use Series Data Labels for slicing, in this case, the Start and Stop will be a data label and the Step will be a number.

syntax:

<Series Object> [start : stop : step]

Since the Data Labels of any series can be duplicate, hence we will see the slicing for unique and duplicate data labels separately.

Note: In this type of slicing both the start and stop end will be included in the result.

i) With unique data labels

Check the following example, in this example all the data labels of student Series are unique.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
>>> print(student)
S1      BOB
S2     JHON
S3      RAM
S4    MOHAN
dtype: object

>>> student['S1':'S4':2]
S1    BOB
S3    RAM
dtype: object

>>> student['S4':'S1':1]
Series([], dtype: object)

>>> student['S4':'S1':-1]
S4    MOHAN
S3      RAM
S2     JHON
S1      BOB
dtype: object


ii) With duplicate data labels 

Check the following example, student Series is having two similar Data Labels S1. If we are doing the slicing on a non-unique data Label, we will face the error as we are facing in the below given example.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
>>> print(student)
S1      BOB
S2     JHON
S3      RAM
S1    MOHAN
dtype: object

>>> student['S1':'S2']
KeyError: "Cannot get left slice bound for non-unique label: 'S1'"

>>> student['S2':'S3']
S2    JHON
S3     RAM
dtype: object


2. Using ".loc" attribute

Access a group of rows and columns by label(s) or a Boolean array.
  1. Series.loc[ start : stop : step ]
  2. Series.loc[[<list of labels>]]
Consider the following Series Object:
1
2
3
4
5
import pandas as pd
student = pd.Series(
data = ["BOB", "JHON", "RAM", "MOHAN"],
index = ['S1','S2','S3','S4'])
print(student)

  1. Series.loc[ start : stop : step ] : Using this you can extract series slices using series index names with providing the range. Here start is the start index, stop is till where you want to extract the slice and step is the step size when you read the data. Data will be printed up to stop.
    Example:
    1
    2
    3
    4
    5
    6
    7
    8
    student.loc['S1':'S4':2]
    
    
    '''
    S1    BOB
    S3    RAM
    dtype: object
    '''
    

  2. Series.loc[[<list of labels>]] : If you want to access particular elements of a Series object you can use this type of loc attribute. Here you have to provide the index in the form of a list.
    Example:

    1
    2
    3
    4
    5
    6
    7
    student.loc[['S1','S4']]
    
    '''
    S1      BOB
    S4    MOHAN
    dtype: object
    '''
    

3. Using ".iloc" attribute

Using iloc attribute : Purely integer-location-based indexing for selection by position.
  1. Series.iloc[ start : stop : step ]   
  2. Series.iloc[[<list of positions>]]
Consider the following Series Object:
1
2
3
4
5
import pandas as pd
student = pd.Series(
data = ["BOB", "JHON", "RAM", "MOHAN"],
index = ['S1','S2','S3','S4'])
print(student)

    1. Series.iloc[ start : stop : step ] :  Using this you can extract series slices using series index positions with providing the range. Here start is the start index, stop is till where you want to extract the slice and step is the step size when you read the data. Data will be printed up to stop-1.
      Example:
      1
      2
      3
      4
      5
      6
      7
      8
      student.iloc[0:3:1]
      
      '''
      S1     BOB
      S2    JHON
      S3     RAM
      dtype: object
      '''
      

    2. Series.iloc[[<list of positions>]] :  If you want to access particular elements of a Series object you can use this type of iloc attribute. Here you have to provide the index positions in the form of a list.
      Example:
      1
      2
      3
      4
      5
      6
      7
      8
      student.iloc[[1,2,0]]
      
      '''
      S2    JHON
      S3     RAM
      S1     BOB
      dtype: object
      '''
      

    Watch the following video lecture to know more:



    No comments:

    Post a Comment