---
title: "Selecting Rows in Pandas: A Complete Guide to DataFrame Filtering Techniques"
description: "How to use loc, iloc, slice, and row filtering in Pandas"
author: "Bartosz Mikulski"
author_bio: "Principal AI Engineer & MLOps Architect. I bridge the gap between \"it works in a notebook\" and \"it works for 200 million users.\""
author_url: https://mikulskibartosz.name
author_linkedin: https://www.linkedin.com/in/mikulskibartosz/
author_github: https://github.com/mikulskibartosz
canonical_url: https://mikulskibartosz.name/selecting-rows-in-pandas
---

In Pandas, we have multiple methods of selecting the data. Let's take a look at the four most popular ones.

We will start with a DataFrame containing five rows:

|    |   col_A | col_B   |
|---:|--------:|--------:|
|  0 |       1 | A       |
|  1 |       2 | B       |
|  2 |       3 | C       |
|  3 |       4 | D       |
|  4 |       5 | E       |

## the loc function

First, we will use the `loc` function. `loc` lets us select rows using the DataFrame index. For example, if we write `data.loc[[0,1,4]]`, we will get the first, the second, and the last row of our DataFrame.

|    |   col_A | col_B   |
|---:|--------:|--------:|
|  0 |       1 | A       |
|  1 |       2 | B       |
|  4 |       5 | E       |

Of course, it's difficult to spot the benefit of using the `loc` function when we have a numeric index. Because of that, we will set the `col_B` column as the index and use its values to select the rows:

```python
data.set_index('col_B').loc[['A', 'B', 'E']]
```

| col_B   |   col_A |
|:--------|--------:|
| A       |       1 |
| B       |       2 |
| E       |       5 |

## the iloc function

Similarly to `loc` with a numeric index, we can use the `iloc` function to retrieve rows using their position in the DataFrame. Let's retrieve the last two rows:

```python
data.iloc[[3,4]]
```

|    |   col_A | col_B   |
|---:|--------:|--------:|
|  3 |       4 | D       |
|  4 |       5 | E       |

## Using a binary mask

In Pandas, we can pass a binary array to the DataFrame selector to retrieve the corresponding rows.

We are going to need an array of bool values. The array must have the same length as our DataFrame.

```python
binary = [True, False, True, True, False]
data[binary]
```

|    |   col_A | col_B   |
|---:|--------:|--------:|
|  0 |       1 | A       |
|  2 |       3 | C       |
|  3 |       4 | D       |

The most popular data selection method involves generating the binary array using the values from the DataFrame. For example, we can retrieve the rows in which `col_A` has values smaller than 3:

```python
data[data['col_A'] < 3]
```

|    |   col_A | col_B   |
|---:|--------:|--------:|
|  0 |       1 | A       |
|  1 |       2 | B       |

## Slicing a DataFrame

Finally, we can use the slicing operation that works like the same operation in Python lists.

```python
data[2:3]
```

|    |   col_A | col_B   |
|---:|--------:|--------:|
|  2 |       3 | C       |

```python
data[:2]
```

|    |   col_A | col_B   |
|---:|--------:|--------:|
|  0 |       1 | A       |
|  1 |       2 | B       |

```python
data[1:]
```

|    |   col_A | col_B   |
|---:|--------:|--------:|
|  1 |       2 | B       |
|  2 |       3 | C       |
|  3 |       4 | D       |
|  4 |       5 | E       |

```python
data[::2]
```

|    |   col_A | col_B   |
|---:|--------:|--------:|
|  0 |       1 | A       |
|  2 |       3 | C       |
|  4 |       5 | E       |