---
title: "How to select a random sample of rows using Athena"
description: "How to use a window function to select random rows from Athena"
author: "Bartosz Mikulski"
author_bio: "Principal AI Engineer & MLOps Architect. I bridge the gap between \"it works in a notebook\" and \"it works for 200 million users.\""
author_url: https://mikulskibartosz.name
author_linkedin: https://www.linkedin.com/in/mikulskibartosz/
author_github: https://github.com/mikulskibartosz
canonical_url: https://mikulskibartosz.name/select-random-sample-from-athena
---

This article shows you how to use the window function and random sorting to select a random sample of rows grouped by a column.

First, we will use the window function to group the rows by a given column and order them randomly. Let's assume that I have an Athena table called `the_table` with a column called `column_A`. In this case, the window function looks like this:

```sql
ROW_NUMBER() OVER (PARTITION BY column_A ORDER BY RANDOM()) AS rn
```

I will put that window function in a subquery:

```sql
WITH random_order AS (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY column_A ORDER BY RANDOM()) AS rn
    FROM the_table
)
```

If I want to get 1000 random samples in every group, I have to select the rows with the `rn` parameter equal or less than 1000:

```sql
WITH random_order AS (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY column_A ORDER BY RANDOM()) AS rn
    FROM the_table
)
SELECT *
FROM random_order
WHERE rn <= 1000
```

