---
title: "How to find the Hive partition closest to a given date"
description: "How to use Airflow to find the Hive partition closest to a given date"
author: "Bartosz Mikulski"
author_bio: "Principal AI Engineer & MLOps Architect. I bridge the gap between \"it works in a notebook\" and \"it works for 200 million users.\""
author_url: https://mikulskibartosz.name
author_linkedin: https://www.linkedin.com/in/mikulskibartosz/
author_github: https://github.com/mikulskibartosz
canonical_url: https://mikulskibartosz.name/find-hive-partition-closest-to-given-date
---

In Airflow, there is a built-in function, which we can use to find the Hive partition closest to the given date. However, it works only with partition identifiers in the YYYY-mm-dd format, so if you use a different partitioning method, this function will not help you.

To find the closest Hive partition, we should use the `closest_ds_partition` function:

```python
from airflow.macros.hive import closest_ds_partition

closest_ds_partition(
    hive_table_name,
    the_date,
    before=True,
    schema='hive_schema',
    metastore_conn_id='metastore_connection_id'
)
```

Be careful with the `before` parameter. It has a weird behavior. As you may expect, `True` means a partition before the given date, `False` returns the partition after a given date, but when the `before` parameter is set to `None` it returns the closest partition, and it does not matter whether it is before or after the given date.

Please don't follow this coding practice. Three value "boolean" logic is a terrible, **terrible** idea. It is way better to use an enum with descriptive names.

