---
title: "How to use one SparkSession to run all Pytest tests"
description: "How to speed us Pytest tests by reusing the same SparkSession in all of them"
author: "Bartosz Mikulski"
author_bio: "Principal AI Engineer & MLOps Architect. I bridge the gap between \"it works in a notebook\" and \"it works for 200 million users.\""
author_url: https://mikulskibartosz.name
author_linkedin: https://www.linkedin.com/in/mikulskibartosz/
author_github: https://github.com/mikulskibartosz
canonical_url: https://mikulskibartosz.name/use-one-spark-session-to-run-all-pytest-tests
---

When we test a PySpark application, we run into a problem of passing the SparkSession into the tests. Of course, we can instantiate a separate session in every test function, but that is going to slow down the tests significantly. Such a solution may be acceptable when we have only one or two PySpark tests inside a larger application, but what if we want to run a few hundreds of tests and every one of them uses PySpark?

In this situation, we should **instantiate the SparkSession once and pass it to every test as a parameter**. In PyTest, we can do it using the fixtures. The fixtures are supposed to configure the test environment and clean up after the tests.

To configure a fixture, we must create a new file in the tests directory and implement a function that returns the value of that fixture. The function must be decorated using the `pytest.fixture` decorator. Inside the function, we can also define a finalizer that is supposed to release the resources allocated by the fixture.

To reuse the same SparkSession in all of the tests, we must specify the scope of the fixture and set its value to "session".

The following example demonstrates the complete way of defining a SparkSession fixture. Note that the name of the function is going to be used as the fixture name.

```python
import pytest
from pyspark.sql import SparkSession

@pytest.fixture(scope="session")
def spark_session(request):
    spark_session = SparkSession.builder \
        .master("local[*]") \
        .appName("some-app-name") \
        .getOrCreate()

    request.addfinalizer(lambda: spark_session.sparkContext.stop())

    return spark_session
```

In the tests, we must declare which fixture we want to use inside the test file. The function that creates a SparkSession is called `spark_session`, so we use the same name to declare the fixture.

```python
pytestmark = pytest.mark.usefixtures("spark_session")
```

Now, we can add the `spark_session` parameter to every test function that needs a SparkSession.

```python
def test_name(spark_session):
    ...
```