PostgresOperator allows us to use a SQL file as the query. However, when we do that, the standard way of passing template parameters no longer works.

Table of Contents

  1. Get Weekly AI Implementation Insights

For example, if I have the following SQL query:

 SELECT column_a, column_b FROM table_name WHERE column_a = {{ some_value }} 

Airflow will not automatically pass the some_value variable as the parameter.

In this situation, I have to use the parameters attribute of the PostgresOperator:

PostgresOperator(
    task_id='postgres_task',
    postgres_conn_id='postgress_sophi',
    params={"some_value":"'the value I want'"},
    sql='query.sql',
    dag=dag)

Note that, I had to wrap the value into single quotes! Because in Postgres double quotes are used to reference a columns, tables, or functions. String literals, on the other hand, are inside single quotes.

In the query.sql file, I have to change the template variable to params.some.value:

 SELECT column_a, column_b FROM table_name WHERE column_a = {{ params.some_value }} 

Get Weekly AI Implementation Insights

Join engineering leaders who receive my analysis of common AI production failures and how to prevent them. No fluff, just actionable techniques.

Get Weekly AI Implementation Insights

Join engineering leaders who receive my analysis of common AI production failures and how to prevent them. No fluff, just actionable techniques.

Older post

Use regexp_replace to replace a matched string with a value of another column in PySpark

Use regex to replace the matched string with the content of another column in PySpark

Newer post

Speed up counting the distinct elements in a Spark DataFrame

Use HyperLogLog to calculate the approximate number of distinct elements in Apache Spark

Engineering leaders: Is your AI failing in production? Take the 10-minute assessment
>