Use regexp_replace to replace a matched string with a value of another column in PySpark

When we look at the documentation of regexp_replace, we see that it accepts three parameters:

the name of the column
the regular expression
the replacement text

Unfortunately, we cannot specify the column name as the third parameter and use the column value as the replacement.

If I have the following DataFrame and use the regex_replace function to substitute the numbers with the content of the b_column:

from pyspark.sql import DataFrame

from pyspark.sql.types import *
from pyspark.sql import functions as F

df_schema = StructType([StructField('a_column', StringType()), StructField('b_column', StringType())])

test_list = [
    ['aaa123', 'aaa'],
    ['bbb243', 'bbb']
]

df = spark_session.createDataFrame(test_list, schema=df_schema)

df \
    .withColumn('replaced', F.regexp_replace('a_column', '\d{3}', 'b_column')) \
    .show()

The resulting DataFrame is not even a little bit similar to the expected output:

+--------+--------+-----------+
|a_column|b_column|   replaced|
+--------+--------+-----------+
|  aaa123|     aaa|aaab_column|
|  bbb243|     bbb|bbbb_column|
+--------+--------+-----------+

The first solution that comes to mind is using the col function as the third parameter:

.withColumn('replaced', F.regexp_replace('a_column', '\d{3}', F.col('b_column'))) \

This attempt fails too because we get TypeError: Column is not iterable error.

To solve the problem, we have to use the expr function and define the operation as a string:

.withColumn('replaced', F.expr("regexp_replace(a_column, '([0-9]{3})', b_column)"))