The other day, I was porting an Athena query to run it on Hive, and I realized that the regexp_like function does not exist in Hive. Fortunately, there is a nice replacement that offers precisely the same behavior.

First, lets take a look at the regexp_like function in Presto/Athena:

regexp_like(column_name, 'regexp')

For example, we can use it like this to return the value of another column when a value in the row matches the regexp and return 0 when it doesn’t:

CASE WHEN (regexp_like(column_name, '^[0-9]+$')) THEN cast(other_column as double) ELSE 0 END

To get the same behavior in Hive, I have to use the RLIKE keyword. Instead of WHEN CASE with only two branches, I can use the IF function:

IF(column_name RLIKE '^[0-9]+$', cast (other_column as double), 0)
Older post

How to check whether a YARN application has finished

How to use Airflow PythonSensor to check whether a YARN application finished running

Newer post

How to write Hive queries with column position number in the GROUP BY or ORDER BY clauses

How to enable column position support in Hive GROUP BY or ORDER BY