By default, Hive does not support passing the columns to GROUP BY or ORDER BY using their positions instead of names. Therefore, when I try to run this query:
SELECT column_A, count(*) FROM table_name GROUP BY 1
I will see an error message telling me that the column_A is not in the GROUP BY expression.
To get the behavior I want, I have to enable the position alias usage before running my query. I can do that by setting the hive.groupby.orderby.position.alias property to true:
SET hive.groupby.orderby.position.alias=true;
SELECT column_A, count(*) FROM table_name GROUP BY 1
Now, Hive is going to correctly recoginize that I want to group by the first column, lookup its name and use that to execute the query.
Engineering Reliable AI
Strategies for building production-grade, deterministic AI systems.
"I've learned a lot already from your blog."
— A Substack reader who has pledged $80 per year for this content.
Most AI newsletters hype the latest model releases. This one focuses on the boring, critical engineering required to make those models actually work in production.
Join engineers moving from "notebook" to "production":
-
Architectural Deep Dives: Designing deterministic RAG pipelines and Agentic workflows using schema enforcement (BAML).
-
Production MLOps: Real-world strategies for evaluation, FinOps, and "Shift Left" data quality.
-
System Reliability: Post-mortems on why AI systems fail at scale and how to prevent it.
Subscribe for Free