My name is Thais Vaz. I’ve been a data engineer for 8+ years.
I started at Itaú Unibanco, where I received the MÉRITO Prize for data quality work. Then moved to EBANX, building ETL pipelines at scale. After that, worked on the Apple project in Silicon Valley through HCL Technologies. Today I’m a senior data engineer at Bradesco.
My core stack is Databricks. Not because I read the docs. Because it’s what runs in production where I’ve worked.
Why this blog exists
In 2025 I joined the Numerical Methods in Engineering program at UFPR (Federal University of Paraná) as an auditing student, and in 2026 I was admitted as a full Master’s student. My research is on AI-driven predictive monitoring using LLMs for operational systems.
Along the way I noticed something that was bothering me. Almost nobody was writing about production data engineering in Portuguese. Not the way I wanted to read it: with depth, from someone who actually shipped it, at a real bank, with real LGPD compliance constraints, SLA requirements, and regulatory oversight.
So I started writing.
What you’ll find here
The first track is production data engineering. Databricks, Delta Lake, Spark, dbt, Airflow. Real architecture decisions, mistakes I made and what I learned. Brazilian context where it’s relevant.
The second is the crypto AI agent, built in public. Architecture, code, backtesting, on-chain analysis. Every step documented. If something breaks, you’ll know why.
The third is the master’s research translated to practice. What academic research has to say about the problems you face every day, no filter.
Published in Portuguese and English, every week.
Where to find me
- Newsletter on Substack: vazdeng.substack.com, a summary of what’s published here straight to your inbox
- GitHub: @thaiscvaz
- LinkedIn: thacvaz
- Contact: reply to any post or ping me on LinkedIn