Skip to content

Home

  • Profile photo for Paul Brabban Paul Brabban, Lead Consultant at Equal Experts


    With experience in software development, data engineering and machine learning, I specialise in data-intensive problems and decentralised data engineering at scale. My experience extends to leading teams, technical architecture and product development. Find out more about my experience and publications in my portfolio.


    Contact me to see how I can help at paul@tempered.works.

Testing stored procedures

hero image

Whilst I've used and written about UDFs a lot, I can't recall ever having a reason to work with stored procedures. When I was asked how I'd go about wrapping them in automation and testing, I thought it'd be a good excuse to take a look and see how I might go about testing them!

Avoiding CAST ROW in AWS Athena SQL

The Trino issue that makes row aggregations so much easier to work with

There is more than one way to build a ROW in AWS Athena and the underlying Trino engine. It turns out I was doing it the verbose, brittle and really annoying way. Row subqueries are so much better!

  • Thanks to Equal Experts logo for supporting this content.

Time travelling with change data capture

Visualisation of effects of end timestamps on a series of transactions

In the last article of the series, I ran into difficulties as I tried to "time-travel" back to earlier points in time. That's an important capability for correct functioning and reproducing results. This article shows how to use window functions (also known as analytic functions) to simplify handling of processing time and avoid the previous problems.

  • Thanks to Equal Experts logo for supporting this content.

Map over an array in BigQuery

SQL snippet for map explained in post

This walkthrough shows how I can use the functional programming techniques map and filter that I already know and love in SQL engines like BigQuery. These techniques give me a lot of processing power whilst keeping my SQL simple and relatively easy to understand. Unlike custom code, I can use the same SQL and infratructure I'd use to process ten rows to process ten billion rows in seconds.

Breaking change data capture with primary keys

A SQL statement that updates a row's primary key

My work on dealing with multiple tables was interrupted when I discovered a subtle scenario that leads to DMS CDC output that cannot be correctly interpreted. I was unable to find a solution, but I will update this post if new information emerges.

  • Thanks to Equal Experts logo for supporting this content.

Disambiguating transactions in change data capture

Example query against disambiguated view

In the CDC output, I get a row for each statement executing in the transaction. Each row reflects the state of the database when that statement is executed. How do I filter out all the transient statements to get the final state of the row when a transaction has finished?

  • Thanks to Equal Experts logo for supporting this content.