Skip to content

Home

  • Profile photo for Paul Brabban Paul Brabban, Lead Consultant at Equal Experts


    With experience in software development, data engineering and machine learning, I specialise in data-intensive problems and decentralised data engineering at scale. My experience extends to leading teams, technical architecture and product development. Find out more about my experience and publications in my portfolio.


    Contact me to see how I can help at paul@tempered.works.

dbt-core vulnerability PVE-2024-73530

SafetyCLI check output for the vulnerability

I noticed failing builds this week, with a dbt-core vulnerability flagged by SafetyCLI as the culprit. There was no actionable information in the command line output, so here's what I found and my action.

  • Thanks to Equal Experts logo for supporting this content.

Time travelling with change data capture

Visualisation of effects of end timestamps on a series of transactions

In the last article of the series, I ran into difficulties as I tried to "time-travel" back to earlier points in time. That's an important capability for correct functioning and reproducing results. This article shows how to use window functions (also known as analytic functions) to simplify handling of processing time and avoid the previous problems.

  • Thanks to Equal Experts logo for supporting this content.

Map over an array in BigQuery

SQL snippet for map explained in post

This walkthrough shows how I can use the functional programming techniques map and filter that I already know and love in SQL engines like BigQuery. These techniques give me a lot of processing power whilst keeping my SQL simple and relatively easy to understand. Unlike custom code, I can use the same SQL and infratructure I'd use to process ten rows to process ten billion rows in seconds.

Breaking change data capture with primary keys

A SQL statement that updates a row's primary key

My work on dealing with multiple tables was interrupted when I discovered a subtle scenario that leads to DMS CDC output that cannot be correctly interpreted. I was unable to find a solution, but I will update this post if new information emerges.

  • Thanks to Equal Experts logo for supporting this content.

Disambiguating transactions in change data capture

Example query against disambiguated view

In the CDC output, I get a row for each statement executing in the transaction. Each row reflects the state of the database when that statement is executed. How do I filter out all the transient statements to get the final state of the row when a transaction has finished?

  • Thanks to Equal Experts logo for supporting this content.

Handling CVE-2019-8341 for dbt and mkdocs

Safety output for CVE-2019-8341

Yesterday, Safety told me about CVE-2019-8341, a security issue affecting Jinja2. I'll walk through how I investigated and assessed the risk to my website and a dbt pipeline I operate in the public domain. I finish up with a commentary on why I think this vulnerability is real and should be fixed, and why I think we need to risk breaking potentially insecure usage to make vulnerability management manageable in the real world.

  • Thanks to Equal Experts logo for supporting this content.

Exploring transactions in change data capture

Screenshot of a transaction containing insert, update and delete

Last time, I set up a CDC system using AWS RDS and DMS services. Now, I'll run some operations through the source database and show what that looks like in the CDC output. I'll introduce some metadata fields that are critical to figuring out what this CDC output means and set us up to look at the specific challenges I've had with interpreting these CDC outputs robustly to solve real-world problems reliably.

  • Thanks to Equal Experts logo for supporting this content.