Floe Blog

Floe dev update - Q1 2026

Written by Neil | Mar 16, 2026 2:37:59 AM

Most readers know that at Floe, we're busy building a SQL compute service on the data lakehouse. Floe's built to eat up ad-hoc queries from humans and AIs. It makes sure results are delivered reliably with quality of service no matter what the workload mix.

We just celebrated a major development milestone on the road to delivering the first version of our product. Our first milestone was to get end-to-end TPC-H queries running against Delta Lake and Iceberg lakehouses. We're using a micro-services architecture which I'll outline in a subsequent blog, and end-to-end flow means we've got to traverse dozens of these little programs written in Golang, Rust, C, C++ and Java (at least). We got through creating a new account, signing up, importing a data catalogue or two (in this case, a Delta and Iceberg view of the same data hosted on AWS Glue), creating a small compute cluster, firing up a query editor, and executing some queries. All in our Production AWS account, delivering correct results.

This milestone matters because we can now start to test correctness of the entire system, enforce continuous integration and deployment of micro-services to Production, and start to build system regression tests. We can start work on building out new functionality without worrying about regressions.

To pull this together, the team implemented all sorts of functions, listed below, each of which is further subdivided into multiple micro-services:

  • A new cloud platform and SIEM ready for SOC-2 type II compliance
  • An extremely restrictive multi-tenant security stack based on Cilium to isolate account networks, Istio to route and protect cross-service communication, and Falco to monitor suspicious activity.
  • A new session service for authentication and tracking session state backed by scale-out key/value stores and secret manager.
  • A brand new Vue 3.5 UI stack
  • A new PostgreSQL-compatible connection handler written in Golang, which supports standard and extended protocols, including binary.
  • FloeCat, our open-source meta-catalogue, which provides uniform access to Iceberg and Delta catalogues, captures extended statistics and implements traditional database artefacts such as system views, constraints, all accessible via Iceberg and Arrow Flight.
  • The query parser and planner were a big lift. We disentangled them from the old PostgreSQL binary (and each other!), added, clean abstractions and interfaces to enable delegating statistics and catalogue functions to FloeCat.
  • A new cluster manager and cluster allocator for provisioning and managing compute clusters.
  • The Yellowbrick worker was decoupled from the old appliance-based storage engine and ported to ARM architecture. Right now we have a Rust-based "shim" Parquet scanning service, as a temporary placeholder.

This took just over two months and came alive on 6th March, approximately 3 weeks ahead of schedule. AI tools like Claude Code and Codex – wielded carefully – have accelerated everything we do, and we're running faster and more nimbly as an early-stage business where everyone's having fun cranking out code.

To celebrate, we drank plenty of beer and wine at the office bar, ate too much cheese, and threw a party at Amazonico. One team member (who preferred to remain anonymous) even got quietly married, congrats!!!!!

Team Floe chilling at Amazonico

 

We're heads-down into planning our next major milestone, the key component of which is a new scale-out scan-planning service that we expect to set some world records pruning Parquet files on object stores. We're going to take a page out of the Clickhouse Cloud journey and "dog-food" Floe running alongside Databricks for OpenTelemetry, as well as our marketing warehouse and our QA warehouse. These projects also require us to implement the VARIANT Iceberg type for interoperability as a replacement for the old Yellowbrick JSON type.

Stay tuned for more updates!

- The Floe team