Databases

Last week

PSA: Your SQLite Connection Pool Might Be Ruining Your Write Performance

- 17 Feb, 2026 Update (Feb 18, 2026): After a productive discussion on Reddit and additional benchmarking, I found that the solutions I originally proposed (batched writes or using a synchronous connection) don't actually help. The real issue is simpler and more fundamental than I...

Databases

about 2 months ago

Scour Year End Update 2025

- Happy New Year and thanks for your support in 2025!...

Databases Product dev

2 months ago

Short-Circuiting Correlated Subqueries in SQLite

- Skipping expensive per-row subqueries to speed up my average query ~17%....

Databases

What Does a Database for SSDs Look Like?

- What Does a Database for SSDs Look Like? Maybe not what you think. Over on X, Ben Dicken asked: What does a relational database designed specifically for local SSDs look like? Postgres, MySQL, SQLite and many others were invented in the 90s and 00s, the era of spinning disks. A...

Databases

3 months ago

RAG Isn’t a Vector Search Problem

- Through market forces, embeddings became the singular framework we understood RAG. It's the wrong lens to think about the problem...

AI / LLMs Databases

sqlite-utils 4.0a1 has several (minor) backwards incompatible changes

Backend dev Databases

Why Strong Consistency?

- Why Strong Consistency? Eventual consistency makes your life harder. When I started at AWS in 2008, we ran the EC2 control plane on a tree of MySQL databases: a primary to handle writes, a secondary to take over from the primary, a handful of read replicas to scale reads, and...

Databases

4 months ago

Scaling HNSWs

- I’m taking a few weeks of pause on my HNSWs developments (now working on some other data structure, news soon). At this point, the new type I added to Redis is stable and complete enough, it’s the perfect moment to reason about what I learned about HNSWs, and turn it into a blog...

Databases

A hypothetical search engine on S3 with Tantivy and warm cache on NVMe

- I’ve been curious about how far you can push object storage as a foundation for database-like systems. In previous posts, I explored moving JSON data from PostgreSQL to Parquet on S3 and building MVCC-style tables with constant-time deletes using S3’s conditional writes. These...

Backend dev Databases

A new SQL-powered permissions system in Datasette 1.0a20

Backend dev Databases

DSQL: Simplifying Architectures

- DSQL: Simplifying Architectures Complexity is a choice. While we were designing and building Aurora DSQL, we spent a lot of time thinking about our experience building and running database-backed systems. We saw that building great, fast, cost-effective, highly-available,...

Databases

Fixing UUIDv7 (for database use-cases)

- Fixing UUIDv7 (for database use-cases) How do I even balance a V7? RFC9562 defines UUID Version 7. This has made a lot of people very angry and been widely regarded as a bad move1. More seriously, UUIDv7 has received a lot of criticism, despite seemingly achieving what it set...

Backend dev Databases

5 months ago

Mutable atomic deletes with Parquet backed columnar tables on S3

- In the previous post, I explored a Parquet on S3 design with tombstones for constant time deletes and a CAS updated manifest for snapshot isolation. This post extends that design. The focus is in file delete operations where we replace a Parquet row group and publish a new...

Databases

Locality, and Temporal-Spatial Hypothesis

- Locality, and Temporal-Spatial Hypothesis Good fences make good neighbors? Last week at PGConf NYC, I had the pleasure of hearing Andres Freund talking about the great work he’s been doing to bring async IO to Postgres 18. One particular result caught my eye: a large difference...

CompSci Databases

An MVCC-like columnar table on S3 with constant-time deletes

- Parquet is excellent for analytical workloads. Columnar layout, aggressive compression, predicate pushdown, but deletes require rewriting entire files. Systems like Apache Iceberg and Delta Lake solve this by adding metadata layers that track delete files separately from data...

Backend dev Databases

Exploring PostgreSQL to Parquet archival for JSON data with S3 range reads

- PostgreSQL handles large JSON payloads reasonably well until you start updating or deleting them frequently. Once payloads cross the 8 KB TOAST threshold and churn becomes high, autovacuum can dominate your I/O budget and cause other issues. I have been exploring the idea of...

Backend dev Databases

Top picks — 2025 September

- Holly molly, September was hectic, mostly good and definitely memorable. Family came over for a visit from Poland, we got married, we travelled to northern Italy, and the most recent meetup I organised was a huge success. It was intense and I’m ready for a chill and quiet...

CompSci Databases

Subtleties of SQLite Indexes

- Understanding query planner quirks yielded a ~35% speedup....

Databases

You Want Technology With Warts

- I normally skip presentations because I prefer reading, but Building the Hundred-Year Web Service (YouTube) was worth the time.1 Note that despite “htmx” featuring in the title, very little of the presentation is actually about htmx. It is about choosing and using technology in...

Backend dev Databases

6 months ago

Why UUIDs Beat Integers as Primary Keys (And Why Performance Isn’t the Issue)

- One of the first schema decisions you face when designing a database table is: Should I use an INT or a UUID as the primary key? Most developers default to an auto-incrementing integer. It’s simple, compact, and familiar. But UUIDs (a.k.a. GUIDs) are increasingly popular — and...

Databases

Strong Eventual Consistency - The Big Idea behind CRDTs

- 9/8/2025 CRDTs. Data structures that can be replicated across multiple nodes, edited independently, merged back together, and it all just works. But collaborative document editing and multiplayer TODO lists are just the tip of the iceberg - I believe the big application is...

Databases

Materialized views are obviously useful

Backend dev Databases

7 months ago

Dynamo, DynamoDB, and Aurora DSQL

- Dynamo, DynamoDB, and Aurora DSQL Names are hard, ok? People often ask me about the architectural relationship between Amazon Dynamo (as described in the classic 2007 SOSP paper), Amazon DynamoDB (the serverless distributed NoSQL database from AWS), and Aurora DSQL (the...

Databases

You should add debug views to your DB

- This one will be quick. Imagine this, you get a report from your bug tracker: Sophie got an error when viewing the diff after her most recent push to her contribution to the @unison/cloud project on Unison Share (BTW, contributions are like pull requests, but for Unison code)...

Databases

Rows per page

Page 1 of 8