Last week
- Happy New Year and thanks for your support in 2025!...
About a month ago
- Skipping expensive per-row subqueries to speed up my average query ~17%....
- What Does a Database for SSDs Look Like? Maybe not what you think. Over on X, Ben Dicken asked: What does a relational database designed specifically for local SSDs look like? Postgres, MySQL, SQLite and many others were invented in the 90s and 00s, the era of spinning disks. A...
about 1 month ago
- Through market forces, embeddings became the singular framework we understood RAG. It's the wrong lens to think about the problem...
about 2 months ago
- Why Strong Consistency? Eventual consistency makes your life harder. When I started at AWS in 2008, we ran the EC2 control plane on a tree of MySQL databases: a primary to handle writes, a secondary to take over from the primary, a handful of read replicas to scale reads, and...
2 months ago
- I’ve been curious about how far you can push object storage as a foundation for database-like systems. In previous posts, I explored moving JSON data from PostgreSQL to Parquet on S3 and building MVCC-style tables with constant-time deletes using S3’s conditional writes. These...
- DSQL: Simplifying Architectures Complexity is a choice. While we were designing and building Aurora DSQL, we spent a lot of time thinking about our experience building and running database-backed systems. We saw that building great, fast, cost-effective, highly-available,...
3 months ago
- Fixing UUIDv7 (for database use-cases) How do I even balance a V7? RFC9562 defines UUID Version 7. This has made a lot of people very angry and been widely regarded as a bad move1. More seriously, UUIDv7 has received a lot of criticism, despite seemingly achieving what it set...
- In the previous post, I explored a Parquet on S3 design with tombstones for constant time deletes and a CAS updated manifest for snapshot isolation. This post extends that design. The focus is in file delete operations where we replace a Parquet row group and publish a new...
- Locality, and Temporal-Spatial Hypothesis Good fences make good neighbors? Last week at PGConf NYC, I had the pleasure of hearing Andres Freund talking about the great work he’s been doing to bring async IO to Postgres 18. One particular result caught my eye: a large difference...
- Parquet is excellent for analytical workloads. Columnar layout, aggressive compression, predicate pushdown, but deletes require rewriting entire files. Systems like Apache Iceberg and Delta Lake solve this by adding metadata layers that track delete files separately from data...
- PostgreSQL handles large JSON payloads reasonably well until you start updating or deleting them frequently. Once payloads cross the 8 KB TOAST threshold and churn becomes high, autovacuum can dominate your I/O budget and cause other issues. I have been exploring the idea of...
- Holly molly, September was hectic, mostly good and definitely memorable. Family came over for a visit from Poland, we got married, we travelled to northern Italy, and the most recent meetup I organised was a huge success. It was intense and I’m ready for a chill and quiet...
- Understanding query planner quirks yielded a ~35% speedup....
4 months ago
- I normally skip presentations because I prefer reading, but Building the Hundred-Year Web Service (YouTube) was worth the time.1 Note that despite “htmx” featuring in the title, very little of the presentation is actually about htmx. It is about choosing and using technology in...
- One of the first schema decisions you face when designing a database table is: Should I use an INT or a UUID as the primary key? Most developers default to an auto-incrementing integer. It’s simple, compact, and familiar. But UUIDs (a.k.a. GUIDs) are increasingly popular — and...
5 months ago
- Dynamo, DynamoDB, and Aurora DSQL Names are hard, ok? People often ask me about the architectural relationship between Amazon Dynamo (as described in the classic 2007 SOSP paper), Amazon DynamoDB (the serverless distributed NoSQL database from AWS), and Aurora DSQL (the...
- This one will be quick. Imagine this, you get a report from your bug tracker: Sophie got an error when viewing the diff after her most recent push to her contribution to the @unison/cloud project on Unison Share (BTW, contributions are like pull requests, but for Unison code)...
- This article is about a code-transformation technique I used to get 100x-300x performance improvements on a particularly slow bit of code which was loading Unison code from Postgres in Unison Share. I haven't seen it documented anywhere else, so wanted to share the trick! It's a...
- 1. Mental Model D1 = SQLite running inside your Worker process Not a separate database server - zero network latency One logical database, replicated globally by Cloudflare env.DB injected at runtime via binding system 2. Basic Setup Wrangler Co......
- Quick Mental Model KV Namespace = The actual database/storage (has ID like abc123def456) Binding = Variable name in your code (like TODO, USERS) Key-Value Store = Simple hash map, not a relational database Eventually Consistent = Changes take tim......
Rows per page