Blog | DeBrie Advisory

How do distributed databases handle secondary indexes? A survey

March 28, 2024 · 17 min read

Founder, DeBrie Advisory

I've done a lot of work with DynamoDB over the years, so my mental model of how distributed databases work is very much based on DynamoDB. But not all distributed databases work like DynamoDB! Sometimes when I venture out into the world of other distributed databases, I'm surprised by how they handle things differently.

In this post, I want to look at a specific aspect of distributed databases: how they handle secondary indexes.

How you should think about DynamoDB costs

May 8, 2023 · 16 min read

Alex DeBrie

Founder, DeBrie Advisory

Last week, someone emailed me to ask about a potential cost optimization mechanism in DynamoDB. More on the specifics of that situation below, but the basic point is they were thinking about adding some additional application and architectural complexity because they were concerned about high DynamoDB costs for a particular use case.

I responded the way I always respond for these requests -- "have you done the math?"

One of my favorite things about DynamoDB is that you can easily do the math when considering how much it will cost you. I use this all the time in a few different ways, from getting a rough guess at how much DynamoDB will cost for my application to deciding between different approaches to solving a specific access pattern.

Event-Driven Architectures vs. Event-Based Compute in Serverless Applications

February 20, 2023 · 16 min read

Alex DeBrie

Founder, DeBrie Advisory

I recently delivered serverless training to some engineers, and there was confusion between two concepts that come up in discussons of serverless architectures.

On one hand, I describe AWS Lambda as event-based compute, which has significant implications for how you write the code and design the architecture in your serverless applications.

On the other hand, many serverless applications use an event-driven architecture that relies on decoupled, asynchronous processing of events across your application.

These two concepts -- event-driven architectures and event-based compute -- sound similar and are often used together in serverless applications on AWS, but they're not the same thing. Further, the patterns you use for one will not necessarily apply if you're not using the other.

Why I (Still) Like the Serverless Framework over the CDK

August 23, 2022 · 21 min read

Alex DeBrie

Founder, DeBrie Advisory

Over the past year or two, I've seen the AWS CDK turn a lot of my friends into converts. And in the very recent past, a few of these people have written up their (mostly positive) experiences with the CDK. See Maciej Radzikowski on why he stopped being a CDK skeptic here or Corey Quinn's list of the CDK's hard edges which, ultimately, is still favorable. For a more skeptical view of the CDK, check out Mike Roberts' excellent thoughts here.

Some of these articles describe similar concerns I have, but none of them quite nails my thoughts on the matter. I thought I'd throw my hat in the ring and describe why I still prefer the Serverless Framework over the CDK.

Key Takeaways from the DynamoDB Paper

July 4, 2022 · 18 min read

Alex DeBrie

Founder, DeBrie Advisory

In 2007, a group of engineers from Amazon published The Dynamo Paper, which described an internal database used by Amazon to handle the enormous scale of its retail operation. This paper helped launch the NoSQL movement and led to the creation of NoSQL databases like Apache Cassandra, MongoDB, and, of course, AWS's own fully managed service, DynamoDB.

Fifteen years later, the folks at Amazon have released a new paper about DynamoDB. Most of the names have changed (except for AWS VP Swami Sivasubramanian, who appears on both!), but it's a fascinating look at how the core concepts from Dynamo were updated and altered to provide a fully managed, highly scalable, multi-tenant cloud database service.

In this post, I want to discuss my key takeaways from the new DynamoDB Paper.

Understanding Eventual Consistency in DynamoDB

June 29, 2022 · 22 min read

Alex DeBrie

Founder, DeBrie Advisory

One of the core complaints I hear about DynamoDB is that it can't be used for critical applications because it only provides eventual consistency.

It's true that eventual consistency can add complications to your application, but I've found these problems can be handled in most situations. Further, even your "strongly consistent" relational databases can result in issues if you're not careful about isolation or your choice of ORM. Finally, the benefits from accepting a bit of eventual consistency can be pretty big.

In this post, I want to dispel some of the fear around eventual consistency in DynamoDB.

CAP or no CAP? Understanding when the CAP theorem applies and what it means.

June 21, 2022 · 16 min read

Alex DeBrie

Founder, DeBrie Advisory

The CAP theorem might be the most misunderstood idea in computer science. If you're looking to understand the CAP theorem through a series of examples, you're in the right place.

Inconsistent thoughts on database consistency

May 12, 2022 · 19 min read

Alex DeBrie

Founder, DeBrie Advisory

So, this is a post about consistency in databases. And it comes as a result of a deep dive down a rabbit hole, with hundreds of pages of academic papers printed and countless Chrome tabs eating memory on my MacBook.

GraphQL, DynamoDB, and Single-table Design

April 12, 2022 · 19 min read

Alex DeBrie

Founder, DeBrie Advisory

I've written and spoken a lot about data modeling with DynamoDB over the years. I love DynamoDB, and I feel like I understand its pros and cons pretty well.

Recently, the topic of the compatibility of GraphQL and single-table design in DynamoDB came up again, sparked by a tweet from Rick Houlihan that led to a follow-up tweet indicating that I should update my post about single-table design where I mentioned that GraphQL might be an area where you don't want to use single-table design with DynamoDB.

Twitter is a bad medium for nuance, and I think the question of whether to use single-table design with GraphQL is a nuanced one that depends a lot on why you are choosing to use GraphQL.

Everything you need to know about DynamoDB Partitions

May 17, 2021 · 14 min read

Alex DeBrie

Founder, DeBrie Advisory

Prefer video? View this post on YouTube!

DynamoDB powers some of the highest-traffic systems in the world, including Amazon.com's shopping cart, real-time bidding for ad platforms, and low-latency gaming applications. They use DynamoDB because of its fast, consistent performance at any scale.

Before I understood DynamoDB, I thought AWS had a giant supercomputer that was faster than everything else out there. Turns out that's not true. They're not defying the laws of physics -- they're using basic computer science principles to provide the consistent, predictable scaling properties of DynamoDB.

In this post, we'll take a deep look at DynamoDB partitions -- what they are, why they matter, and how they should affect your data modeling. The most important reason to learn about DynamoDB partitions is because it will shape your understanding of why DynamoDB acts as it does. At first glance, the DynamoDB API feels unnecessarily restrictive and the principles of single-table design seem bizarre. Once you understand DynamoDB partitions, you'll see why these things are necessary.