Skip to main content

· 17 min read
Alex DeBrie

I've done a lot of work with DynamoDB over the years, so my mental model of how distributed databases work is very much based on DynamoDB. But not all distributed databases work like DynamoDB! Sometimes when I venture out into the world of other distributed databases, I'm surprised by how they handle things differently.

In this post, I want to look at a specific aspect of distributed databases: how they handle secondary indexes.

· 16 min read
Alex DeBrie

Last week, someone emailed me to ask about a potential cost optimization mechanism in DynamoDB. More on the specifics of that situation below, but the basic point is they were thinking about adding some additional application and architectural complexity because they were concerned about high DynamoDB costs for a particular use case.

I responded the way I always respond for these requests -- "have you done the math?"

One of my favorite things about DynamoDB is that you can easily do the math when considering how much it will cost you. I use this all the time in a few different ways, from getting a rough guess at how much DynamoDB will cost for my application to deciding between different approaches to solving a specific access pattern.

· 16 min read
Alex DeBrie

I recently delivered serverless training to some engineers, and there was confusion between two concepts that come up in discussons of serverless architectures.

On one hand, I describe AWS Lambda as event-based compute, which has significant implications for how you write the code and design the architecture in your serverless applications.

On the other hand, many serverless applications use an event-driven architecture that relies on decoupled, asynchronous processing of events across your application.

These two concepts -- event-driven architectures and event-based compute -- sound similar and are often used together in serverless applications on AWS, but they're not the same thing. Further, the patterns you use for one will not necessarily apply if you're not using the other.

· 21 min read
Alex DeBrie

Over the past year or two, I've seen the AWS CDK turn a lot of my friends into converts. And in the very recent past, a few of these people have written up their (mostly positive) experiences with the CDK. See Maciej Radzikowski on why he stopped being a CDK skeptic here or Corey Quinn's list of the CDK's hard edges which, ultimately, is still favorable. For a more skeptical view of the CDK, check out Mike Roberts' excellent thoughts here.

Some of these articles describe similar concerns I have, but none of them quite nails my thoughts on the matter. I thought I'd throw my hat in the ring and describe why I still prefer the Serverless Framework over the CDK.

· 18 min read
Alex DeBrie

In 2007, a group of engineers from Amazon published The Dynamo Paper, which described an internal database used by Amazon to handle the enormous scale of its retail operation. This paper helped launch the NoSQL movement and led to the creation of NoSQL databases like Apache Cassandra, MongoDB, and, of course, AWS's own fully managed service, DynamoDB.

Fifteen years later, the folks at Amazon have released a new paper about DynamoDB. Most of the names have changed (except for AWS VP Swami Sivasubramanian, who appears on both!), but it's a fascinating look at how the core concepts from Dynamo were updated and altered to provide a fully managed, highly scalable, multi-tenant cloud database service.

In this post, I want to discuss my key takeaways from the new DynamoDB Paper.

· 22 min read
Alex DeBrie

One of the core complaints I hear about DynamoDB is that it can't be used for critical applications because it only provides eventual consistency.

It's true that eventual consistency can add complications to your application, but I've found these problems can be handled in most situations. Further, even your "strongly consistent" relational databases can result in issues if you're not careful about isolation or your choice of ORM. Finally, the benefits from accepting a bit of eventual consistency can be pretty big.

In this post, I want to dispel some of the fear around eventual consistency in DynamoDB.

· 19 min read
Alex DeBrie

I've written and spoken a lot about data modeling with DynamoDB over the years. I love DynamoDB, and I feel like I understand its pros and cons pretty well.

Recently, the topic of the compatibility of GraphQL and single-table design in DynamoDB came up again, sparked by a tweet from Rick Houlihan that led to a follow-up tweet indicating that I should update my post about single-table design where I mentioned that GraphQL might be an area where you don't want to use single-table design with DynamoDB.

Twitter is a bad medium for nuance, and I think the question of whether to use single-table design with GraphQL is a nuanced one that depends a lot on why you are choosing to use GraphQL.

· 14 min read
Alex DeBrie

Prefer video? View this post on YouTube!

DynamoDB powers some of the highest-traffic systems in the world, including Amazon.com's shopping cart, real-time bidding for ad platforms, and low-latency gaming applications. They use DynamoDB because of its fast, consistent performance at any scale.

Before I understood DynamoDB, I thought AWS had a giant supercomputer that was faster than everything else out there. Turns out that's not true. They're not defying the laws of physics -- they're using basic computer science principles to provide the consistent, predictable scaling properties of DynamoDB.

In this post, we'll take a deep look at DynamoDB partitions -- what they are, why they matter, and how they should affect your data modeling. The most important reason to learn about DynamoDB partitions is because it will shape your understanding of why DynamoDB acts as it does. At first glance, the DynamoDB API feels unnecessarily restrictive and the principles of single-table design seem bizarre. Once you understand DynamoDB partitions, you'll see why these things are necessary.