At the time of writing, we’ve reimplemented most of Digg’s functionality using Cassandra as our primary datastore. We’ve supplemented Cassandra-based indexing using full text, relational and graph indexing systems. We’re getting used to dealing with eventual consistency.
We’ve been working on Cassandra itself too. We’ve made massive performance improvements: increased comparitor speed, added better compaction threading, reduced logging overhead, added row-level caching and implemented multi-get capability. We’ve also implemented native atomic counters using Zookeeper (you can probably guess why were motivated to add that feature :)
We’ve tested and improved the operational capabilities of Cassandra, upgrading its Rackaware capability, added slow query logging, improved the bulk import functionality and implemented Scribe support for improved logging. We’ve also done a ton of operational testing.
http://about.digg.com/node/564
Sounds fun. Did I mention I love Amazon SimpleDB?
