« The Toolsmith's Dilemma | Main | Why GlusterFS is Glusterfsck'd Too »

August 27, 2011


Feed You can follow this conversation by subscribing to the comment feed for this post.

Jeremy Hanna

Why don't you email the cassandra user or dev list with these questions?

Chip  Salzenberg

Jeremy: You think they don't know? I've spoken to many of the devs in person. My company hired Riptano. It's not like this is a secret, it's their freaking DESIGN PHILOSOPHY: "Lose all the data you want--the user can always make more."


Your title isn't a correct generalization. I'm using cassandra in production to serve > 20k operations per second. Many others are also using it successfully.

Read repair does a good job picking up where node availability was diminished, and general repair does a good job picking up where read repair hasn't.

Your statement about the manual process (I assume you mean "repair") amounting to a read repair of everything all at once is incorrect. Repair calculates a highly compact merkle tree that's cheap to broadcast and compare, and only data identified as missing is relayed back to the node missing it.

There are lots of tunables (both in the cassandra server as well as in the client) that allow you to enforce or relax certain behaviors.

Cassandra does have its pain points (tweaking till stable, ring reconfiguration, client backpressure as you've mentioned, overzealous disk IO in some cases), but in my experience data loss isn't an issue when all the settings are properly configured.

Chip  Salzenberg

That people are using something doesn't mean that thing is fit for the use. cf Windows.

I am curious how you know read repair is working. Seems to me that it could be working very badly and you might never know, unless you are so overprovisioned that write replication never fails you and nodes never die.

Full repair is conceptually identical to mass read repair, in that it compares what the nodes have and make sure they end up sharing what any of them has. It doesn't require sending the full data, but that's not a relevant difference to me. It still requires READING the full data, and iops are the more precious resource.

The comments to this entry are closed.