Let's start from the key point - we have databases to manage DATA. And one of the key elements of this tasks is to make sure that data is reliably stored and retrieved. And here is a catch - what do we mean by reliable? Or to be precise - what happens to you/your company/your customers if some piece of the data is lost forever/unrecoverable? And the answer on this question drives the whole technology stack! For example, if you work with medical/legal/official data - a small chunk of lost information (if noticed) could mean litigation at best and people's life at worst!
Let's be clear - majority of current NoSQL DB solutions are explicitly not ACID-compliant (or at least not 100% ACID compliant). For example, I found a pretty good analysis of MongoDB and CouchDB - and it is clear that even its proponents say that there are always trade-offs between performance and data reliability. In some articles there are even suggestions to have double-environment implementation, where you have NoSQL-database for non-critical data plus RDBMS for critical data.
Just to clarify - what do I mean by ACID-compliance:
- Atomicity requires that each transaction is executed in its entirety, or fail without any change being applied.
- I.e. if you have successful INSERT and successful DELETE in the same transaction - you will have both/none of them committed.
- Consistency requires that the database only passes from a valid state to the next one, without intermediate points.
- I.e. it is impossible to catch the database in the state when for the stored data some rules (for example, PK) are not yet enforced
- Isolation requires that if transactions are executed concurrently, the result is equivalent to their serial execution. A transaction cannot see the partial result of the application of another one.
- I.e. each transaction works in its own realm until it tries to commit the data.
- Durability means that the the result of a committed transaction is permanent, even if the database crashes immediately or in the event of a power loss.
- I.e. it is impossible to have a situation when the application/user thinks the data is committed but after the power failure it is gone.
I understand that there are environments where that small chance of data loss can be if not ignored, but at least tolerated: we all know that Craiglist is being run by MongoDB - so, what's the impact by one lost add? Somebody might get annoyed, but that's all!
Although when I start hearing about medical systems being built via NoSQL solutions - I start to get nervous. Maybe, in a couple of years before going to the doctor I will first check what kind of software they use! Just to feel safer...