tag:blogger.com,1999:blog-4490444256563865262.post729392839692711884..comments2023-12-26T06:47:22.192+01:00Comments on This long run: The confusing CAP and ACID wordingNicolas Liochonhttp://www.blogger.com/profile/07943925485349697034noreply@blogger.comBlogger3125tag:blogger.com,1999:blog-4490444256563865262.post-67934061559346970192015-08-10T12:32:06.152+02:002015-08-10T12:32:06.152+02:00Thanks for a very informative article. Given that ...Thanks for a very informative article. Given that the definition and context ACID relates to relational databases, should we in fact be more concerned about BASE versus ACID in terms of equivalence?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4490444256563865262.post-87527745120458766152015-05-30T15:11:21.486+02:002015-05-30T15:11:21.486+02:00Hi Pawel,
Thanks for the nice feedback.
To me, us...Hi Pawel,<br /><br />Thanks for the nice feedback.<br />To me, using ACID as a binary yes/no is a little bit like CAP categories: it gives a taste of what the database authors are aiming at but not much more. <br />Distributed databases with ACID do exist: SQL + 2PC is exactly this, but it comes with so many drawbacks (no real tolerance to partition, limited performances, ...) that it cannot be a general solution. And all distributed database have chosen different drawbacks so far.<br /><br />As such, if you want to select or use a distributed database you can't use ACID as a boolean value, but using ACID (and CAP) to study the database behavior helps a lot to identify the trade-offs actually made. For example:<br /><br />Atomic:<br /> - many (Bigtables clones) are atomic by row, and rows have to fit on a single node.<br /> - Some (voltDB) do more, but then the actual cost of cross-nodes operation can by quite high.<br /><br />Consistent:<br /> - Some constraints are easy to implement (checking data types for example)<br /> - Foreign key checks are expensive on big data systems. Some try to keep them outside of the DB. There is a paper by Peter Baillis (Feral Concurrency Control: An Empirical Investigation of Modern Application Integrity), with some 'fun' (but expected) findings. (it seems it's not available anymore. As the conf takes place next week, it should be back soon).<br /><br />Isolation<br /> - Between all the default levels + snapshot isolation, which one is efficiently implemented by the DB? Is it implemented with locks all over the place, forfeiting concurrency?<br /><br />Durability<br /> - The wildest one. A lot of NoSQL databases forfeit durability to get better performances: they write on a single node instead of 2 or more or flush the buffer asynchronously or keep the data in the client buffer.<br /><br />So the boolean ACID & CAP are very limited but taking each letter individually and asking for each one "What does it mean exactly?" and "what are the drawbacks?" is quite ok to understand a distributed system.<br />Nicolas Liochonhttps://www.blogger.com/profile/07943925485349697034noreply@blogger.comtag:blogger.com,1999:blog-4490444256563865262.post-90030017750281198502015-05-29T23:41:58.891+02:002015-05-29T23:41:58.891+02:00Hi Nicolas, thanks for this really good article. I...Hi Nicolas, thanks for this really good article. I personally think that because of all those differences between the terms (which you've just described), we shouldn't consider ACID in any way when discussing NoSQL. ACID was created back in the day when the systems were hosted on single machines, so they didn't even think about distributed processing and it only confuses people nowadays. I have one question however (since contrary to my opinion ACID is sometimes used when describing NoSQL): do you think that it is possible for NoSQL databases to simultaneously guarantee all of ACIDs properties?Paweł J.noreply@blogger.com