Update on CloudDS
Here is a progress update to my current main project (we call it 'CloudDS' which stands for cloud data store which is a silly name but it will have to do until we can find a replacement ).
I have been working on the data store component of the service. It has taken at least x4 as much time and effort as I thought it would. A prime reason for underestimating the time requirements is that the initial features list I wanted to implement doubled in size. In addition to that, testing for most of the possible logic paths that could result to failure also took a long time - even if some of that testing was automated, not all of it was and validating results is harder than setting up the test environment.
In such a service, it matters little if most of underlying components fail (I/O and tasks scheduler, garbage collector, cache subsystem, etc) as long as the data management component is not affected. Suffering from a service outage is bad, suffering data corruption and/or data loss is something that has to be prevented by any means necessary.
As it stands, that said component now deals fine with reads and writes, self-healing, caching and performs faster than I hoped it would. The data model is based on BigTable, Dynamo, Cassandra and some earlier prototypes/projects we toyed with in the past. It borrows Cassandra's ColumnFamily/SuperColumn/Column key value representation model. Data are pushed into MemTables and an append only commit log, memtables are flushed into SSTables to disk.
The GCollector merges SSTables whenever required to reclaim space, resolve conflicts and extract a single value out of multiple versions, etc. All operations supported by Cassandra are implemented (query by path, predicate, column names, key ranges, etc ) and CloudDS clients/users will also be able to use a scripting language to describe explicitly down to bytes what they need(i.e give me the first couple of bytes for those values, or gimme a concatenation of those values, etc etc).
Now that that component is out of the way, I can move on to the rest; those are relatively straight forward to implement ( the tasks scheduler and the network I/O subsystems are mostly done ).
Friday, 26 March 2010 9:19 pm
mySQL, noSQL, and Key Value datastores
Monolithic RDBMs are losing ground to key-value data stores, particularly persistent distributed in nature. mySQL mounting problems was perhaps the key reason (pun intended) people looked elsewhere. Google's brilliant engineers realized that a key/value data model can satisfy the needs of almost every class of application that needs a datastore backend.
Key/value datastores are simple to build, easy to understand, easy to optimize, easy to scale. The, now famous, CAP theorem states that it is not practically possible to guarantee consistency, availability and partitioning resilience/tolerance all at the same time; one of those traits has to be sacrificed. Again, most applications really do not require all three to function. The CAP theorem is most likely derived from the Project Triangle mode.
Most web-based applications are built on simple data models. Most web-applications eventually suffer from service capacity and availability issues(i.e scalability woes). It is trivial to scale out(vertically) application logic processing(application servers), HTTP requests processing(web servers, load balancers).
It is not easy to scale out an RDBMS. Some expensive systems(Oracle, etc) provide ways to address those issues (e.g Oracle RAC) but its expensive to deploy them, and most of them rely on a shared everything setup which just doesn't work in the long run. (Shared nothing is really the way to go).
Google released a bunch of papers ( actually, a bazillion of papers ), many of them defining and shaping the development of future related technologies. Namely, the papers describing GFS, BigTable, MapReduce (and of course, the paper the changed everything, "The Anatomy of a Large-Scale Hypertextual Web Search Engine" ) steered everyone to the right direction.
In the datastores domain, Hadoop/HBase, Radix, Cassandra and others, based on BigTable and Amazon's Dynamo papers, all relying on the simple key/value datastore model, are gaining market share - rightly so. Coupled with Memmache and similar services(in-memory key/value stores) they are solving the problems of service capacity and availability. This is a paradigm shift. Its a downhill for heavy-footprint, complex and inflexible datastore systems. They wont go away but will not be such a valuable(pun intended) component in tomorrow's technology landscape.
We are going to gradually migrate from RDBMs - though, we are not relying that much on them nowadays - to a key/value datastore (we are currently building one, also based on BigTable and Dynamo ). If nothing else, those simple systems are both simple and beautiful (for the most part).
Saturday, 13 March 2010 8:14 pm
Silly tip of the day : improve mySQL INSERTs
Instead of using
INSERT INTO table VALUES (columnValue1[,columnValue2...columnValueN])
for each row you wish to insert, use a single ( or a few )
INSERT INTO table VALUES (columnValue1[,columnValue2...columnValueN]), (columnValue1[,columnValue2...columnValueN]), ... .
You will notice a HUGE increase in perfromance.
Friday, 9 June 2006 2:40 pm