Beyond Relational Databases: Is Your Data Relational?

One of the strangest things about technology is how it moves in circles.  The relational database isn’t new technology, and while many changes to the storage model and the performance of the system has changed, the underlying concept is the same.  The leading databases, except for Oracle, all bare SQL in the name, giving the impression that SQL was critical to the concept of the relational database, not merely a front end language for describing access to relational data.

Web sites fit nicely into a relational model.  They have categories, articles, products, etc., sets of data.  The idea of applying set theory to data is at the core of the relational database.  I can quickly and easily get all Articles in the Category of SEO, because those fields are tagged, and I simply pull the appropriate subset.  You can always get intersections (with JOINs), unions, set deletes (EXCEPT), and other set operations… if you are using sets of data.

Martin Kleppmann asks, on Carsonified, Should you go Beyond Relational Databases? That’s the wrong question to ask.  The question is, “Is your data relational?”  If you have groupings of like data, then you need a relational database.  If you are building an application with non-relational data, then storing it in the database to have a quick id look up is foolish, and you should be looking for persistent data storage that is optimized for that sort of data.

For temporary storage, a system like memcached is perfect, it gives you lightning fast references to data that may only exist temporarily.  For a long term storage, maybe a database is your answer, or maybe you need something more tied to your data structure.  We wouldn’t suggest Microsoft switch from it’s DOC format (and the Docx XML version) to relational databases, but I wouldn’t put relational data into something more object oriented.  You might use objects to represent it in memory for easier programming, but if the data is essentially relational, keep it in relations.

Data structures are at the core of computer science.  With all the free information out there, there is no excuse to be building a large scale system without knowing the basics.  The fact that Twitter built their operation without knowing what they were doing doesn’t mean that everyone can… Bill Gates dropped out of Harvard and made a fortune, not every Harvard drop-out is so successful.

Leave a Reply