Wednesday, July 25, 2012

Query Languages and technology Mashups

SQL (Structured Query Language, See-Kwell) has been around for awhile.  IBM engineers developed SEQUEL (Structured English Query Language) in the early 70s to support building and querying relational models.  It was adopted as an ANSI standard in the mid-80s.  I've been using it since the days of Informix & Sybase, prior to SQL Server 6.5.  Various flavours and implementations have been created since then, with the key ones being SQL Server's T-SQL, Oracle's PL/SQL, and IBM's SQL PL, and PostGres PL/SQL, and the various offshoots of those languages.

MDX (Multidimensional Expressions, or Mosha's Data Expressions) are used to query Analysis Services multidimensional and tabular cube models.  Instead of a single axis of rows being returned, there is the potential to return 128 axes of data.  Usually it's just rows, columns, and sometimes pages, since thinking in 128 dimensions hurts most peoples brains.

DMX (Data Mining Extensions) is a a SQL-like query language used for creating and pulling data from multidimensional mining models.  The original team lead for Data Mining at MS is now CTO at a company called Predixions, implementing mining models using PowerPivot.

DAX (Data Analysis Expressions) supercedes MDX for the Analysis Services / PowerPivot tabular models.  MDX is still used in the background for Excel when querying PowerPivot models, but DAX is the way you write expressions and calculated functions.

NoSQL (Not Only SQL) isn't really a language, it is a technology categorization.  NoSQL databases like those used for Facebook and Google aren't necessarily relational in nature.  Usually they are key-value stores.  Think thin columns and deep rows.  Microsoft has Windows Azure Blob Storage for it's NoSQL offering, with others on the way.  HQL (Hive Query Language) is one way of querying a NoSQL database running on Hadoop.  Not to be mistaken with Hibernate Query Language, used to query the ORM framework NHibernate.  Most NoSQL databases fail the ACID test, and aren't necessarily good for transactional systems.  They're great when you need to distribute and analyze massive amounts of data.

NewSQL is either a Sourceforge LDBC driver implementation created around 2003 that offers a simplified SQL syntax, or yet another buzzword coined last year to describe something that isn't SQL and isn't NoSQL.  It's easier to use, more scalable, and performs better.  It provides ACID support and all the features of new and old database technologies.  Or so the vendors of this technology space suggest...

Some products in the NewSQL space include Akiban Server and (perhaps) Microsoft SQL Azure

More info here:
http://www.readwriteweb.com/cloud/2011/04/the-newsql-movement.php

Augmenting legacy SQL databases and technologies with their new "Big Data" columnstore or key-pair counterparts seems to be the Next Big Thing.  Eventually one of the Big Dogs will fill this space, probably Microsoft, Oracle, IBM or SAP.  For now the niche players will nibble at each other's dogfood, until somebody decides to buy out the dogfood supplier.

What we really need is a database to track and compare the companies, names, technologies, features and query languages used for each of these products.  Maybe this database could just be a query language against Wikipedia.  As more and more players enter the database market, research companies and consultants who understand the "Big Picture" become more important, as do content aggregators and search tools like Wikipedia, Linked-In and Google.  

No comments: