Thursday, November 06, 2008

Legalizing the crack that is Excel spreadmarts – Chris Webb on Project Gemini

There’s nothing wrong with Excel.  Actually, there’s tons of stuff wrong with Excel.  Here is one example of a Spreadmart gone bad.

Excel error leaves Barclays with more Lehman assets than it bargained for

The law firm representing Barclays filed the motion (download PDF) on Friday in U.S. Bankruptcy Court for the Southern District of New York, seeking to exclude 179 Lehman contracts that it said were mistakenly included in the asset purchase agreement. The firm — Cleary Gottlieb Steen & Hamilton LLP — said in the motion that one of its first-year law associates had unknowingly added the contracts when reformatting a spreadsheet in Excel.

Cut-Paste Wealth Destruction! :)

Actually, many of the problems with Excel aren’t really with the product, it’s how it’s used.  “When the only tool you have is a hammer, everything looks like a nail.”  Ditto when the only tools you are comfortable with for dealing with numbers are Excel and Calc.exe, and it takes 2 weeks (or more) for the same report to be built using a “reporting tool” by the IT team…

Chris has some great points that I wholeheartedly agree with.  In the past, I have been responsible for promoting the view of a “single source of total knowledge” or a “one view” of the organization.  Excel doesn’t fit into this picture as a storage mechanism, but it can be the UI, analysis, modeling, and calculation engine that is supported by a central repository stored in the SQL Server cloud (or wherever you may decide to store your data).  Not knowing the details on Project Gemini, I hope that it continues with this theme of server-based storage and client-based analysis, and doesn’t go the way of Coleco Gemini.  I hope that there are options for clients who don’t adopt the latest technologies.  

Spreadsheets have long been one of the most popular ways for corporate users to store and analyze data. But over the past few years, they have played an increasing role in data breaches because workers are apt to store them unsecured on laptops. In addition, hackers have actively tried to exploit vulnerabilities in Excel.

Since I’m a DBA at heart, I’m not comfortable seeing thousands of silos of varying degrees of accurate information multiplying and dividing around a company. With a DBA mindset, it is all about control, security, maintainability, and performance of your data. 

This kind of desktop, DIY BI is in a way similar to illegal drugs: there are always some people that want it, a certain number of them are always going to do it even though they know they shouldn't, so you've got two choices - either legalise it and then hope to control it, as with Gemini, or throw all your efforts into outlawing it.

Chris Webb's BI Blog: Last thoughts on Gemini for the moment

Project Gemini – Microsoft’s Brilliant Trojan Horse

The concept of transferring some of the calculations and aggregations to the client does make sense.  My laptop is more powerful than many of the 5 year old servers in use at some of the client sites I work in.  It would be great to be able to quickly build models without delving into Business Intelligence Studio, SSIS, SSRS & SMS, or asking a developer.  There’s not much faster than memory on a PC, so in-memory processing sounds great to me.  I just hope there’s a way to push out a lockdown mechanism, for when that laptop and its information disappears from the company.

I hope they include a connector to perform calcs inside the GPU too

I like what I see so far with some of the “value-adds” coming out of MS Downloads for Excel like the data mining tools, though in the wrong hands (or even worse, the right hands) the information could be very misleading and lead to disastrous results.  From a marketing and adoption perspective, does it make sense to sell the idea of distributing mass amounts of data down to a client PC?  I’m still waiting on a good forms-based interface to input data from Excel directly into a data repository.  Sort of an InfoPath merged with Excel, without the need to install InfoPath, and with the “always-on” save features from One Note.  Sure, web forms and web services.  What about just Excel to Sql?

Without proper governance and understanding of the technology, publishing to Sharepoint can still lead us to Enterprise Spreadmart solutions and IT maintenance nightmares. 

In my opinion, rather than sheets of 20 million rows of raw data crunched and stored in a spreadsheet on a laptop, file share, or document store, there should be sheets (or something else?) of results available with the data being stored, crunched, and transformed in a central, secure, redundant place (“THE CLOUD?”). I hope that this is the approach Project Gemini will provide.  Magic?

One truth, shared by many, common to one. 

Metadata, semantic web technologies and, yes, Gemini again

Gemini IS Analysis Services

Gemini is Inevitability

No comments: