Wednesday, July 30, 2008

SQL 2005 Mirroring - Automatic Failover « Brad Marsh’s Weblog

 

This post is a follow on from a previous post ‘Complete Guide to SQL 2005 mirroring using the GUI

SQL 2005 Mirroring - Automatic Failover « Brad Marsh’s Weblog

Removing Duplicate Records using SQL Server 2005 | Servers and Storage | TechRepublic.com

 

Duplicate Records

Duplicate records can occur numerous ways, such as loading source files too many times, keying the same data more than once, or from just bad database coding. Having a primary key on your table (and you always should have one) can will in the removal of the duplicate records, but even w/ a primary key it is never a fun task to have handed to you to complete.

Removing Duplicate Records using SQL Server 2005 | Servers and Storage | TechRepublic.com

Tuesday, July 29, 2008

T-SQL 2005: One-to-many comma separated

Efficient string concatenation in SQL?

We have 2 tables one-to-many. How can we fetch parent table field, and the second field is its children comma separated ? In MSSQL 2000 we could use the following function. But in MSSQL 2005 with the help of FOR XML PATH feature it is a lot easier and the performance of string concatenation is amazing.

SELECT CustomerID
,(SELECT CAST(OrderID AS VARCHAR(MAX)) + ',' AS [text()]
FROM dbo.Orders AS O
WHERE O.CustomerID = C.CustomerID
ORDER BY OrderID
FOR XML PATH('')) AS Orders
FROM dbo.Customers AS C;



T-SQL 2005: One-to-many comma separated

Monday, July 28, 2008

BizTalk 2006 Comparison

A low-cost alternative to BizTalk?  Sounds useful…

Generally we consider BizTalk to be over-engineered and awkward to use.  It's well placed to solve your problem, and will ultimately deliver a reliable solution - however development is complicated (requiring specialist personnel), debugging is difficult and it can't easily be used straight out of the box.  Additionally, relative to SmartsIntegrator, BizTalk is very expensive!

BizTalk 2006 Comparison

Microsoft OLAP by Mosha Pasumansky : Analyze MDX with MDX Studio

Best practice design tool for MDX queries is now available in MDX Studio.

One last thing to mention: Best practices are exactly what they are – best practices. It means, that most of the time using them you will be better off. However, they are absolute rules to be followed always. Sometimes there are exceptions from the rules. AS is a complex, feature-rich product, and MDX engine is probably the most sophisticated piece in it. Sometimes features interact in such a way, that some best practices might result in worse results. For example, one of the advices that MDX Studio will give is when it sees Filter(CrossJoin(…)), it will recommend using CrossJoin(Filter(…)) instead. It is common sense to reduce the size of the set before crossjoin’ing. But not only this is not always possible, but also there are rare cases where it would result in worse performance. So while you are encouraged to follow the best practices, always use them as a guide, and measure before and after applying the rules to make sure things do improve.

MDX Studio 0.4.0 is released with desktop and online versions simultaneously.

Desktop version can be downloaded from: http://cid-74f04d1ea28ece4e.skydrive.live.com/browse.aspx/MDXStudio/v0.4.0

Online version can be accessed at: http://mdx.mosha.com

As usual – please post your comments, suggestions, bug reports etc at MDX Studio forum.

Microsoft OLAP by Mosha Pasumansky : Analyze MDX with MDX Studio

Lazy developer – Powershell for SSIS

I’m sure this could be applied to a package in order to build a custom solution for SSIS ETL against multiple tables, without manually creating ETL for each table.

Not to be one to admit defeat (and being too lazy to edit 120 columns by hand) I pulled together some C# to work with the interfaces that I need and then exposed this as PowerShell cmdlets. This both simplified the script and got around the issue of PowerShell not casting to the interfaces I needed.

The following script opens a package, grabs a reference to the columns property of a given csv connection and then loops through the columns, changing all the datatypes and then finally saves the package.

add-PSSnapin powerSSIS

$pkg = get-ssisPackage "C:\temp\Test Package.dtsx"
$fileCon = $pkg.Connections["CSV file"]
$col = $fileCon.Properties["Columns"].GetValue($fileCon)

 

for ($i = 4; $i -lt $col.Count; $i++)
{
$name = get-ssisflatfileColumnName $pkg "CSV File" $i
if ($name.startsWith("Column"))
{
set-ssisflatfileColumnName $pkg "CSV File" $i "Mth$($i-2)"
$c = get-SSISFlatFileColumn $pkg "CSV File" $i
$c.DataType = [Microsoft.SqlServer.Dts.Runtime.Wrapper.DataType]::DT_Numeric
$c.DataPrecision = 18
$c.DataScale = 5
}
}

set-SSISPackage $pkg "C:\temp\Test Package.dtsx"




Darren Gosbell [MVP] - Random Procrastination

Tuesday, July 22, 2008

Andrew Fryer's Blog : Microsoft SQL Server Community Utilities

 

Microsoft SQL Server Community Utilities

As well as the independent SQL communities, there is also a Microsoft run one, the Microsoft SQL Server Community Worldwide ( You can tell it’s the Microsoft one by the catchy title).  They do tend to assume worldwide is the United States wide or even the Seattle area so you can drop in to Redmond for SQL school.  However don’t let that put you off, there is some really good stuff there including a list of utility scripts here  as long as your arm for probing the dark corners of your installations:

Andrew Fryer's Blog : Microsoft SQL Server Community Utilities

Charlie Maitland’s Blog

Charlie has a huge amount of links on BI, Axtapa & Dynamix.

Charlie Maitland’s Blog

Jason Morales' Microsoft BI Update

 

Enterprise & Standard Edition Feature Comparisons:

- Microsoft SQL Server 2008

- Microsoft SQL Server 2005

Business Intelligence features of SQL Server Enterprise Edition:

(that are not in the Standard Edition)

Advanced Analytics

- Account intelligence

- Linked measures and dimensions

- Perspectives

- Semiadditive measures

- Writeback dimensions

Scalability and Performance

- Proactive caching

- Partitioned cubes and distributed partitioned cubes

- Auto parallel partition processing

Integration Services

- Data mining query transformation

- Data mining model training destination adapter

- Fuzzy Grouping transformation

- Fuzzy Lookup transformation

- Term Extraction transformation

- Dimension processing destination adapter

- Partition processing destination adapter

Reporting Services

- Data-driven subscriptions

- Report scale-out deployment

- Infinite clickthrough in ad-hoc reports

- Scale-out operational report configuration**

Data Warehousing

- Data compression**

- Star join query optimizations**

- Change data capture (CDC)**

- Scalable data marts and reporting**

Data Mining

- Parallelism for model processing & prediction

- Advanced Data Mining algorithms with algorithm plug-in API

- Advanced configuration and tuning options for data mining algorithms

- Time series capabilities**

** double-star denotes feature of SQL Server 2008

Introduction to New Data Warehouse Scalability Features in SQL Server 2008

Jason Morales' Microsoft BI Update

Tuesday, July 15, 2008

Mark Malakanov poorly maintained blog: July 2007

One workaround for the notoriously slow insert times from SSIS into Oracle.

SSIS 2005. How to avoid Autocommit when load into OLE DB destination.

An example with using of Oracle OLE DB driver on a destination side.

I have heard from my friend that he is experiencing very slow load into Oracle database. He used SSIS 2005. Which is funny, because Oracle Warehouse Builder is available. Anyway.
The cause of slowness of SSIS, when it loads into non-Microsoft OLE DB target, appeared to be the commit after every insert. Since the ‘Autocommit’ is set to be ‘ON’ by OLE DB standard, a driver propagates this mode to the database. The only way to turn it off, is to switch OLE DB datasource into a Transactional mode by issuing a BeginTransaction() call.

Mark Malakanov poorly maintained blog: July 2007

Error Handling in SSIS - SQL Server Central

Borrowing liberally from Jack Corbett's scripts will get you a great custom error handler.  No more creating and maintaining error tables for SSIS data flows!

Anyone who has used SSIS error handling has been frustrated that it returns an Error Code with no description. I decided that I needed a solution. Googling the problem led me to this blog by Jamie Thomson which had a link to a Script Component that gets the Error Description. Excellent! While this was I good start, I also wanted to see the data that caused the error. I started by including the data and outputting to a text file, but, being a database pro, this did not sit well with me. I started looking for a solution that would allow me to store the data in a table and be useable in each dataflow task, and, once again, the internet provided a solution, this article by jwelch on agilebi.com which I highly recommend reading.

My solution is uses the Input0_ProcessInputRow method and reflection, as shown by jwelch’s article, to loop through the columns, building an XML string (name-value pairs) which I insert into a table. I can then query the table to see the errors and the data. I can then verify that I fixed the errors from earlier loads.

Here is the solution:

Error Handling in SSIS - SQL Server Central

Call stored procedure inside SSIS Script task

Useful way to build custom data flow scripts, with increased performance benefits in Oracle, and access to other custom data providers and APIs.

I would like to call stored procedure inside Script task.
Any example out there?

Call stored procedure inside SSIS Script task

BI Thoughts and Theories : Address Columns Generically In a Script Component

Also useful when building a custom error script in SSIS.

Address Columns Generically In a Script Component

When writing script components in SSIS, it's often useful to be able to process the columns without knowing exactly what they are. For example, you may want to loop through all the columns to check for a conditional, like NULL values in the columns. Or you may want to take a set of columns and concatenate them into a single string for output, or output each one as an XML element. In any of these cases, you don't necessarily care about the individual column details.

One option is to use Reflection. This is fairly easy to set up, but is not the fastest performing option. To use Reflection, add this to the top of the Script class:

BI Thoughts and Theories : Address Columns Generically In a Script Component

BI Thoughts and Theories : XML Transformations Part 2

After a bit of frustrations, I got this to work with a SqlClient rather than an OLE DB client by parsing out the connection string and doing a string replace to remove the provider and other invalid information.

I'm using something similar to create a custom error handler that exports to an XML column.  No more error tables for each dimension table.

The script is very similar to the one from the previous article, though it is a bit simpler. The XML values are prepared by iterating through the Input Collection metadata, and using the Reflection classes to read the Row object dynamically. The resulting value is put into the XML column on the output.

The OLEDB Destination simply maps the XML column to an XML column on a database table.

This is a good alternate approach to the previous one, particularly if you need to send the XML to a database instead of a flat file.

BI Thoughts and Theories : XML Transformations Part 2

Thursday, July 10, 2008

Magic Quadrant for Business Intelligence Platforms, 2008

Gartner has released the 2008 Magic Quadrant for BI Platforms.

Looks like BI has come of age, according to Gartner.  We have leveled off and are coasting rather than innovating.

Note that for the 2008 Magic Quadrant, given the maturity of the market, the Innovation criterion was not rated separately. Instead, it was factored into the Market Understanding and Offering (Product) Strategy criteria.

Magic Quadrant for Business Intelligence Platforms, 2008

SAS is the visionary leader, not much surprise here.  Microsoft and Cognos are the overall leaders.  Panorama, arcplan, and newcomer Board International are all niche players. 

Truly the last two years have been one of consolidation.  The next few years should see another round of innovation and new product offerings, if the cycles continue.

One interesting product that IBM returned to the mothership was Cognos.  It's technology that came out of IBM now returning to the fold.

The one player that's missing from this puzzle?  Google.

Companies such as Panorama are now promoting integration with Google technologies, though this article from 2006 seems to promote the idea of Google itself as a BI tool.  When I think of BI in terms of the quadrant I think of reporting rather than searching, so Google really doesn't fit into this space just yet - unless you want to count Google Analytics.

It will be interesting to see what the next year brings, with the launch of SQL 2008, PerformancePoint improvements, Microsoft's MDM tools, integration of Cognos and Applix with IBM software, and whatever happens with SAP and their BO.

My first pick would be an easy-to-use tool that brings relational, unstructured and multidimensional cube data together under one umbrella with point & click or easily maintainable scripts for ETL consolidation and data scrubbing, and a place to put planning numbers and napkin-style forecasts.  Preferably with some fancy charts, graphs, dashboards, and output to PDF, Powerpoint & Excel (and Visio?).  Without any installation or security hassles, java virtual machines, memory leaks, bloatware, or incompatibilities with other software.

What about using the OCZ NIA as an interface and the NVidia Tesla GPU for calculations?  Why not add a little e-Ink to help save some trees?

Too much to ask?

Microsoft Certified Master Product Overviews and Requirements

There are 3 new certifications out of Microsoft... the "Masters" series.

Microsoft Certified Master Product Overviews and Requirements

Published: June 10, 2008

IT professionals who hold Microsoft Certified Master technology-based certifications are recognized as holding the highest level of technical certification for Microsoft server products. Find out if you have what it takes to become a Microsoft Certified Master by reading the program overviews and prerequisites.

Microsoft Certified Master Product Overviews and Requirements

Tuesday, July 08, 2008

Deprecated Database Engine Features in SQL Server 2008

A few of the more common things that will probably break your SQl 2008 upgrade if you have not run the upgrade advisor for SQL 2005 & corrected them:

Database compatibility < 80

*= & =* joins

SET ROWCOUNT for INSERT, UPDATE, DELETE

RAISERROR syntax

More here...

Deprecated Database Engine Features in SQL Server 2008

Changes that may break SQL 2000 - 2005 upgrade.

http://msdn.microsoft.com/en-us/library/bb510680(SQL.100).aspx

Thursday, July 03, 2008

Missing Date Ranges- the Sequel

Time intelligence - a missing component of SQL - is available with some scripts like these.

Missing Date Ranges- the Sequel

Introduction to the Transaction Log - SQL Server Central

Excellent article about transaction logs in SQL.  My best tip:  Always backup the transaction log and clear unused entries.  If your log hasn't been backed up before, back it up twice to be able to shrink it.  Never use the TRUNCATE command. It is much safer to use the backup facility to maintain the log.

Introduction to the Transaction Log - SQL Server Central