Cannot View Data Mining Model in BIDS – function does not exist

I’d been running some Naive Bayes Data Mining models without problems as part of initiating a Data Mining exercise, so it was time to move on and cut the data some different ways. So I set up a Decision Tree model and it processed fine, but when I tried to view it a message box appeared telling me it wasn’t going to co-operate:

The tree graph cannot be created because of the following error:

‘Query (1,6) The

‘[System].[Microsoft].[AnalysisServices].[System].[DataMining].[DecisionTrees].[GetTreeScores] function does not exist.’.

Fortunately someone had hit this before, as the solution is rather obscure. The install I am working against is non-standard, being split across two drives. What had happened is the path for the Data Mining dll’s set up in the install process didn’t actually match where they were placed.

So when I looked under the assembly location – SSMS > AS Server > Assemblies > System > Properties, the Source Path referenced a dll that didn’t actually exist – so it appears this incorrect path does not raise an error when trying to start the server. To fix it, I located located where the dll really was, then updated the config files where this path is stored – System.0.asm.xml and VBAMDX.0.asm.xml – to point to that path.

A restart of the server and the models reprocessed and I could happily view the output!

Kilimanjaro, Projects Gemini and Madison Webcast

For those who haven’t seen much of Project Gemini but have heard the buzz, this TechNet Webcast: An Early look at SQL Server ‘Kilimanjaro’ and project ‘Madison’ – will give you a good insight. It also has some features on reusable Reporting Services components which look very impressive and info on Project Madison, which provides scalability features. Registration as usual is a pain, and forget trying to use the site using any browser other than IE – I wish Microsoft would make their content easier to access.

Anyway, onto the webcast – about the 1st quarter of the webcast is the usual generic roadmap blurb, but then the presenter gets into the real meat of Gemini – an Excel based ‘in memory’ analysis tool that allows joining between entities without having to know about such things, superfast analytics – pivoting, calculation, charts etc. and then being able to publish to Sharepoint. From an OLAP analysis point of view, the Pivot Tables also has slicers (effctively table wide filters) displayed in the spreadsheet as well, and it would be good if that made it into Pivot Tables generally in the next release of Office. It looks like an incredible tool and very easy to use – and may be a powerful step towards the realisation of the ‘BI for the masses’ vision. The presenter did let slip one weakness though – much has been made of the 100 million rows of data demo – but that data still has to be loaded into memory first and will still take significant time. I also suspect that how successful Gemini will be is going to depend on how much it will rely on good data structures being in place in an organisation to support it. The Data Warehouse is going to remain the core part of  any succesful BI delivery.

The next component of interest was the reusable Reporting Services components – there is the concept of a library of components that can be built – e.g. standard charts, logos, gauges etc – and then dropped into any report, either by a developer or a user in Report Builder 2.0.  What really grabbed my attention is that these components are version aware – i.e. if the library version of the component is updated then if you reopen the report in design mode it will let you know and give you the option to update. Again this points to ‘BI for the masses’ as you can have developers create some great components which any user can then drop in to their home grown reports. Plus as any developer knows, there’s a lot of repetition and any options for code re-use are always appreciated.

Finally, Project Madison was covered – and seems more about scalability up to multi-terabyte warehouses. It was a bit infrastructure focused for me so most of it passed me by, but clearly Microsoft are stepping up to try and address the market perception that they can’t scale.

Anyway, this all will be dropped in late 2010 as Kilimanjaro – an interim release of SQL Server.

Microsoft’s secret forecasting tool – the Office Suite

Last night I attended an IAPA presentation on basic forecasting concepts and the tools used, presented by the ever interesting Eugene Dubossarsky (of Presciient, an analytics consultancy).  I will skip over the forecasting content as for the Microsoft BI community, the interesting part is which tool he used for most basic forecasting activities. It was Excel. Then, when he needed to do more advanced work, he used – Excel. Only when he needed to do trickier stuff with larger amounts of data did he pull in a more heavyweight tool – Access.

That’s right – the office suite covers the majority of forecaster’s needs. SQL Server and Analysis Services didn’t get a look in until the really heavyweight analytics processes began. For his purposes however, Eugene much prefers R, an open source stats program that is free, very powerful and now a serious competitor to SAS - much to their annoyance. Microsoft are rumoured to be talking to the people behind R, and an acquisition would make sense for both sides – R is not user friendly, which Microsoft could provide help with – and adding the capabilities of R would allow Microsoft to take a slug at SAS’s BI market.

So, this shows that most users still aren’t fully aware of, let alone using Excel’s capabilites – otherwise they wouldn’t be paying analytics consultants to to use it for them. Microsoft are always pushing Excel further, so now i’ll cover two features of Excel that the power users may not be aware of. It’s easy to forget sometimes that the 2007 Office suite wasn’t just a new, pretty interface – it also added huge BI capabilities.

The Data Mining Add-In for Excel (download for SQL Server 2008 or 2005)

This Add-In allows you to leverage the Data Mining capabilities of Analysis Services through Excel. It allows you to use Excel as the front end for creating and working with Data Mining models that exist on your server. However what really makes it interesting for Excel users is that it allows you to perform Data Mining on your spreadsheet data.

There is a Virtual Lab here explaining and demonstrating their use.

Project Gemini

This feature is slated for the next release of Excel, and is an in-memory tool for analysing large amounts of data in an OLAP style, but without all the fiddly data modelling normally required. It is a clear slug at other players in the in-Memory market, such as QlikTech. The models created will also be able to be ported back to SSAS with minimum effort as well. For more details read this commentary from the OLAP Report.

Microsoft has one of the most powerful BI Tools in the world in Excel, users just need to be made aware!

SSAS Training Resource

I have added a link to Craig Utley’s excellent SSAS training video resource site LearnMicrosoftBI.com, which contains training videos on a variety of subjects in SSAS – dimensional modelling, Actions and the one I found most useful explaining the tricky but critical subject of Attribute Relationships (Video SSAS 109). Recommended for anyone starting out in SSAS or needing concepts clarifying.

Registration is required to download the videos (not sure why) – but it seems to generate no spam so not a big issue, and the content is very high quality for free content.