Datazen first client test!

I got to use Datazen for the first time in anger with a client this week, and my experience was a bit of a mixed bag. There are elements of it which are neat but a fair few niggles.

Things to love

It’s pretty. The simple visualisations mean it is easy to create a nice looking, uncluttered dashboard with minimal fuss and tweaking. The responsive design means cross device functionality is pretty good and looks nice. It’s also quick to build and change content.

Quick learning tips

First up, let’s get some practical learnings shared:

  1. Use views or stored procs behind your data if hitting a SQL Source. Datazen has no query editor (just a big text box) and doesn’t always handle parameters gracefully. Plus debugging is a pain as error massages are often less than helpful (e.g. “Data Preview Failed”)
  2. Set report friendly field names in you query as you can’t always manage them in designer – sometimes you can, sometimes you can’t.
  3. Selecting the ‘All’ option on a Selection List Navigator sends back ” (empty string) as the parameter value to the query, so handle that rather than NULL.

Now, some major drawbacks:

  1. Consistency of cross platform behaviour is not great. I found some drillthoughs didn’t work on iOS or Android. Windows 8 seems to be the go to client. It’s not fatal but for fussy clients it’s a hard sell that this cross platform tool doesn’t work as expected.
  2. The Win 7 Publisher app is unstable, buggy and seems to have features missing – such as proper configuration for report drillthrough. It’s only been around a few weeks so it’s forgivable but if you have to use it seriously, make sure you have a Win 8 client somewhere to do your development work on.
  3. The charting is actually quite limited. There’s no stacked bar, for example. Line charts can only be by time. Labeling and colours are pretty hard to control, often rendering components useless. A great example is the category chart (bar chart) – the renderer will sometimes decide not to display category labels – which then means you just have a nice picture of some bars with no context as to what each one is, like this:
Could be cat population by state for all I know
Could be cat population by state for all I know

Finally some irritations:

These are some of the things that got annoying as I used the product – not an exhaustive list – and small enough I’d expect them to be fixed relatively soon.

  1. You cannot update columns on a grid component if the underlying column names change – you have to rebuild component (a small task but annoying during development)
  2. You cannot set the Low/Neutral/High ranges for gauge columns on indicator grids so they match settings for other gauges
  3. You cannot align numbers – and they are aligned left which is not good dataviz practice
  4. There is no handling for outliers on heatmaps so one extreme value will screw your shading
  5. You can’t cascade drillthrough to a level below
  6. The data preview after creating a data view has no Scroll bar so if there’s a lot of fields you can’t see them
  7. There are maps provided but you have to work out how they are keyed yourself so you can connect your data (to be addressed in a future post)
  8. You can’t “oversize” the canvas so phone users can scroll down.
  9. Nobody’s using it – or at least talking about it – so help is hard to find.

A lot of irritation boils down to “I want to set that but I can’t”. This I’m sure is partly design, partly product maturity.

My takeaway.

After a week with the product I get a real sense that it’s not even a v1 product yet. Maybe v0.9. There’s lots of niggles in the product – and not just at the back end where users can’t see them. I could tolerate the back end weaknesses if the end user experience was great, but there’s holes there. Still, there’s a lot of good that can be done. It’ll be interesting to see how it fares given PowerBI has such a huge feature overlap.

Read More

IAPA 2014 Salary Survey

The IAPA salary survey came out a couple of months back, and though it is Analytics focused it has some interesting results for those of us in the BI world. My key takeaways follow.

From a purely self interested point of view, Analytics is a well paid profession and it’s getting more so. Further, recruiters are reporting that finding people is getting harder, which indicates the talent pool is not all that deep and has been sucked fairly dry already. Something I experience regularly when trying to find BI talent.

If you want a job in the field, you’re best off being in Sydney or Melbourne. There also appears to be minimum education level of a bachelors degree with most professionals holding a masters or higher. Marketing is one of the biggest employers of analysts.

For those in the field there seems to be a mid career slump in satisfaction (around the ten year mark). Fresh starters are all excited and lifers seem happy too, but somewhere in the middle the enthusiasm fades.

Despite all the market enthusiasm, a significant proportion of respondents said there is an ongoing challenge reported that analysts struggle to get their organisation to value or act on analytics findings – supportive of Eugene Dubossarsky’s claims that business heavily invest in vanity analytics so they can claim “me too” rather than to derive real value.

Technical takeways – for all the noise, Big Data is still a Small Concern and regular sized analytical problems are prevalent. Excel is the #1 tool used to work with data, and if you are more of an integrator good SQL skills are king.

Last of all, There still seems to be a heavy focus on social media analytics – despite it’s dubious value – but it pays better. Something which underscores the vanity analytics claims further.

Read More

Agility in BI

Over the last couple of years I have been increasingly exposed to Agile Methodologies for delivering BI projects. I’ve even once accidentally got myself trained as a Scrum Master. The topic itself isn’t exactly new (I blogged about it at the end of 2009) but adoption is starting to pick up as recognition that it can significantly improve the results of BI projects. What follows are some early stage ramblings on my experience so far as I try to pull together a more coherent story as I think its one that needs to be told and sold.

Agile Principles

For those of you unfamiliar with Agile, it stems from the Agile Manifesto drawn up by some experienced developers who figured there was a better way than waterfall. The aim is to deliver faster, earlier and closer to user expectations. This is achieved by focusing on high value deliverables, delivering them piecemeal in descending order of value and engaging very closely with the business. All sounds pretty sensible so far.

Agile People

There are a large variety of Agile Methodologies in play – Scrum, Extreme Programming, Lean Startup….   and none of them matter. Or at least, which one you pick doesn’t really matter. What matters more is whether your team does – or doesn’t – have the ability to work in an agile fashion. What I mean by this is that team members are self starting, self managing individuals with a low level of ego – for example they are happy to pitch in for the team and test some other developers work, rather than go “No, I code C++ and that’s all I will do.”. Good Agile teams are made up of members who understand they are in a team and that a team is not a fit of discrete lego bricks of skills but a fuzzy stew of ingredients where contributions blend into each other.

Now “the team” may just be your little pod of developers going out on a limb and trying soemthing new. Or, in a case I worked with last year, it may be a collosal financial services organisation where “the team” also encompassed very senior executives. Agility is a fragile thing and can just as easily be broken by a bottom rung developer not pulling their weight as it can be by a senior manager not understanding what the team is doing and trying to influence them to behave in non-agile ways.

Adoption Issues

The take up of Agile is often fraught with frustration and disappointment that a Nirvana of productivity doesn’t arise as soon as the Agile team forms. The reasons behind this are manifold – not least because it takes a while for the team to start working together with the new approach. Most methodologies acknowledge that initially productivity will be worse, rather than improved, until the team has stormed formed and normed its way to a higher productivity level. Many businesses struggle with the idea that they must let the patient get worse before they can get better.

Something else I have seen frequently is that the business engagement is poor – and not for want of trying. Before Agile is attempted, the business will often complain IT doesn’t involve them enough. Once it is attempted, they then grumble that involves them too much. This is incredibly frustrating to see as it underscores the business percepetion that IT is the servant and they are the master, rather than a valued partner trying to help them achieve their goals.

The Road to Success

I’m still gathering my thoughts on this but I suspect part of the road to success in doing BI in an Agile way is going to involve raising some red flags in the same way that you do for BI projects for the classic modes of failure (e.g. no business sponsor). Too often i’ve seen Agile fail because it is driven out of the development team wanting to do better but being unable to get the support needed to make it happen. The other killer is the “WAgile” trap where the team tries to operate as an Agile unit under a Waterfall project management approach.

I’m also keen to hear any readers war stories in the comments.

Read More

Top Six Worst Practices in BI – Vendor Nonsense

A while ago I was pointed at this TDWI white paper – “Top Six Worst Practices in Business Intelligence” – which turned out to be a classic TDWI vendor driven pile of steaming misinformation. Verbatim, here are their claims:

  1. Buying what analysts want without considering the needs of other users
  2. Deploying new BI tools without changing the Excel mindset into a BI platform mindset
  3. Making a BI purchasing decision based on one hot feature, or buying a feature list rather than laying the foundation for a comprehensive BI strategy
  4. Lack of a concrete data quality strategy or treating data quality as an afterthought
  5. Not taking a “mobile-first” approach, or not considering the needs of mobile BI users
  6. Ignoring new data, new sources, and new compliance requirements

Just in case anyone is in danger of believing this, I thought I’d give a rebuttal as I have a headache and am feeling grumpy.

1. Buying what analysts want without considering the needs of other users

The paper makes the somewhat bizarre (and unsupported) claim that:

Most companies make business intelligence (BI) purchasing decisions based on input from
just one group of users – the business analysts. This single perspective, however, creates
many problems.

I can safely say in any purchasing decision I’ve ever come across the BA’s input has been somewhere between nada and zip. The reality is that it’s driven by corporate policies, what legacy systems are already in place, licensed and supported and one in a blue moon some idiot in management sees Tableau or Qlikview and buys that without considering any of the underlying problems that will cause.

There is a grain of truth in this point – that any purchasing decision that doesn’t consider the end users preferred way to to use information is doomed to failure. The back end is irrelevant – end users do not care about the Data Warehouse platform, the ETL tool or even the Cube platform you use. Just the front end. And most of the time, that front end is Excel.

2. Deploying new BI tools without changing the Excel mindset into a BI platform mindset

This is a vendor problem, not a user problem in that users prefer Excel to their tool. Sorry about that to anyone who isn’t Microsoft. It’s kind of odd because this point contradicts their first point.

3. Making a BI purchasing decision based on one hot feature

Yes. This explains Tableau and Qlikview’s popularity. However the solution is not – as the erstwhile vendor claims – their product. In fact, I’m not even sure why this is on the list. Technical issues are rarely the source of BI project failure, so it doesn’t really matter what product you choose – and I’m sure every vendor in the world will recoil in horror at this uncomfortable truth. What matters are people and data. The tool connecting the two is often inconsequential.

4. Lack of a concrete data quality strategy or treating data quality as an afterthought

This I agree with. Data Quality is a huge pain point.

5. Not taking a “mobile-first” approach

If it’s relevant to your organisation. In my current organisation it is utterly irrelevant. In many projects I’ve worked on it’s been a nice to have that got discarded quickly due to its poor value. If it affects adoption, then it’s relevant and must be considered. If it won’t, it doesn’t need to be thought about.

6. Ignoring new data, new sources, and new compliance requirements

I’m ambivalent about this one. The implication of this is less about ignoring and more about being unable to adapt. Rigid, locked down BI systems rapidly become irrelevant because they must change as business changes. However this is as much a function of people and process as it is technology.

How about ONE worst practice in Business Intelligence?

Try this instead: “Believing your BI implementations success or failure will be impacted by technology more that by the people using it”.

…as a cheeky second practice – “Believing anything vendors tell you”.

Merry Christmas and a Happy New Year!

Read More

October Sydney training roundup – MS BI, Cloud, Analytics

The end of the year is closing in fast but there’s still plenty of chances to learn from specialist providers Agile BI, Presciient and of course, me!

Topics cover the full spread of DW, BI and Analytics so there’s something for every role in the data focused organisation.

Build your Data Warehouse in SQL Server & SSIS with the BI Monkey

Nov 24/25 – Are you about to build your Data Warehouse with Microsoft tools and want to do it right first time?

This course is designed to help a novice understand what is involved in building a Data Warehouse both from a technical architecture and project delivery perspective. It also delivers you basic skills in the tools the Microsoft Business Intelligence suite offers you to do that with.

Get more detail here

Agile BI workshops

Power BI specialist Agile BI brings your product updates on this key new self service BI technology:

Oct 15 – Power BI workshop – Excel new features for reporting and data analysis – more detail here

Oct 30 – What Every Manager Should Know About Microsoft Cloud, Power BI for Office 365 and SQL Server 2014 – more detail here

Presciient Training

Dr Eugene Dubossarsky shares his deep business and technical exercise across a range of advanced and basic analytics. Full details here but the key list is:

Dec 9/10 – Predictive analytics and data science for big data

Dec 11/12 –Introduction to R and data visualisation

Dec 16/17 –Data analytics for fraud and anomaly detection, security and forensics

Dec 18/19 – Business analytics and data for beginners


Read More

Creating effective date ranges from multiple sources using Window Functions

Sometimes dimension data is managed across multiple tables. Just to complicate things sometimes this data has independent effective date ranges on these sources. So when we try to tie our data together, trying to pick which item of data is effective when is a bit of a challenge.

A picture speaks a thousand lines of blog post, so the picture below spells it out:

Date Ranges from multiple sources
Date Ranges from multiple sources

Table A has a set of data with different effective periods. Table B has a set of data for the same attribute with a completely independent set of effective periods. In data terms, it looks like this:

Date ranges from multiple sources sample data
Date ranges from multiple sources sample data

The challenge is to join them together so we get the right combination of attributes effective at the right time, as per the first picture. Now there is a way to do it through a join with careful selection of start / end  dates in a CASE statement and filtering out of records using  WHERE clause. However that has the downfall that it cannot cope with records where there is no cross over of data – so records  “1-“,”4-” and  “5-” have to be added in later through a separate process.

The alternative is to get the window functions voodoo doll out, and stretch the brain a little so you can do it all in one query.

Step one in this exercise is realising that each tables start dates could also be end dates in the other table, and each tables end dates could also be start dates (less a day) in the other table. So we need to UNION End Dates from Table A with Start Dates from Table B, like so:

SELECT    ID,    [A Start Date] AS [Start Date]
FROM    Table_A
SELECT    ID,    [B Start Date]
FROM    Table_B
— All end dates are start dates + 1 too
SELECT    ID,    ISNULL(DATEADD(d,1,[A End Date]),’31 Dec 2099′)
FROM    Table_A
SELECT    ID,    ISNULL(DATEADD(d,1,[B End Date]),’31 Dec 2099′)
FROM    Table_B

Now, this gives us a full set of every possible start date – which is a starting point.The end result looks like this:

Union Results
Union Results

We can repeat the same trick for end dates and then do a cartesian join on the two sets and then we get a combination of every possible start and end date. No we need some criteria by which to select the right date pair. If we add a DATEDIFF to the resultset it becomes obvious we want to pick the smallest date range:

Crossjoin results with DATEDIFF
Crossjoin results with DATEDIFF

A WINDOW function gives us the intelligence to pick the right row. So if we apply a ROW_NUMBER() over a PARTITION of Start Date, ordering by End Date, then we just have to select the first row of each partition:

The final result
The final result

Now we have a complete set of effective date ranges on which to join our attribute tables!

Grab a copy of the query, including sample data scripts here: DateRanges

Don’t forget to check out my upcoming training “Build your Data Warehouse in SQL Server & SSIS with the BI Monkey” – a  primer course for anyone wanting to build a first DW using Microsoft tools.


Read More

The APT of BI & Analytics

In the world of Information Security an Advanced Persistent Threat (APT)  “usually refers to a group, such as a government, with both the capability and the intent to persistently and effectively target a specific entity”.

I’ve written and tweeted and otherwise socialled the message about the threat automation is posing human cognitive labour. However one thing I’ve skipped over – despite through my career choice being an implicit part of – is the APT to human labour that the application of analytics within business represents.

Attempting to control labour productivity and costs have always been an important activity within any operation – more productivity per unit of labour at the lowest possible cost being the key aim (when was the last time you heard business groups advocating higher minimum wages?).

The Science of the Labour Analytics APT

BI & Analytics have enabled this to move from an art – i.e. “I, Bob’s manager, suspect he is a slacker, and should be fired” – to a science – i.e. “I, Bob’s manager, see he is costing more to employ than he generates in revenue, and should be fired.”  To people working in sales this is nothing new – targets and bonuses have long been part of the way their performance is measured (gleefully ignoring the evidence that this is counter-productive). Now however, everyone in the organisation can be assigned a “worth” which they must justify.

Now traditionally some components are more easily allocated value – sales people generate revenue, consultants can be sold – but areas that have been harder are starting to fall into a metricisable state. For example, through analytics of customer satisfaction, it can be worked out which aspects of service – billing, support, service levels – actually matter. Then consequently what the business should spend to get that function delivering the customer satisfaction to keep the customer. If support doesn’t really matter, don’t ask for a payrise if you work in that department.

Its not all dark side, of course – part of the metricisation of labour has meant that some improvements to working life have come along. The realisation that happy employees are more productive has led to companies paying more than lip service (read: obligatory Christmas party and awkward team-building events) to keeping people happy and feeling like their work is worth expending effort on. So we can all look forward to less beatings.

Analysts are the Architects of Unemployment

It may be a bit harsh to suggest this, but I believe that alongside the roboticists, software developers, visionaries and other people building our future, analysts are a key player in removing human labour from daily life. At the futurologists end of the deal they are designing the learning systems which will allow machines to think, but in the here and now they are creating the basis for working out what sections of business need to be automated first.

At least the good news is that as an Analyst you will probably be one of the last to be fired….   by the HR algorithm.

The Good Little Robot
Courtesy “The Good Little Robot” –


Read More

Is ETL Development doomed?

A slightly dramatic title, but over the last few months I’ve been exposed to a number of tools that will provide a strong layer of automation to ETL development, eliminating a lot of the basic need for ETL developers to shift data from System A to Database B and beyond.

I’ve been hands on with a couple:

And also heard about a couple more:

… and I’m certain this list is not comprehensive. The significant takeaway is that software build automation in the BI world is starting to catch up with where other software languages have already been (Coded a website lately? many IDE’s do most of the hard work for you now). Much as IDE driven tools such as DTS, SSIS and so on moved us away from hand coding SQL and wrapping up those scripts, the latest round of tools are moving us away from the IDE’s where we drag and drop our flows.

How will ETL careers be killed?

There seems to be a couple of tracks for this. First is the pure development automation tools, such as Varigence MIST. If you are technically minded, take a look at this product demo video – though I suggest skipping to about 25 minutes in to see the real meat as it does go on a bit. It looks mindbogglingly powerful but is clearly shooting at the ETL pro who wants to churn stuff out faster, more consistently and with less fiddling about. MIST is limited to SSIS/AS (for now) and I’m not sure how far it will go as it’s clearly aimed at the developer pro market, which is not always the big buyers. I expect to be playing with it more over the next few weeks on a live project so should be able to get a better view.

The second path appears to be more targeted at eliminating ETL developers in their entirety. AnalytixDS wraps up metadata import (i.e. you suck in your source and target metadata from the systems or ERWIN), do the mapping of fields and apply rules, then “push button make code”. Obviously there’s a bit more to it than that, but the less you care about your back end and the quality of your ETL code (cough Wherescape cough) the more likely this product will appeal to you. Say hello, business users, who are the big buyers (though I look forward to troubleshooting your non-scalable disasters in the near future).

What’s the diagnosis, doctor?

Long term, the demand for ETL skills will decline on the back of these tools. Simple ETL work will simply go away, but the hard stuff will remain and it will become an even more niche skill that will pay handsomely – though you may spend more time configuring and troubleshooting products than working with raw code. Which class of tool dominates is uncertain, but I’m leaning towards the business oriented mapping tools that completely abstract away from ETL development altogether.

If you’ve had any experience with these tools, I’d love to hear about them in the comments.

Read More

Gartner 2014 Magic Quadrant for Business Intelligence

The Gartner Magic Quadrant for Business Intelligence and Analytics Platforms is now available.

Good news for Microsoft again – it remains in the Leaders quadrant, though in line with all other MegaVendors has slipped a bit due to a weak story around data discovery. It still remains a well loved platform by both users, developers and architects and is showing increasing levels of being the standard enterprise product. For those of us working the the field it remains a safe bet from a career point of view for a good few years yet.

On the downside there are the same bugbears we are still complaining about – no credible mobile story, metadata management is non-existent (hello Project Barcelona – no news for 2 years now?)  and PowerView, while shiny, is no match for the likes of QlikView or Tableau (regardless of how ugly they are behind the shiny screens, that’s what the users see and judge you on).

Anyway, not too shabby a report card, a decent score but the usual caveat of “could try harder”. But the other big kids (IBM Cognos, SAS, Oracle) are doing pretty much the same so not much to worry about.

Read More

What will we do when White Collar Automation takes our jobs?

While I’m in a bit of a groove about the future of the workplace, I may as well talk about how there may not be a future for the workplace.

Automation destroyed the working class

The Industrial Revolution was so long ago now that it qualifies as history. The replacement of skilled labour with machines wiped out a whole class of skilled workers, but simultaneously expanded opportunities for unskilled workers to such an extent that overall standards of living rose and most people saw this as a Good Thing(tm). However since the seventies, robotics and computing started to strip humans from the factory to the point that now the modern factory floor workforce is only a tiny proportion of what it used to be. Similar effects can be found in farming, where vast farms are now run by just a handful of people.

Any repetitive physical task can be completed by a robot – and nobody has questioned this too hard. Factory conditions are harsh and most people don’t want to perform the exact same task hundreds of times a day due to the physical and psychic toll that can take.

However a clear upshot of this is that unskilled labour has little place in a modern economy. You could perhaps be a driver (a career with probably less that 20 years left before that becomes automated) – work in retail (currently being seriously eroded by ecommerce) – construction (safe for now) – but the options are limited and shrinking. If a job doesn’t require physical presence (e.g. Bricklayer) or face-to-face interaction (most sales) then it is potentially at risk.

A debate I’ve been having recently with a friend thinks that office workers are more immune…  but I think she’s being rather optimistic.

Analytics will destroy the middle class

Famed economist John Maynard Keynes once predicted widespread unemployment “due to our discovery of means of economising the use of labour outrunning the pace at which we can find new uses for labour” – i.e. we will make the economy so efficient that we don’t need all available working people to run it any more.

Now this future has been long foreseen by Science Fiction writers and falls across a wide spectrum of possibilities. There’s the wildly optimistic future presented by the late Iain M. Banks of “The Culture” where effectively machines take care of humanity in a benign manner and give them a life of luxury and freedom. Then there is the darker end, such as UK comic 2000AD‘s character Judge Dredd‘s dystopian Mega Cities where wealth is concentrated in the hands of the few and 99% of the population is unemployed and lives off far from generous state handouts and life for most people is pretty dismal.

According to a study by Oxford University nearly 47% of US jobs are at high risk of being replaced by automation within the next 20 years. So this may be a reality we need to work out sooner rather than later. If your job involves decision making and it has routine repeatable elements to it then it is at risk of a pattern detecting engine being applied to it and that decision making process delegated to a machine. This could be as simple as approving a loan – something that is largely automated anyway – or as complex as diagnosing cancer.

Now many people may resist this and argue that a machine could never replicate the subtlety of human thinking. To some extent that is true, but the quality of human decision making is poor and it is arguable that handing over things such as medical diagnoses to systems that can absorb a volume of data far beyond our poor human brains capacity – and assess it rationally and fairly – may well improve the decisions that do get made.

So, perhaps it is time hail our new AI overlords, and let us pray they are kind to their creators…

Read More