Metrics - A little knowledge can be a dangerous thing (or 'Why you're not clever enough to interpret metrics data')

Posted by Jason Crease on Simple Talk See other posts from Simple Talk or by Jason Crease
Published on Thu, 03 May 2012 14:23:00 GMT Indexed on 2012/05/30 16:56 UTC
Read the original article Hit count: 302

Filed under:

At RedGate Software, I work on a .NET obfuscator  called SmartAssembly.  Various features of it use a database to store various things (exception reports, name-mappings, etc.) The user is given the option of using either a SQL-Server database (which requires them to have Microsoft SQL Server), or a Microsoft Access MDB file (which requires nothing). MDB is the default option, but power-users soon switch to using a SQL Server database because it offers better performance and data-sharing.

In the fashionable spirit of optimization and metrics, an obvious product-management question is 'Which is the most popular? SQL Server or MDB?'

We've collected data about this fact, using our 'Feature-Usage-Reporting' technology (available as part of SmartAssembly) and more recently our 'Application Metrics' technology:

Parameter

Number of users

% of total users

Number of sessions

Number of usages

SQL Server

28

19.0

8115

8115

MDB

114

77.6

1449

1449

(As a disclaimer, please note than SmartAssembly has far more than 132 users . This data is just a selection of one build)

So, it would appear that SQL-Server is used by fewer users, but more often. Great.

But here's why these numbers are useless to me:

Only the original developers understand the data

What does a single 'usage' of 'MDB' mean? Does this happen once per run? Once per option change? On clicking the 'Obfuscate Now' button? When running the command-line version or just from the UI version? Each question could skew the data 10-fold either way, and the answers only known by the developer that instrumented the application in the first place. In other words, only the original developer can interpret the data - product-managers cannot interpret the data unaided.

Most of the data is from uninterested users

About half of people who download and run a free-trial from the internet quit it almost immediately. Only a small fraction use it sufficiently to make informed choices. Since the MDB option is the default one, we don't know how many of those 114 were people CHOOSING to use the MDB, or how many were JUST HAPPENING to use this MDB default for their 20-second trial.

This is a problem we see across all our metrics: Are people are using X because it's the default or are they using X because they want to use X? We need to segment the data further - asking what percentage of each percentage meet our criteria for an 'established user' or 'informed user'. You end up spending hours writing sophisticated and dubious SQL queries to segment the data further. Not fun.

You can't find out why they used this feature

Metrics can answer the when and what, but not the why. Why did people use feature X? If you're anything like me, you often click on random buttons in unfamiliar applications just to explore the feature-set. If we listened uncritically to metrics at RedGate, we would eliminate the most-important and more-complex features which people actually buy the software for, leaving just big buttons on the main page and the About-Box.

"Ah, that's interesting!" rather than "Ah, that's actionable!"

People do love data. Did you know you eat 1201 chickens in a lifetime? But just 4 cows? Interesting, but useless. Often metrics give you a nice number: '5.8% of users have 3 or more monitors' . But unless the statistic is both SUPRISING and ACTIONABLE, it's useless.

Most metrics are collected, reviewed with lots of cooing. and then forgotten. Unless a piece-of-data could change things, it's useless collecting it.

People get obsessed with significance levels

The first things that lots of people do with this data is do a t-test to get a significance level ("Hey! We know with 99.64% confidence that people prefer SQL Server to MDBs!") Believe me: other causes of error/misinterpretation in your data are FAR more significant than your t-test could ever comprehend.

Confirmation bias prevents objectivity

If the data appears to match our instinct, we feel satisfied and move on. If it doesn't, we suspect the data and dig deeper, plummeting down a rabbit-hole of segmentation and filtering until we give-up and move-on. Data is only useful if it can change our preconceptions. Do you trust this dodgy data more than your own understanding, knowledge and intelligence?  I don't.

There's always multiple plausible ways to interpret/action any data

Let's say we segment the above data, and get this data:

Post-trial users (i.e. those using a paid version after the 14-day free-trial is over):

Parameter

Number of users

% of total users

Number of sessions

Number of usages

SQL Server

13

9.0

1115

1115

MDB

5

4.2

449

449

Trial users:

Parameter

Number of users

% of total users

Number of sessions

Number of usages

SQL Server

15

10.0

7000

7000

MDB

114

77.6

1000

1000

How do you interpret this data? It's one of:

  1. Mostly SQL Server users buy our software. People who can't afford SQL Server tend to be unable to afford or unwilling to buy our software. Therefore, ditch MDB-support.
  2. Our MDB support is so poor and buggy that our massive MDB user-base doesn't buy it.  Therefore, spend loads of money improving it, and think about ditching SQL-Server support.
  3. People 'graduate' naturally from MDB to SQL Server as they use the software more. Things are fine the way they are.
  4. We're marketing the tool wrong. The large number of MDB users represent uninformed downloaders. Tell marketing to aggressively target SQL Server users.

To choose an interpretation you need to segment again. And again. And again, and again.

Opting-out is correlated with feature-usage

Metrics tends to be opt-in. This skews the data even further. Between 5% and 30% of people choose to opt-in to metrics (often called 'customer improvement program' or something like that). Casual trial-users who are uninterested in your product or company are less likely to opt-in. This group is probably also likely to be MDB users. How much does this skew your data by? Who knows?

It's not all doom and gloom.

There are some things metrics can answer well.

  1. Environment facts. How many people have 3 monitors? Have Windows 7? Have .NET 4 installed? Have Japanese Windows?
  2. Minor optimizations.  Is the text-box big enough for average user-input?
  3. Performance data. How long does our app take to start? How many databases does the average user have on their server?

As you can see, questions about who-the-user-is rather than what-the-user-does are easier to answer and action.

Conclusion

  1. Use SmartAssembly. If not for the metrics (called 'Feature-Usage-Reporting'), then at least for the obfuscation/error-reporting.
  2. Data raises more questions than it answers.
  3. Questions about environment are the easiest to answer.

© Simple Talk or respective owner