The Business Case for Snowplow Analytics
The Business Case for Snowplow Analytics
by Luke Thomas

tl;dr - let me teach you how to sell the value of Snowplow Analytics from a business-person's perspective.

A bit about me

Before I jump into specific tactics and strategies for selling the value of owning your own data collection stack (via Snowplow Analytics), I'd like to provide a bit of backstory to help you understand the trial and error I encountered on my analytics journey.

A bit more about me. I'm the Head of Product at Crystal; we're building software to help individuals and organizations work better through an understanding of personality differences. Previously I worked on the growth team at O'Reilly Media. Before that, I worked at a couple venture-backed startups. I've also consulted with early stage companies on their analytics/growth stacks and am building a weekly retrospective tool on the side.

Common Analytics Pitfalls

In the early stages of company-building, you need to spend most of your time collecting qualitative feedback. This comes from user-testing, research calls, and other high-fidelity feedback loops.

As the company starts to grow, you'll find yourself instrumenting analytics to understand where the issues and opportunities can be found. For most startups (especially software-as-a-service), you'll install Mixpanel or Amplitude for product analytics, and Google Analytics to understand where new signups/customers are coming from.

Any post-signup behavior is fed into Mixpanel, while Google Analytics keeps track of pre-signup behavior.

For the most part, these systems work well. There's a low barrier to getting started, and there's valuable features like funnel analytics and cohort analytics that you can run out of the box.

Everything is great. Until it isn't.

You are locked-in.

Now is the time where things start to fail.

The company continues to grow, but you start getting data requests from a variety of people inside the organizations. Marketers want to understand how they are acquiring customers, product people want to dig deeper into funnels and more advanced cohort analytics, and the CEO wants to understand the business-level data.

If you're the person in charge of analytics (which tends to fall somewhere between product and marketing), you may spend more time writing custom queries, oftentimes hitting the production database because engineers have a ton of other projects that take a higher priority. You're finding that Mixpanel can't answer certain questions you have, so you start learning how to write their crazy querying language. Next, you may hire a data analyst, who's primary task is to write SQL queries all day long.

To make matters worse, you hit a limit with your analytics tools and now they want at least $25k to store your data. There's no end in sight with this either, as you know that the price will increase exponentially as the company (and data set) grows.

You've officially been locked into a system that is mediocre at best.

Wrappers = extra $ There's also been a rise of analytics wrappers. Segment.com offers an easy way to abstract the various third-party tools into a single api (which has become popular with early stage startups), but the price can still be prohibitive, as you're paying for Segment AND all the third-party tools like Mixpanel/Amplitude. Once again, as your data needs grow, so does the price.

I really like the folks at Segment, but the main value of their tool is in the early stages of company-building, when you need to be able to spin up multiple third-party tools and keep engineers writing code vs. installing js snippets.

So what do you do?

The Case for Snowplow Analytics

Now that story-time is done, let's talk about the issues at play (and the opportunies that exist).

Someone else owns your data

The biggest issue here is that you don't really own your most important data.

Sure, you may have the ability to export the data via API endpoints, but the analytics vendors know that this can be extremely difficult to migrate the data, which is all part of their business strategy (near free price in the early days, dramatically increase the price later).

You need to own your data, but don't have time to wrangle it.

The biggest value that Snowplow Analytics provides is that you can finally own your own data. Snowplow offers client side and server-side libraries (just like Mixpanel and Amplitude), but instead of piping your data to someone else's data store, it's flowing to your own Redshift instance where you can do whatever you want with it. For more info, read the basics of Snowplow guide.

You don't have time to go out and create your own libraries to collect/process/store your data - you have a business to handle and it's probably not a data analytics company. Fortunately Snowplow handles this for you. Yay for open source libraries.

Your data needs to flex with the business

The data you need in the early stages of company-building is very different than the data needs of an established business. As the business grows, there will be many stakeholders that you'll need to keep happy.

If your data store is brittle, you will experience a lot of pain. The CEO won't be able to get the numbers he needs for the investor update, the product manager won't understand if a feature is a success or failure, and the marketing team will spend way too much money if they can't see where the existing dollars are being spent.

So far, the only solution I've found is to own your own data. It's super easy to join additional data sets with Snowplow event data. Not so much with a third-party provider.

(Literally) track all the things

Another major benefit of Snowplow Analytics is that if you get a lot of traffic/usage, you don't have to play the game of "should we track this event?" I've frequently had to make decisions like this because the analytics bill will get out of control.

While I strongly advocate for only tracking the events that you need, it's nice to be able to track all the things (if you want). This is especially helpful if you are growing a business with a heavy focus on SEO or website traffic.

Ad-Blockers

One less frequently discussed but important detail (especially with the marketing team), is the rise of ad-blockers. I'm pretty sure this isn't discussed because marketers don't understand the technical details, but if you spend money to acquire customers, and many of those visitors are blocking Mixpanel or Amplitude (via ad-blockers), it WILL have an effect on your customer acquisition metrics.

At that point, you're only getting sample data. While this is fine with larger companies who have a lot of volume, this can be disastrous for growing companies with limited budgets.

The beauty of Snowplow is that your javascript snippet won't get picked up by these ad-blockers, as it's not a third-party tracking script.

Super Scalable

Snowplow will scale as your business continues to grow. I've seen it work in small startups (like Crystal) and big ones. My brother is a data engineer at a public company, and they are currently setting up Snowplow there. It will handle as many events as you can throw at it.

Downsides of Snowplow Analytics

At this point you may be wondering - "what's wrong with Snowplow?" I'd be happy to explain downsides of using Snowplow Analytics. I've seen several different Snowplow instances over the last few years, so I can highlight the painpoints below.

Visualization

Don't get me wrong - I really like Mixpanel's funnel visualization and cohorting tools, but I think these can be built using SQL and displayed using visualization tools like Metabase, Chartio, Looker, or Mode. Yes, it takes more time to write these queries, but you have a lot more customization you can offer to the stakeholders inside your business. It's more scalable over time, even though it's a bit painful in the early days.

Not plug-and-play

Snowplow Analytics is not as easy to setup and manage compared to a third-party analytics vendor. This leaves you with a couple choices:

  • You hire a data engineer (expensive and in short supply)
  • You hire a consultant to manage the stack (cheaper, but now you need to source someone)
  • Purchase Snowplow's managed service (starts at $1500/mo; more if you want additional functionality)

If you want to learn more about options on this, feel free to contact us and we can give some guidance on this. If you want advanced tutorials on setting this up, check out the Snowplow Analytics guides.

Difficult to Diagnose/QA at times

Finally, it can be tough to diagnose/lint event issues, especially when adding new calls. We've found ways to work with this (using Snowplow streaming), but it can take time to figure out.

Overall, there's few downsides, but they certainly exist.

Wrapping up

If you are looking for a great analytics provider, Snowplow is the best option I've found (especially if you anticipate growing in the future). I highly recommend it from a product perspective. It may not be necessary in the early days of growing your company, but as you scale, take a serious look at Snowplow and avoid some serious pain that will come in the future.