BI For Experts

The ultimate analytics stack for startups [2020 edition]

One of the most common projects I work on involves mapping out the ideal analytics stack for my client. The right analytics stack will significantly improve the company's ability to collect and leverage their data. It is my job to help them work out the right set of tools based on their budget and specific needs.

Even though each company is slightly different there is one analytics stack which I'd consider my "ultimate analytics stack for startups".

This stack is best suited for B2B, non-mobile companies but can address the majority of needs for all online, application-based companies.

If your startup is bootstrapped or you have a small budget for operations then my post, The analytics stack every startup should implement on day one, is a good fit for you.

What should be included in an ultimate analytics stack?

The ultimate analytics stack would include best-in-class solutions that can help us in the following areas:

  • Record user and visitor behavior - Our visitors and users are engaging with our website and products. We need to collect this data so we can learn from it.
  • Data warehousing - A central location where the company's data is housed.
  • Extract, transform, load (ETL) - Help move data from relevant silos to the data warehouse.
  • Data visualization - Allows us to visualize and quickly act on our data.
  • Leverage user data in 3rd party tools - The more info we can leverage in our 3rd party tools, the more useful these tools become.

Which tools / platforms meet the needs of our analytics stack?

The solutions included in my ultimate analytics stack include Segment, BigQuery, Stitch, Tableau, Mixpanel and Redash.

The ultimate analytics stack - tools diagram

The table below shows which need each tool addresses.

The needThe solutionRecord user and visitor behavior SegmentData warehousingBigQueryExtract, transform, load (ETL)StitchData visualizationTableauSelf-service analyticsMixpanel & Redash

In my diagram you'll also notice Postgres and Google Analytics.

I included Postgres since it's a popular option for the application database (the database R&D uses to run your online service / product). Since you'll need a database set up to run your application I didn't include it in the stack. The trick is being able to move the application data into the data warehouse. This is where Stitch comes in as you can see in the diagram.

I included Google Analytics as well since it's a very popular free analytics solution to track traffic, visitors and content performance. It's such an obvious one that I just assume everyone is using it. In my diagram you can see that you have the option of pushing Google Analytics data into BigQuery via Stitch, or simply plugging it directly into Tableau.

Segment - the most ROI positive piece of any analytics stack

Segment is one of those solutions which if setup and leveraged correctly it will pay for itself a thousand times over. I seriously consider it the most ROI positive piece of any business operations stack that can be built.

event tracking solution ultimate analytics stack

The reason Segment is so ROI positive is because it tackles three massive headaches for companies.

Headache #1 - Collecting behavioral data from visitors and users

Segment's first value proposition is helping companies collect event data from their visitors and users. Some of these events are out of the box while others need to be written up in code by developers.

My post titled, The feature adoption funnel: How to track and measure feature adoption, goes into more detail on how events can be used to better understand users.

Headache #2 - Leveraging user data in 3rd party tools

Before Segment came along it was extremely challenging to move user data between different tools. A ton of custom API code would need to be written or if you were lucky your favorite tool would add an integration with a second popular solution.

Segment allows non-technical people to quickly and easily connect hundreds of different tools to the event collection infrastructure so this data can be leveraged further.

A great example is sending events that a specific user generates to a tool like Intercom or HubSpot which then triggers a new series of emails that is sent to that specific user.

The options are endless.

Headache #3 - Loading new scripts within apps and on the website

How frustrating is it when a marketer wants to start using a new solution but requires R&D to manually add new code to the website or application. In most cases this request will be considered low priority or if implemented, come at the cost of other high priority R&D work.

R&D is the most expensive department in a hi-tech company and every hour that can he saved is worth a lot of money to the company. Segment can save hundreds of man hours each year by making it easy for marketers, analysts and ops people to turn on new tools without any developers needing to get involved.

Segment is similar to Google Tag Manager in that it can load code libraries within itself. This can be done in minutes straight from within the Segment portal.

Costs and vendor lock-in

The biggest risk of using Segment is long-term costs and vendor lock-in. Since Segment is so powerful you'll inevitably scale up your usage of the platform. This will do wonders for your operational capabilities but will also make you more reliable on the solution. You'll need to keep this in mind and balance the risks involved.

Segment was a lot more affordable a few years ago before shifting to a purely metered pricing model. Segment does offer a generous free plan which is great for companies just starting out but once a company has thousands of users it can become costly.

I still consider Segment extremely ROI positive but the caveat is that it is only going to be ROI positive if used correctly and your analysts and ops people are proactive.

Segment is not going to be a great fit for most B2C companies since the lifetime value will be too low to justify the costs associated with the tool.

BigQuery - auto-scaling, affordable and analyst-friendly

BigQuery is a powerful solution for data warehousing and my go to solution for the vast majority of my clients. I've always been a fan of Google products and Google have done a fantastic job with BigQuery.

Stitch, Segment and Tableau all have great connectors to BigQuery which means it fits perfectly into our stack.

BigQuery is auto-scaling and really affordable. Depending on the volume of your data and the size of your queries BigQuery could cost you as little as a few dollars a month.

BigQuery is very analyst-friendly. They have great documentation and their online graphic user interface (GUI) makes it easy to query your data straight from your browser. Creating views and organizing your data into folders (called data sets) is really easy.

Check out the business analyst encyclopedia for some tips on using BigQuery.

Stitch - ETL out of the box

I've been using Stitch for client projects for over 2 years now and I'm a big fan.

Stitch just does what you'd expect it to do. You sign up, connect a source and destination and the data flows from one place into the other.

Stitch has an impressive list of sources including Asana, HubSpot, all the popular database solutions, and even Google Ads.

Stitch integrations and sources

Stitch has a very generous free plan and most startups should be able to use Stitch either on their free, or $100 a month plan.

The main use case for Stitch is for moving your application database into your data warehouse. Before Stitch came along you'd need a data engineer to write up a complex script that would ETL the data into the data warehouse.

Stitch can be set up within a few minutes and isn't overly complex. Some sources like Google Analytics have API limitations and are thus more nuanced.

When setting up a new source within Segment you will be required to pick which tables and columns you want to move. This does require you to know your data sources well. If you are new to your company or aren't yet familiar with what data is available then I recommend you conduct a thorough data audit.

Tableau - The global leader in data visualization

If you've been following this blog for a while you'll know that Tableau is my go to solution for reporting and data visualization.

I've written extensively about Tableau on this blog so I'm not going to go into too much detail here. I'll point you to this guide which answers the question, what is tableau and how can it be used to help a company become data-driven.

You might be wondering why I choose to include Tableau in my ultimate analytics stack over more affordable options like PowerBI or Google Data Studio? The main reason I prefer Tableau over other solutions is because Tableau is extremely ROI positive.

A single license of Tableau Creator (all you need to get started) will set you back $840 a year. This is a relatively small investment to get your hands on the most powerful data visualization platform available today.

Yes, Tableau does have a learning curve but after +-10 solid hours of study you'll already be in the position to create basic reports that can be published onto the cloud.

Mixpanel & Redash - self-service analytics at scale

Both Mixpanel and Redash are great self service solutions that can help you leverage your infrastructure by putting your data in the hands of more people.

Mixpanel is a mature analytics company and especially well known among product analysts. The tool can easily plug into Segment which can feed it event data. Product managers and analysts can run ad hock segmentation, and cohort reports in seconds within Mixpanel.

Redash is a more sophisticated self-service analytics solution. Redash acts as a SQL-based querying tool which can be used to run calculations and pull data from a wide range of sources, including BigQuery.

Mixpanel is easier to use than Redash and only employees that know SQL will be able to use Redash effectively.

If you'd like to learn SQL then check out my post, SQL for dummies: How to learn SQL for free in 30 days or less.

Summary

In this post I've covered what I consider the ultimate analytics stack for startups. The tools I've included in my stack are best-in-class solutions for data collection, ETL and reporting, and most companies would benefit from using them.

If you are looking for more content on setting up business operations infrastructure then visit the BI Infrastructure category page.

If you're in the process of implementing analytics at your company and have any questions on the process you should be following then post them in the comments section below.