How R helps Kami to drive Sales Growth in one of New Zealand’s highest-growth Startups

8 min readFeb 18, 2021

At the time of writing, Kami’s 4,500% ARR growth over a 3 year period certainly falls into any definition of a high growth business. For context, this would place Kami among the top 50 fastest growing tech companies in North America according to the Deloitte list published in 2020.

Kami is an edtech startup reimagining the classroom to meet the needs of a student-driven, 21st century education, as the world adapts to the need for a lifetime of learning. Kami is used every day — by over 24 million teachers and students — in 180 countries.

In this post, I highlight the role that data science in general and R in particular plays in the sales process at Kami and share some practical insights for others who seek to use R to help their teams make the most of their data.

Who has Responsibility for Crunching Data at Kami?

At Kami, our leadership style is hands on: leading from the front is not the preferred approach, it’s the only approach accepted in our team. And it’s not necessarily an executive leading a particular function or initiative. I just as often find myself following or assisting someone with more expertise than me in a given domain. Here, we expect everyone to pick a lane: “Lead, follow, or GTF out of the way” is our mantra.

So when, for example, applying data science techniques to our sales data, it has meant that as head of sales I’m leading it. Not just by giving instructions, but also writing code. Every day.

We use R heavily in our sales function. I’m responsible for sales, but also an engineer and I’ve been writing code for almost 40 years. I first started using the somewhat peculiar data science language R about 5 years ago. I found the way it treats tables and lists of data as ‘first class citizens’ immensely powerful when processing datasets of all kinds sourced from databases, spreadsheets, files and REST APIs. Writing hundreds of lines of data-munging code without using a single for or while loop is not only efficient, but can also be empowering in new ways.

Why such a Focus on Data?

We believe that not only our product, but also our use of data is a competitive differentiator. And like our product, our use of data is constantly evolving, expanding, improving, and adapting to external market and technology changes.

The reason data is especially vital for Kami is because as a B2B SaaS vendor, our ‘average check size’ falls within what received wisdom calls a ‘Death Valley’. This rule of thumb for the typical B2B SaaS sales playbook from the 2010s went like this: You can build a sustainable SaaS business with a price point under $1000 using a marketing-led growth model with self service online purchasing. Or with a price point over $10k, you can afford to use sales and outbound teams to find and engage with customers to close deals and take purchase orders manually. Between those two price points is the death valley, where, the theory goes, no business can achieve sustainability.

At Kami, our average check size is smack in the middle of that feared killing zone. And yet we have been able to achieve both profitability and hyper growth with only $2.5M of external investment.

How? We accomplished this by evolving a new kind of user-adoption-driven, product-led B2B growth model through experiment and evidence-based decision-making; by adopting and merging both B2C and B2B techniques; and through relentless automation.

We weren’t the only ones going through this evolution in the last few years — the parallel emergence of similar fast-growing, capital-efficient B2B startups with user-driven adoption like Zapier and Calendly, has recognizably mirrored our own journey.

But we didn’t all get here by taking a conscious decision to adopt an off-the-shelf growth model, because five years ago it did not exist. Nor did we experience a moment of epiphany or lead a revolution. Instead, it resulted from continuously analysing and extracting insights from our growth data, and making incremental changes in practise, policy and team structure.

So is this evolutionary process now over — have we settled on this business model? The data will continue to answer that question for us.

Where do we Store all this Data?

We use a number of cloud data providers, but Google Cloud Platform is our primary cloud partner. So we use GCP for our data lake, where all the data extracted from the various systems we use is stored for subsequent access for workflows or analysis.

Ingestion into our data lake is a still a mix of automated and manual exports, downloads and API extracts — basically, whatever it takes to get it in there.

How do we use R in Sales ?

We use R, and data, for four distinct purposes in the sales / growth function:

Workflows automating our lead qualification and quote to cash processes
Data cleansing, munging, and verifying data consistency across the various systems we use
Data visualisation
Exploratory data analysis and predictive modelling

Let’s take a look at some examples in each of these categories:

#1 Automating lead qualification and quote to cash processes

Automation of previously-manual activities is fundamental to our ability to scale, and when growing quickly, we need to be in a position to immediately identify new bottlenecks as they appear, and automate them away.
I often use R for prototyping these workflows, and experimentation, before our team then implements a more robust and resilient solution. Our automated workflows include:

Qualify, classify and distribute new inbound leads
Generate quotes in the Invoicing system, and upsert the quote details into our CRM
Generate Invoices, subscriptions, onboarding instructions, and licenses from purchase orders using information from the CRM and other systems.

#2 Data cleansing and verifying inter-system data consistency

Let’s face it, sales people are not universally as dedicated to the maintenance of information in the CRM as they are to selling. But the business has a need for accurate and consistent data, so this is too important to rely on badgering, or distracting financial incentives. In the past, you might also have hired sales operations staff to regularly review and clean the CRM data. Today, we automate all that as much as possible:

We need to relate customers and opportunities in our CRM system to licenses in our production database; and to quotes, POs and invoices in our invoice system or online subscriptions and payments in Stripe. For each critical data object we define the data fields we use as a unique key across all our systems. On occasion these can get out of sync with the CRM or invoicing systems.
We need to generate and assign new subscription renewals in the CRM each year
Verify and ensure data consistency across our systems: identify missing information, errors, typos, or unauthorised changes and correct or flag them for investigation.
Calculate sales commissions, including accounting for PO cancellations, credit notes, refunds, and other exceptions.
Automatically reassign opportunities and renewals in the CRM when sales regions are redrawn — something you have to do repeatedly as the business, and the sales team, grows

#3 Data visualisation

Calculate and forecast the usual SaaS metrics, plus the new metrics critical to our particular business model
Charting funnel metrics and sales results along with historical comparisons, trend lines, and from a range of perspectives to help to identify patterns or anomalies.

#4 Exploratory data analysis and predictive modelling

While R is an extremely effective tool for data cleaning and data munging, it is an even more powerful language for data modeling. Its power lies in the wide range of machine learning and analytical techniques that the R community has built over the years. We generally run these models, scripts and ad-hoc exploration of data in Rstudio

Predictive modelling and analysis — often sales forecasts for planning purposes.
Ad hoc data projects. Recent projects include:
Mapping customer locations by zip code to explore evidence of word of mouth propagation patterns IRL; and
A recent calculation showing that if all the documents loaded each month by Kami users were printed out, the resulting stack would reach beyond the altitude of the international space station.

Where, How and When ?

There are three different ways we apply R code depending on the purpose it is applied to :

Cron jobs that run hourly, daily, weekly or monthly — typically workflow and cleansing jobs.
Occasional scripts and ad-hoc exploration of data we run in Rstudio
Rmarkdown or RShiny for presentations, live dashboards, or for interactive dashboards where non-technical team members can explore aggregated datasets.

R and the Copper CRM Platform

I couldn’t find an API package for accessing the data in the Copper CRM system we use, so I wrote and published one. If you’re also using Copper, This blog post provides an introduction https://bobvondrummond-53692.medium.com/copperr-my-r-package-for-copper-crm-55ea2f7910e2, and you can find the R package to install from Github here https://github.com/fatkahawai/copperr. In addition to the data extraction functions, the package provides useful CRUD functions for automating your data cleansing, migrations and automated manipulation.

Advice for Data Mungers and Sales Leaders

We have certainly learned a lot from this journey — we ourselves have fully embraced the “lifetime of learning” coming to everyone in the 21st century!

I feel I should wrap up with some recommendations to other sales leaders and data practitioners who are interested in using R to help their team make the most of their data:

First, I recommend you do at least some of your data exploration and analysis yourself. If you don’t know how, now’s a great time to learn. Its easy to delegate to some else to “go run a data script”, or “build a dashboard”, but what others won’t be able to bring to the task is your experience and understanding of the context and key drivers of success — priceless when identifying meaningful patterns and extracting actionable insights from data.

I recommend building a library of helper functions to connect R to your various systems containing data, such as your CRM and production databases, as early as possible.

When you can’t find an R package for accessing a particular system, create and publish one for the community.

Also consider creating an internal R package to share code across your team and encourage a culture of collaborative tool-building. Before building that internal package, agree a fundamental data model across the business.

We recommend maintaining your scripts in a Github repository, Rmarkdown & ggplot, and Shiny. Finally, remember that R is not the only language for data work, and allow your team to explore data using other tools and languages. Happy data munging!

How R helps Kami to drive Sales Growth in one of New Zealand’s highest-growth Startups

Written by Bob Drummond