Lifetime Customer Value - Freemium Economics: Leveraging Analytics and User Segmentation to Drive Revenue (2014)

Freemium Economics: Leveraging Analytics and User Segmentation to Drive Revenue (2014)

Chapter 5. Lifetime Customer Value

This chapter, “Lifetime Customer Value,” takes a thorough look at one of the most fundamental concepts in the freemium model: lifetime customer value, or the total value of each customer. Lifetime customer value is an important consideration in freemium products, where user acquisition costs can impact the degree to which products can profitably grow. The chapter begins with an overview of lifetime customer value at a conceptual level, describing its component parts and what it is used for in an organization. The chapter then proceeds into a set of methodologies that can be used to calculate lifetime customer value: the spreadsheet method, in which the product team collects data about the product in a spreadsheet and constructs a macro-level, top-down assessment of how much each user is worth, and the analytics method, in which sophisticated algorithms are utilized. The chapter ends with a guide to making the best use of the lifetime customer value metric in product design, marketing, and product portfolio management.

Keywords

user acquisition; retention; user value; regression; freemium product development; analytics model; lifetime customer value analytics; discounted LTV

Lifetime customer value

In the freemium model, full intellectual ownership of data is the only means through which optimized improvements can be made to the product in development iterations. The concrete action points users produce should instruct the product team’s efforts.

Analytics is used to increase the various behavioral metrics groups that correlate with product enjoyment. But increasing the value proposition of a product requires some insight into the extent to which a user can be expected to contribute revenues; for some users, revenue contributions are not realistic, and thus the behavioral data they produce doesn’t inform revenue-oriented product iterations. To that end, in order to optimize the product value proposition to the user, the product team requires some estimation of that user’s value proposition to the product. In other words, what can a specific user or group of users be expected to contribute in revenues? This measure is known as lifetime customer value (most commonly denoted as LTV, although the acronym doesn’t correspond with the first letters of each word).

Lifetime customer value and the freemium model

LTV is defined as the present value of all future cash flows attributable to a user for a given product. Put another way, LTV is the total amount of money a user is expected to spend on a product, adjusted as if that money were received as a lump sum today. The LTV equation was most popularly defined in Modeling Customer Lifetime Value, a 2006 article by Gupta et al (2006) whose authors included two prominent academics in the field of LTV estimation, Bruce Hardie and Dominique Hanssens. The equation the article uses to define LTV is depicted in Figure 5.1. Hardie and another academic, Peter Fader, have contributed perhaps the greatest volume of practical, actionable literature to the academic study of lifetime customer value within the field of marketing.

image

FIGURE 5.1 The lifetime customer value (LTV) calculation, as defined by Gupta et al.

This conceptual description measures LTV over a predetermined, finite period. In the case of the freemium model, as with most practical applications of LTV, the period over which a user will remain active is unknowable. And while the LTV equation does a capable job of relaying what, exactly, the LTV metric measures are, the implementation of a mechanism for calculating LTV relies very little on a mathematical equation.

The aggregate sum of the LTV metrics for all current and future users is known as customer equity (CE), which is useful mostly as a component of a firm’s enterprise value. Customer equity is rarely used as a decision-making apparatus at the product level; more commonly, it is utilized by early stage technology companies to derive a transparent company valuation ahead of a round of financing. Since the value of a firm is the sum of all of its projected future revenues (in addition to various intangibles like goodwill and brand value), the customer equity value provides a quantitative framework for producing a valuation for a young company.

In a freemium scenario in which revenues are composed of, at least in part, recurring purchases such as subscription services, estimating LTV is fairly straightforward. It is simply the user’s expected lifetime multiplied by the recurring price of that user’s subscription. But in freemium products with very large, diverse product catalogues designed to take full advantage of freemium dynamics, estimating LTV is resource-intensive and prone to error—and potentially very rewarding.

The analytics infrastructure required to launch freemium products fulfills the preconditions for estimating LTV: behavioral data around spending patterns, demographic data, usage statistics, etc. The barrier most firms face in employing predictive LTV models is twofold: (1) experts who deliver actionable predictive models of this nature are in short supply, and (2) given their complexity, LTV models can be described using only black box argumentation, which many firms are loath to pay for.

Nonetheless, understanding LTV at a conceptual, if not quantitatively actionable, level is helpful in making marketing and, to a lesser extent, product development and prioritization decisions. LTV, as a deterministic measure of what a user will contribute to the product in its current state, explains the return on marketing efforts from a customer-centric rather than a product-centric perspective, providing depth to the process by which the product is grown and iterated upon. Whether or not a product team has the resources to predict LTV with authoritative accuracy, as a theoretical metric it serves to guide the product in the direction of highest possible return.

Making use of LTV

Commanding LTV to the degree that an accurate value can be calculated and audited isn’t a necessity within the freemium model. But deriving a numerical LTV metric is a valuable task for any firm producing freemium products. The thought exercise orients development around the constraints of return on investment, and it provides the firm some guidance for developing a marketing budget. So while precision in deriving an LTV metric isn’t a requirement, the process itself should be; LTV forms the backbone of marketing freemium products and helps inform decisions around continued development. The capability of calculating an accurate LTV metric is only truly useful under circumstances in which users can be shifted from one product to another based on their predicted LTVs. This practice is known as cross-promotion.

Cross-promotion can be between either two products owned by the same firm or one firm’s product and another firm’s product. The latter method is usually done in exchange for money; in effect, one firm “sells” a user to another firm based on the user’s predicted LTV in the first product or on the basis of a partnership, in which both firms agree to the terms of a trade.

When a firm estimates that a cross-promoted user will generate less revenue in the product through purchases than the user’s sale price or trade value, then cross-promoting that user to another firm’s application is an economically rational choice. Cross-promotion is generally the domain of large firms with either diverse product catalogues (so that users can be cross-promoted within the firm’s system of applications) or sophisticated analytics systems (so that the exact value of the user can be estimated with confidence before the user is sold or traded).

Understanding whether a user is a good candidate for cross-promotion to outside the firm’s system of applications—in essence, sold to another firm—requires that the cross-promotion sale price exceeds the user’s expected LTV and that the sale price exceeds the implicit value of keeping the user in the firm’s system as a non-paying user. Recall that all users, whether they contribute revenue to the product or not, have value.

Small firms generally need not possess the infrastructure or domain expertise capable of accurately predicting LTV. But small firms—indeed, all firms—building freemium products benefit from understanding the mechanics of LTV, because understanding the principal components of LTV and the directional impact on its value allows for building products that maximize the amount of money users spend on them.

Since the freemium development process is iterative and data-driven, forward progress in LTV maximization can be measured even if the final metric isn’t precisely estimated. Understanding the exact LTV value of a user is only valuable if that user is to be sold; understanding, in the abstract, what can be done to increase the LTV of a user always provides valuable insight that leads to the development of freemium products that are better monetized.

LTV can be calculated in any number of ways and, given that any predictive metric possesses at least some level of error, none are perfect. But no matter what calculation methodology is used, the primary components are always the same: retention and monetization. The retention metrics describe how long a user is predicted to remain active within a product; the monetization metrics can be extrapolated to describe how much revenue that user is expected to spend on the product.

Even though LTV is not precisely accurate, it belongs in the stable of freemium metrics because it represents the confluence of all of the minimum viable metrics, for which increased and continued accuracy should always be a goal. LTV is predictive, and any predictive model is prone to error; retention and monetization are descriptive and should be measured with complete confidence. So, in pursuing an ever more accurate LTV measurement, the product team is implicitly ensuring that its retention and monetization metrics, over which many product decisions are made, remain trustworthy. This side effect, of itself, is well worth the effort placed into modeling LTV.

LTV in, LTV out

Traditionally, the concept of LTV has been useful primarily for establishing a baseline understanding of how much money can be spent, on a per-person basis, to acquire new users. This aspect of LTV certainly holds true in the freemium model. It is a useful benchmark for setting the marketing budget, which determines how much money can be spent in user acquisition. But the dynamics of freemium product development, and the presupposition of at least a minimally effective analytics infrastructure upon launch of a freemium product, contribute another dimension of utility to the LTV metric: a customer segment benchmark. Users can be evaluated on their way into a freemium product’s user base, and they can also be evaluated on their way out (i.e., upon churn), thus assisting in the segmentation process and shaping product development.

Performance user acquisition—paid marketing campaigns undertaken with the sole purpose of introducing new users into a product’s user base (as opposed to building brand awareness or reengaging existing users)—is wholly dependent on at least a basic, reasonable estimate of LTV. If no quantifiable understanding exists of the amount of money any given user, acquired through any given channel, will contribute to a product, then performance user acquisition is conducted blindly without regard to return on investment. Given the dynamics of the freemium model, in which the vast majority of users are expected to not contribute to the product monetarily, blind user acquisition campaigns have the potential to result in disastrous, near-total losses.

The LTV metric is the basis for performance user acquisition because it represents the maximum price that a user can be purchased for to avoid a loss. In some cases, a firm may pursue user acquisition on a loss basis to build an initial user base or generate publicity preceding some sort of liquidity event or fund-raising initiative. In other cases, a firm may undertake a loss-producing user acquisition campaign to lift its user count ranking on a platform store to a highly prominent position, with the expectation that the product’s discovery will benefit from greater visibility (and thus yield an appreciable degree of organic purchasing). But these are calculated initiatives and usually only take place on a short timeline; no marketing professional should engage in user acquisition without being able to identify an informed, defensible estimate of LTV.

For an LTV estimate to qualify as reasonable, it should be calculated around a set of behaviors that lend themselves to monetization. Demographic data such as age, gender, profession, or level of disposable income isn’t enough; these describe broad characteristics that don’t necessarily relate to a user’s relationship with a specific product or a user’s propensity to spend money on that product. Demographic data plays a role in a diverse, thorough array of features used to segment users, but it should not be relied on exclusively to derive an LTV metric.

Performance user acquisition campaigns are generally undertaken on the basis of volume, with a specific number of advertising impressions being purchased for a set price. This type of advertising campaign is known as cost per mille (CPM), where mille represents one thousand impressions. In a CPM campaign, an advertiser sets a bid price, or a price it is willing to pay for the opportunity to display an ad, and the entity brokering the advertising campaign shows only the advertiser’s ad to parties willing to sell ad displays at that price or lower. A bid does not represent the price of acquiring a user in a CPM campaign; a bid in this context merely represents the price paid to show an ad.

When an ad is shown to a potential user, the user chooses whether or not to click on it; the historical proportion of people who see an ad and subsequently click on it is known as its click-through rate, or CTR. Once a user has clicked on an ad, that user generally must complete some further action to be considered acquired—usually either registering with the product or installing it. This final acquisition threshold (usually named by the action it is measuring, e.g., install rate or registration rate) is measured as a percentage of total users who see the ad and is almost always lower than CTR, given that some users will discover the product but have no interest in it.

Some advertising networks undertake user acquisition on a cost per acquisition (CPA) basis; that is, the advertiser submits bids not for an ad display but for product adoption. This means that the advertiser only pays for a user who has successfully and verifiably interacted with the product at least once. Advertising networks facilitate this by using algorithmic optimization with the aim of achieving click-through and install rates for an ad campaign that match the CPA bid submitted by the advertiser. In cost-per-acquisition campaigns, interplay between LTV and the marketing budget is straightforward: the price paid for each user’s acquisition cannot exceed LTV. A full table of an advertising campaign’s metrics is shown in Figure 5.2.

image

FIGURE 5.2 An advertising campaign performance overview.

In the figure, CTR is measured by dividing ad clicks by ad views. Similarly, install rate is calculated by dividing ad views by installs. CPA is calculated as a function of installs: it measures how much each install costs, given the ad’s performance. Performance marketing is conceptually grounded in achieving an acquisition price that is less than a product’s LTV; when this does not hold true, an advertising campaign is being run at a loss.

Once a user has been acquired, the estimated LTV used to acquire that user is no longer relevant to that user’s behavior; by its nature, LTV is predictive, and as a description of any particular user it is only useful in determining the price that user should be acquired for. But that user’s predicted LTV becomes relevant again once that user churns out of the product; it becomes a measure of how accurate the LTV prediction mechanism was in the first place. (See Figure 5.3.)

image

FIGURE 5.3 A diagram of LTV estimation, user lifetime, and model optimization.

When a user churns from a product, an actual LTV can be calculated by summing the total value of purchases over the user’s lifetime. The difference between the user’s actual and initially predicted LTV metrics represents the error present in the LTV estimation mechanism.

A churned user’s actual LTV should be contrasted with that user’s behaviors to hone the LTV estimation mechanism evaluating users entering the system. Actual LTV metrics should cycle back into the model that is used to formulate estimated LTV in order to increase its predictive capability. This cycle, whereby predictions become more accurate over time as more data becomes available to disprove those predictions, is a distinct feature of the freemium model, in which total instrumentation delivers compounding benefit.

Retention versus acquisition

One reason understanding the value of a user in a product’s user base is important is that product teams should be equipped to make quantitative decisions about the lengths they should go to retain users. In almost all businesses, retaining an existing user is far more economical than acquiring a new user, and the freemium business is no exception. What is different, however, is that, given robust analytics, a respectable estimate can be made about every user’s potential total value to the product, given that user’s archive of behavioral data.

Acquiring a new user involves uncertainty. A person is only within the scope of a product’s analysis regarding likelihood to spend once the person crosses a product’s “event horizon” by interacting with the product for the first time. A user outside of that scope, for whom the product team has no meaningful understanding of likelihood to spend, represents a position on a very wide spectrum of behavioral profiles, most of which do not lead to direct contributions of revenue to the product. Thus, a user with an LTV of $0 who is engaged with the product is still more valuable to the product than a single newly acquired user, given three statements of fact:

1. The likelihood of any one newly acquired user spending money is low; the likelihood of that user spending vast sums of money approaches zero.

2. The data the product has about the existing user holds value in terms of its utility in creating more predictive models about future users. Again, the new user is a mystery; tenure cannot be predicted, and all things being equal, data about a user with a long tenure is more valuable than data from a short-tenured user (because longer tenures in freemium products are scarcer).

3. The new user must be acquired, which represents an expense. Even if the existing user was acquired through a paid campaign in the past, that cost is sunk and thus not relevant to the current decision point. Generally, in software, the cost of retaining a single user versus the cost of acquiring a new user is negligible.

While it is somewhat convoluted, the decision point in choosing to replace (via paid acquisition) or retain any given user will almost always favor retention. Retention contributes to LTV not only in its calculation but in its accuracy; a greater volume of data about users who have remained engaged with a product produces a more robust and actionable LTV metric.

Discounting LTV

Because LTV is a calculation of a stream of revenues over a variable length of time (the user’s lifetime), its numerical value at the time of estimation isn’t strictly the sum of the estimated revenue contributions projected into the future. This is because money has atime value owing to the opportunity cost of allocating finite resources. Money invested into one set of revenue streams is not invested into another, and thus the return of each stream of revenues must be compared in determining the true value of each stream with respect to the others. In other words, money received today is more valuable than money received in the future because money received today can be invested in interest-bearing activities. The opportunity cost of not allocating money to an activity that could produce returns must be taken into account when making financial decisions. This concept deserves consideration when formulating an LTV calculation.

The time value of money is incorporated into corporate decision-making and project financing by determining the minimum amount of money an investment should earn to make the investment preferred over not investing. This is usually quantified by calculating what is known as a risk-free rate, which is the rate of return an investment would generate if placed in risk-free government bonds (so-called because they face an almost nonexistent risk of defaulting). The rate of return generated by an investment (into product development, research, marketing, etc.) should exceed this risk-free rate; if it doesn’t, it is not a favorable choice because investing the money into risk-free government bonds would yield a larger eventual return.

The process of assessing a stream of cash flows in terms of their values today is known as discounting, and the rate by which the value of a future payment is reduced to a value in the present is known as the discount rate. Calculating the present value of a stream of future cash flows is relatively simple, involving only basic algebra. The formula for the calculation is presented in Figure 5.4. The output of the formula is an amount of money that a person should consider equivalent to the value of the payment expected in the future.

image

FIGURE 5.4 The equation to calculate the present value of a stream of future cash flows.

Like many other aspects of the LTV calculation, some controversy exists around the idea of discounting future revenues from users in order to produce a present value LTV metric. Most academic surveys of the LTV calculation advocate for discounting, especially as the surveys relate to traditional businesses such as subscription services, in which users pay the same amount on a regular basis. Likewise, many businesses of this nature can take advantage of discounting by project finance methodology: choosing the most profitable projects out of a broad portfolio of opportunities. But these characteristics don’t generally apply to freemium businesses. For one, given the typical freemium product catalogue, revenues are anything but regular and consistent, thus representing a challenge in estimating the per-period value of any user’s revenue contributions. Also, most freemium products are their firms’ sole focus.

The decision to fund an aspect of freemium product development based on a comparison with simply leaving the money in an interest-bearing security or money market account isn’t relevant, as most firms developing freemium products have a mandate, either from shareholders or investors, to fully utilize the firm’s specific expertise in developing freemium products. Massive industrial conglomerates generally have capital markets groups that generate return through investment activities with the firm’s capital reserves; most freemium product developers have no such facility and are instead dedicated to building freemium products.

Discounting the LTV metric presents a needless complication in making product and user acquisition decisions. When LTV is discussed henceforth in this book, the calculation undertaken is not discounted and represents a simple sum of user purchases values.

Calculating lifetime customer value

A number of quantitative methodologies exist through which an informative and actionable LTV metric can be calculated. The complexity present in an LTV calculation generally corresponds to its purposes; when used as an estimate of return on a proposed product feature to inform a development decision, LTV has to be exact only with respect to magnitude. When used to set a budget for user acquisition, however, the LTV metric should be dependable enough to produce positive returns on marketing investments.

Any LTV calculation is constructed by multiplying the expected lifetime of a user, in periods (where period can be any length of time: day, week, month, etc.) by that user’s expected revenue contributions per period. This formulation is fairly intuitive and its two components can be easily explained (even if the calculations producing the components are harder to interpret), alleviating the need to engage in black box argumentation when explaining why and how the LTV metric should be used in making decisions.

The component parts of LTV, like any other relevant metric, must be computable around various dimensions for the purposes of user segmentation. A global LTV metric may be useful in making financial forecasts, but it can’t substantially contribute to marketing operations. The dimensions used to aggregate the LTV metric must reconcile with those generally used when acquiring users: location, age, sex, platform, interests, etc. The level of granularity required of an LTV metric obviously depends on what can be accomplished through that granularity, but the dimensions used to calculate group LTV metrics should be driven by business need and practicality.

In this book, I discuss LTV calculation as a possible implementation of two different frameworks, each corresponding to a different level of complexity and company tolerance for ambiguity (i.e., black box argumentation). As mentioned earlier, there exists no perfect LTV calculation; any estimation of what a user might spend on a product in the future is a prediction and is thus susceptible to bias, influence from unforeseeable externalities, and user error.

But formalizing a quantitative approach to LTV estimation is important because it serves to bookend the user experience through the lens of data-driven optimization: the user is acquired into a product with an LTV determination and creates an LTV artifact upon leaving, which is then used to optimize the system. A focus on LTV from the product standpoint places an impetus on the entire development process to quantify and evaluate the user experience on the basis of financial return.

This book presents two methodological approaches to estimating LTV: the spreadsheet method, whereby the calculation and its required parts are manipulated entirely within desktop spreadsheet software, and the analytics method, whereby the calculation and all related data are housed in an analytics infrastructure and cannot be updated or altered without the assistance of an engineer. Both methods are valid for different reasons, and the appropriate choice for a given product depends on the product team’s resources and the organization’s tolerance for abstraction and inaccuracy.

In either case, this chapter does not identify a specific formula to use to calculate LTV; the construction of a formula is heavily related to the needs of the product being estimated for and cannot be prescribed. But each method discussed presents a framework for identifying the critical inputs to LTV and a broad, abstracted formula for deriving its value.

The spreadsheet approach

The spreadsheet approach to estimating LTV operates from the position that LTV is the average revenue per period multiplied by the average periods per user lifetime. This approach precludes the possibility of comprehensive revenue or engagement segmentation, given that spreadsheets can only accommodate a few million rows of data at most; instead, predetermined aggregates are used to calculate LTV metrics for very large, very broad user segments such as those based on country.

The advantages to this approach relate to communication and flexibility. Spreadsheets are corporate poetry; when constructed elegantly enough, they can be used to communicate sophisticated ideas to audiences who wouldn’t otherwise be receptive to details. The spreadsheet approach sidesteps the black box argumentation minefield completely, since it is entirely self-contained and transparent: calculations are auditable and verifiable, and the impact of changes can be seen instantaneously.

And while the data present in a spreadsheet model may itself be aggregated output from an analytics system, it is generally not manipulated beyond simple counts or averages. The agreeability of a method’s medium may seem like a trivial benefit, especially when compared to a medium’s ability to produce granular results, but it shouldn’t be understated; information can’t be used to influence decisions unless it can be parsed and interpreted by decision-making parties. For better or worse, spreadsheets are a fundamental pillar of the modern corporate structure, and ignoring their existence doesn’t change anyone’s expectations about how data will be presented.

Another benefit of using a spreadsheet to calculate LTV is the spreadsheet’ experimental nature; adjusting a calculation results in instant feedback and allows the relationship between variables to be observed quickly and without much development effort. This flexibility is valuable; understanding the dynamics of customer value allows for decisions about product strategy and priority to be made with return in mind. The spreadsheet approach allows the product team to model alternate scenarios or adjust assumptions without the assistance of an engineer. This means that while the spreadsheet approach’s LTV calculation is potentially less precise than what could be produced programmatically,it has greater conceptual gravity within the organization because people have a better understanding of its dynamics.

The downside to the spreadsheet approach is granularity and accuracy. Given that a spreadsheet can’t match the data capacity of a database, an LTV metric derived from a spreadsheet can’t possibly match the breadth of one calculated programmatically from within an analytics infrastructure. This downside relates directly to the confidence with which an LTV calculation can be used to make decisions, especially those around setting budgets for customer acquisition. The spreadsheet approach thus relegates the LTV metric to a decision influencer from a decision point; a precise LTV metric is itself actionable, but a loosely estimated LTV metric is simply one element of a portfolio of data points that must be considered, especially when making a decision about user acquisition.

This trade-off between communicability and reliability appears frequently as a result of the freemium model’s data-reliant nature, in which metrics are the basis of decisions but black box argumentation isn’t persuasive. The spreadsheet approach is valuable in its ability to serve as a staging ground for product decisions and a way to observe the relationship between a product’s mechanics and revenue, but the spreadsheet model itself cannot represent the sole basis for decisions.

Constructing the retention profile in a spreadsheet

The retention profile, as discussed in Chapter 4, presents a template for use patterns following a fairly standard decay pattern: 50 percent of the users who use the product one day after first engaging with it (day 1 retention) can be expected to return seven days after first engaging with it (day 7 retention), and 50 percent of those users can be expected to return twenty-eight days after first engaging (day 28 retention). This decay pattern isn’t universally applicable, and in most cases, it probably doesn’t describe the exact retention values for users one, seven, and twenty-eight days after first using a product.

But the retention profile gives the product team a framework for estimating freemium product use, which is valuable. This framework can be used to drive assumptions about use patterns before the product is launched; once the product is producing data, it is no longer necessarily needed, as user behavior can speak for itself through measurement. Thus, once a product is live, its retention profile becomes the actual observed measure of retention following first interaction, from day 1 through the end of the product’s life (although, in practice, retention is often only valuable through day 365). When plotted, these measurements take the shape of a curve, and the area under the curve can be used to calculate the lifetime engagement of the average user (given that the retention metrics comprising the points on the curve are average values over a sample of the population). The curve can be constructed as measurements from any user segment; the analytics system should capture retention metrics for each user as binary count values (1 or 0, corresponding to whether or not a user interacts with the product on a given day). These counts can be summed across any number of aggregated dimensions; dividing the sums by the number of new users (DNU) for that set of aggregates provides the retention percentage for that user segment.

For the LTV calculation, the retention profile should be composed of retention values over the range of days from 1 to 365 (keeping in mind that day 365 represents the retention profile’s terminal value) because, in order to be actionable, the LTV model should encompass a user’s entire lifetime; in order to be predictive, it must describe any new user’s end state.

In the spreadsheet model, the retention profile is curve-fitted onto the standard 50 percent decay pattern and projected through day 365 until enough data has been collected to supersede these assumptions. This allows for a one-year retention profile (and thus a user lifetime estimate) to be constructed from day 1 retention data only.

The first step in constructing the retention profile in a spreadsheet is building a template that can be updated in the future, when more data becomes available. The retention profile consists of only two inputs for a user segment: a timeline of days since first interaction (called retention days) and the percentage of the user base that interacted with the product on those given retention days (called retention percentage). These inputs relate to all users in the segment being considered, not just users who joined the product on a specific day; in other words, the values are summed across the entire user segment being measured, regardless of the first day of interaction.

A good way of building this template is to use the columns of the spreadsheet to represent the retention percentages and the rows of the spreadsheet to represent retention days, as shown in Figure 5.5. The benefit of this format is that additional retention profiles can easily be added to the template by copying an entire column, pasting it into a blank column, and adjusting the relevant retention values.

image

FIGURE 5.5 A spreadsheet template for retention data.

Given the limited amount of data available, at least one of the day 1, day 7, and day 28 retention cells should be input cells, meaning the cell’s value shouldn’t be determined by a formula but rather input directly. When the product is new and has not yet collected much user data, only day 1 retention will receive input directly; days 7 and 28 are calculated from day 1 using the 50 percent decay pattern (image, and image). As more data becomes available, the values of all retention days for which data exists should be input directly.

The initial estimated retention profile relies on the values for days 1, 7, and 28; once these cells have values, a terminal value should be determined for day 365. There is no general decay pattern that provides a rough estimate for the percentage of users who will retain in a product through day 365 and beyond; estimating the terminal value relies solely on intuition.

A value of 1 percent for day 365 retention would reflect a reasonably committed but not fanatical user base. The estimate of day 365 retention represents the retention profile’s right-most endpoint; once this has been input, the boundaries of a discrete curve have been put in place, and the curve can be constructed.

Keep in mind that day 365 retention represents the percentage of users who retain with a product for at least a year; as a terminal value, day 365 retention describes the percentage of users who interact with the product indefinitely after day 365, not specifically on day 365. In other words, the retention profile can be thought of as two separate curves. The first is the retention curve through day 365, composed of large values near the origin and descending toward zero on the x-axis as “retention days” approaches 365, and the second is the retention curve from day 365 through infinity, starting from a very small number and steadily shrinking as “retention days” approaches infinity.

The terminal value at day 365 represents the area under the second curve, and it can be shown as a discrete value at the endpoint of the first curve in a process that would be equivalent to modeling retention through infinity. This is the first of a number of computational limitations of the spreadsheet approach.

With its endpoints set, the retention profile’s curve is built through a process called linear interpolation, whereby the known points on the curve are connected via a linear equation. Three linear equations are required: one for the line drawn between the day 1 and day 7 retention points, one for the line drawn between the day 7 and day 28 retention points, and one for the line drawn between the day 28 and day 365 retention points.

Calculating the values for each retention day between days 1, 7, 28, and 365 requires the calculation of step-down values, or the amounts by which retention days decrease over the limited continuums between points. A step-down value is calculated once for each linear equation; it is simply the difference between the endpoint values divided by the distance between the endpoints (in this example, the number of days between the relevant retention days). The step-down value can be thought of as a function of the slope from a linear regression. The equation in Figure 5.6 is used to calculate the step-down value for the linear equation between retention days 1 and 7, where day 1 retention is 80 percent and day 7 retention is 40 percent.

image

FIGURE 5.6 The equation to calculation the step-down value between day 1 and day 7 retention values.

This result reveals that the retention percentage decreases by 6.7 percent for each retention day between day 1 and day 7. To apply this step-down value to the cells between the two days, the formula must be adapted to spreadsheet format and inserted into each cell in column B between rows 4 and 9. The spreadsheet formula must indicate that the step-down value is being subtracted from the cell above it; for example, it should show that day 2 retention is equal to day 1 retention minus the step-down value. Given that the step-down value should be dynamic—it should update automatically if day 1 retention is changed—the formula for the step-down value must be included in each cell. Therefore, the formula in Figure 5.7 is input into cell C4.

image

FIGURE 5.7 The step-down equation input into cell C4.

Using the same retention values from the example above, the new value of cell C4 (day two) is 73 percent. For this formula to retain its references, the values in the step-down formula must be made absolute; they must reference exact cells, not cells relative to the current cell’s position. In most spreadsheet software, this is done with the dollar sign symbol ($). Rewriting the formula in Figure 5.7 with absolute references yields the formula in Figure 5.8.

image

FIGURE 5.8 The step-down equation, edited to include absolute cell references.

This formula can now be copied from cell C4 to C8. The absolute references ensure that the same values are used in each cell to calculate the step-down value (which is constant across the linear equation).

The process of creating the step-down function between the two end points and using it to fill in the intermediary cells should be repeated between day 7 and day 28 retention and again between day 28 and day 365 retention to create the complete retention profile.

Calculating user lifetime from the retention profile curve

Because the retention profile curve is discrete (consider the difference between a discrete and a continuous variable), taking the area under the curve is easily accomplished: a definite integral can be thought of as the sum of the areas of a set of infinite rectangles under the curve of an equation.

In the case of the retention profile, a discrete number of rectangles are observable, the width of each representing one retention day and the height of each representing the proportion of users returning to the product on that day. The area of each rectangle is simply the proportion of returning users multiplied by the width of the rectangle (which is always a unit of one day, meaning the area is equivalent to the proportion of returning users), and the area of the retention profile is the sum of those values. The expected lifetime is therefore the sum of the values along the curve. Since each point represents a proportion of returning users per day, the resultant sum is expressed in days.

Another way of thinking about this approach is to consider a product that can only be used for two days: the day after first interaction and the day after that. If the day 1 and day 2 retention rates are both 50 percent, then 50 percent represents a probability of a future user returning on days 1 and 2 after first interacting with the product. The expected lifetime, therefore, is one day: a new user has a 50 percent probability of returning on day 1 and a 50 percent probability of returning on day 2, which, added together, equals a 100 percent probability of returning on at least one day.

The expected lifetime determined by this technique produces a value that requires special interpretation, since it doesn’t represent calendar days but rather any combination of days within the retention profile. It must be thought of as the number of days that the user interacts with the product, as opposed to the total length of time that the user interacts with the product, which is a substantial distinction. This lifetime metric can be considered as the user duration. In finance, duration describes the weighted average lifetime over which a financial instrument delivering regular cash flows (a bond, for instance) will mature, or return its total present value.

Duration as used in the freemium context is not completely analogous to duration in finance, but as a concept it is similar; duration describes the total number of days on which a user will deliver all cash flows. Duration is much more useful in forecasting revenues—and, consequently, LTV—than a calendar days projection because it can easily be combined with the ARPDAU metric, since both are expressed in units of one day. The duration metric, therefore, comprises the lifetime component of the LTV calculation.

The retention profile should be segmented by user behavior to the greatest extent possible in order to provide a range of duration metrics. Before constructing the retention profile, thorough consideration should be given as to which segmentation approach to take, as user segments must reconcile between both user lifetime and revenue to produce valid LTV metrics. If user lifetime and revenue are segmented differently, the LTV metric won’t be representative of empirical behavior; it will be a combination of varied user actions and not indicative of any probable user profile.

Since LTV values should be calculable early in the product’s life, when little data exists, the best way of segmenting retention immediately after a product launch is through behavioral triggers—interactions with the product in the first session that correlate with day 1 retention.

These triggers are product-specific but should relate to spending behavior; a good example is a binary trigger event that simply captures whether or not a user makes a purchase within the product in the first session. This trigger can be used to segment users into two groups for which retention profiles can be constructed: those who made purchases in the first session and those who didn’t. Whatever user segments are chosen, they must be applied uniformly to the revenue metrics in order to establish congruency between the LTV components.

Arranging the retention profile worksheet in the spreadsheet model to accommodate user segments is typically done by creating separate worksheets for each user segment, as depicted in Figure 5.5. Calculating user duration for each user segment is accomplished by summing column C from cell C3 (day 1 retention) to the very bottom of the column (whichever retention day is given the terminal value).

Calculating revenue with trailing ARPDAU

In order to fully form the LTV metric, revenue must be estimated as ARPDAU and applied to the duration metric. As stated earlier, ARPDAU is the appropriate metric to use in calculating LTV in terms of duration, as it is calculated in units of one day, meaning it is compatible with the duration metric (which is a number of days) in terms of its unit.

Initially, ARPDAU appears simple to calculate within the context of LTV, as it merely represents a daily average. But ARPDAU requires smoothing before its use in calculating the LTV metric is valid, even in a spreadsheet model (which isn’t assumed to make sophisticated accommodations for trends or aberrant behavior). ARPDAU over the lifetime of a product, across all users, is not an instructive metric; it describes broad averages applied to various product update cycles.

A more relevant metric is trailing ARPDAU, which only accounts for a set number of days prior to the current day. Trailing ARPDAU should be calculated over a long enough timeline to capture a broad spectrum of behaviors, but recent enough to discard spending patterns associated with old versions of the product or from demographics that have since shifted away from the product. A trailing period of four weeks—that is, 28 days prior to the current day—is a reasonable standard for calculating trailing ARPDAU, although the appropriateness of that timeline is obviously dependent on the specific product being analyzed.

Since ARPDAU represents average spending per user, it must be calculated for a count of unique users when a trailing time horizon is used. The uniqueness requirement adds a layer of complexity to the process of deriving the metric because counting unique users for a period requires some forethought when designing the analytics system; simple sums of DAU cannot be used to calculate trailing ARPDAU because users may have interacted with the product on multiple days for the timeline being considered. The equation to calculate the trailing ARPDAU metric is shown in Figure 5.9.

image

FIGURE 5.9 The equation to calculate trailing ARPDAU.

The numerator of the equation, in which the total revenue for the period is divided by the unique user count, calculates a period average revenue per user. Dividing that value by the length of the period (in days) normalizes the value on the basis of one day, rendering it compatible with the duration metric.

ARPDAU must be segmented over the same behavioral conditions as duration in order to be used with duration to calculate LTV. If a behavioral trigger was used to segment retention profiles, that same trigger should be applied to the calculation of ARPDAU.

The structure of the ARPDAU worksheet is illustrated in Figure 5.10. The days of the trailing time horizon are laid out down the first column of the worksheet from most recent (one day trailing) to most distant (28 days trailing, or whatever is chosen as the maximum of the time horizon). To the right of that are columns labeled “revenue” and “unique users,” representing the component parts of trailing ARPDAU. The last column is the calculated trailing ARPDAU metric. This table template should be used for each user segment over which user duration is calculated to produce that segment’s specific trailing ARPDAU value.

image

FIGURE 5.10 A spreadsheet template for calculating trailing ARPDAU.

Note that each row represents a complete time period over which ARPDAU is calculated. For instance, the value in cell D3 would be described as two-day trailing ARPDAU, or the average revenue produced per user over the previous two days combined. The value in cell D29 would be described as 28-day trailing ARPDAU, or the average revenue produced per unique user who is active in the product over the entire period of 28 days leading up to when the values are calculated.

The revenue column value is an input for the sum of revenue generated by that user group for the period being calculated (e.g., for two-day trailing ARPDAU, it is the sum of all revenue generated in the previous two days). The unique users column contains a count input of the unique users who interacted with the product over the timeline selected.

The ARPDAU column is likewise calculated for the entire time period and contains a formula output; the formula in each cell divides the sum of the revenue generated in the given user segment by the unique users from that segment who interacted with the product over the time period. Figure 5.11 shows the formula for the D2 cell in Figure 5.10.

image

FIGURE 5.11 The formula to calculate trailing period ARPDAU.

Since the time period over which ARPDAU is calculated is trailing, this worksheet can be updated without any changes to its template; when the values for daily revenue and unique users are pasted in, the trailing ARPDAU values automatically update.

Structuring the LTV worksheet and deriving LTV

Once both the retention curve and trailing ARPDAU worksheets are formatted and completed, calculating LTV is straightforward: the LTV metric is simply the duration for a given user segment multiplied by the trailing ARPDAU.

Since one of the principle benefits of the spreadsheet approach to LTV calculation is transparency, an illustrative model of LTV can be created in a dedicated worksheet that explains the calculation to people not on the product team for whom the concept may be foreign.

The first and most important details to provide in the LTV worksheet are the standards by which the user groups are defined. This can be done with a simple definition of a group described in a particular worksheet. Continuing with the behavioral trigger example, the description of the user groups could be as simple as:

Group 1: Users who made a purchase in their first session

Group 2: Users who did not make a purchase in their first session

These definitions should be included at the top of each worksheet in a legend that also explains, in fairly rigorous detail, exactly how each component— trailing ARPDAU and duration—is calculated. The time period chosen for trailing ARPDAU should also be justified here, as this time period has a significant impact on the value of LTV. These clarifications may bear more weight in the decisions made as a result of LTV than the value of the LTV metric itself.

The LTV metrics are calculated by multiplying the duration value by the trailing ARPDAU value for each segment. A good graphic for LTV metrics is a grouped bar chart, where the LTV metric for each user segment is displayed as a columnar bar. These data points can be arranged in a table, as in Figure 5.12, that also includes the corresponding proportion made by each user segment. Such data provides rich context to the LTV metrics and can alleviate the need for follow-up analysis before product decisions are made on the basis of LTV.

image

FIGURE 5.12 An example of a summary table for LTV values by user segment group.

Constructing the LTV spreadsheet model presents a valuable exercise for the product team in the earliest stages of the development process. Even when values can’t be entered into the model to produce a credible LTV metric, the process of building the template and considering the user segments can help shape the direction and focus of the product’s development cycle.

It should be noted that the spreadsheet approach isn’t immune to suspicions of black box argumentation; even when all the data used in a calculation is made readily available, the calculation can appear cryptic. Carefully explaining the calculation keeps the focus of any LTV analysis on the value of the metric and not on the method.

An LTV-first approach to product development ensures that features will be built and prioritized with a discerning eye for optimized revenue and that, upon product launch, marketing efforts will be informed by data. The LTV spreadsheet model is the first tangible artifact a product team can produce in the early stages of development that articulates a reasonable estimate of segmented user revenues, even when the LTV is based entirely on assumptions. It is therefore worth a significant amount of effort and intellectual consideration.

ARPDAU versus projected individual revenue

The method described here for modeling LTV in a spreadsheet uses ARPDAU to describe an entire group of users composed of the segments being considered. In most cases, the group of users to be analyzed will be a subset of the entire user base, selected and segregated by some set of characteristics (very often demographic characteristics, such as country). But even when the LTV spreadsheet model as described here is broken down into distinct models of grouped segments, the individual models still represent groups of users—not individual users—and thus suffer from a clumping around averages that prevents predicting aberrant behavior. As discussed earlier, in the freemium model, the behavior of outliers is often the most revenue-relevant.

Another valid approach to measuring LTV in a spreadsheet model is simply to track cumulative spending by days since registration and then project that progression forward using a technique called curve fitting. An example of this technique is outlined in Figure 5.13.

image

FIGURE 5.13 A fitted LTV curve.

This curve describes an individual user and graphs the user’s cumulative in-product spending over the first few days of product interaction. When a curve is roughly fitted to the user’s existing behavior, the cumulative spending appears to approach an asymptote, or a line that the curve continually approaches but will never cross. As more data is collected about the user’s spending habits, the curve will shift, but after just a few days, a projection can be made.

A curve-fitting approach has two distinct advantages over the ARPDAU approach. The first is that the graph is intuitive and easy to explain; a brief glance at the chart in Figure 5.13 relays that the dotted line approached by the graph represents lifetime value. The second advantage is that the curve is more easily calculated for an individual user because it can be built using only spending data.

The principal disadvantage of this approach, however, is that, given the 5% rule, any spending in the freemium model is deviant from the norm and, thus, almost impossible to predict. With respect to users who will eventually spend in a product, a fitted curve for cumulative spending at an individual level will almost always be incorrect, often dramatically so. Likewise, choosing an accurate shape for an individual’s spending curve is impossible; some users may make purchases late in their use of the product, some may purchase only once at the beginning of their tenures, and so on.

The principle disadvantage of the ARPDAU method is that it tracks a group of users and not individuals. This is assuaged when the user base is large and stable. ARPDAU captures a cross-section of user spending behavior on any given day; it includes users making their first purchase on that day and users making their hundredth. When assuming that the same number of people make their first purchases on any given day as did the day before, then the ARPDAU approach doesn’t need to describe the behavior of an individual user, because it describes the behavior of a group of individuals in unchanging roles.

In other words, by measuring an entire large user base in a steady state, the ARPDAU method does capture data about individual users, in that they are enacting patterns that are present on any given day. A user base is not always in a stable state, of course, but this can be managed by adjusting the trailing period over which ARPDAU is calculated; when the product’s user base is in flux, the period should be increased to provide more data over which to calculate an average. As the user base stabilizes, the length of the period can be decreased, as almost all behavior can be considered pattern-recognizable, from a general perspective.

The analytics method

The shortcomings of the spreadsheet method are obvious: spreadsheets cannot process extremely large volumes of data, the model must be updated manually, and spreadsheet models generally do not accommodate values variability. The analytics methodcircumvents these shortcomings and is implemented programmatically alongside the product’s instrumentation; as users create data artifacts, the artifacts are cycled back into the LTV estimation mechanism to calibrate the mechanism to new data.

The analytics method presents a new set of shortcomings, however, which are mostly related to an organization’s realities. The first shortcoming is that an automated, programmatic LTV estimation methodology forces black box argumentation when used in making decisions; the analytics method calculates LTV metrics automatically by a completely opaque process that is difficult to explain and audit in nontechnical terms. Such opacity may not be acceptable to a firm, especially when large marketing and product development budgets are allocated on the basis of estimated LTV.

The second shortcoming is the expense of building a system of such breadth and technical intricacy. The spreadsheet method requires only spreadsheet software and a standard business intelligence stack to implement, and a spreadsheet template for modeling LTV can be completed in a few hours. The analytics method requires that a system be designed and built, usually with enough flexibility that it can serve multiple products and be expanded in the future. Such an undertaking is not trivial, could be very expensive, and could easily span months.

That said, at massive scale and when combined with a sophisticated analytics stack, the analytics method can produce very large returns by automating the management of the entire customer life cycle, from acquisition through churn. It is also capable of far more accurate predictions than the spreadsheet method, given the sophisticated statistical techniques the analytics method can employ and the vast amounts of data it can parse.

The analytics method also affords far more immediacy in analysis; as user trends change or market dynamics shift (especially in paid acquisition markets), a near-real-time LTV metric is available to capitalize on opportunities or limit expenses. When the scale of a product’s user base is large enough, these optimizations can represent appreciable amounts of money.

Given the fragmented and constantly evolving nature of the system of analytics technology platforms and the almost endless intricacy with which an LTV estimation mechanism can be implemented, this book avoids any specific prescriptions for building an LTV estimation mechanism with the analytics method. Instead, the concepts forming the foundation of the method are introduced in the vernacular of the field, providing a basis for further exploration.

The truth is that the specifics of an entire predictive analytics platform exceed the scope of a discussion of LTV, given the mastery needed across a wide range of disciplines—computer science, probabilistic statistics, data structures and data management, and machine learning—in order to construct such a platform. Rather than broach these topics, the analytics method is described in conceptual terms as an alternative to the spreadsheet method.

The Pareto/NBD method

The Pareto/NBD (negative binomial distribution) model was introduced in a 1987 article by Schmittlein, Morisson, and Colombo entitled Counting Your Customers: Who Are They and What Will They Do Next? (Schmittlein et al. 1987). The Pareto/NBD model is perhaps the most popular model used in predicting LTV within the context of irregular revenue streams (i.e., non-subscription and non-contract business units) and has spawned a vast amount of academic literature on the subject.

Given a set of user data, the Pareto/NBD model attempts to determine two things: (1) whether or not the user is still “alive” (that is, whether or not the user has churned out of the product), and (2) the number of purchases the user is likely to make in the future, called discounted expected transactions, or DET (Fader et al. 2004).

The utilization of the Pareto/NBD model is contingent on a set of assumptions about the underlying user base being true:

ent At any given point in time, a customer can exist in only one of two possible states with respect to the product: “alive” or “dead.” Customers who are alive may make a purchase in the future; customers who are dead have permanently churned out of the product.

ent The number of purchases made by a customer who is alive in a given time period varies randomly around that customer’s average purchase rate as a Poisson process. A Poisson process is simply the function of a binomial distribution, which describes the probability of a specific event taking place within a discrete period of time.

ent The transaction rate across users is heterogeneous, following a Gamma distribution. The Gamma distribution is the parent form of the exponential distribution with a long right tail and is often used to describe times between events and lifetimes.

ent User lifetimes follow the exponential distribution.

ent The point at which a customer churns out of the product is not observed; rather, it is deduced as an estimate based on a period of inactivity. Churn points are heterogeneous and follow a Gamma distribution.

ent The reasons behind customer churn are varied and exist as a component of randomness in the alive or dead estimation. That is, users may churn out of the product for unknowable reasons that occur independent of other users.

Many marketing models rely on a methodology for gauging customer value called recency, frequency, and monetary value (RFM). This approach values a customer’s future contributions to the product as a result of how recent (the recency) the last purchase was, the frequency with which the customer has made purchases in the past, and the monetary value of past purchases.

The Pareto/NBD model uses only recency and frequency as model determinants; it assumes that the size of a purchase is independent of a user’s propensity to make that purchase, given the user’s history of purchases. This simplification reduces the amount of data needed in employing the model; for any given user, the Pareto/NBD model requires only the number of purchases made by the user over the time period being considered, the time of the last purchase, and the time of the first interaction with the product.

Like the LTV approach taken in the spreadsheet method, users should be segmented by type before being analyzed with the Pareto/NBD model. The Pareto/NBD model produces a universal description of whatever sample it is executed over; segments should therefore be defined and identified before being measured by the model so as to not aggregate the DET value across the entire user base.

The segmentation process can be undertaken in the same way it is undertaken in the spreadsheet model, except that users must somehow be coded with their designated segment in the analytics system before their data is fed to the Pareto/NBD model. This can be accomplished without much effort, usually through an SQL command that adds a column to the users table with a segment code corresponding to a set of each user’s characteristics.

To implement the Pareto/NBD model, the only data needed to produce an estimate of DET (total future purchases) is basic information about the user and the user’s purchasing history, which is within the scope of even the most basic analytics systems. The Pareto/NBD model, however, produces an estimate for a point in time given a historical data set; that is, the model can make a best guess as to a user’s future DET given a set of historical purchases but not based on demographic data or non-purchase behavioral triggers. This limitation means that the Pareto/NBD model is adept at estimating customer equity but not an appropriate tool for estimating the LTV metric for a new user, about which little behavioral data is available.

Because the Pareto/NBD model, as described here, produces an estimate of DET, a revenue component must be included in the model to product an LTV metric. The DET estimate is analogous to the duration metric derived in the spreadsheet approach; it describes a “present value” of a unit relating to future activity (in the case of DET, the unit is purchases; in duration, it is days). Therefore, like in the spreadsheet approach, LTV can be constructed by multiplying the determinant of future activity by the empirically observed average value of that activity.

In the case of the Pareto/NBD model, that activity is the purchase, and the average value of the activity can be calculated by taking the average purchase sizes of the user segment being considered. Like in the spreadsheet approach, the segment over which average purchase size is calculated should match the segment being considered for DET. When this is done, the final calculation to produce LTV is simply the segment DET multiplied by the average purchase size for the segment. While isolating purchase size as independent of recency and frequency may seem counterintuitive (and certainly contrary to the RFM model), Fader, Hardie, and Lee provide extensive analysis supporting this notion (Fader et al. 2004).

The regression method

Another tool commonly used to estimate LTV is regression. Regression LTV models generally extend the recency, frequency, and monetary value framework and also incorporate demographic data and early behavioral and descriptive characteristics such as acquisition source and weekday of acquisition.

A multitude of regression frameworks exist that could be reasonably applied to LTV estimation; this treatment, building upon the foundation laid in Chapter 3, focuses on the use of linear and logistic regressions. As in the Pareto/NBD method, regression models should be run against user segments and not the entire population of users.

The simplest form of an LTV estimation regression model is a linear regression using recency of previous purchase, frequency of past purchases, and average purchase price as primary independent variables, with LTV representing the dependent variable. As stated previously, additional dependent variables relating to a user’s demographic characteristics or early behavior can also be coded as dummy variables to provide depth to the model.

While models of this nature are simple, they can be surprisingly accurate at producing numerical estimates of LTV metrics, although estimates tend to skew high when very large lifetime values are achieved (even with low frequency). Linear models should be tempered by a maximum possible value, which is usually the total theoretical value of the product catalogue.

Linear models are incapable of capturing a user’s state (alive or dead), given that the time of state change is unknowable and the linear model cannot accommodate a binary variable. The LTV regression model can be expanded, then, into two sub-models: one to predict a user’s state using a logistic regression and another to predict the value of future purchases. The logistic regression sub-model can utilize the RFM data points to produce a probability of a binary alive/dead state designation. State is important to consider when modeling LTV because it is inherently random and especially susceptible to over-regression in a linear model.

The output of a logistic regression is usually stated in an odds-ratio, or the odds that the value of the binary dependent variable is 1, given a one-unit increase in an independent variable. But most statistical packages can also state logistic regression results in probabilistic terms; they can state the probability that the dependent variable is 1, given the value of the independent variable. To predict LTV, the results of the logistic regression sub-model should be stated in probabilistic terms representing the probability that the user is alive, given the RFM independent variables.

The linear regression sub-model should, as in the simple linear model described earlier, produce a predicted value of future purchases based on the RFM data points and whatever demographic and behavioral characteristics are deemed relevant. This value, when multiplied by the probability of making future purchases produced by the logistic regression model, produces an expected value of future purchases. The sum of this value and a user’s previous purchase total produces that user’s LTV. When aggregated over the segment and normalized on a per-user basis, as in the Pareto/NBD method, this value can be used to benchmark the value of users belonging to the various predefined user segments within the product.

The regression method represents a compromised middle ground between the spreadsheet approach and the Pareto/NBD method; as regression is a fairly common statistical technique, and regression coefficients are easily interpreted without extensive knowledge of the model being considered, communicating the results of a regression alleviates some of the tensions of black box argumentation. And while linear methods require adjustment to conform to maximum thresholds and normal empirical patterns, they can produce accurate results. More sophisticated models can be susceptible to overfitting, especially when, as noted already, some independent variables are influenced by elements of unpredictability.

Implementing an analytics model

An analytics model, defined here as a model that is executed as a process within the analytics stack and not a model that is merely built on analytics output, is rolled out in two phases using a combination of statistical software and programmatic design. The first phase is called parameter estimation and is initially undertaken in a statistical package on a sample of historical data in order to train the model and determine the appropriate model coefficients and constants, or parameters. In this process, the proposed formula is run against a sample of test data to build a model approximating the relationship between the variables in the sample.

Within the statistical package, the model takes on a different form than what is usually proposed in academic literature, since at this point the model is crossing the chasm between theory and practice. Most statistical packages are semi-programmatic, meaning that standard programming nomenclature, such as loops and “if” statements, are used in many places, along with various functions for performing statistical techniques.

Some programming languages, such as Ruby and Python, have had enough statistical modules written for them that they serve as better test environments than do statistical packages, especially if the back-end of the analytics stack is written in one of these languages. The benefit of using a statistical package to define the model is the access to advanced statistical functionality; when that functionality is available within a pure programming environment (such as with Python), then use of the statistical package isn’t advantageous.

Defining a model means producing a formula that can be replicated in a programming environment. Academic formulas are theoretical and mostly cryptic; they are meant to convey sets of complex processes (such as sums and integrations) in as little space as possible to provide a general understanding of an entire concept in one or two lines.

Programmatic models need not be succinct; they may span dozens or even hundreds of lines of code. In programmatic statistics, the length of code is usually negatively correlated with readability: concise, convoluted code is often inscrutable to third parties, which can slow down update processes in the future if the author of the original code leaves the product group or company. Code should be eminently clear, readable, and explanatory, especially when performing complex statistics to produce a prediction, and ideally it should be checked before being deployed by another data scientist.

Once parameter estimation is completed, the model can be fed data from outside the sample and evaluated based on performance (which is measured by how well the model’s output fits historical data). If the model performs to expectations, it is translated into code (if it wasn’t developed in a programming environment in the first place) and deployed to the back-end, where it is inserted into the back-end’s “data loop” (the T in ETL). This means that the model is run against raw data and used to create a new aggregation, LTV. This is then represented as a new set of table columns on a reporting dashboard.

One of the most profound benefits of the analytics approach is that the model’s product is made available throughout the entire analytics stack: in-product processing, reporting, aggregation, and any other processes that directly interface with analytics in the execution of business logic. This means that LTV can now be used to trigger actions such as retention campaigns or in-product promotions based on LTV.

The spreadsheet approach doesn’t provide this kind of interactivity, as it sacrifices direct utility for simplicity and flexibility. The downside to this interoperability is that an inaccurate analytics model can corrupt an entire system; models must therefore be thoroughly vetted before being pushed onto the analytics stack and must be continually audited while in deployment.

Auditing an analytics model

A statistical model is built to define the relationship between variables in an existing product dynamic. But product dynamics change, and rare behavioral swings that were not captured in a model’s training set can emerge after model deployment. These behavioral attributes must be accommodated for in the model; deployed models always require future attention for tuning and updating in accordance with the shifts and realities of a large, complex system of user interaction.

Auditing an analytics model is the process of maintaining its level of accuracy with respect to product data. Auditing may be done on a formal basis, wherein a model’s historical output is queried and compared against a proxy metric, or it might be done on an ad-hoc or automatic basis (say, a wild swing in a model output raises a red flag and invites investigation). A model audit could result in a new process of parameter estimation against a newer or larger sample of training data or a fundamental reengineering of the model on a different conceptual basis.

The best standard against which LTV can be compared, for auditing purposes, is the final, actual LTV of users. But auditing by this standard requires more waiting time than most product teams are comfortable with to determine whether a model is valid; a user’s predicted LTV can be compared only to the user’s actual LTV once the user has churned out of the product, which could take months.

Early summed revenue—for instance, the amount of revenue generated by a user in the user’s first week—can be used as a proxy if LTV can be adjusted to a first-week value, but this could lead to interpretation problems, especially in products where purchases are generally made later in a user’s lifetime.

The best model audit technique incorporates evaluation by both proxy and actual values; wild differences in proxy metrics and the normalized predicted values can be spotted early, and the model can be reevaluated before the first group of users churns out. With respect to LTV, this is a comparison of final, actual LTV to the predicted LTV as users churn out of the product. But as users join and remain with the product, their predicted LTV should be evaluated as a function of their running total actual LTV; this can be represented in one-week increments (e.g., week 1 LTV, week 2 LTV, etc.) that are gauged against their predicted LTV, normalized for weekly periods.

The difference between a model’s performance and its expected accuracy shouldn’t be evaluated on a per-measurement basis; one erroneous prediction, even when extremely inaccurate, shouldn’t constitute a basis for removing a model. Predictions aren’t generally used on an individual basis to make decisions; rather, they are utilized as broad trends. The output of a model should be curve-fitted against actual data in the same process used in parameter estimation and the measure of standard error should be considered. If the measure of standard error exceeds the threshold of acceptability, the model should be reengineered, either with a new training set or with a completely different conceptual approach.

Parameter re-estimation is by far the most straightforward means of updating a model: the model is run against a new training set, new parameters are produced, and the new model is redeployed. This process is often automated to the extent that it occurs without any input from the product team. Optimizing parameters due to slight shifts in behavior is appropriate, however, only when the product isn’t undergoing rapid or significant updates that change the way users interact with it. As the product’s core use case changes, so too can the way in which LTV, at a conceptual level, is conceived. These types of changes require a model overhaul; the model should be removed from the analytics stack and reimagined to fit the new functionality or purpose of the product.

Making decisions with LTV

Like most other quantitative aspects of the freemium model, LTV is primarily a decision-making tool: it should contribute to revenue optimization, reduce uncertainty, and clarify choices. The purpose of calculating LTV, and the value in approximating as accurate an LTV metric as possible, is to drive paid user acquisition with a positive return on investment and prioritize feature development so as to maximize revenue. LTV facilitates both of these initiatives by assigning distinct monetary value to the user. On this basis, LTV is often the yardstick by which product changes are evaluated. Freemium products often exist as complex systems, and adjustments to their feature sets can result in conflicting metric results that impede product progress. The LTV metric allows the product team to make decisions by focusing on a singular performance signal: profit net of marketing costs.

LTV also serves as a bridge between the product development and management spheres through an element of shared vocabulary. Speaking in terms of the revenue benefits of a specific product feature or the projected profitability of a marketing campaign helps reduce communication barriers between the layers of an organization that often interface on uneasy terms.

By unifying the objectives of both the product team and the management team, the LTV metric removes obstructions that can otherwise produce significant delays in product planning and release. Specifically, LTV attaches a marketing constraint to the product development process, meaning that marketing considerations are made early, reducing the risk of a mismatch between product vision and marketing strategy at the point of launch.

The impact of LTV is strongest in three specific areas of operational choice. In marketing, for which the LTV metric holds primary significance, LTV provides a maximum threshold for acquisition spending in the pursuit of positive ROI acquisition campaigns. In product development, LTV informs the direction of a specific feature based on a concrete goal. And in product portfolio management, LTV provides a unit of measurement by which proposed new projects and existing projects can be allocated resources, given a strategic revenue mix. Each of these uses of LTV accomplishes different goals, but all are anchored in one core precept: that an explicit, quantifiable measure of revenue should determine strategic decisions.

LTV and marketing

The role LTV plays in making marketing decisions is fairly clear: it sets a concrete limit on the price that should be paid for a given user. This use case—appropriating a budget for user acquisition—adds substantial emphasis on the need for LTV to be calculated by user segment. User acquisition, whether undertaken through highly auditable digital ads or through traditional mediums that aren’t easily tracked, such as physical billboards or television ads, is conducted with targets, usually demographic, in mind. A maximum acquisition price for a universal product LTV cannot be used to optimize marketing spend; rather, it merely describes a theoretical limit that likely isn’t applicable to many users and can’t inform the targeting specifications around which all marketing campaigns are structured.

The characteristics used to define a user segment and upon which the LTV is based, then, must translate into realistic marketing targets. For instance, an LTV metric for a segment consisting of users owning a specific device is useful only in a marketing context in which a campaign can be targeted on that basis; if the device in question is a specific television model, or a specific brand of computer, then those users aren’t likely explicitly targetable. Thus, user segmentation, as it applies to the calculation of LTV, may need to be applied in reverse order at the organizational level: the marketing group defines the characteristics of potential users who can be filtered, and the product team segments those users within the analytics system accordingly.

This isn’t to say that marketing completely dominates the segmentation process and determines how users are grouped for product development; the marketing and product development segments can exist independently of each other. The segments can also exist within a hierarchy; because marketing segments are traditionally broad, product teams can easily subdivide them into more specific segments to optimize the user experience or ignore them altogether for product development purposes.

Because marketing segments are determined beforehand, by definition they can be made using only demographic, not behavioral, characteristics. The most common attributes used to define marketing segments are age and gender. They are also almost always based on geography, if for no other reason than to be adjusted for translation and cultural sensitivity. Given these constraints, an LTV used for marketing might be defined around three dimensions—location, age, and gender—meaning that a maximum acquisition price has been set for each combination of the three factors. Typically, a marketing segment takes the form of something like “men in the United States, aged 18–35.” An example of an acquisition price segment table is shown in Figure 5.14.

image

FIGURE 5.14 A sample acquisition price segment table.

Using LTV as a maximum acquisition price isn’t to constrain growth; it is to maintain a strictly positive return on marketing spending. There are cases where it may make sense for a firm to spend more to acquire a user than the firm predicts to spend on a product. Such cases include a product that requires an initial seed of users, or a firm that is attempting to reach a more visible position in a platform rankings chart (which will initiate increased organic growth), or an intangible benefit that can be obtained from conducting marketing at a loss. But marketing activity undertaken blindly or without regard to top-line return isn’t sustainable, and long-term marketing expenditure must be justified by the profitability of any given campaign on an LTV basis: users are acquired for less than they are expected to be worth.

It is tempting to think about the role of LTV as a proxy for overall business profit, where the difference between a user’s expected LTV and acquisition cost represents a per-user profit (or loss) margin. But this is misguided; LTV is, at its heart, a marketing metric, and so it merely describes the profitability of a given marketing campaign. LTV is therefore a marketing mandate, not a business strategy, and because it represents expected revenues and not immediately realized revenues, it cannot be used as a proxy for firm-level profits.

The firm-level evaluation of operational profit is expressed as revenue minus expenses; LTV contributes to this only in the sense that it constrains marketing expenses to within the boundaries established by the firm’s estimation of total lifetime user revenue contributions. But those revenue contributions are not immediate; in fact, they can materialize over very long lifetimes. Because of the mismatch in chronology—revenues accumulate over a period of indeterminable length and acquisition expenses are shouldered at the start of a user’s lifetime—LTV can’t be used in an accounting context to describe profit.

Similarly, marketing expenses for user acquisition spending that the LTV formula calculates constitute only one component of a firm’s overall operating expenses. Other expense groups such as salaries, facilities, product development, and so on must obviously be considered when evaluating whether or not a firm is profitable. Constraining the consideration of a firm’s profitability to its acquisition campaigns, in most cases, severely underestimates its total expenses.

For these reasons, LTV exceeding per-user acquisition cost is a prerequisite for overall firm profitability but does not exclusively determine it. In freemium products, the degree to which LTV exceeds user acquisition costs must be considered with respect to theLTV minus CPA margin, the volume of users acquired, and the size of the total user base. The fact that the expected total revenue contributions from purchased users exceeds those users’ purchase prices may be undermined by low margins and low volumes that don’t make up for the firm’s overhead expenses.

LTV, then, should determine only marketing strategy, not company strategy; the degree to which LTV must exceed per-user acquisition prices is set at the company level and is based on growth targets and the expense structure of the entire organization—not just marketing—that user revenues must compensate for.

LTV and product development

Within the scope of a single product, the LTV metric, in conjunction with the minimum viable metrics, can be used as a diagnostic tool to identify shortcomings that wouldn’t otherwise be visible without a derived measure of performance. LTV is synthetic in the sense that it is composed of multiple metric groups—retention and monetization—and can highlight poor performance by the standard of one set of metrics even if another set is performing to specification. For this reason, LTV is useful when prioritizing new feature development based on the metrics that features are expected to boost. For instance, if LTV is low but conversion and ARPDAU are reasonable, then features contributing to retention should be prioritized.

A product team is concerned with maintaining or growing product revenues, given a target range of daily user activity. In this sense, the product can be thought of as a hose: it receives an input, water (users), and produces an output, force (revenue). In this metaphor, LTV serves as a choke point that exposes a leak in the hose, given stable input (new and returning users); the leaks would be identifiable of their own accord but are much more visible when contextualized within the system and stress-tested, which is what LTV does. The magnitude of each leak in the hose is measured by the speed of the water leaving it; in the product development process, magnitude is measured by the difference between the metric’s current values and historical high points. The most consequential leaks are patched first; this is the prioritization that LTV imposes on the iteration process.

Addressing metrics shortcomings using LTV as an indicator requires establishing a standard for the metrics in the first place, which should be done as early as the product feature planning stage. For a feature that exists only on paper, forecasting its impact on product metrics is fraught with uncertainty, but estimated LTVs play a vital role in the feature evaluation process; each new feature developed should be considered under the guidance of the role that feature plays in meeting overall product revenue targets.

This is why the LTV model is so important, even when it is only directionally accurate. When introducing a specific feature, comparable historical feature development can be used to estimate, with reasonable accuracy, the impact on product metrics. Even if an LTV model is useful only in predicting the magnitude of LTV change at a relative level, not an absolute level, it can still be used to prioritize feature development.

LTV and organizational priority

High LTV products (products exhibiting high and non-stratified LTV metrics across user segments) tend to be niche; users are most willing to spend money on products they feel passionate about, and user passion negatively correlates with general appeal. This dynamic introduces a strategic element to the composition of a firm’s product portfolio: a diversified collection of freemium products, all exhibiting points across a wide spectrum of LTV values, may present more opportunities than a number of niche freemium products or one very large, broadly appealing freemium product.

This is because some minimal overhead is required to build and maintain a product, regardless of the size of its user base: a portfolio composed of a large number of small, niche products incurs greater fixed costs in maintenance than does a relatively smaller portfolio. A portfolio composed of a small number of broad products with large user bases, though, faces the risk that any one product falling into user base decay could disastrously disrupt revenues. The freemium product portfolio can be optimized to reduce expenses, maximize revenue, and allow for the most beneficial aspect of the web and mobile platforms: cross-promotion.

Any product’s user base is transitory: at some point, it will fall into decline due to the changing nature of the market in which it competes. As discussed earlier in the chapter, all else being equal, a user in the product system is worth more than a user outside of it because of the inherent value of behavioral data. This data is useful not only when determining if a user should be cross-promoted to an existing product, but also when determining which products should be developed and launched, given a user base that can be cross-promoted into them.

A low-LTV product—a casual product with broad appeal—serves as an excellent “bank” in which to retain users before cross-promoting them into higher-LTV products with more niche use cases. As old products decline and new products launch, the retention of users within the overall product portfolio system is paramount; the need for reacquisition essentially doubles a firm’s marketing costs over time. This need for low-LTV, mass-appeal products must factor into decisions made about the product catalogue and the priority of the product pipeline: a firm should structure its product portfolio to maintain the fungibility, or substitution properties, of its users.

User fungibility describes the extent to which users can be transitioned from one product to another without friction, or churn. Cross-promoting a user from one high-LTV product to another is often difficult; if the fundamental use cases of the products align, then they represent redundant efforts. It is therefore optimal to maintain an active low-LTV product in the catalogue to serve as an intermediary “holding product” for users before they are more appropriately cross-promoted to another product in the system (or sold to a third party through advertising). This transition should take place ideally before the initial product experiences significant decline; the more precisely this can be timed, the greater the number of users that can be retained within the system.

All of this activity requires a deep understanding of lifetime customer value—not only the LTV metrics for the active products, but also the projected values for new products. Applying this strategic thought process to the composition of the product catalogue can reduce by a significant degree the need for market expenditure; as has been established, a user is an asset, and for an early-stage technology firm producing freemium products, customer equity can represent nearly all of the enterprise’s value. One aspect of preserving that value is ensuring that users always have a viable destination within the product catalogue.

The politics of LTV

Given the multiple moving parts comprising its calculation and the extent of its organizational impact, lifetime customer value can stir intense emotions. In a highly fragmented and dysfunctional company, the LTV metric might create conflicting incentives across groups: the product team wants an LTV to be projected as high as possible because it results in acquiring a large user base (which is how success may be evaluated), while the marketing team wants the opposite in order to reduce expenditure (which is also how success may be evaluated). And finance may want to decouple LTV from marketing altogether, given that a high LTV allows large revenue projections to be produced without incurring commensurate acquisition expenses.

Under ideal circumstances, and when a firm is operating with a singular focus, LTV shouldn’t be a point of contention. But because it is a factor in so many decisions, LTV can easily become the subject of an intradepartmental tug-of-war, with each stakeholder attempting to influence LTV’s calculation or its use. This situation highlights deep, intrinsic weaknesses in the firm’s operational foundation; a nuanced calculation of LTV or a multi-group committee approach toward defining it can help to alleviate these weaknesses. Taking proactive steps can reduce the possibility that LTV becomes a battleground over influence or resource allocation in the first place.

The first step is for the firm to align each business unit behind the wisdom and prudence of collective performance marketing. When the entire firm subscribes to the notion that marketing spend should always produce a positive return, convincing a product team that metrics improvement is needed before its product can be marketed profitably isn’t a deviation from precedent and therefore doesn’t inspire feelings of animosity. Firm-wide metrics transparency, including for marketing spend, contributes to company-wide alignment.

The second step is to define the LTV calculation universally and charge the analytics group, not individual product groups, with maintaining it. Analytics groups generally sit outside the boundaries of the potential incentive crossfire that can materialize around LTV; furthermore, as a firm-wide platform, the analytics group should be impartial to any particular product’s performance.

The third step is to gauge marketing success not by volume or budget size but by verifiable profitability, which requires the marketing group to actively manage campaigns as a function of LTV. By coupling marketing spend with LTV and measuring only quantifiable profit, the marketing group is not motivated to appropriate any more budget than can be put to effective use. This step may require creativity in evaluating non-concrete marketing spend—such as on brand awareness initiatives or industry events—but it nonetheless aligns the interests of the marketing group with those of the product group.

When LTV has become a political minefield, other, more fundamental problems with the firm’s operations are at play. But because of its strategic importance, as much effort as possible should be put into preventing LTV from becoming collateral damage in an intradepartmental conflict; when LTV is held hostage by company politics, the performance of the product catalogue is put in jeopardy. With as much influence and power as LTV commands, companies should take care to place the integrity of the metric beyond dispute.