Reach More Customers with Better Data—and Products - Big Data Bootcamp: What Managers Need to Know to Profit from the Big Data Revolution (2014)

Big Data Bootcamp: What Managers Need to Know to Profit from the Big Data Revolution (2014)

Chapter 9. Reach More Customers with Better Data—and Products

How Big Data Creates a Conversation Between Company and Customer

No book on Big Data would be complete without a few words on the Big Data conversation. One of the biggest challenges to widespread adoption of Big Data is the nature of the Big Data conversation itself.

Historically, the discussion around Big Data has been a highly technical one. If the discussion remains highly technical, the benefits of Big Data will remain restricted to those with deep technical expertise. Good technology is critical. But companies must focus on communicating the business value they deliver in order for customers to buy their products and for business leaders to embrace a culture of being data-driven.

In developing the Big Data Landscape, I talked with and evaluated more than 100 vendors, from seed-stage startups to Fortune 500 companies. I spoke with numerous Big Data customers as well.

Many Big Data vendors lead with the technical advantages of their products to the exclusion of talking about business value, or vice versa. A technical-savvy company will often highlight the amount of information its database can store or how many transactions its software can handle per second.

A vision-savvy company, meanwhile, will talk about how it plays in the Big Data space but will lack the concrete technical data to show why its solutions perform better or the specifics about what use cases its product supports and the problems it solves.

Effective communication about Big Data requires both vision and execution. Vision involves telling the story and getting people excited about the possibilities. Execution means delivering on specific business value and having the proof to back it up.

Big Data cannot solve—at least not yet—a lack of clarity about what a product does, who should buy it, or the value a product delivers. Companies that lack clarity on these fronts struggle to sell their products no matter how hard they try.

Thus, there are three key components when it comes to a successful Big Data conversation: vision, value, and execution. “Earth’s biggest bookstore,” “The ultimate driving machine,” and “A developer’s best friend,” all communicate vision clearly.1

But clarity of vision alone is not enough. It must go hand in hand with clear articulation of the value a product provides, what it does, and who, specifically, should buy it.

Based on vision and business value, companies can develop individual stories that will appeal to the customers they’re trying so hard to reach as well as to reporters, bloggers, and other members of industry. They can create insightful blog posts, infographics, webinars, case studies, feature comparisons, and all the other marketing materials that go into successful communication—both to get the word out and to support sales teams in explaining their products to customers.

Content, like other forms of marketing, needs to be highly targeted. The same person who cares about teraflops and gigabits may not care as much about which companies in the Fortune 500 use your solution. Both pieces of information are important. They simply matter to different audiences.

Even then, companies can generate a lot of awareness about their products but fail to convert prospects when they land on their web sites. All too often, companies work incredibly hard to get visitors to their sites, only to stumble when it comes to converting those prospects into customers.

Web site designers place buttons in non-optimal locations, give prospects too many choices of possible actions to take, or build sites that lack the information that customers want. It’s all too easy to put a lot of friction in between a company and a customer who wants to download or buy.

When it comes to Big Data marketing, it’s much less about traditional marketing and much more about creating a conversation that is accessible. By opening up the Big Data conversation, we can all bring the benefits of Big Data to a much broader group of individuals.

Better Marketing with Big Data

Big Data itself can help improve the conversation, especially as more ad spend moves online. In 2013, marketers in the United States spent some $171.7 billion on advertising.2 As spending on offline channels such as magazines, newspapers, and the yellow pages continues to decline, new ways to communicate with customers online and via mobile keep springing up. Marketers spent $42.8 billion on online advertising in the United States in 2013, and invested some $7.1 billion in mobile advertising.3

Google remains the gorilla of online advertising, accounting for some 49.3% of total digital advertising revenue in 2013.4 Meanwhile, social media such as Facebook, Twitter, and LinkedIn represent not only new marketing channels but new sources of data as well. From a Big Data perspective, the opportunity doesn’t stop there.

Marketers have analytics data from visitors to their web sites, customer data from trouble ticketing systems, and actual product usage data. That data that can help them close the loop in understanding how their marketing investments translate into customer action.

Marketing today doesn’t just mean spending money on ads. It means that every company has to think and act like a media company. It means not just running advertising campaigns and optimizing search engine listings, but developing content, distributing it, and measuring the results. Big Data Applications can pull the data from all the disparate channels together, analyze it, and make predictions about what to do next—either to help marketers make better decisions or to take action automatically.

Image Note Every company now has to think and act like a data-driven media company. Big Data Applications are your new best friend when it comes to understanding all the data streaming into your company and making the wisest long-term decisions possible.

Big Data and the CMO

By 2017, chief marketing officers (CMOs) will spend more on information technology than chief information officers (CIOs), according to industry research firm Gartner.5 Marketing organizations are making more of their own technology decisions, with less involvement from IT. More and more, marketers are turning to cloud-based offerings to serve their needs. That’s because they can try out an offering and discard it if it doesn’t perform, without significant up-front cost or time investment.

Historically, marketing expenses have come in three forms: people to run marketing; the costs of creating, running, and measuring marketing campaigns; and the infrastructure required to deliver such campaigns and manage the results.

At companies that make physical products, marketers spend money to create brand awareness and encourage purchasing. Consumers purchase at retail stores, such as car dealerships, movie theaters, and other physical locations, or at online destinations such as

Marketers at companies that sell technology products often try to drive potential customers directly to their web sites. A technology startup, for example, might buy Google AdWords—the text ads that appear on Google’s web site and across Google’s network of publishing partners—in the hopes that people will click on those ads and come to their web site. From there, the potential customer might try out the company’s offering or enter their contact information in order to download a whitepaper or watch a video, activities which may later result in the customer buying the company’s product.

All of this activity leaves an immense digital trail of information—a trail that is multiplied ten times over, of course, because Google AdWords aren’t the only form of advertising companies invest in to drive customers to their web sites. Marketers buy many different kinds of ads across different ad networks and media types. Using the right Big Data tools, they can collect data and analyze the many ways that customers reach them. These range from online chat sessions to phone calls, from web site visits to the product features customers actually use. They can even analyze which segments of individual videos are most popular.

Historically, the systems required to create and manage marketing campaigns, track leads, bill customers, and provide helpdesk capabilities came in the form of expensive and difficult-to-implement installed enterprise software solutions. IT organizations would embark on the time-consuming purchase of hardware, software, and consulting services to get a full suite of systems up and running to support marketing, billing, and customer service operations.

Cloud-based offerings have made it possible to run all of these activities via the Software as a Service (SaaS) model. Instead of having to buy hardware, install software, and then maintain such installations, companies can get the latest and greatest marketing, customer management, billing, and customer service solutions over the web.

Today, a significant amount of the data many companies have on their customers is now in the cloud, including corporate web sites, site analytics, online advertising expenditures, trouble ticketing, and the like. A lot of the content related to company marketing efforts such as press releases, news articles, webinars, slide shows, are now online. Marketers at companies that deliver products such as online collaboration tools or web-based payment systems over the web can now know which content a customer or prospect has viewed, along with demographic and industry information.

The challenge and opportunity for today’s marketer is to put the data from all that activity together and make sense of it. For example, a marketer might have their list of customers stored in, leads from their lead-generation activities stored in Marketo or Eloqua, and analytics that tell them about company web site activity in Adobe Omniture or Google Analytics, a web site analytics product from Google.

Certainly, a marketer could try to pull all that data into a spreadsheet and attempt to run some analysis to determine what’s working well and what isn’t. But actually understanding the data takes significant analysis. Is a certain press release correlated with more web site visits? Did a particular news article generate more leads? Do visitors to a web site group fit into certain industry segments? What kinds of content appeals to which visitors? Did moving a button to a new location on a web site result in more conversions?

These are all questions that consumer packaged goods (CPG) marketers like Procter & Gamble (P&G) have focused on for years. In 2007, P&G spent $2.62 billion on advertising and in 2010 the company spent $350 million on customer surveys and studies.6 With the advent of Big Data, the answers are available not just to CPG companies who spend billions on advertising and hundreds of millions on market research each year but also to big and small vendors alike across a range of industries. The promise of Big Data is that today’s tech startup can have as much information about its customers and prospects as a big CPG company like P&G.

Another issue for marketers is understanding the value of customers—in particular, how profitable they are. For example, a customer who spends a small amount of money but has lots of support requests is probably unprofitable. Yet correlating trouble ticket data, product usage data, and information about how much revenue a particular customer generated with how much it cost to acquire that customer remains very hard to do.

Big Data Marketing in Action

Although few companies can analyze such vast amounts of data in a cohesive manner today, one thought leader who has been able to perform such analysis is Patrick Moran, vice president of marketing at New Relic. New Relic is an application performance monitoring company. The company makes tools that help developers figure out what’s causing web sites to run slowly and make them faster.

Moran has been able to pull together data from systems like and demand-generation system Marketo along with data from Zendesk, a helpdesk ticketing system, and from Twitter campaigns, on which New Relic spends some $150,000 per month. In conjunction with a data scientist, Moran’s team is able to analyze all that data and figure out which Twitter campaigns have the most impact—down to the individual tweets. That helps Moran’s team determine which campaigns to spend more on in the future.

The first step in Moran’s ability to gather and analyze all that data is having it in the cloud. Just as New Relic itself is a SaaS company, virtually all of the systems from which Moran’s team gathers marketing data are cloud-based.

The next step in the process is running a series of marketing campaigns by investing in ads across Google, Twitter, and other online platforms.

Third, the marketing team gathers all the data from, Marketo, Twitter campaigns, product usage data, and other forms of data in one place. In New Relic’s case they store the data in Hadoop.

Fourth, using the open source statistics package R, the team analyzes the data to determine the key factors that drive the most revenue. For example, they can evaluate the impact on revenue of the customer’s geographic location, the number of helpdesk tickets a customer submitted, the path the customer took on New Relic’s web site, the tweets a customer saw, the number of contacts a customer has had with a sales rep, and the kind of performance data the customer monitors within the New Relic application. By analyzing all of this data, Moran’s team even knows what time of day to run future campaigns. Finally, the team runs a new set of campaigns based on what they’ve learned.

New Big Data Applications are emerging specifically to make the process that teams like Moran’s follow easier. MixPanel, for example, is a web-based application that allows marketers to run segmentation analysis, understand their conversion funnels (from landing page to product purchase), and perform other kinds of marketing analysis.

By aggregating all of this information about customer activity, from ad campaign to trouble ticket to product purchase, it is possible for Big Data marketers to correlate these activities and not only reach more potential customers but to reach them more efficiently.

Marketing Meets the Machine: Automated Marketing

The next logical step in Big Data marketing is not just to bring disparate sources of data together to provide better dashboards and insights for marketers, but to use Big Data to automate marketing. This is tricky, however, because there are two distinct components of marketing: creative and delivery.

The creative component of marketing comes in the form of design and content creation. A computer, for example, can’t design the now famous “For everything else there’s MasterCard” campaign. But it can determine whether showing users a red button or a green button, a 12-point font or a 14-point font, results in more conversions. It can figure out, given a set of potential advertisements to run, which ones are most effective.

Given the right data, a computer can even optimize specific elements of a text or graphical ad for a particular person. For example, an ad optimization system could personalize a travel ad to include the name of the viewer’s city: “Find the lowest fares between San Francisco and New York” instead of just “Find the lowest fares.”7 It can then determine whether including such information increases conversion rates.

Image Note Increasingly, computers will make many minor marketing decisions—which ad to show, what color the background should be, where to place the “Tell Me More” buttons, and so forth. They will have to, given the scale of today’s Big Data marketing machine. No human, or team of humans, can make as many effective decisions as fast.

In theory, human beings could perform such customizations manually, and in the past, they did. Graphic artists used to—and some still do—customize each ad individually. Web developers would set up a few different versions of a web page and see which one did the best. The problem with these approaches is two-fold. They’re very limited in the number of different layouts, colors, and structures a marketer can try. There’s also no easy way to customize what is shown to each individual. A different button location, for example, might work better for one group of potential customers but not for another. Without personalization, what results in higher conversion rates for one group of customers could results in lower conversion rates for another.

What’s more, it’s virtually impossible to perform such customizations for thousands, millions, or billions of people by hand. And that is the scale at which online marketing operates. Google, for example, serves an average of nearly 30 billion ad impressions per day.8 That’s where Big Data systems excel: when there is a huge volume of data to deal with and such data must be processed and acted upon quickly.

Some solutions are emerging that perform automated modeling of customer behavior to deliver personalized ads. TellApart and AdRoll offer retargeting applications. They combine automated analysis of customer data with the ability to display relevant advertisements based on that data. TellApart, which recently hit a $100 annual run rate, identifies shoppers that have left a retailer’s web site and delivers personalized ads to them when they visit other web sites, based on the interests that a given shopper showed while browsing the retailer’s site.9 This kind of personalized advertising brings shoppers back to the retailer’s site, often resulting in a purchase. By analyzing shopper behavior, TellApart is able to target high-quality customer prospects while avoiding those who aren’t ultimately likely to make a purchase.

When it comes to marketing, automated systems are primarily involved in large-scale ad serving and in lead-scoring, that is, rating a potential customer lead based on a variety of pre-determined factors such as the source of the lead. These activities lend themselves well to data mining and automation. They are well-defined processes with specific decisions that need to be made, such as determining whether a lead is good, and actions that can be fully automated, such as choosing which ad to serve.

Plenty of data is available to help marketers and marketing systems optimize content creation and delivery. The challenge is putting it to work.

Social media scientist Dan Zarrella has studied millions of tweets, likes, and shares, and has produced quantitative research on what words are associated with the most re-tweets, the optimal time of day to blog, and the relative importance of photos, text, video, and links.10 The next step in Big Data meets the machine will be Big Data Applications that combine research like Zarrella’s with automated content campaign management.

In the years ahead, you’ll see intelligent systems continue to take on more and more aspects of marketing. These systems won’t just score leads, they’ll also determine which campaigns to run and when to run them. They’ll customize web sites so that the ideal site is displayed to each individual visitor. Marketing software won’t just be about dashboards that help humans make better decisions—useful as that is. With Big Data, marketing software will be able to run campaigns and optimize the results automatically.

The Big Data Content Engine

When it comes to creating content for marketing, there are really two distinct kinds of content most companies need to create: high-volume and high-value. Amazon, for example, has some 248 million pages stored in Google’s search index.11 Such pages are known as the long tail. People don’t come across any individual page all that often, but when someone is searching for a particular item, the corresponding web page is there in Google’s index to be found. Consumers searching for products are highly likely to come across an Amazon page while performing their search.

Human beings can’t create each of those pages. Instead, Amazon automatically generates its pages from its millions of product listings. The company create pages that describe individual products as well as pages that are amalgamations of multiple products: there’s a headphones page, for example, that lists all the different kinds of headphones along with individual headphones and text about headphones in general. Each page, of course, can be tested and optimized.

Amazon has the advantage not only of having a huge inventory of products—its own and those listed by merchants that partner with Amazon—but a rich repository of user-generated content, in the form of product reviews, as well. Amazon combines a huge Big Data source, its product catalog, with a large quantity of user-generated content.

This makes Amazon not only a leading product seller but also a leading source of great content. In addition to reviews, Amazon has product videos, photos (both Amazon and user-supplied), and other forms of content. Amazon reaps the rewards of this in two ways: it is likely to be found in search engine results and users come to think of Amazon as having great editorial content. Instead of just being a destination where consumers go to buy, Amazon becomes a place where consumers go to do product research, making them more likely to make a purchase on the site.

Other companies, particularly e-commerce companies with large existing online product catalogs, have turned to solutions like BloomReach. BloomReach works with web sites to generate pages for the search terms that shoppers are looking for. For example, while an e-tailer might identify a product as a kettle, a shopper might search for the term “hot pot.” The BloomReach solution ensures that sites display relevant results to shoppers, regardless of the exact term the shopper searches for.

Amazon isn’t the only company that wouldn’t traditionally be considered a media company that has turned itself into exactly that. Business networking site LinkedIn has too. In a very short time, LinkedIn Today has become a powerful new marketing channel. It has transformed the business social networking site into an authoritative source of content and delivered a valuable service to the site’s users in the process.

LinkedIn used to be a site that users would occasionally frequent when they wanted to connect with someone or they were starting a new job search. LinkedIn Today has made the site relevant on a daily basis by curating relevant news from around the web and updates from users of the site itself.

LinkedIn goes a step further than most traditional media sites by showing users content that is relevant to them based on their interests and their network. The site brings users back via a daily email that contains previews of the latest news. LinkedIn has created a Big Data content engine that drives new traffic, keeps existing users coming back, and maintains high levels of engagement on the site.

How can a company that doesn’t have millions of users or product listings create content at the Big Data scale? I’ll answer that question in a moment. But first, a few words on marketing and buying Big Data products.

The New PR: Big Data and Content Marketing

When it comes to driving demand for your products and keeping prospects engaged, it’s all about content creation: blog posts, infographics, videos, podcasts, slide decks, webinars, case studies, emails, newsletters, and other materials are the fuel that keep the content engine running.

Since 1980, the number of journalists “has fallen drastically while public relations people have multiplied at an even faster rate.”12 In 1980, there were .45 public relations (PR) workers per 100,000 people compared to .36 journalists. In 2008 there were twice as many PR workers, .90 for every 100,000 people, compared with .25 journalists. That means there are more than three PR people for every journalist, which makes getting your story covered by a reporter harder than ever before. Companies, Big Data and otherwise, have to create useful and relevant content themselves to compete at the Big Data scale.

In many ways, content marketing is the new advertising. As of 2011, according to NM Incite, a Nielsen/McKinsey company, there were some 181 million blogs worldwide compared to only 36 million in 2006.13 But the good news for companies trying to get the word out about their products is that many of these blogs are consumer-oriented with small audiences, and creating a steady stream of high-quality content is difficult and time-consuming. A lot more people consume content than create it. A study by Yahoo research14 showed that about 20,000 Twitter users (just .05% of the user base) generated 50% of all tweets.15

Content marketing means putting as much effort into marketing your product as you put into marketing the content you create about your product. Building great content no longer means simply developing case studies or product brochures specifically about your product but delivering news stories, educational materials, and entertainment.

In terms of education, IBM for example, has an entire portfolio of online courses. Vacation rental site Airbnb created Airbnb TV to showcase its properties in cities around the world, which in the process showcased Airbnb itself. You can no longer just market your product; you have to market your content too, and that content has to be compelling in its own right.

Image Note Content marketing will be a big part of your future. Gaining market adoption isn’t just about developing great products—it’s about ensuring people understand the value of those products. Fortunately, that’s where content marketing comes to the rescue.

Crowdsource Your Way to Big Data Scale

Producing all that content might seem like a daunting and expensive task. It needn’t be. Crowdsourcing, which involves outsourcing tasks to a distributed group of people, is the easy way to generate that form of unstructured data that is so critical for marketing: content.16

Many companies already use crowdsourcing to generate articles for search engine optimization (SEO), articles that help them get listed and ranked more highly in sites like Google. Many people associate such content crowdsourcing with high-volume, low-value forms of content. But today it is possible to crowdsource high-value, high-volume content as well.

Crowdsourcing does not replace in-house content development. But it can augment it. A wide variety of sites now provide crowdsourcing services. Amazon Mechanical Turk (AMT) is frequently used for tasks like content categorization and content filtering, which are difficult for computers but easy for humans. Amazon itself uses AMT to determine if product descriptions match their images. Other companies build on top of the programming interfaces that AMT supports to deliver vertical-specific services such as audio and video transcription.17

Sites like and are frequently used to find software engineers or to create large volumes of low-cost articles for SEO purposes, while sites like 99designs and Behance make it possible for creative professionals, such as graphic designers, to showcase their work and for content buyers to line up designers to deliver creative work. Meanwhile, TaskRabbit is applying crowdsourcing to offline tasks such as food delivery, shopping, house cleaning, and pet sitting.

One of the primary differences between relatively low-value content created exclusively for SEO purposes—which despite (or perhaps because of) its goal is having progressively less impact on search results—and high-value content is the authoritative nature of the latter. Low-value content tends to provide short-term fodder for search engines in the form of an article written to catch a particular keyword search.

High-value content, in contrast, tends to read or display more like professional news, education, or entertainment content. Blog posts, case studies, thought leadership pieces, technical writeups, infographics, video interviews, and the like fall into this category. This kind of content is also the kind that people want to share. Moreover, if your audience knows that you have interesting and fresh content, that gives them more reason to come back to your site on a frequent basis and a higher likelihood of staying engaged with you and your products.

The key to such content is that it must be newsworthy, educational, entertaining, or better yet, a combination of all three. The good news for companies struggling to deliver this kind of content is that crowdsourcing now makes it easier than ever.

Crowdsourcing can come in the form of using a web site like 99designs, but it doesn’t have to. As long as you provide a framework for content delivery, you can plug crowdsourcing in to generate the content. For example, if you create a blog for your web site, you can author your own blog posts but also publish those authored by contributors, such as customers and industry experts.

If you create a TV section of your site, you can post videos that are a mix of videos you create yourself, videos embedded from other sites, such as YouTube, and videos produced through crowdsourcing. Those producers can be your own employees, contractors, or industry experts conducting their own interviews. You can crowdsource webinars and webcasts in much the same way. Simply look for people who have contributed content to other sites and contact them to see if they’re interested in participating on your site.

Using crowdsourcing is an efficient way to keep your high-value content production machine humming. It simply requires a content curator or a content manager to manage the process. Of course, even that can be crowdsourced. Most importantly, as it relates to Big Data, as you create your content, you can use analytics to determine which content is most appealing, interesting, and engaging for your users. By making Big Data an integral part of your content marketing strategy, you can bring together the best of both worlds—rich content with leading edge analytics that determine which content is a hit and which isn’t.

Every Company Is Now a Media Company

In addition to creating content that’s useful in the context of your own web site, it’s also critical to create content that others will want to share and that bloggers and news outlets will want to write about. That means putting together complete content packages. Just as you would include an image or video along with a blog post on your own site, you should do the same when creating content you intend to pitch to others.

Some online writers are now measured and compensated based on the number of times their posts are viewed. As a result, the easier you make it for them to publish your content and the more compelling content you can offer them, the better. For example, a press release that comes with links to graphics that could potentially be used alongside an article is easier for a writer to publish than one that doesn’t.

A post that is ready to go, in the form of a guest post, for example, is easier for an editor or producer to work with than a press release. An infographic that comes with some text describing what its key conclusions are is easier to digest than a graphic by itself.

Once your content is published, generating visibility for it is key. Simply announcing a product update is no longer sufficient. High-volume, high-quality content production requires a media company-like mindset. Crowdsourcing is still in its infancy but you can expect the market for it to continue to grow in the coming years.

Measure Your Results

On the other end of the spectrum from content creation is analyzing all that unstructured content to understand it. Computers use natural language processing and machine learning algorithms to understand unstructured text, such as the half billion tweets that Twitter processes every day. This kind of Big Data analysis is referred to as sentiment analysis or opinion mining.

By evaluating posts on Internet forums, tweets, and other forms of text that people post online, computers can determine whether consumers view brands positively or negatively. Companies like Radian6, which acquired for $326 million in 2010, and Collective Intellect, which Oracle acquired in 2012, perform this kind of analysis. Marketers can now measure overall performance of their brand and individual campaign performance.

Yet despite the rapid adoption of digital media for marketing purposes, measuring the return on investment (ROI) from marketing remains a surprisingly inexact science. According to a survey of 243 CMOs and other executives, 57% of marketers don’t base their budgets on ROI measures.18Some 68% of respondents said they base their budgets on historical spending levels, 28% said they rely on gut instinct, and 7% said their marketing spending decisions weren’t based on any metrics.

The most advanced marketers will put the power of Big Data to work, removing more unmeasurable components from their marketing efforts and continuing to make their marketing efforts more data-driven, while others continue to rely on traditional metrics such as brand awareness or no measurement at all. This will mean a widening gap between the marketing haves and the marketing have-nots.

While marketing at its core will remain creative, the best marketers will use tools to optimize every email they send, every blog post they write, and every video they produce. Ultimately every part of marketing that can be done better by an algorithm—such as choosing the right subject line or time of day to send an email or publish a post—will be. Just as so much trading on Wall Street is now done by quants, large portions of marketing will be automated in the same way.19 Creative will pick the overall strategy. Quants will run the execution.

Of course, great marketing is no substitute for great product. Big Data can help you reach prospective customers more efficiently. It can help you better understand who your customers are and how much they’re spending. It can optimize your web site so those prospects are more likely to convert into customers once you’ve got their attention. It can get the conversation going. But in an era of millions of reviews and news that spreads like wildfire, great marketing alone isn’t enough. Delivering a great product is still job one.

____________________, BMW, and New Relic, respectively.





6 and










16, which the author co-founded, is one such example.