Preface - Pentaho Analytics for MongoDB (2014)

Pentaho Analytics for MongoDB (2014)

Preface

MongoDB and Pentaho go together like yin and yang. They are emerging as a powerful combination for scalable data storage, processing, and analytics. Leading companies are pairing these complementary technologies together in development labs and production to deliver innovative analytics. These innovations are creating worldwide demand for developers with skills in both Pentaho and MongoDB.

You want to make an impact by creating innovative data storage capabilities or eye-catching data visualizations. Wouldn't it be great if you could quickly ramp up on both technologies to develop a turn-key solution for your organization? However, as with any new and emerging technology combination, the availability of organized knowledge on the combined topic is scarce.

Pentaho Analytics for MongoDB will show you how to develop an analytic solution that you can demonstrate to your colleagues. It is a practical guide to get you started with both Pentaho and MongoDB, beginning with basic MongoDB data modeling and querying and then advancing to data integration, analysis, and reporting with Pentaho. Each chapter guides you through using different components of the Pentaho platform to create analytic models and reports using a sample MongoDB database.

What this book covers

Chapter 1, Getting Started with Pentaho and MongoDB, introduces you to the powerful combination of MongoDB and Pentaho and provides step-by-step guidance on how to install and configure both technologies and restore the sample MongoDB data provided with this book.

Chapter 2, MongoDB Database Fundamentals, expands on the topic of data modeling and explains MongoDB database concepts essential to querying MongoDB data with Pentaho.

Chapter 3, Using Pentaho Instaview, shows you how to visualize data by connecting Pentaho to MongoDB. You use Instaview with the sample MongoDB database to analyze and visualize the website clickstream data.

Chapter 4, Modifying and Enhancing Instaview Transformations, introduces Pentaho Data Integration (PDI)—the ETL tool used by Instaview to extract, load, and transform data from various data sources.

Chapter 5, Modifying and Enhancing Instaview Metadata, explores metadata by explaining dimensional modeling concepts and how to model metadata to better reflect business requirements.

Chapter 6, Pentaho Report Designer Fundamentals, teaches you the basics of Pentaho Report Designer (PRD) to build pixel-perfect reports sourced directly from MongoDB databases.

Chapter 7, Pentaho Report Designer Prompting and Charting, expands on the previous chapter by teaching you additional advanced PRD features. You can enhance your report with new queries, charts, and a prompt designed to make the report more interactive.

Chapter 8, Deploying Pentaho Analytics to the Web, is all about web-enabling your MongoDB data using Pentaho methods and web interfaces for connecting to, modeling, and analyzing our sample clickstream data in a web browser.

What you need for this book

We need the following software for this book:

· Pentaho Business Analytics v5.0.2 (64-bit for Windows)

· MongoDB v2.2.3 (64-bit for Windows)

This book provides two data sources for use throughout the book, a MongoDB database of sample web clickstream data, and an associated comma-separated (CSV) file containing geographic data. Both files are available as a free download from:http://www.packtpub.com/support.

Who this book is for

This book is intended for business analysts, data architects, and developers new to either Pentaho or MongoDB, who want to be able to deliver a complete solution for storing, processing, and visualizing data. It's assumed that you already have experience in defining the data requirements needed to support business processes and exposure to database modeling, SQL query, and reporting techniques.

Conventions

In this book, you will find a number of styles of text that distinguish between different kinds of information. The following are some examples of these styles and an explanation of their meaning.

Code words in text are shown as follows: "$.event_data[0].event."

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

{ $match : {referring_url : "${ReferringURLParam}"}},

{ $unwind : "$event_data" },

{ $group : { _id : "$browser", event_count : { $sum : 1 } } },

{$sort:{event_count: -1}}

Any command-line input or output is written as follows:

cd \

move C:\mongodb-win32-* C:\mongodb

New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: "Select and drag the CSV file input step onto the canvas."

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.