Using Google Web Services - Using Platforms - Cloud Computing Bible (2011)

Cloud Computing Bible (2011)

Part II: Using Platforms

Chapter 8: Using Google Web Services

IN THIS CHAPTER

Learning about Google's range of cloud-based services

Understanding Google's search model

Using Google's services in your own applications

Discovering the Google App Engine PaaS cloud service

Google is the prototypical cloud computing services company, and it supports some of the largest Web sites and services in the world. In this chapter, you learn about Google's applications and services for users and the various developer tools that Google makes available.

At the center of Google's core business is the company's search technology. Google uses automated technology to index the Web. It makes its search service available to users as a standard search engine and to developers as a collection of special search tools limited to various areas of content. The application of Google's searches to content aggregation has led to enormous societal changes and to a growing trend of disintermediation.

The most important commercial part of Google's activities is its targeting advertising business: AdWords and AdSense. Google has developed a range of services including Google Analytics that supports its targeted advertising business.

Google applications are cloud-based applications. The range of application types offered by Google spans a variety of types: productivity applications, mobile applications, media delivery, social interactions, and many more. The different applications are listed in this chapter. Google has begun to commercialize some of these applications as cloud-based enterprise application suites that are being widely adopted.

Google has a very large program for developers that spans its entire range of applications and services. Among the services highlighted are Google's AJAX APIs, the Google Web Toolkit, and in particular Google's relatively new Google Apps Engine hosting service. Using Google App Engine, you can create Web applications in Java and Python that can be deployed on Google's infrastructure and scaled to a large size.

Exploring Google Applications

Few companies have had as much impact on their industries as Google has had on the computer industry and on the Internet in particular. Some companies may have more Internet users (Microsoft comes to mind) or have a stock valuation higher than Google (Apple currently fits that description), but Google remains both a technology and thought leader for all things Internet. For a company whose motto is “Don't be evil,” the impact of consumer tracking and targeted advertising, free sourcing applications, and the relentless assault on one knowledge domain after another has had a profound impact on the lives of many people. I call it the Google Effect.

The bulk of Google's income comes from the sales of target advertising based on information that Google gathers from your activities associated with your Google account or through cookies placed on your system using its AdWords system. In 2009, Google's revenue was $23.6 billion, and it controlled roughly 65 percent of the search market through its various sites and services. The company is highly profitable, and that has allowed Google to create a huge infrastructure as well as launch many free cloud-based applications and services that this chapter details. These applications are offered mostly on a free usage model that represents Google's Software as a Service portfolio. A business model that offers cloud-based services for free that are “good enough” is very compelling. While Google is slowly growing a subscription business selling these applications to enterprises, its revenue represents only a small but growing part of Google's current income.

Google's cloud computing services falls under two umbrellas. The first and best-known offerings are an extensive set of very popular applications that Google offers to the general public. These applications include Google Docs, Google Health, Picasa, Google Mail, Google Earth, and many more. You can access a jump table of Google's cloud-based user applications by following the “More” and “Even More” links on Google's home page to the More Google Products page at http://www.google.com/intl/en/options/ shown in Figure 8.1; these features are described in Table 8.1.

Because I cover many of these products in other chapters in this book, the focus in this chapter is to survey the applications that Google offers, to understand why Google offers them as services, and to gain some insight into their potential future role. Google's cloud-based applications have put many other vendors' products—such as office suites, mapping applications, image-management programs, and many other categories of traditional shrink-wrapped software—under considerable pressure.

The second of Google's cloud offerings is its Platform as a Service developer tools. In April 2008, Google introduced a development platform for hosted Web applications using Google's infrastructure called the Google App Engine (GAE). The goal of GAE is to allow developers to create and deploy Web applications without worrying about managing the infrastructure necessary to have their applications run. GAE applications may be written using many high-level programming languages (most prominently Java and Python) and the Google App Engine Framework, which lowers the amount of development effort required to get an application up and running. Goggle also allows a certain free level of service so that the application must exceed a certain level of processor load, storage usage, and network bandwidth (Input/Output) before charges are assessed.

FIGURE 8.1

More Google Products equals fewer commercial products.

9780470903568-fg0801.tif

Google App Engine applications must be written to comply with Google's infrastructure. This narrows the range of application types that can be run on GAE; it also makes it very hard to port applications to GAE. After an application is deployed on GAE, it is also difficult to port that application to another platform. Even with all these limitations, the Google App Engine provides developers a low-cost option on which to create an application that can run on a world-class cloud infrastructure—with all the attendant benefits that this type of deployment can bestow.

Surveying the Google Application Portfolio

It is fair to say that nearly all the products in Google's application and service portfolio are cloud computing services in that they all rely on systems staged worldwide on Google's one million plus servers in nearly 30 datacenters. Roughly 17 of the 48 services listed leverage Google's search engine in some specific way. Some of these search-related sites search through selected content such as Books, Images, Scholar, Trends, and more. Other sites such as Blog Search, Finance, News, and some others take the search results and format them into an Aggregation page. Figure 8.2 shows one of these aggregation pages: Google Finance.

FIGURE 8.2

Google's Finance page at http://www.google.com/finance/ is an example of an aggregation page provided by results from Google's search engine.

9780470903568-fg0802.tif

Indexed search

Google's search technology is based on automated page indexing and information retrieval by Web crawlers, also called spiders or robots. Content on pages is scanned up to a certain number of words and placed into an index. Google also caches copies of certain Web pages and stores copies of documents it finds such as DOC or PDF files in its cache.

Google uses a patented algorithm to determine the importance of a particular page based on the number of quality links to that page from other sites, along with other factors such as the use of keywords, how long the site has been available, and traffic to the site or page. That factor is called the PageRank, and the algorithm used to determine PageRank is a trade secret. Google is always tweaking the algorithm to prevent Search Engine Optimization (SEO) strategies from gaming the system. Based on this algorithm, Google returns what is called a Search Engine Results Page (SERP) for a query that is parsed for its keywords.

It is really important to understand what Google (and other search engines) offers and what it doesn't offer. Google does not search all sites. If a site doesn't register with the search engine or isn't the target of a prominent link at another site, that site may remain undiscovered. Any site can place directions in their ROBOTS.TXT file indicating whether the site can be searched or not, and if so what pages can be searched. Google developed something called the Sitemaps protocol, which lets a Web site list in an XML file information about how the Google robot can work with the site. Sitemaps can be useful in allowing content that isn't browsable to be crawled; they also can be useful as guides to finding media information that isn't normally considered, such as AJAX, Flash, or Silverlight media. The Sitemaps protocol has been widely adopted in the industry.

Note

While dynamic content presented in AJAX isn't normally indexed, Google now has a procedure that helps the Google engine crawl this information. You can read about it at: http://code.google.com/web/ajaxcrawling/.

The dark Web

Online content that isn't indexed by search engines belongs to what has come to be called the “Deep Web”—that is, content on the World Wide Web that is hidden. Any site that suppresses Web crawlers from indexing it is part of the Deep Web. You need go no further than the world's number two Web site, Facebook, for a prominent example of a site that isn't indexed in search engines.

Entire networks exist that aren't searchable, particularly peer-to-peer networks. Ian Clarke's Freenet, which is a P2P network, supports both “darknet” and “opennet” connections. Freenet (http://freenetproject.org/) has been downloaded by millions of people.

The Deep Web includes:

• Database generated Web pages or dynamic content

• Pages without links

• Private or limited access Web pages and sites

• Information contained in sources available through executable code such as JavaScript

• Documents and files that aren't in a form that can be searched, which includes not only media files, but information in non-standard file formats

Although efforts are underway to enable information on the Deep Web to be searchable, the amount of information stored that is not accessible is many times larger than the amount of information that can currently be accessed. Some estimates at the size of the Dark Web suggest that it could be an order of magnitude larger than the content contained in the world's search engines.

It is always a good idea to keep these search engine limitations in mind when you work with this technology.

Aggregation and disintermediation

Aggregation pages are a great user service, but they are very controversial—as are a number of Google's search applications and services. It has long been argued that Google's display of information from various sites violates copyright laws and damages content providers. In several lawsuits, Google successfully defended its right to display capsule information under the Digital Millennium Copyright Act, while in other instances Google responds to requests from interested parties to remove information from its site.

The Authors Guild's filed a class action suit in 2005 regarding unauthorized scanning and copying of books for the creation of the Google Books feature. Google reached a negotiated agreement with the Authors Guild that specified Google's obligations under the fair use exemption. Google argues that the publicity associated with searchable content adds value to that content, and it is clear that this is an argument that will continue into the future.

What is clear is that Google has been a major factor in a trend referred to as disintermediation. Disintermediation is the removal of intermediaries such as a distributor, agent, broker, or some similar functionary from a supply chain. This connects producers directly with consumers, which in many cases is a very good thing. However, disintermediation also has the unfortunate side effect of impacting organizations such as news collection agencies (newspapers, for example), publishers, many different types of retail outlets, and many other businesses, some of which played a positive role in the transactions they were involved in.

Google began to introduce productivity applications starting in 2004 with Gmail. The expansion of these services has continued unabated ever since. Some of these applications are homegrown, but many of them were acquired by acquisition. An example of an acquired product is Writely, the online word processor that is now at the heart of Google Docs and is described in Chapter 14.

Productivity applications and services

These products store your information online in a form that Google can use to build a profile of your activities, and it is unclear how the company uses the information it stores. Google states that your information is never viewed individually by humans, and the company lists its policies in the Privacy Center, which you can find at http://www.google.com/privacypolicy.html. Google has been vigilant in protecting its privacy reputation, but the collection of such a large amount of personal data must give any thoughtful person reason for pause.

Note

Space considerations preclude a more complete description of Google applications and services. Several books treat this topic in detail, including Google Apps For Dummies by Ryan Teeter and Karl Barksdale, Wiley, 2008.

Table 8.1 lists the current Google “products” listed on its Even More page.

TABLE 8.1

Google Products

Product Name

URL

Google Description

Alerts

http://www.google.com/alerts?hl=en

Sends a periodic e-mail alert to you based on your search term. Search news, blogs, discussions, video, or everything.

Blog Search

http://www.google.com/blogsearch?hl=en

Displays an aggregation page from blogs.

Blogger

http://www.blogger.com/start?hl=en

A blogging site for personal blogs. See Chapter 18 for a description of blogging services.

Books

http://books.google.com/books?hl=en

A vast library of book content in the public domain and previews of copyrighted material.

Calendar

http://www.google.com/calendar/render?hl=en

Calendar service for managing schedules and events and sharing them with others.

Chrome

http://www.google.com/chrome?hl=en&brand=CHMI

Google's browser and operating system wannabe.

Checkout

http://checkout.google.com/

A payment processing system.

Code

http://code.google.com/intl/en/

Developer tools and resources. Described more fully later in this chapter.

Custom Search

http://www.google.com/coop/cse/?hl=en

Creates a custom search utility for a particular Web site.

Desktop

http://desktop.google.com/en/?ignua=1

Indexes content on your local drive for fast searches. Adds a sidebar with gadgets.

Directory

http://www.google.com/dirhp?hl=en

Search the Web by topics, a la Yahoo!

Docs

http://docs.google.com/

Online productivity applications. Described in Chapter 16.

Earth

http://earth.google.com/intl/en/

An online atlas and mapping service with mashups.

Finance

http://www.google.com/finance

A financial news aggregation service and site.

GOOG-411

http://www.google.com/goog-411/

Mobile phone search.

Google Health

http://www.google.com/health/

Health information management system.

Groups

http://www.google.com/grphp?hl=en

Discussion groups on specific topics.

iGoogle

http://www.google.com/ig?hl=en&source=mpes

AJAX customized home page.

Images

http://images.google.com/imghp?hl=en

Web image search.

Knol

http://knol.google.com/k?hl=en

Short articles submitted by users.

Labs

http://labs.google.com/

A collection of applications and utilities under development and testing.

Orkut

https://www.orkut.com/

Social media service with instant messaging. Described in Chapter 18.

Maps

http://maps.google.com/?hl=en

Mapping and direction service.

Maps for Mobile

http://www.google.com/mobile/default/maps.html

Mapping and direction service. Works with GPS on mobile devices.

Mobile

http://www.google.com/mobile/

Mobile search using voice and location.

News

http://news.google.com/news?ned=en

News aggregation service and Web site.

Pack

http://pack.google.com/?hl=en

Free Windows-based software selected by Google, including Chrome, apps, Desktop, Earth, Picasa, Adobe Reader, Talk, RealPlayer, Skype, and others.

Patent Search

http://www.google.com/patents?hl=en

Patent and trademark search of the United States Patents and Trademark Office.

Picasa

http://picasa.google.com/intl/en/

Photo-editing and management software.

Product Search

http://www.google.com/products

Shopping search function.

Reader

http://www.google.com/reader/view/?hl=en&source=mmm-en

An RSS reader.

Scholar

http://www.google.com/schhp?hl=en

Search site for research and scholarly work from many disciplines.

Search for Mobile

http://www.google.com/mobile/default/search.html

Google's search application optimized for mobile devices.

Sites

http://sites.google.com/

Web site and wiki creation and staging tool.

SketchUp

http://sketchup.google.com/intl/en/

Allows users to create 3D models and share them with others.

Talk

http://www.google.com/talk/

Instant messaging and chat utility. Can be integrated in Gmail.

Toolbar

http://toolbar.google.com/intl/en/

Provides search features inside different browsers.

Translate

http://translate.google.com/?hl=en

Language translation utility.

Trends

http://www.google.com/trends

Statistical information on different search terms.

Videos

http://video.google.com/?hl=en

Searches for videos on the Web.

Voice

http://voice.google.com/

Free phone service, formerly called Grand Central. Described in Chapter 19.

Web Search

http://www.google.com/webhp?hl=en

Google's core Web search engine of indexed pages sorted with page rank.

Web Search Features

http://www.google.com/intl/en/help/features.html

A help page for special Web searches in Google.

YouTube

http://www.youtube.com/

Flash video sharing site. Described in Chapter 19.

Source: http://www.google.com/intl/en/options/.

Enterprise offerings

As Google has built out its portfolio, it has released special versions of its products for the enterprise. The following are among Google's products aimed at the enterprise market:

Google Commerce Search (http://www.google.com/commercesearch/): This is a search service for online retailers that markets their products in their site searches with a number of navigation, filtering, promotion, and analytical functions.

Google Site Search (http://www.google.com/sitesearch/): Google sells its search engine customized for enterprises under the Google Site Search service banner. The user enters a search string in the site's search, and Google returns the results from that site.

Google Search Appliance (http://www.google.com/enterprise/gsa): This server can be deployed within an organization to speed up both local (Intranet) and Internet searching. The three versions of the Google Search Appliance can store an index of up to 300,000 (GB-1001), 10 million (GB-5005), or 30 million (GB-8008) documents. Beyond indexing, these appliances have document management features, perform custom searches, cache content, and give local support to Google Analytics and Google Sitemaps.

Google Mini (http://www.google.com/enterprise/mini/): The Mini is the smaller version of the GSA that stores 300,000 indexed documents.

Google also has some success in marketing its productivity applications as office suites to organizations. Google uses different names for the different bundles under a branded program called Google Apps for Business (http://www.google.com/apps/intl/en/business/index.html). Figure 8.3 shows the home page for Google's various office suite bundles. The company has packages for governments, schools, non-profits, and ISPs (a reseller program). Google claims that some 8 million students now use Google Apps, and Google Apps has had some large government purchases, such as the City of Los Angeles.

For business and other organizations such as governmental agencies, the company has a branded Google Apps Premier Edition, which is a paid service. The different versions offer Gmail, Docs, and Calendar as core applications. The Premier Edition adds 25GB of Gmail storage, e-mail server synchronization, Groups, Sites, Talk, Video, enhanced security, directory services, authentication and authorization services, and the customer's own supported domain—all hosted in the cloud. Premium Edition also adds access to Google APIs and a 24/7 support service with a 99.9-percent uptime guarantee Service Level Agreement. The cost per use is $50 per user account/per year.

FIGURE 8.3

Google Apps for Business is the commercial versions of the company's productivity suites.

9780470903568-fg0803.tif

To support Google's Premier and Education Editions' Gmail, Google purchased the Postini archiving and discovery service. Google Postini Services (http://www.google.com/postini/) provides security services such as threat assessment, proactive link blocking and Web policy enforcement, e-mail message encryption, message archiving, and message discovery services. These are paid services that add from $12 to $45 per user/per year, based on the options chosen. Postini allows e-mail to be retained for up to 10 years and can be used to demonstrate regulatory compliance.

Many of Google's productivity applications are quite capable, but none is a state-of-the-art client you might expect to find in a locally installed office suite. When compared one-on-one to Microsoft Office applications, Google's online offerings give users the essential features for a fraction of the Microsoft Office price.

Most sophisticated users prefer Microsoft Office, but for the average user (that is most people) Google App bundles are good enough. When that low price is coupled with the collaborative tools and features Google offers, the value of Google Apps will be increasingly more appealing. We can reasonably expect that cloud-based productivity apps will put their shrink-wrapped competitors under great pressure. Microsoft's current strategy of putting crippled Office applications on the Web in Windows Live isn't going to be competitive.

AdWords

AdWords (http://www.google.com/AdWords) is a targeted ad service based on matching advertisers and their keywords to users and their search profiles. This service transformed Google from a competent search engine into an industry giant and is responsible for the majority of Google's revenue stream. AdWords' two largest competitors are Microsoft adcenter (http://adcenter.microsoft.com/) and Yahoo! Search Marketing (http://searchmarketing.yahoo.com/).

Ads are displayed as text, banners, or media and can be tailored based on geographical location, frequency, IP addresses, and other factors. AdWords ads can appear not only on Google.com, but on AOL search, Ask.com, and Netscape, along with other partners. Other partners belonging to the Google Display Network can also display AdSense ads. In all these cases, the AdWords system determines which ads to match to the user searches.

Here's how the system works: Advertisers bid on keywords that are used to match a user to their product or service. If a user searches for a term such as “develop abdominal muscles,” Google returns products based on those terms. You might see an ad with Chuck Norris selling a modern-day version of a torture rack that, if it doesn't give you a six-pack, at least makes your wallet lighter. Up to 12 ads per search can be returned.

Google gets paid for the ad whenever a user clicks it. The system is referred to as pay-per-click advertising, and the success of the ad is measured by what is called the click-through rate (CTR). Google calculates a quality scorefor ads based on the CTR, the strength of the connection between the ad and the keywords, and the advertiser's history with Google. This quality score is a Google trade secret and is used to price the minimum bid of a keyword.

In 2007, Google purchased DoubleClick, an Internet advertising services company. DoubleClick helps clients create ads, provides hosting services, and tracks results for analysis. DoubleClick ads leave browser cookies on systems that collect information from users that determine the number of times a user has been exposed to a particular ad, as well as various system characteristics. Some spyware trackers flag DoubleClick cookies as spyware. Both AdWords and DoubleClick are sold as packages to large clients.

Google Analytics

Google Analytics (GA; http://google.com/analytics) is a statistical tool that measures the number and types of visitors to a Web site and how the Web site is used. It is offered as a free service and has been adopted by many Web sites. GA is built on the Urchin 5 analytical package that Google acquired in 2006. Figure 8.4 shows the Google Analytics home page.

According to Builtwith.com (http://trends.builtwith.com/analytics/Google-Analytics), Google Analytics was in use on 54 percent of the top 10,000 and 100,000, and 35 percent of the top one million of the world's Web sites. Builtwith.com speculates that Google Analytics JavaScript tag is the most widely used URL in the world today. The service BackendBattles.com (http://www.backendbattles.com/backend/Google_Analytics) sets GA's market share at 57 percent for the top 10,000 sites.

FIGURE 8.4

Google Analytics is the most widely used Web traffic analysis tool on the Internet.

9780470903568-fg0804.tif

Analytics works by using a JavaScript snippet called the Google Analytics Tracking Code (GATC) on individual pages to implement a page tag. When the page loads, the JavaScript runs and creates a first-party browser cookie that can be used to manage return visitors, perform tracking, test browser characteristics, and request tracking code that identifies the location of the visitor. GATC requests and stores information from the user's account. The code stored on the user's system acts like a beacon and collects visitor data that it sends back to GA servers for processing.

Among the visitors that can be tracked are those that land from search engines; referral links in e-mail, documents, and Web pages; display ads; PPC networks; and some other sources. GA aggregates the data and presents the information in a visual form. GA also is connected to the AdWords system so it can track the performance of particular ads in different contexts. You can view referral location statistics and time spent on a page, and you can filter by visitor site. GA lets you save and store up to 50 individual site profiles, provided the site has less than 5 million pageviews per month. This restriction is lifted for an AdWords subscription.

GA cookies are blocked by a number of technologies, such as Firefox Adblock and NoScript or by turning off JavaScript execution in other browsers. You also can delete GA cookies manually or block them, which also defeats the system.

Google Translate

Of all the Google applications, the one that might have significant immediate impact is Google Translate. Computer technology is very close to having the necessary hardware and software to realize the dream of a “universal translator” that the TV show Star Trek proposed some 45 years ago. The current version of Google Translate performs machine translation as a cloud service between two of your choice of 35 different languages. That's not truly universal, but until aliens appear, it will do for most people.

Google Translate was introduced in 2007 and replaced the SYSTRAN system that many other computer services utilize. The translation method uses a statistical approach that was first developed by Franz-Joseph Och in 2003. Och now heads the Translate effort at Google.

Translate uses what is referred to as a corpus linguistics approach to translation. You start off building a translation system for a language pair by collecting a database of words and then matching that database to two bilingual text corpuses. A text corpus or parallel collection is a database of word- and phrase-usage taken from the language in everyday use obtained by examining documents translated by professionals to software analysis. Among the documents that are analyzed are the translations of the United Nations and European Parliament, among others.

Google Translate can be accessed directly at http://translate.google.com/translate_t?hl=en#, where you can select the language pair to be translated. You can do the following:

• Enter text directly into the text box, and click the Translate button to have the text translated.

If you select the Detect Language option, Translate tries to determine the language automatically and translate it into English.

• Enter a URL for a Web page to have Google display a copy of the translated Web page.

• Enter a phonetic equivalent for script languages.

• Upload a document to the page to have it translated.

Translate parses the document into words and phrases and applies its statistical algorithm to make the translation. As the service ages, the translations are getting more accurate, and the engine is being added to browsers such as Google Chrome and through extension into Mozilla Firefox. The Google Toolbar offers page translation as one of its options, selectable in the Tools settings.

The Google Translator Toolkit (http://translate.google.com/toolkit) shown in Figure 8.5 provides a means for using the Translate to perform translations that you can edit. Shown in the figure is the translation of an article from the English version of Wikipedia into Spanish. The toolkit provides access to tools to aid you in editing the translation.

Translation services have been in development for many years. IBM has had a large effort in this area, and the Microsoft Bing search engine also has a translation engine. There are many other translation engines, and some of them are even cloud-based like Google Translate. What makes Google's efforts potentially unique is the company's work in language transcription—that is, the conversion of voice to text. As part of Google Voice and its work with Android-based cell phones, Google is sampling and converting millions and millions of conversations. Combining these two Web services together could create a translation device based on a cloud service that would have great utility.

FIGURE 8.5

The Google Translator Toolkit lets you translate documents, Web pages, and other material from one language to another and provides tools to improve on the translation.

9780470903568-fg0805.tif

Exploring the Google Toolkit

Google has an extensive program that supports developers who want to leverage Google's cloud-based applications and services. These APIs reach into every corner of Google's business. Google's Code Home page for developers may be found at http://code.google.com and is shown in Figure 8.6. From this site, you can access developer tools, information on how to use its various APIs to include Google services in your own work, and technical resources.

FIGURE 8.6

Google's Code page at http://code.google.com/intl/en/

9780470903568-fg0806.tif

Google has a number of areas in which it offers development services, including the following:

AJAX APIs (http://code.google.com/intl/en/apis/ajax/) are used to build widgets and other applets commonly found in places like iGoogle. AJAX provides access to dynamic information using JavaScript and HTML.

Android (http://developer.android.com/index.html) is a phone operating system development.

Google App Engine (http://appengine.google.com/) is Google's Platform as a Service (PaaS) development and deployment system for cloud computing applications.

Google Apps Marketplace (http://code.google.com/intl/en/googleapps/marketplace/) offers application development tools and a distribution channel for cloud-based applications.

Google Gears (http://gears.google.com/) is a service that provides offline access to online data.

Google Gears includes a database engine installed on the client that caches data and synchronizes it. Gears allows cloud-based applications to be available to a client even when a network connection to the Internet isn't available. Using Gears, you could work on your mail in Gmail offline, for example.

Google Web Toolkit (GWT; http://code.google.com/webtoolkit) is a set of development tools for browser-based applications.

GWT is an open-source platform that has been used to create Google Wave and Google AdWords. GWT allows developers to create AJAX applications using Java or with the GWT compiler using JavaScript.

Project Hosting (http://code.google.com/intl/en/projecthosting/) is a project management tool for managing source code.

The Google APIs

Most Google services are exposed by an API, which is why you find a version of Google's search engine, Google Maps, YouTube videos, Google Earth, AdWords, AdSense, and even elements of Google Apps exposed in many other Web sites. You can get to the listing of the Google APIs by clicking the More Products link on the Code page (refer to Figure 8.6). The page you see is http://code.google.com/intl/en/more/, which is shown in Figure 8.7.

Google's APIs can be categorized as belonging to the following categories:

Ads and AdSense: These APIs allow Google's advertising services to be integrated into Web applications. The most commonly used services in this category are AdWords, AdSense, and Google Analytics.

AJAX: The Google AJAX APIs provide a means to add content such as RSS feeds, maps, search boxes, and other information sources by including a snippet of JavaScript into your code.

Browser: Google has several APIs related to building browser-based applications, including four for the Chrome browser. This category includes the Google Cloud Print API, the Installable Web Apps API for creating installation packages, the Google Web Toolkit for building AJAX applications using Java, and V8, which is a high-performance JavaScript engine.

Data: The Data APIs are those that exchange data with a variety of Google services. The list of Google Data APIs includes Google Apps, Google Analytics, Blogger, Base, Book, Calendar, Code Search, Google Earth, Google Spreadsheets, Google Notebook, and Picasa Web Albums.

Geo: A number of APIs exist to give location-specific information hooking into maps and geo-specific databases. Some of the more popular APIs in this category include Google Earth, Directions, JavaScripts Maps, Maps API for Flash, and Static Maps.

Search: The search APIs leverage Google's core competency and its central service. APIs such as Google AJAX Search, Book Search, Code Search, Custom Search, and Webmaster Tools Data APIs allow developers to include Google searches in their applications and web sites.

Social: Many Google APIs are used for information exchange and communication tools. They support applications such as Gmail, Calendar, and others, and they provide a set of foundation services. The popular social APIs are Blogger Data, Calendar, Contacts, OpenSocial, Picasa, and YouTube.

FIGURE 8.7

Google's More Code page exposes the extensive set of APIs offered by Google for its various products.

9780470903568-fg0807.tif

Table 8.2 summarizes the many different Google APIs.

TABLE 8.2

Google APIs

API Name

URL

Category

Google Description

Google Accounts Authentication

http://code.google.com/apis/accounts/

Infrastructure

Get access into desktop or mobile applications.

Google AdWords API

http://code.google.com/apis/adwords/

Ads

Automate and streamline your campaign management activities.

AdSense for AJAX

http://code.google.com/apis/afa/

Ads, AJAX

Target ads to dynamic page content.

AdSense for Search Ads Only

http://code.google.com/apis/afs-ads-only/

Ads

Target ads to search results.

Google AJAX APIs

http://code.google.com/apis/ajax/

AJAX

Implement rich, dynamic Web sites entirely in JavaScript and HTML.

Google AJAX Feed API

http://code.google.com/apis/ajaxfeeds/

AJAX

Easily mash up public feeds using JavaScript.

Google AJAX Language API

http://code.google.com/apis/ajaxlanguage/

AJAX

Easily translate and detect multiple languages using JavaScript.

Google AJAX Search API

http://code.google.com/apis/ajaxsearch/

AJAX, Search

Put a Google Search box and results on your own site.

Google Analytics

http://code.google.com/apis/analytics/

Ads

Track your site traffic, and write your own client applications that use Analytics data in the form of Google Data API feeds.

Android

http://code.google.com/android/

Infrastructure

Build mobile apps for Android, a software stack for mobile devices.

Google App Engine

http://code.google.com/appengine/

Infrastructure

Run your Web applications on Google's infrastructure.

Google Apps Script

http://code.google.com/googleapps/appsscript/

Productivity

Automate tasks across Google products.

BigQuery (Labs)

http://code.google.com/apis/bigquery/

Labs

Interactively analyze large datasets.

Google Apps

http://code.google.com/googleapps/

Productivity

Extend Google Apps, integrate with other systems, or build new apps.

Google Apps Marketplace

http://code.google.com/googleapps/marketplace/

Productivity

Sell integrated applications to millions of Google Apps users.

Gmail APIs and Tools

http://code.google.com/apis/gmail/

Labs

Create gadgets for Gmail, and interact with the inbox.

Google Base Data API (Labs)

http://code.google.com/apis/base/

Labs

Manage Google Base content programmatically.

Blogger Data API (Labs)

http://code.google.com/apis/blogger/

Labs, Social

Enable your apps to view and update Blogger content.

Google Books Search APIs (Labs)

http://code.google.com/apis/books/

Labs, Search

Search the complete index of Book Search, and integrate with its social features.

Google Buzz (Labs)

http://code.google.com/apis/buzz/

Labs, Social

Share updates, photos, videos, and more, and start conversations about the things you find interesting.

Google Calendar APIs and Tools

http://code.google.com/apis/calendar/

Social

Create and manage events, calendars, and gadgets for Google Calendar.

Chart Tools

http://code.google.com/apis/charttools/

Productivity

Add charts and graphs to your Web page.

Google Checkout

http://code.google.com/apis/checkout/

Infrastructure

Start selling on your Web site.

Chromium

http://code.google.com/chromium/

Browser

Contribute to the open-source project behind Google Chrome.

Google Chrome Frame

http://code.google.com/chrome/chromeframe/

Browser

Enable open Web technologies and Google Chrome's fast JavaScript implementation within Internet Explorer.

Google Chrome Extensions (Labs)

http://code.google.com/chrome/extensions/

Browser, Labs

Modify and enhance the functionality of Google Chrome.

Installable Web Apps (Labs)

http://code.google.com/chrome/apps/

Browser, Labs

Package your Web apps for installation in Google Chrome.

Closure Tools

http://code.google.com/closure/

Labs

Create powerful and efficient JavaScript.

Google Cloud Print (Labs)

http://code.google.com/apis/cloudprint/

Browser, Labs

Enable any app (Web, desktop, mobile) on any device to print to any printer.

Google Code Search Data API (Labs)

http://code.google.com/apis/codesearch/

Labs, Search

Enable your apps to view data from Code Search.

Google Contacts API

http://code.google.com/apis/contacts/

Social

Allow your apps to view and update user contacts.

Google Coupon Feeds (Labs)

http://code.google.com/apis/coupons/

Labs

Provide coupon listings that are included in Google search results.

Google Custom Search API

http://code.google.com/apis/customsearch/

Ads, Search

Create a custom search engine for your Web site or a collection of Web sites.

Google DoubleClick for Publishers (Labs)

http://code.google.com/apis/dfp/

Ads, Labs

Build applications that interact directly with Google's next-generation display advertising platform.

Google Data Protocol

http://code.google.com/apis/gdata/

Infrastructure

A simple, standard protocol for reading and writing data on the Web.

Google Desktop APIs (Labs)

http://code.google.com/apis/desktop/

Labs, Search

Create gadgets and indexing plugins for Google Desktop.

Google Documents List Data API

http://code.google.com/apis/documents/

Infrastructure

Enable your apps to view and update your list of Google Documents.

Google Interactive Media Ads (Labs

http://code.google.com/apis/ima/

Ads, Labs

Google Interactive Media Ads enable publishers to request and display ads into video, audio, and game content.

Google Earth API

http://code.google.com/apis/earth/

AJAX, Geo

Embed Google Earth into your Web page.

Google Plugin for Eclipse

http://code.google.com/eclipse/

Infrastructure

Enjoy simplified development of GWT and App Engine projects in the Eclipse IDE.

Feedburner API (Labs)

http://code.google.com/apis/feedburner/

Labs

Interact with FeedBurner's feed management and awareness-generating capabilities.

Google Finance Data API (Labs)

http://code.google.com/apis/finance/

Labs

View and update Finance content in the form of Google Data API feeds.

Google Friend Connect APIs (Labs)

http://code.google.com/apis/friendconnect/

Labs, Social

JS and REST/RPC API's to Google Friend Connect.

Google Fusion Tables API (Labs)

http://code.google.com/apis/fusiontables/

Labs

Manage Google Fusion Tables content programmatically.

Gadgets API

http://code.google.com/apis/fusiontables/

Social

Build mini-apps that run on multiple sites, including iGoogle, Google Desktop, or any Web page.

Gears (Labs)

http://code.google.com/apis/gears/

AJAX, Labs

Enable Web applications to work offline, from your desktop PC, or your mobile device.

Google Health API

http://code.google.com/apis/health/

Productivity

Manage your personal health information with Google.

iGoogle Developer Home (Labs)

http://code.google.com/apis/igoogle/

Labs, Social

Build and test gadgets for iGoogle.

iGoogle Themes API (Labs)

http://code.google.com/apis/themes/

Labs

Design a dynamic theme for the iGoogle home page.

KML

http://code.google.com/apis/kml/

Geo

Create and share content with Google Earth, Maps, and Maps for mobile.

Google Latitude API (Labs)

http://code.google.com/apis/latitude/

Geo, Labs

Build applications that read and update user locations and location histories.

Google Libraries API

http://code.google.com/apis/libraries/

AJAX

Load open-source JavaScript libraries.

Google Moderator API (Labs)

http://code.google.com/apis/moderator/

Labs

Collect ideas, questions, and recommendations from audiences of any size.

Google Geocoding API

http://code.google.com/apis/maps/documentation/geocoding/

AJAX, Geo

Convert addresses from geographic coordinates.

Google Directions API

http://code.google.com/apis/maps/documentation/directions/

AJAX, Geo

Plot directions using a variety of transportation options.

Google JavaScript Maps API

http://code.google.com/apis/maps/documentation/javascript/

AJAX, Geo

Integrate Google's interactive maps with data on your site.

Google Maps API for Flash

http://code.google.com/apis/maps/documentation/flash/

Geo

Integrate Google Maps in Flash applications.

OpenSocial

http://code.google.com/apis/opensocial/

AJAX, Social

Build social applications that work across many Web sites.

Orkut Developer Home

http://code.google.com/apis/orkut/

Social

Create social applications for the millions of global Orkut users.

Google Project Hosting

http://code.google.com/projecthosting/

Infrastructure

Host your open-source project on Google Code.

Picasa APIs (Labs)

http://code.google.com/apis/picasa/

Labs, Social

Create custom buttons and upload files to third-party services.

Picasa Web Albums Data API

http://code.google.com/apis/picasaweb/

Social

Include Picasa Web Albums in your application or Web site.

Google PowerMeter API (Labs)

http://code.google.com/apis/powermeter/

Labs

Integrate with Google PowerMeter.

Google Prediction API (Labs)

http://code.google.com/apis/predict/

Labs

Add predictions to your applications.

PubSubHubbub

http://code.google.com/apis/pubsubhubbub/

Labs, Social

Turn your Atom and RSS feeds into real-time streams.

reCAPTCHA (Labs)

http://code.google.com/apis/recaptcha/

AJAX, Labs

Digitize books with this anti-bot service.

Google Safe Browsing APIs (Labs)

http://code.google.com/apis/safebrowsing/

Labs

Download lists of suspected phishing and malware URLs.

Google Secure Data Connector

http://code.google.com/securedataconnector/

Infrastructure

Connect data from behind the firewall to Google Apps.

Google Sidewiki API

http://code.google.com/apis/sidewiki/

Labs, Social

Enable your apps to view data from Google Sidewiki.

Google Sites Data API

http://code.google.com/apis/sites/

Labs

Enable your apps to modify content within a Google Site.

Google SketchUp Ruby API

http://code.google.com/apis/sketchup/

Geo

Extend Google SketchUp with Ruby.

Social Graph API (Labs)

http://code.google.com/apis/socialgraph/

Labs, Social

Enable users to quickly add their public social connections to your site.

Google Static Maps API

http://code.google.com/apis/maps/documentation/staticmaps/

Geo

Embed a Google Maps image on your Web site without requiring JavaScript or any dynamic page loading.

Google Storage for Developers (Labs)

http://code.google.com/apis/storage/

Labs

Store and share your data in the Google cloud.

Google Talk for Developers (Labs)

http://code.google.com/apis/talk/

Labs, Social

Connect your client or network to the Google Talk network, add chatback, or customize the Google Talk gadget.

Google Transit Feed Specification

http://code.google.com/transit/spec/transit_feed_specification.html

Geo

Provide public transit route and schedule information for Google Maps and more.

Google Translator Toolkit Data API

http://code.google.com/apis/gtt/

Labs

Build applications that can access and update translation-related data.

V8

http://code.google.com/apis/v8/

Browser

Google's high-performance, open-source, JavaScript engine.

Google Wave API

http://code.google.com/apis/wave

Labs, Social

Build extensions for Google Wave or embed Google Waves in your site.

Google Web Elements

http://www.google.com/webelements/

Infrastructure

Add your favorite Google products to your own Web site.

Google Web Toolkit

http://code.google.com/webtoolkit/

AJAX, Browser

Build AJAX apps in the Java language.

Google Webmaster Tools Data API (Labs)

http://code.google.com/apis/webmastertools/

Labs, Search

View and update site information and Sitemaps in the form of feeds.

YouTube API

http://code.google.com/apis/youtube/

Social

Integrate YouTube videos into your Web site or application.

Source: http://code.google.com/intl/en/more/.

Working with the Google App Engine

Google App Engine (GAE) is a Platform as a Service (PaaS) cloud-based Web hosting service on Google's infrastructure. Figure 8.8 shows the GAE home page at http://code.google.com/intl/en/appengine/. This service allows developers to build and deploy Web applications and have Google manage all the infrastructure needs, such as monitoring, failover, clustering, machine instance management, and so forth. For an application to run on GAE, it must comply with Google's platform standards, which narrows the range of applications that can be run and severely limits those applications' portability.

GAE supports the following major features:

• Dynamic Web services based on common standards

• Automatic scaling and load balancing

• Authentication using Google's Accounts API

• Persistent storage, with query access sorting and transaction management features

• Task queues and task scheduling

• A client-side development environment for simulating GAE on your local system

• One of either two runtime environments: Java or Python

When you deploy an application on GAE, the application can be accessed using your own domain name or using the Google Apps for Business URL.

FIGURE 8.8

The Google App Engine page at http://code.google.com/intl/en/appengine/

9780470903568-fg0808.tif

Google App Engine currently supports applications written in Java and in Python, although there are plans to extend support to more languages in the future. The service is meant to be language-agnostic. A number of Java Virtual Machine languages are compliant with GAE, as are several Python Web frameworks that support the Web Server Gateway Interface (WSGI) and CGI. Google has its own Webapp framework designed for use with GAE. The AppScale (http://appscale.cs.ucsb.edu/) open-source framework also may be used for running applications on GAE.

To encourage developers to write applications using GAE, Google allows for free application development and deployment up to a certain level of resource consumption. Resource limits are described on Google's quota page at http://code.google.com/appengine/docs/quotas.html, and the quota changes from time to time.

Google uses the following pricing scheme:

• CPU time measured in CPU hours is $0.10 per hour.

• Stored data measured in GB per month is $0.15 per GB/month.

• Incoming bandwidth measured in GB is $0.10 per GB.

• Outgoing bandwidth measured in GB is $0.12 per GB.

• Recipients e-mailed is $0.0001 per recipient.

The pricing page for Google AppEngine may be found at: http://code.google.com/appengine/docs/billing.html. The current resource limits are shown in Table 8.3. Consumption of resources beyond the free limit is generally on a pay-as-you-go basis, although in certain circumstances, Google may allow for additional free usage. When you enable billing for an application deployed to GAE, you pay for consumption of CPU, network I/O, and other usage above the level of the free quotas that GAE allows.

TABLE 8.3

Apps Quota Limits

Resource Quotas

Free Default Quota

Billing Enabled Default Quota

Applications per developer

10

No fixed limit

Application size

150MB

No fixed limit

Bandwidth limit (in and out)

1GB (each), up to 56MB/minute

1GB free and 1,046GB max, up to 10GB/min rate

CPU usage

6.5 CPU-hours/day, up to 15 CPU-minutes/minute

6.5 CPU-hours/day free to 1,729 CPU-hours/day maximum, up to 72 CPU-minutes/minute maximum rate

Datastore API calls

10 million/day, up to 57,000 queries/min

200 million queries/day, up to 129 queries/min

Data received from API

115GB, up to 659MB/min

695GB, up to 1,484MB/min

Data sent to API

12GB, up to 68MB/min

72GB, up to 153MB/min

Data storage

1GB

1GB free, no maximum

Datastore CPU Time

60 CPU-hours, up to 20 CPU-min/min

1,200 CPU-hours, up to 50 CPU-min/min

E-mails

2,000/day, up to 8 recipients/min

2,000 free to 7.4 million recipients max, up to 5,100 recipients/min

HTTP requests

1,300,000/day, up to 7,400 requests/minute

43,000,000 requests, up to 30,000 requests/min rate

Indexes

100

200

Storage per application (Blobstore)

1GB

1GB free, no limit

Storage API calls (Blobstore)

No free quota

140 million calls/day, up to 72,000 calls/min

Storage item limit

1GB

1 GB free, no maximum

Time per request allowed

30 sec

30 sec

URLFetch API calls

657,000/day up to 3,000 calls/min

46 million calls/day up to 32,000 calls/min

Source: http://code.google.com/appengine/docs/quotas.html.

Applications running in GAE are isolated from the underlying operating system, which Google describes as running in a sandbox. This allows GAE to optimize the system so Web requests can be matched to the current traffic load. It also allows applications to be more secure because applications can connect only to computers using the specified URLs for the e-mail and fetch services using HTTP or HTTPS over the standard well-known ports. URL fetch uses the same infrastructure that retrieves Web pages on Google. The mail service also supports Gmail's messaging system.

Applications also are limited in that they can only read files; they cannot write to the file system directly. To access data, an application must use data stored in the memcache (memory cache), the datastore, or some other persistent service. Memcache is a fast in-memory key-value cache that can be used between application instances. For persistent data storage of transactional data, the datastore is used. Additionally, an application responds only to a specific HTTP request—in real-time, part of a queue, or scheduled—and any request is terminated if the response requires more than 30 seconds to complete.

GAE has a distributed datastore system that supports queries and transactions. This datastore is non-relational or “schema-less,” but it does store data objects or entities that are assigned properties. In your queries, you can use entities filtered by kind or type and also sorted by properties. You can find a list of the various property types at http://code.google.com/appengine/docs/python/datastore/typesandpropertyclasses.html; the list includes strings, booleans, float, datetime, blob, text, and other property types. Each application can structure its own sets of data entities. The datastore uses an optimistic concurrency control and maintains strong consistency. An application can execute transactions with multiple operations, and they either all succeed or fail as a unit. To support the distributed nature of the datastore, the concept of an entity group is employed. Transactions manage entities as a single group, and entity groups are stored together in the system so operations can be performed faster.

The App Engine relies on the Google Accounts API for user authentication, the same system used when you log into a Google account. This provides access to e-mail and display names within your app, and it eliminates the need for an application to develop its own authentication system. Applications can use the User API to determine whether a user belongs to a specific group and even whether that person is an administrator for your application.

Many applications have been built and are running on Google App Engine. To get some idea of the range of applications that have been developed, you may want to visit the Google App Engine Gallery. This gallery is found at http://appgallery.appspot.com/ and is shown in Figure 8.9. It is searchable by keyword and category.

FIGURE 8.9

Google App Engine gallery page may be found at http://appgallery.appspot.com/.

9780470903568-fg0809.tif

Summary

In this chapter, you learned about all things Google. The range of applications and services that Google offers is truly impressive; the company is essentially a self-contained ecosystem. Google's empire is built on its highly regarded search engine. The company monetized search technology by attaching target advertising to searches that its users perform. This revenue has allowed Google to create a range of applications and services on the Web that are having real impact in society.

In this chapter, the applications and services were listed, as were the APIs that are built on these applications and services. Google makes nearly all the products accessible through its APIs. That is why you find Google's services on so many of the world's Web sites.

This chapter ended by describing Google App Engine, a Platform as a Service Web-hosting offering that allows you to create Web applications and deploy them on Google's own infrastructure. Development and deployment of these applications are free, as is some basic usage of the application. You can scale your applications on a pay-per-use basis to whatever size you need.

In Chapter 9, I examine the approach of Amazon Web Services in cloud computing. AWS offers a very different service model, operating as an Infrastructure as a Service (IaaS) provider.