Introducing Google App Engine - Programming Google App Engine (2012)

Programming Google App Engine

Chapter 1. Introducing Google App Engine

Google App Engine is a web application hosting service. By “web application,” we mean an application or service accessed over the Web, usually with a web browser: storefronts with shopping carts, social networking sites, multiplayer games, mobile applications, survey applications, project management, collaboration, publishing, and all the other things we’re discovering are good uses for the Web. App Engine can serve traditional website content too, such as documents and images, but the environment is especially designed for real-time dynamic applications.

In particular, Google App Engine is designed to host applications with many simultaneous users. When an application can serve many simultaneous users without degrading performance, we say it scales. Applications written for App Engine scale automatically. As more people use the application, App Engine allocates more resources for the application and manages the use of those resources. The application itself does not need to know anything about the resources it is using.

Unlike traditional web hosting or self-managed servers, with Google App Engine, you only pay for the resources you use. These resources are measured down to the gigabyte. Billed resources include CPU usage, storage per month, incoming and outgoing bandwidth, and several resources specific to App Engine services. To help you get started, every developer gets a certain amount of resources for free, enough for small applications with low traffic.

App Engine can be described as three parts: application instances, scalable data storage, and scalable services. In this chapter, we look at each of these parts at a high level. We also discuss features of App Engine for deploying and managing web applications, and for building websites integrated with other Google offerings such as Google Apps, Google Accounts, and Google Cloud Storage.

The Runtime Environment

An App Engine application responds to web requests. A web request begins when a client, typically a user’s web browser, contacts the application with an HTTP request, such as to fetch a web page at a URL. When App Engine receives the request, it identifies the application from the domain name of the address, either an .appspot.com subdomain (provided for free with every app) or a subdomain of a custom domain name you have registered and set up with Google Apps. App Engine selects a server from many possible servers to handle the request, making its selection based on which server is most likely to provide a fast response. It then calls the application with the content of the HTTP request, receives the response data from the application, and returns the response to the client.

From the application’s perspective, the runtime environment springs into existence when the request handler begins, and disappears when it ends. App Engine provides several methods for storing data that persists between requests, but these mechanisms live outside of the runtime environment. By not retaining state in the runtime environment between requests—or at least, by not expecting that state will be retained between requests—App Engine can distribute traffic among as many servers as it needs to give every request the same treatment, regardless of how much traffic it is handling at one time.

In the complete picture, App Engine allows runtime environments to outlive request handlers, and will reuse environments as much as possible to avoid unnecessary initialization. Each instance of your application has local memory for caching imported code and initialized data structures. App Engine creates and destroys instances as needed to accommodate your app’s traffic. If you enable the multithreading feature, a single instance can handle multiple requests concurrently, further utilizing its resources.

Application code cannot access the server on which it is running in the traditional sense. An application can read its own files from the filesystem, but it cannot write to files, and it cannot read files that belong to other applications. An application can see environment variables set by App Engine, but manipulations of these variables do not necessarily persist between requests. An application cannot access the networking facilities of the server hardware, although it can perform networking operations by using services.

In short, each request lives in its own “sandbox.” This allows App Engine to handle a request with the server that would, in its estimation, provide the fastest response. For web requests to the app, there is no way to guarantee that the same app instance will handle two requests, even if the requests come from the same client and arrive relatively quickly.

Sandboxing also allows App Engine to run multiple applications on the same server without the behavior of one application affecting another. In addition to limiting access to the operating system, the runtime environment also limits the amount of clock time and memory a single request can take. App Engine keeps these limits flexible, and applies stricter limits to applications that use up more resources to protect shared resources from “runaway” applications.

A request handler has up to 60 seconds to return a response to the client. While that may seem like a comfortably large amount for a web app, App Engine is optimized for applications that respond in less than a second. Also, if an application uses many CPU cycles, App Engine may slow it down so the app isn’t hogging the processor on a machine serving multiple apps. A CPU-intensive request handler may take more clock time to complete than it would if it had exclusive use of the processor, and clock time may vary as App Engine detects patterns in CPU usage and allocates accordingly.

Google App Engine provides three possible runtime environments for applications: a Java environment, a Python environment, and an environment based on the Go language (a new systems language developed at Google). The environment you choose depends on the language and related technologies you want to use for developing the application.

The Java environment runs applications built for the Java 6 Virtual Machine (JVM). An app can be developed using the Java programming language, or most other languages that compile to or otherwise run in the JVM, such as PHP (using Quercus), Ruby (using JRuby), JavaScript (using the Rhino interpreter), Scala, Groovy, and Clojure. The app accesses the environment and services by using interfaces based on web industry standards, including Java servlets and the Java Persistence API (JPA). Any Java technology that functions within the sandbox restrictions can run on App Engine, making it suitable for many existing frameworks and libraries. Notably, App Engine fully supports Google Web Toolkit (GWT), a framework for rich web applications that lets you write all the app’s code—including the user interface that runs in the browser—in the Java language, and have your rich graphical app work with all major browsers without plug-ins.

The Python environment runs apps written in the Python 2.7 programming language, using a custom version of CPython, the official Python interpreter. App Engine invokes a Python app using WSGI, a widely supported application interface standard. An application can use most of Python’s large and excellent standard library, as well as rich APIs and libraries for accessing services and modeling data. Many open source Python web application frameworks work with App Engine, such as Django, web2py, Pyramid, and Flask. App Engine even includes a lightweight framework of its own, called webapp.

All three runtime environments use the same application server model: a request is routed to an app server, an application instance is initialized (if necessary), application code is invoked to handle the request and produce a response, and the response is returned to the client. Each environment runs application code within sandbox restrictions, such that any attempt to use a feature of the language or a library that would require access outside of the sandbox returns an error.

You can configure many aspects of how instances are created, destroyed, and initialized. How you configure your app depends on your need to balance monetary cost against performance. If you prefer performance to cost, you can configure your app to run many instances and start new ones aggressively to handle demand. If you have a limited budget, you can adjust the limits that control how requests queue up to use a minimum number of instances.

I haven’t said anything about which operating system or hardware configuration App Engine uses. There are ways to figure out what operating system or hardware a server is using, but in the end it doesn’t matter: the runtime environment is an abstraction above the operating system that allows App Engine to manage resource allocation, computation, request handling, scaling, and load distribution without the application’s involvement. Features that typically require knowledge of the operating system are either provided by services outside of the runtime environment, provided or emulated using standard library calls, or restricted in sensible ways within the definition of the sandbox.

Everything stated above describes how App Engine allocates application instances dynamically to scale with your application’s traffic. You can also run code on specialized instances that you allocate and deallocate manually, known as “backends” (or simply, “servers”). These specialized instances are well-suited to background jobs and custom services, and have their own parameters for how they execute code. They do not, however, scale automatically: once you reach the capacity of a server, it’s up to your code to decide what happens next. Backends are a relatively new feature of App Engine, and this architecture is still evolving. We do not cover this feature in detail in this edition of this book.

The Static File Servers

Most websites have resources they deliver to browsers that do not change during the regular operation of the site. The images and CSS files that describe the appearance of the site, the JavaScript code that runs in the browser, and HTML files for pages without dynamic components are examples of these resources, collectively known as static files. Since the delivery of these files doesn’t involve application code, it’s unnecessary and inefficient to serve them from the application servers.

Instead, App Engine provides a separate set of servers dedicated to delivering static files. These servers are optimized for both internal architecture and network topology to handle requests for static resources. To the client, static files look like any other resource served by your app.

You upload the static files of your application right alongside the application code. You can configure several aspects of how static files are served, including the URLs for static files, content types, and instructions for browsers to keep copies of the files in a cache for a given amount of time to reduce traffic and speed up rendering of the page.

The Datastore

Most useful web applications need to store information during the handling of a request for retrieval during a later request. A typical arrangement for a small website involves a single database server for the entire site, and one or more web servers that connect to the database to store or retrieve data. Using a single central database server makes it easy to have one canonical representation of the data, so multiple users accessing multiple web servers all see the same and most recent information. But a central server is difficult to scale once it reaches its capacity for simultaneous connections.

By far the most popular kind of data storage system for web applications in the past two decades has been the relational database, with tables of rows and columns arranged for space efficiency and concision, and with indexes and raw computing power for performing queries, especially “join” queries that can treat multiple related records as a queryable unit. Other kinds of data storage systems include hierarchical datastores (filesystems, XML databases) and object databases. Each kind of database has pros and cons, and which type is best suited for an application depends on the nature of the application’s data and how it is accessed. And each kind of database has its own techniques for growing past the first server.

Google App Engine’s database system most closely resembles an object database. It is not a join-query relational database, and if you come from the world of relational-database-backed web applications (as I did), this will probably require changing the way you think about your application’s data. As with the runtime environment, the design of the App Engine datastore is an abstraction that allows App Engine to handle the details of distributing and scaling the application, so your code can focus on other things.

Entities and Properties

An App Engine application stores its data as one or more datastore entities. An entity has one or more properties, each of which has a name, and a value that is of one of several primitive value types. Each entity is of a named kind, which categorizes the entity for the purpose of queries.

At first glance, this seems similar to a relational database: entities of a kind are like rows in a table, and properties are like columns (fields). However, there are two major differences between entities and rows. First, an entity of a given kind is not required to have the same properties as other entities of the same kind. Second, an entity can have a property of the same name as another entity has, but with a different type of value. In this way, datastore entities are “schemaless.” As you’ll soon see, this design provides both powerful flexibility as well as some maintenance challenges.

Another difference between an entity and a table row is that an entity can have multiple values for a single property. This feature is a bit quirky, but can be quite useful once understood.

Every datastore entity has a unique key that is either provided by the application or generated by App Engine (your choice). Unlike a relational database, the key is not a “field” or property, but an independent aspect of the entity. You can fetch an entity quickly if you know its key, and you can perform queries on key values.

An entity’s key cannot be changed after the entity has been created. Neither can its kind. App Engine uses the entity’s kind and key to help determine where the entity is stored in a large collection of servers—although neither the key nor the kind ensure that two entities are stored on the same server.

Queries and Indexes

A datastore query returns zero or more entities of a single kind. It can also return just the keys of entities that would be returned for a query. A query can filter based on conditions that must be met by the values of an entity’s properties, and can return entities ordered by property values. A query can also filter and sort using keys.

In a typical relational database, queries are planned and executed in real time against the data tables, which are stored just as they were designed by the developer. The developer can also tell the database to produce and maintain indexes on certain columns to speed up certain queries.

App Engine does something dramatically different. With App Engine, every query has a corresponding index maintained by the datastore. When the application performs a query, the datastore finds the index for that query, scans down to the first row that matches the query, then returns the entity for each consecutive row in the index until the first row that doesn’t match the query.

Of course, this requires that App Engine know ahead of time which queries the application is going to perform. It doesn’t need to know the values of the filters in advance, but it does need to know the kind of entity to query, the properties being filtered or sorted, and the operators of the filters and the orders of the sorts.

App Engine provides a set of indexes for simple queries by default, based on which properties exist on entities of a kind. For more complex queries, an app must include index specifications in its configuration. The App Engine SDK helps produce this configuration file by watching which queries are performed as you test your application with the provided development web server on your computer. When you upload your app, the datastore knows to make indexes for every query the app performed during testing. You can also edit the index configuration manually.

When your application creates new entities and updates existing ones, the datastore updates every corresponding index. This makes queries very fast (each query is a simple table scan) at the expense of entity updates (possibly many tables may need updating for a single change). In fact, the performance of an index-backed query is not affected by the number of entities in the datastore, only the size of the result set.

It’s worth paying attention to indexes, as they take up space and increase the time it takes to update entities. We discuss indexes in detail in Chapter 6.

Transactions

When an application has many clients attempting to read or write the same data simultaneously, it is imperative that the data always be in a consistent state. One user should never see half-written data or data that doesn’t make sense because another user’s action hasn’t completed.

When an application updates the properties of a single entity, App Engine ensures that either every update to the entity succeeds all at once, or the entire update fails and the entity remains the way it was prior to the beginning of the update. Other users do not see any effects of the change until the change succeeds.

In other words, an update of a single entity occurs in a transaction. Each transaction is atomic: the transaction either succeeds completely or fails completely, and cannot succeed or fail in smaller pieces.

An application can read or update multiple entities in a single transaction, but it must tell App Engine which entities will be updated together when it creates the entities. The application does this by creating entities in entity groups. App Engine uses entity groups to control how entities are distributed across servers, so it can guarantee a transaction on a group succeeds or fails completely. In database terms, the App Engine datastore natively supports local transactions.

When an application calls the datastore API to update an entity, the call returns only after the transaction succeeds or fails, and it returns with knowledge of success or failure. For updates, this means the service waits for all entities to be updated before returning a result. The application can call the datastore asynchronously, such that the app code can continue executing while the datastore is preparing a result. But the update itself does not return until it has confirmed the change.

If a user tries to update an entity while another user’s update of the entity is in progress, the datastore returns immediately with a contention failure exception. Imagine the two users “contending” for a single piece of data; the first user to commit an update wins. The other user must try her operation again, possibly rereading values and calculating the update from fresh data. Contention is expected, so retries are common. In database terms, App Engine uses optimistic concurrency control: each user is “optimistic” that her commit will succeed, so she does so without placing a lock on the data.

Reading the entity never fails due to contention. The application just sees the entity in its most recent stable state. You can also read multiple entities from the same entity group by using a transaction to ensure that all the data in the group is current and consistent with itself.

In most cases, retrying a transaction on a contested entity will succeed. But if an application is designed such that many users might update a single entity, the more popular the application gets, the more likely users will get contention failures. It is important to design entity groups to avoid a high rate of contention failures even with a large number of users.

It is often important to read and write data in the same transaction. For example, the application can start a transaction, read an entity, update a property value based on the last read value, save the entity, and then commit the transaction. In this case, the save action does not occur unless the entire transaction succeeds without conflict with another transaction. If there is a conflict and the app wants to try again, the app should retry the entire transaction: read the (possibly updated) entity again, use the new value for the calculation, and attempt the update again. By including the read operation in the transaction, the datastore can assume that related writes and reads from multiple simultaneous requests do not interleave and produce inconsistent results.

With indexes and optimistic concurrency control, the App Engine datastore is designed for applications that need to read data quickly, ensure that the data it sees is in a consistent form, and scale the number of users and the size of the data automatically. While these goals are somewhat different from those of a relational database, they are especially well suited to web applications.

The Services

The datastore’s relationship with the runtime environment is that of a service: the application uses an API to access a separate system that manages all its own scaling needs separately from application instances. Google App Engine includes several other self-scaling services useful for web applications.

The memory cache (or memcache) service is a short-term key-value storage service. Its main advantage over the datastore is that it is fast, much faster than the datastore for simple storage and retrieval. The memcache stores values in memory instead of on disk for faster access. It is distributed like the datastore, so every request sees the same set of keys and values. However, it is not persistent like the datastore: if a server goes down, such as during a power failure, memory is erased. It also has a more limited sense of atomicity and transactionality than the datastore. As the name implies, the memcache service is best used as a cache for the results of frequently performed queries or calculations. The application checks for a cached value, and if the value isn’t there, it performs the query or calculation and stores the value in the cache for future use.

App Engine provides a storage system for large values called the Blobstore. Your app can use the Blobstore to store, manage, and serve large files, such as images, videos, or file downloads. The Blobstore can also accept large files uploaded by users and offline processes. This service is distinct from the datastore to work around infrastructure limits on request and response sizes between users, application servers, and services. Application code can read values from the Blobstore in chunks that fit within these limits. Code can also query for metadata about Blobstore values.

App Engine applications can access other web resources using the URL Fetch service. The service makes HTTP requests to other servers on the Internet, such as to retrieve pages or interact with web services. Since remote servers can be slow to respond, the URL Fetch API supports fetching URLs in the background while a request handler does other things, but in all cases the fetch must start and finish within the request handler’s lifetime. The application can also set a deadline, after which the call is canceled if the remote host hasn’t responded.

App Engine applications can send messages using the Mail service. Messages can be sent on behalf of the application or on behalf of the user who made the request that is sending the email (if the message is from the user). Many web applications use email to notify users, confirm user actions, and validate contact information.

An application can also receive email messages. If an app is configured to receive email, a message sent to the app’s address is routed to the Mail service, which delivers the message to the app in the form of an HTTP request to a request handler.

App Engine applications can send and receive instant messages to and from chat services that support the XMPP protocol, including Google Talk. An app sends an XMPP chat message by calling the XMPP service. As with incoming email, when someone sends a message to the app’s address, the XMPP service delivers it to the app by calling a request handler.

You can accomplish real-time two-way communication directly with a web browser using the Channel service, a clever implementation of the Comet model of browser app communication. Channels allow browsers to keep a network connection open with a remote host to receive real-time messages long after a web page has finished loading. App Engine fits this into its request-based processing model by using a service: browsers do not connect directly to application servers, but instead connect to “channels” via a service. When an application decides to send a message to a client (or set of clients) during its normal processing, it calls the Channel service with the message. The service handles broadcasting the message to clients, and manages open connections. Paired with web requests for messages from clients to apps, the Channel service provides real-time browser messaging without expensive polling. App Engine includes a JavaScript client so your code in the browser can connect to channels.

The image processing service can do lightweight transformations of image data, such as to make thumbnail images of uploaded photos. The image processing tasks are performed using the same infrastructure Google uses to process images with some of its other products, so the results come back quickly. This service includes special support for interacting with large data objects stored in the Blobstore, so it can operate on large image files uploaded by users.

TIP

Neither the Channel service nor the Images service are discussed in this book. See the official App Engine website for more information about these services.

As of the printing of this edition, App Engine has several compelling new services under development, some available for public beta testing. The Search service in particular may prove to be a major part of document-oriented websites and apps in the near future. Because these services are still being developed and may change, they too have been omitted from this edition. Again, see the official site for the latest.

Namespaces

The datastore, Blobstore, and memcache together store data for an app. It’s often useful to partition an app’s data on a global scale. For example, an app may be serving multiple companies, where each company is to see its own isolated instance of the application, and no company should see any data that belongs to any other company. You could implement this partitioning in the application code, using a company ID as the prefix to every key. But this is prone to error: a bug in the code may expose or modify data from another partition.

To better serve this case, App Engine provides this partitioning feature at the infrastructure level. An app can declare it is acting in a namespace by calling an API. All subsequent uses of any of the data services will restrict itself to the namespace automatically. The app does not need to keep track of which namespace it is in after the initial declaration.

The default namespace has a name equal to the empty string. This namespace is distinct from other namespaces. (There is no “global” namespace.) All data belongs to a namespace.

See the official documentation for more information on the namespace feature.

Google Accounts, OpenID, and OAuth

App Engine features integration with Google Accounts, the user account system used by Google applications such as Google Mail, Google Docs, and Google Calendar. You can use Google Accounts as your app’s account system, so you don’t have to build your own. And if your users already have Google accounts, they can sign in to your app using their existing accounts, with no need to create new accounts just for your app.

Google Accounts is especially useful for developing applications for your company or organization using Google Apps. With Google Apps, your organization’s members can use the same account to access your custom applications as well as their email, calendar, and documents.

Of course, there is no obligation to use Google Accounts. You can always build your own account system, or use an OpenID provider. App Engine includes special support for using OpenID providers in some of the same ways you can use Google Accounts. This is useful when building applications for the Google Apps Marketplace, which uses OpenID to integrate with enterprise single sign-on services.

App Engine includes built-in support for OAuth, a protocol that makes it possible for users to grant permission to third-party applications to access personal data in another service, without having to share her account credentials with the third party. For instance, a user might grant a mobile phone application access to her Google Calendar account, to read appointment data and create new appointments on her behalf. App Engine’s OAuth support makes it straightforward to implement an OAuth service for other apps to use. Note that the built-in OAuth feature only works when using Google Accounts, not OpenID or a proprietary identity mechanism.

There is no custom support for implementing an OAuth client in an App Engine app, but there are OAuth client libraries for Python and Java that work fine with App Engine.

Task Queues and Cron Jobs

A web application has to respond to web requests very quickly, usually in less than a second and preferably in just a few dozen milliseconds, to provide a smooth experience to the user sitting in front of the browser. This doesn’t give the application much time to do work. Sometimes, there is more work to do than there is time to do it. In such cases it’s usually OK if the work gets done within a few seconds, minutes, or hours, instead of right away, as the user is waiting for a response from the server. But the user needs a guarantee that the work will get done.

For this kind of work, an App Engine app uses task queues. Task queues let you describe work to be done at a later time, outside the scope of the web request. Queues ensure that every task gets done eventually. If a task fails, the queue retries the task until it succeeds.

There are two kinds of task queues: push queues, and pull queues. With push queues, each task record represents an HTTP request to a request handler. App Engine issues these requests itself as it processes a push queue. You can configure the rate at which push queues are processed to spread the workload throughout the day. With pull queues, you provide the mechanism, such as a custom computational engine, that takes task records off the queue and does the work. App Engine manages the queuing aspect of pull queues.

A push queue performs a task by calling a request handler. It can include a data payload provided by the code that created the task, delivered to the task’s handler as an HTTP request. The task’s handler is subject to the same limits as other request handlers, with one important exception: a single task handler can take as long as 10 minutes to perform a task, instead of the 60 second limit applied to user requests. It’s still useful to divide work into small tasks to take advantage of parallelization and queue throughput, but the higher time limit makes tasks easier to write in straightforward cases.

An especially powerful feature of task queues is the ability to enqueue a task within a datastore transaction. This ensures that the task will be enqueued only if the rest of the datastore transaction succeeds. You can use transactional tasks to perform additional datastore operations that must be consistent with the transaction eventually, but that do not need the strong consistency guarantees of the datastore’s local transactions.

App Engine has another service for executing tasks at specific times of the day, called the scheduled tasks service. Scheduled tasks are also known as “cron jobs,” a name borrowed from a similar feature of the Unix operating system. The scheduled tasks service can invoke a request handler at a specified time of the day, week, or month, based on a schedule you provide when you upload your application. Scheduled tasks are useful for doing regular maintenance or sending periodic notification messages.

We’ll look at task queues and scheduling and some powerful uses for them in Chapter 16.

Developer Tools

Google provides free tools for developing App Engine applications in Java or Python. You can download the software development kit (SDK) for your chosen language and your computer’s operating system from Google’s website. Java users can get the Java SDK in the form of a plug-in for the Eclipse integrated development environment. Python developers using Windows or Mac OS X can get the Python SDK in the form of a GUI application. Both SDKs are also available as ZIP archives of command-line tools, for using directly or integrating into your development environment or build system.

Each SDK includes a development web server that runs your application on your local computer and simulates the runtime environment, the datastore, the services, and task queues. The development server automatically detects changes in your source files and reloads them as needed, so you can keep the server running while you develop the application.

If you’re using Eclipse, you can run the Java development server in the interactive debugger, and can set breakpoints in your application code. You can also use Eclipse for Python app development by using PyDev, an Eclipse extension that includes an interactive Python debugger. (Using PyDev is not covered in this book, but there are instructions on Google’s site. Also check out my webcast of June 14, 2012, entitled “Python for Google App Engine,” linked from the book’s website.)

The development version of the datastore can automatically generate configuration for query indexes as the application performs queries, which App Engine will use to prebuild indexes for those queries. You can turn this feature off for testing whether queries have appropriate indexes in the configuration.

The development web server includes a built-in web application for inspecting the contents of the (simulated) datastore. You can also create new datastore entities using this interface for testing purposes.

Each SDK also includes a tool for interacting with the application running on App Engine. Primarily, you use this tool to upload your application code to App Engine. You can also use this tool to download log data from your live application, or manage the live application’s datastore indexes and service configuration.

The Python and Java SDKs include a feature you can install in your app for secure remote programmatic access to your live application. The Python SDK includes tools that use this feature for bulk data operations, such as uploading new data from a text file and downloading large amounts of data for backup or migration purposes. The SDK also includes a Python interactive command-line shell for testing, debugging, and manually manipulating live data. These tools are in the Python SDK, but also work with Java apps by using the Java version of the remote access feature. You can write your own scripts and programs that use the remote access feature for large-scale data transformations or other maintenance.

But wait, there’s more! The SDKs also include libraries for automated testing, and gathering reports on application performance. We’ll cover one such tool, AppStats, in Chapter 17. (For Python unit testing, see again the aforementioned “Python for Google App Engine” webcast.)

The Administration Console

When your application is ready for its public debut, you create an administrator account and set up the application on App Engine. You use your administrator account to create and manage the application, view its resource usage statistics and message logs, and more, all with a web-based interface called the Administration Console.

You sign in to the Administration Console by using your Google account. You can use your current Google account if you have one. You may also want to create a Google account just for your application, which you might use as the “from” address on email messages. Once you have created an application by using the Administration Console, you can add additional Google accounts as administrators. Any administrator can access the Console and upload new versions of the application.

The Console gives you access to real-time performance data about how your application is being used, as well as access to log data emitted by your application. You can also query the datastore for the live application by using a web interface, and check on the status of datastore indexes. (Newly created indexes with large data sets take time to build.)

When you upload new code for your application, the uploaded version is assigned a version identifier, which you specify in the application’s configuration file. The version used for the live application is whichever major version is selected as the “default.” You control which version is the “default” by using the Administration Console. You can access nondefault versions by using a special URL containing the version identifier. This allows you to test a new version of an app running on App Engine before making it official.

You use the Console to set up and manage the billing account for your application. When you’re ready for your application to consume more resources beyond the free amounts, you set up a billing account using a credit card and Google Accounts. The owner of the billing account sets a budget, a maximum amount of money that can be charged per calendar day. Your application can consume resources until your budget is exhausted, and you are only charged for what the application actually uses beyond the free amounts.

Things App Engine Doesn’t Do...Yet

When people first start using App Engine, there are several things they ask about that App Engine doesn’t do. Some of these are things Google may implement in the near future, and others run against the grain of the App Engine design and aren’t likely to be added. Listing such features in a book is difficult, because by the time you read this, Google may have already implemented them. (Indeed, this list has gotten substantially shorter since the first edition of this book.) But it’s worth noting these features here, especially to note workaround techniques.

An app can receive incoming email and XMPP chat messages at several addresses. As of this writing, none of these addresses can use a custom domain name. See Chapter 14 and Chapter 15 for more information on incoming email and XMPP addresses.

An app can accept web requests on a custom domain using Google Apps. Google Apps associates a subdomain of your custom domain to an app, and this subdomain can be www if you choose (http://www.example.com/). Requests for this domain, and all subdomains (http://foo.www.example.com), are routed to your application. Google Apps does not yet support requests for “naked” domains, such as http://example.com/.

App Engine does not support streaming or long-term connections directly to application servers. Apps can use the Channel service to push messages to browsers in real-time. XMPP is also an option for messaging in some cases, using an XMPP service (such as Google Talk). These mechanisms are preferred to a polling technique, where the client asks the application for updates on a regular basis. Polling is difficult to scale (5,000 simultaneous users polling every 5 seconds = 1,000 queries per second), and is not appropriate for all applications. Also note that request handlers cannot communicate with the client while performing other calculations. The server sends a response to the client’s request only after the handler has returned control to the server.

App Engine only supports web requests via HTTP or HTTPS, and email and XMPP messages via the services. It does not support other kinds of network connections. For instance, a client cannot connect to an App Engine application via FTP.

The App Engine datastore does not support full-text search queries, such as for implementing a search engine for a content management system. The Search service, which as of this writing is in beta testing, will provide powerful document-based search functionality with good datastore integration. But full-text search is not an inherent ability of the datastore’s query engine.

Getting Started

You can start developing applications for Google App Engine without creating an account. All you need to get started is the App Engine SDK appropriate for your choice of language, which is a free download from the App Engine website:

http://developers.google.com/appengine/

While you’re there, check out the official “Getting Started Guide” for your language, which demonstrates how to create an application and use several of App Engine’s features.

In the next chapter, we’ll describe how to create a new project from start to finish, including how to create an account, upload the application, and run it on App Engine.