Programming Google App Engine
Chapter 18. The Django Web Application Framework
This chapter discusses how to use a major web application framework with the Python runtime environment. Java developers may be interested in the general discussion of frameworks that follows, but the rest of this chapter is specific to Python. Several frameworks for Java are known to work well with App Engine, and you can find information on these on the Web.
As with all major categories of software, web applications have a common set of problems that need to be solved in code. Most web apps need software to interface with the server’s networking layer, communicate using the HTTP protocol, define the resources and actions of the application, describe and implement the persistent data objects, enforce site-wide policies such as access control, and describe the browser interface in a way that makes it easily built and modified by designers. Many of these components involve complex and detailed best practices for interoperating with remote clients and protecting against a variety of security vulnerabilities.
A web application framework is a collection of solutions and best practices that you assemble and extend to make an app. A framework provides the structure for an app, and most frameworks can be run without changes to demonstrate that the initial skeleton is functional. You use the toolkit provided by the framework to build the data model, business logic, and user interface for your app, and the framework takes care of the details. Frameworks are so useful that selecting one is often the first step when starting a new web app project.
Notice that App Engine isn’t a web application framework, exactly. App Engine provides scaling infrastructure, services, and interfaces that solve many common problems, but these operate at a level of abstraction just below most web app frameworks. A better example of a framework is webapp2, a framework for Python included with the App Engine Python SDK that we’ve been using in examples throughout the book so far. webapp2 lets you implement request handlers as Python classes, and it takes care of the details of interfacing with the Python runtime environment and routing requests to handler classes.
Several major frameworks for Python work well with App Engine. Django, Pyramid, web2py, and Flask work well, and some frameworks have explicit support for App Engine. These frameworks are mature, robust, and widely used, and have large thriving support communities and substantial online documentation. You can buy books about some of these frameworks.
Not every feature of every framework works with App Engine. Most notably, many frameworks include a mechanism for defining data models, but these are usually implemented for relational databases, and don’t work with the App Engine datastore. In some cases, you can just replace the framework’s data modeling library with App Engine’s ext.db library (or ext.ndb). Some features of frameworks also have issues running within App Engine’s sandbox restrictions, such as by depending upon unsupported libraries. Developers have written adapter components that work around many of these issues.
In general, to use a framework, you add the framework’s libraries to your application directory and then map all dynamic URLs (all URLs except those for static files) to a script that invokes the framework. Because the interface between the runtime environment and the app is WSGI, you can associate the framework’s WSGI adapter with the URL pattern in app.yaml, just as we did with webapp2. Most frameworks have their own mechanism for associating URL paths with request handlers, and it’s often easiest to send all dynamic requests to the framework and let it route them. You may still want to use app.yaml to institute Google Accounts–based access control for some URLs.
Django is a popular web application framework for Python, with a rich stack of features and pluggable components. It’s also large, consisting of thousands of files. To make it easier to use Django on App Engine, the Python runtime environment includes the Django libraries, so you do not have to upload all of Django with your application files. The Python SDK bundles several versions of Django as well.
App Engine 1.7.0 provides Django 1.3. This version of Django does not include explicit support for App Engine, but many features work as documented with minimal setup. In this chapter, we discuss how to use Django via the provided libraries, and discuss which of Django’s features work and which ones don’t when using Django this way.
We also look at a third-party open source project called django-nonrel, a fork of Django 1.3 that enables most of Django’s features for App Engine. In particular, it connects most of Django’s own data modeling features to the App Engine datastore, so components and features that need a database can work with App Engine, without knowing it’s App Engine behind the scenes. The data modeling layer also provides a layer of portability for your app, so your app code can run in environments that provide either a relational database or a nonrelational datastore (hence “nonrel”).
The official documentation for Django is famously good, although it relies heavily on the data modeling features for examples. For more information about Django, see the Django project website:
http://www.djangoproject.com/ |
TIP
As of this writing, Google is testing a new service for hosting SQL databases, called Google Cloud SQL. The service provides a fully functional MySQL instance running on Google servers, which you can use with software like the Django application framework. However, a hosted SQL database does not scale automatically like the App Engine datastore. Check the Google Cloud websitefor updates on this feature. In this chapter, we’ll assume your app will use the App Engine datastore for storage.
Using the Bundled Django Library
The App Engine 1.7.0 Python SDK provides Django 1.3 in its lib/django_1_3/ subdirectory. With Django, you use a command-line tool to set up a new web application project. This tool expects lib/django_1_3/ to be in the Python library load path, so it can load modules from the djangopackage.
One way to set this up is to add it to the PYTHONPATH environment variable on your platform. For example, on the Mac OS X or Linux command line, using a bash-compatible shell, run this command to change the environment variable for the current session to load Django 1.3 from the Python SDK located at ~/google_appengine/:
export PYTHONPATH=$PYTHONPATH:~/google_appengine/lib/django_1_3
The commands that follow will assume the SDK is in ~/google_appengine/, and this PYTHONPATH is set.
The Django library is available in the runtime environment by using a libraries: directive in app.yaml, just like other libraries. We’ll see an example of this in a moment.
TIP
Django 1.3 is the most recent version included with the Python runtime environment as of App Engine version 1.7.0. Later versions of Django are likely to be added to the runtime environment in future releases.
Instructions should be similar for later versions, although it isn’t clear whether all future versions of Django will be added to the Python SDK. All previously included versions must remain in the SDK for compatibility, and the SDK might get a little large if it bundled every version. You may need to install Django on your local computer separately from the Python SDK in future versions. Django will likely be included in the runtime environment itself, similar to other third-party Python libraries. If you install Django yourself, you do not need to adjust the PYTHONPATH, and can run the Django commands without the library path.
Check the App Engine website for updates on the inclusion of future versions of Django in the runtime environment.
Creating a Django Project
For this tutorial, we will create an App Engine application that contains a Django project. In Django’s terminology, a project is a collection of code, configuration, and static files. A project consists of one or more subcomponents called apps. The Django model encourages designing apps to be reusable, with behavior controlled by the project’s configuration. The appearance of the overall website is also typically kept separate from apps by using a project-wide template directory.
To start, create the App Engine application root directory. From the Mac OS X or Linux command line, you would typically use these commands to make the directory and set it to be the current working directory:
mkdir myapp
cd myapp
You create a new Django project by running a command called django-admin.py startproject. Run this command to create a project named myproject in a subdirectory called myproject/:
python ~/google_appengine/lib/django_1_3/django/bin/django-admin.py \
startproject myproject
This command creates the myproject/ subdirectory with several starter files:
__init__.py
A file that tells Python that code files in this directory can be imported as modules (this directory is a Python package).
manage.py
A command-line utility you will use to build and manage this project, with many features.
settings.py
Configuration for this project, in the form of a Python source file.
urls.py
Mappings of URL paths to Python code, as a Python source file.
The django-admin.py tool has many features, but most of them are specific to managing SQL databases. This is the last time we’ll use it here.
If you’re following along with a Django tutorial or book, the next step is usually to start the Django development server by using the manage.py command. If you did so now, you would be running the Django server, but it would know nothing of App Engine. We want to run this application in the App Engine development server. To do that, we need a couple of additional pieces.
Hooking It Up to App Engine
To connect our Django project to App Engine, we need a short script that instantiates the Django WSGI adapter, and an app.yaml configuration file that maps all (nonstatic) URLs to the Django project.
Create a file named main.py in the application root directory with the following contents:
import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'myproject.settings'
import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()
The first two lines tell Django where to find the project’s settings module, which in this case is at the module path myproject.settings (the myproject/settings.py file). This must be set before importing any Django modules. The remaining two lines import the WSGI adapter, instantiate it, and store it in a global variable.
Next, create app.yaml in the application root directory, like so:
application: myapp
version: 1
runtime: python27
api_version: 1
threadsafe: yes
handlers:
- url: .*
script: main.application
libraries:
- name: django
version: "1.3"
This should be familiar by now, but to review, this tells App Engine this is an application with ID myapp and version ID 1 running in the Python 2.7 runtime environment, with multithreading enabled. All URLs are routed to the Django project we just created, via the WSGI adapter instantiated in main.py. The libraries: declaration selects Django 1.3 as the version to use when importing django modules.
Our directory structure so far looks like this:
myapp/
app.yaml
main.py
myproject/
__init__.py
manage.py
settings.py
urls.py
We can now start this application in a development server. You can add this to the Python Launcher (File menu, Add Existing Application) and then start the server, or just start the development server from the command line, using the current working directory as the application root directory:
dev_appserver.py .
Load the development server URL (such as http://localhost:8080/) in a browser, and enjoy the welcome screen (Figure 18-1).
Figure 18-1. The Django welcome screen
Creating a Django App
The next step is to add a Django “app” to the Django project. You create new Django apps by using the project’s manage.py script, which was generated when you created the project. (You can also use django-admin.py to create apps.)
With the current working directory still set to the application root, create a new app named bookstore for this project:
python myproject/manage.py startapp bookstore
This creates the subdirectory myproject/bookstore, with four new files:
__init__.py
A file that tells Python that code files in this directory can be imported as modules (this directory is a Python package).
models.py
A source file for data models common to this app.
tests.py
A starter file illustrating how to set up automated tests in Django.
views.py
A source file for Django views (request handlers).
This layout is Django’s way of encouraging a design philosophy that separates data models and request handlers into separate, testable, reusable units of code. This philosophy comes naturally when using the App Engine datastore and ext.db. Django does not depend on this file layout directly, and you can change it as needed.
In addition to creating the app, it’s useful to tell Django explicitly that this app will be used by this project. Edit settings.py in the project root directory, and find the INSTALLED_APPS value. Set it to a tuple containing the Python path to the app’s package:
INSTALLED_APPS = (
'myproject.bookstore',
)
(Be sure to include the trailing comma, which tells Python this is a one-element tuple.)
The settings.py file is a Python source file. It contains settings for the Django framework and related components, in the form of variables. The INSTALLED_APPS setting lists all apps in active use by the project, which enables some automatic features such as template loading. (It’s primarily used by Django’s data modeling tools, which we won’t be using, in favor of App Engine’s ext.db.)
Let’s define our first custom response. Edit bookstore/views.py, and replace its contents with the following:
from django import http
def home(request):
return http.HttpResponse('Welcome to the Book Store!')
In Django, a view is a function that is called with a django.http.HttpRequest object and returns an django.http.HttpResponse object. This view creates a simple response with some text. As you would expect, the HttpRequest provides access to request parameters and headers. You can set headers and other aspects of the HttpResponse. Django provides several useful ways to make HttpResponse objects, as well as specialized response classes for redirects and other return codes.
We still need to connect this view to a URL. URLs for views are set by the project, in the urls.py file in the project directory. This allows the project to control the URLs for all of the apps it uses. You can organize URLs such that each app specifies its own set of mappings of URL subpaths to views, and the project provides the path prefix for each app. For now, we’ll keep it simple, and just refer to the app’s view directly in the project’s URL configuration.
Edit urls.py to contain the following:
from django.conf.urls.defaults import patterns, url
urlpatterns = patterns('myproject',
url(r'^books/$', 'bookstore.views.home'),
)
Like settings.py, urls.py is a Python module. It defines a global variable named urlpatterns whose value is returned by the patterns() function from the django.conf.urls.defaults module. Its first argument is a Python path prefix that applies to all subsequent view paths; here, we set it to 'myproject', which is the first part of the Python path to every app in this project. Subsequent arguments are URL patterns, each returned by the url() function. In this case, we associate a regular expression matching a URL to the Python path for the view function we added.
With the development server still running, load the /books/ URL in your browser. The app calls the view to display some text.
Using Django Templates
Django includes a templating system for building web pages and other displayable text. The Jinja2 templating library we’ve used throughout the book so far is based on the Django template system. Their syntax is mostly similar, but you’ll notice minor differences between the two systems. As just one example, Jinja2 lets you call methods on template values with arguments, and so requires that you use parentheses after the method even when not using arguments: {{ someval.method() }}. In Django, templates can call methods of values but cannot pass arguments, and so the parentheses are omitted: {{ someval.method }}.
Django templates are baked into the Django framework, so they’re easy to use. It’s possible to use Jinja2 templates with a Django application, but Jinja2 does not automatically support some of the organizational features of Django.
Let’s update the example to use a Django template. First, we need to set up a template directory. In the project directory, create a directory named templates, and a subdirectory in there named bookstore:
mkdir -p templates/bookstore
(The -p option to mkdir tells it to create the entire path of subdirectories, if any subdirectory does not exist.)
Edit settings.py. We’re going to make two changes: adding an import statement at the top, and replacing the TEMPLATE_DIRS value. It should look something like this, with the rest of the file not shown:
import os
# ...
TEMPLATE_DIRS = (
os.path.join(os.path.dirname(__file__), 'templates'),
)
# ...
The TEMPLATE_DIRS setting is a list (actually a tuple) of directory paths to check when a particular template file is requested. These paths must be absolute paths in the operating system, so we use os.path.dirname(__file__) to get the full path to the project directory (the directory containing the settings.py file), then os.path.join(...) to refer to the templates/ subdirectory.
Inside the templates/bookstore/ subdirectory, create the file index.html with the following template text:
<html>
<body>
<p>Welcome to The Book Store! {{ clock }}</p>
</body>
</html>
Finally, edit views.py to look like this:
from django.shortcuts import render_to_response
import datetime
def home(request):
return render_to_response(
'bookstore/index.html',
{ 'clock': datetime.datetime.now() },
)
Reload the page to see the template displayed by the new view.
The render_to_response() shortcut function takes as its first argument the path to the template file. This path is relative to one of the directories on the TEMPLATE_DIRS template lookup path. The second argument is a Python mapping that defines variables to be used within the template. In this example, we set a template variable named clock to be the current datetime.datetime. Within the template, {{ code }} interpolates this value as a string.
The behavior of the template engine can be extended in many ways. You can define custom tags and filters to use within templates. You can also change how templates are loaded, with template loader classes and the TEMPLATE_LOADERS setting variable. The filesystem-based behavior we’re using now is provided by the django.template.loaders.filesystem.Loader class, which appears in the default settings file created by Django.
Using Django Forms
Django includes a powerful feature for building web forms based on data model definitions. The Django forms library can generate HTML for forms, validate that submitted data meets the requirements of the model, and redisplay the form to the user with error messages. The default appearance is useful, and you can customize the appearance extensively.
Django’s data modeling library is similar to (and was the inspiration for) ext.db, but it was designed with SQL databases in mind. Django does not (yet) include its own adapter layer for using its modeling library with the App Engine datastore. (As we’ll see later in this chapter, this is a primary goal of the django-nonrel project.) When using the Django library directly as we’re doing now, we can’t use its data modeling library, nor can we use any Django features that rely on it.
But there’s good news for web forms: App Engine includes an adapter library so you can use the Django forms library with ext.db data models! This library is provided by the google.appengine.ext.db.djangoforms module.
We won’t go into the details of how Django forms work—see the Django documentation for a complete explanation—but let’s walk through a quick example to see how the pieces fit together. Our example will use the following behavior for creating and editing Book entities:
§ An HTTP GET request to /books/book/ displays an empty form for creating a new Book.
§ An HTTP POST request to /books/book/ processes the book creation form, and either creates the book and redirects to /books (the book listing page) or redisplays the form with errors, if any.
§ An HTTP GET request to /books/book/1234 displays the form to edit the Book entity, with the fields filled out with the current values.
§ An HTTP POST request to /books/book/1234 updates the book with that ID, with the same error-handling behavior as the book creation form.
Edit urls.py to use a new view function named book_form() to handle these URLs:
from django.conf.urls.defaults import patterns, url
urlpatterns = patterns('myproject',
url(r'^books/book/(\d*)', 'bookstore.views.book_form'),
url(r'^books/', 'bookstore.views.home'),
)
The regular expression '^books/book/(\d*)' captures the book ID in the URL, if any, and passes it to the view function as an argument.
Edit models.py in the bookstore/ app directory, and replace its contents with the following ext.db model definitions:
from google.appengine.ext import db
class Book(db.Model):
title = db.StringProperty()
author = db.StringProperty()
copyright_year = db.IntegerProperty()
author_birthdate = db.DateProperty()
class BookReview(db.Model):
book = db.ReferenceProperty(Book, collection_name='reviews')
review_author = db.UserProperty()
review_text = db.TextProperty()
rating = db.StringProperty(choices=['Poor', 'OK', 'Good', 'Very Good', 'Great'],
default='Great')
create_date = db.DateTimeProperty(auto_now_add=True)
No surprises here; these are just ext.db models for datastore entities of the kinds Book and BookReview.
Now for the views. Edit views.py, and replace its contents with the following:
from django import template
from django.http import HttpResponseRedirect
from django.shortcuts import render_to_response
from google.appengine.ext import db
from google.appengine.ext.db import djangoforms
from bookstore import models
def home(request):
q = models.Book.all().order('title')
return render_to_response('bookstore/index.html',
{ 'books': q })
class BookForm(djangoforms.ModelForm):
class Meta:
model = models.Book
def book_form(request, book_id=None):
if request.method == 'POST':
# The form was submitted.
if book_id:
# Fetch the existing Book and update it from the form.
book = models.Book.get_by_id(int(book_id)
form = BookForm(request.POST, instance=book)
else:
# Create a new Book based on the form.
form = BookForm(request.POST)
if form.is_valid():
book = form.save(commit=False)
book.put()
return HttpResponseRedirect('/books/')
# else fall through to redisplay the form with error messages
else:
# The user wants to see the form.
if book_id:
# Show the form to edit an existing Book.
book = models.Book.get_by_id(int(book_id)
form = BookForm(instance=book)
else:
# Show the form to create a new Book.
form = BookForm()
return render_to_response('bookstore/bookform.html', {
'book_id': book_id,
'form': form,
}, template.RequestContext(request)
We’ve updated the home() view to set up a query for Book entities, and pass that query object to the template. Edit templates/bookstore/index.html to display this information:
<html>
<body>
<p>Welcome to The Book Store!</p>
<p>Books in our catalog:</p>
<ul>
{% for book in books %}
<li>{{ book.title }}, by {{ book.author }} ({{ book.copyright_year }})
[<a href="/books/book/{{ book.key.id }}">edit</a>]</li>
{% endfor %}
</ul>
<p>[<a href="/books/book/">add a book</a>]</p>
</body>
</html>
Finally, create the template for the form used by the new book_form() view, named templates/bookstore/bookform.html:
<html>
<body>
{% if book_id %}
<p>Edit book {{ book_id }}:</p>
<form action="/books/book/{{ book_id }}" method="POST">
{% else %}
<p>Create book:</p>
<form action="/books/book/" method="POST">
{% endif %}
{% csrf_token %}
{{ form.as_p }}
<input type="submit" />
</form>
</body>
</html>
The BookForm class is a subclass of google.appengine.ext.db.djangoforms.ModelForm. It can examine an App Engine model class (a subclass of db.Model) and can render and process forms with fields based on the model’s property declarations. It knows which model class to use from the Meta inner class, whose model class attribute is set to our Book class. The ModelForm has useful default rendering and processing behavior for each of the default property declaration types, and you can customize this extensively. For now, we’ll use the defaults.
The book_form() view function takes the HTTP request object and the book_id captured by the regular expression in urls.py as arguments. If the request method is 'POST', then it processes the submitted form; otherwise it assumes the method is 'GET' and just displays the form. In either case, the form is represented by an instance of the BookForm class.
If constructed without arguments, the BookForm represents an empty form for creating a new Book entity. If constructed with the instance argument set to a Book object, the form’s fields are prepopulated with the object’s property values.
To process a submitted form, you pass the dictionary of POST parameters (request.POST) to the BookForm constructor as its first positional argument. If you also provide the instance argument, the instance sets the initial values—including the entity key—and the form data overwrites everything else, as provided.
The BookForm object knows how to render the form based on the model class and the provided model instance (if any). It also knows how to validate data submitted by the user, and render the form with the user’s input and any appropriate error messages included. The is_valid()method tells you if the submitted data is acceptable for saving to the datastore. If it isn’t, you send the BookForm to the template just as you would when displaying the form for the first time.
If the data submitted by the user is valid, the BookForm knows how to produce the final entity object. The save() method saves the entity and returns it; if you set the commit=False argument, it just returns the entity and does not save it, so you can make further changes and save it yourself. In this example, a successful create or update redirects the user to /books/ (which we’ve hardcoded in the view for simplicity) instead of rendering a template.
To display the form, we simply pass the BookForm object to a template. There are several methods on the object for rendering it in different ways; we’ll use the as_p() method to display the form fields in <p> elements. The template is responsible for outputting the <form> tag and the Submit button. The BookForm does the rest.
Restart the development server, then load the book list URL (/books/) in the browser. Click “add a book” to show the book creation form. Enter some data for the book, then submit it to create it.
NOTE
The default form widget for a date field is just a text field, and it’s finicky about the format. In this case, the “author birthdate” field expects input in the form YYYY-MM-DD, such as 1902-02-27.
Continue the test by clicking the “edit” link next to one of the books listed. The form displays with that book’s data. Edit some of the data and then submit the form to update the entity.
Also try entering invalid data for a field, such as nonnumeric data for the “copyright year” field, or a date that doesn’t use the expected format. Notice that the form redisplays with your original input, and with error messages.
The main thing to notice about this example is that the data model class itself (in this case Book) completely describes the default form, including the display of its form fields and the validation logic. The default field names are based on the names of the properties. You can change a field’s name by specifying a verbose_name argument to the property declaration on the model class:
class Book(db.Model):
title = db.StringProperty(verbose_name="Book title")
# ...
See the Django documentation for more information about customizing the display and handling of forms, and other best practices regarding form handling.
CROSS-SITE REQUEST FORGERY
Cross-site request forgery (CSRF) is a class of security issues with web forms where the attacker lures a victim into submitting a web form whose action is your web application, but the form is under the control of the attacker. The malicious form may intercept the victim’s form values, or inject some of its own, and cause the form to be submitted to your app on behalf of the victim.
Django has a built-in feature for protecting against CSRF attacks, and it is enabled by default. The protection works by generating a token that is added to forms displayed by your app, and is submitted with the user’s form fields. If the form is submitted without a valid token, Django rejects the request before it reaches the view code. The token is a digital signature, and is difficult to forge.
This requires the cooperation of our example code in two places. The template/bookstore/bookform.html template includes the {% csrf_token %} template tag somewhere inside the <form> element. Also, the render_to_response() function needs to pass the request to the template when rendering the form, with a third argument, template.RequestContext(request).
The blocking magic happens in a component known as middleware. This architectural feature of Django lets you compose behaviors that act on some or all requests and responses, independently of an app’s views. The MIDDLEWARE_CLASSES setting in settings.py activates middleware, and django.middleware.csrf.CsrfViewMiddleware is enabled by default. If you have a view that accepts POST requests and doesn’t need CSRF protection (such as a web service endpoint), you can give the view the @csrf_exempt decorator, from the django.views.decorators.csrf module.
This feature of Django illustrates the power of a full-stack web application framework. Not only is it possible to implement a security feature like CSRF protection across an entire site with a single component, but this feature can be provided by a library of such components. (You could argue that this is a poor example, because it imposes requirements on views and templates that render forms. But the feature is useful enough to be worth it.)
See Django’s CSRF documentation for more information.
The django-nonrel Project
What we’ve seen so far are just some of the features of the Django framework that are fully functional on App Engine. Django’s component composition, URL mapping, templating, and middleware let you build, organize, and collaborate on large, complex apps. With a little help from an App Engine library, you can use App Engine datastore data models with Django’s sophisticated form rendering features. And we’ve only scratched the surface.
The Achilles’ heel of this setup is the datastore. Django was built first and foremost on top of a data management library. Its earliest claim to fame was its web-based data administration panel, from which you could manage a website like a content management system, or debug database issues without a SQL command line. Any Django component that needs to persist global data uses the data library. Just one example is Django’s sessions feature, an important aspect of most modern websites.
This data library is built to work with any of several possible database backends, but as of Django 1.3, the App Engine datastore is not yet one of them. The django-nonrel project was founded to develop backends for the Django data library that work with several popular nonrelational datastores, including App Engine. The sister project djangoappengine also implements backend adapters for several other App Engine services, so Django library components can do things like send email via the email service. djangoappengine also includes improved support for features of the Django app management tools.
django-nonrel is a fork of Django 1.3. To use it, you download and unpack several components, which together make up the entirety of Django 1.3, with the necessary modifications. These files become your application directory. When you upload your application, all of these files go with it; it ignores the version of Django included in the runtime environment.
As of this writing, management of the django-nonrel project is in transition. Even in it’s current form, it’s a useful addition to your toolkit. To download django-nonrel for App Engine, and to read more about its features and limitations, visit the djangoappengine project website:
http://www.allbuttonspressed.com/projects/djangoappengine |