Datastore Entities - Programming Google App Engine with Python (2015)

Programming Google App Engine with Python (2015)

Chapter 6. Datastore Entities

Most scalable web applications use separate systems for handling web requests and for storing data. The request handling system routes each request to one of many servers, and the server handles the request without knowledge of other requests going to other servers. Each request handler behaves as if it is stateless, acting solely on the content of the request to produce the response. But most web applications need to maintain state, whether it’s remembering that a customer ordered a product, or just remembering that the user who made the current request is the same user who made an earlier request handled by another server. For this, request handlers must interact with a central database to fetch and update the latest information about the state of the application.

Just as the request handling system distributes web requests across many machines for scaling and robustness, so does the database. But unlike the request handlers, databases are by definition stateful, and this poses a variety of questions. Which server remembers which piece of data? How does the system route a data query to the server or servers that can answer the query? When a client updates data, how long does it take for all servers that know that data to get the latest version, and what does the system return for queries about that data in the meantime? What happens when two clients try to update the same data at the same time? What happens when a server goes down?

Google Cloud Platform offers several data storage services, and each service answers these questions differently. The most important service for scalable applications is Google Cloud Datastore, or as it is known to App Engine veterans, simply “the datastore.” When App Engine was first launched in 2008, it included the datastore as its primary means of scalable data storage. The datastore has since gone through major revisions, and is now a prominent service in the Cloud Platform suite, accessible from App Engine via the original API or from Compute Engine or elsewhere via a REST API.

As with App Engine’s request handling, Cloud Datastore manages the scaling and maintenance of data storage automatically. Your application interacts with an abstract model that hides the details of managing and growing a pool of data servers. This model and the service behind it provide answers to the questions of scalable data storage specifically designed for web applications.

Cloud Datastore’s abstraction for data is easy to understand, but it is not obvious how to best take advantage of its features. In particular, it is surprisingly different from the kind of database with which most of us are most familiar, the relational database (such as the one provided by Google Cloud SQL). It’s different enough that we call it a “datastore” instead of a “database.” (We’re mincing words, but the distinction is important.)

Cloud Datastore is a robust, scalable data storage solution. Your app’s data is stored in several locations by using a best-of-breed consensus protocol (similar to the “Paxos” protocol), making your app’s access to this data resilient to most service failures and all planned downtime. When we discuss queries and transactions, we’ll see how this affects how data is updated. For now, just know that it’s a good thing.

We dedicate the next several chapters to this important subject.1

Entities, Keys, and Properties

Cloud Datastore is best understood as an object database. An object in the datastore is known as an entity.

An entity has a key that uniquely identifies the object across the entire system. If you have a key, you can fetch the entity for the key quickly. Keys can be stored as data in entities, such as to create a reference from one entity to another. A key has several parts, some of which we’ll discuss here and some of which we’ll cover later.

One part of the key is the project ID, which ensures that nothing else about the key can collide with the entities of any other project. It also ensures that no other app can access your app’s data, and that your app cannot access data for other apps. This feature of keys is automatic, and doesn’t appear in the API (or in any examples shown here).

An important part of the key is the kind. An entity’s kind categorizes the entity for the purposes of queries, and for ensuring the uniqueness of the rest of the key. For example, a shopping cart application might represent each customer order with an entity of the kind “Order.” The application specifies the kind when it creates the entity.

The key also contains an entity ID. This can be an arbitrary string specified by the app, or it can be an integer generated automatically by the datastore.2 An entity has either a string ID or a numeric ID, but not both.

System-assigned numeric IDs are generally increasing, although they are not guaranteed to be monotonically increasing. If you want a strictly increasing ID, you must maintain this yourself in a transaction. (For more information on transactions, see Chapter 8.) If you purposefully do not want an increasing ID, such as to avoid exposing data sizes to users, you can either generate your own string ID, or allow the system to generate a numeric ID, then encrypt it before exposing it to users.

Consider a simple example where we store information about books in an online book catalog. We might represent each book with an entity in the datastore. The key for such an entity might use a kind of Book, and a system-assigned numeric ID, like so:

Book, 13579

Alternatively, we could use an externally defined identifier for each book, such as the ISBN, stored as a string ID on the key:

Book, "978-0-24680-321-0"

Once an entity has been created, its key cannot be changed. This applies to all parts of its key, including the kind and the ID.

The data for the entity is stored in one or more properties. Each property has a name and at least one value. Each value is of one of several supported data types, such as a string, an integer, a date-time, or a null value. We’ll look at property value types in detail later in this chapter.

§ A property can have multiple values, and each value can be of a different type. As you will see in “Multivalued Properties”, multivalued properties have unusual behavior, but are quite useful for modeling some kinds of data, and surprisingly efficient.

NOTE

It’s tempting to compare these concepts with similar concepts in relational databases: kinds are tables, entities are rows, and properties are fields or columns. That’s a useful comparison, but watch out for differences.

Unlike a table in a relational database, there is no relationship between an entity’s kind and its properties. Two entities of the same kind can have different properties set or not set, and can each have a property of the same name but with values of different types. You can (and often will) enforce a data schema in your own code, and App Engine includes libraries to make this easy, but this is not required by the datastore.

Also, unlike relational databases, keys are not properties. You can perform queries on IDs just like properties, but you cannot change a string ID after the entity has been created.

A relational database cannot store multiple values in a single cell, while an App Engine property can have multiple values.

Introducing the Python Datastore API

In the Python API for the App Engine datastore, Python objects represent datastore entities. The class of the object corresponds to the entity’s kind, where the name of the class is the name of the kind. You define kinds by creating classes that extend one of the provided base classes.

Each attribute of the object corresponds with a property of the entity. To create a new entity in the datastore, you call the class constructor, set attributes on the object, then call a method to save it. To update an existing entity, you call a method that returns the object for the entity (such as via a query), modify its attributes, and then save it.

Example 6-1 defines a class named Book to represent entities of the kind Book. It creates an object of this class by calling the class constructor, and then sets several property values. Finally, it calls the put() method to save the new entity to the datastore. The entity does not exist in the datastore until it is put() for the first time.

Example 6-1. Python code to create an entity of the kind Book

from google.appengine.ext import ndb

import datetime

class Book(ndb.Expando):

pass

# ...

obj = Book()

obj.title = 'The Grapes of Wrath'

obj.author = 'John Steinbeck'

obj.copyright_year = 1939

obj.author_birthdate = datetime.datetime(1902, 2, 27)

obj.put()

The Book class inherits from the class Expando in App Engine’s ndb package. The Expando base class says Book objects can have any of their properties assigned any value. The entity “expands” to accommodate new properties as they are assigned to attributes of the object. Python does not require that an object’s member variables be declared in a class definition, and this example takes advantage of this by using an empty class definition—the pass keyword indicates the empty definition—and assigning values to attributes of the object after it is created. The Expando base class knows to use the object’s attributes as the values of the corresponding entity’s properties.

The Expando class has a funny name because this isn’t the way the API’s designers expect us to create new classes in most cases. Instead, you’re more likely to use the Model base class with a class definition that ensures each instance conforms to a structure, so a mistake in the code doesn’t accidentally create entities with malformed properties. Here is how we might implement the Book class using Model:

class Book(ndb.Model):

title = ndb.StringProperty()

author = ndb.StringProperty()

copyright_year = ndb.IntegerProperty()

author_birthdate = ndb.DateTimeProperty()

The Model version of Book specifies a structure for Book objects that is enforced while the object is being manipulated. It ensures that values assigned to an object’s properties are of appropriate types, such as string values for title and author properties, and raises a runtime error if the app attempts to assign a value of the wrong type to a property. With Model as the base class, the object does not “expand” to accommodate other entities: an attempt to assign a value to a property not mentioned in the class definition raises a runtime error. Model and the various Propertydefinitions also provide other features for managing the structure of your data, such as automatic values, required values, and the ability to add your own validation and serialization logic.

It’s important to notice that these validation features are provided by the Model class and your application code, not the datastore. Even if part of your app uses a Model class to ensure a property’s value meets certain conditions, another part of your app can still retrieve the entity without using the class and do whatever it likes to that value. The bad value won’t raise an error until the app tries to load the changed entity into a new instance of the Model class. This is both a feature and a burden: your app can manage entities flexibly and enforce structure where needed, but it must also be careful when those structures need to change. Data modeling and the Model class are discussed in detail in Chapter 9.

The Book constructor accepts initial values for the object’s properties as keyword arguments. The constructor code earlier could also be written like this:

obj = Book(title='The Grapes of Wrath',

author='John Steinbeck',

copyright_year=1939,

author_birthdate=datetime.datetime(1902, 2, 27))

As written, this code does not set an ID for the new entity. Without an ID, the datastore generates a unique numeric ID when the object is saved for the first time. If you prefer to use an ID generated by the app, you call the constructor with the id parameter, as follows:

obj = Book(id='0143039431',

title='The Grapes of Wrath',

author='John Steinbeck',

copyright_year=1939,

author_birthdate=datetime.datetime(1902, 2, 27))

WARNING

Because the Python API uses keyword arguments, object attributes, and object methods for purposes besides entity properties, there are several property names that are off-limits. For instance, you cannot use the Python API to set a property named id, because this could get confused with the id parameter for the object constructor. Names reserved by the Python API are enforced in the API, but not in the datastore itself.

The property names reserved by ndb are:

§ id

§ key

§ namespace

§ parent

The datastore itself reserves all property names beginning and ending with two underscores (such as __internal__). This particular rule also applies to kind names.

The Python API ignores all object attributes whose names begin with a single underscore (such as _counter). You can use such attributes to attach data and functionality to an object that should not be saved as properties for the entity.

The complete key of an entity, including the ID and kind, must be unique. (We’ll discuss another part to keys that contributes to a key’s uniqueness, called ancestors, in Chapter 8.) If you build a new object with a key that is already in use, and then try to save it, the save will replace the existing object. When you don’t want to overwrite existing data, you can use a system-assigned ID in the key, or you can use a transaction to test for the existence of an entity with a given key and create it if it doesn’t exist.

The API provides a shortcut for creating entities with app-assigned string IDs. The get_or_insert() class method takes a string ID and either returns an existing entity with that ID, or creates a new entity with that ID and returns it. The method also takes initial property values to use with the newly created entity. Property value arguments are ignored if an entity with the given ID already exists. Either way, the method is guaranteed to return an object that represents an entity in the datastore:

obj = Book.get_or_insert(

key_name='0143039431'

title='The Grapes of Wrath',

author='John Steinbeck',

copyright_year=1939,

author_birthdate=datetime.datetime(1902, 2, 27))

# obj is a stored entity, either the previous entity with the

# key Book:'0143039431' (no changes to properties) or a new

# entity with that key and the provided property values.

Property Values

Each value data type supported by the datastore is represented by a primitive type in the language for the runtime or a class provided by the API. The data types and their language-specific equivalents are listed in Table 6-1. In this table, ndb is the Python packagegoogle.appengine.ext.ndb, and users is google.appengine.api.users.

Data type

Python type

Unicode text string (up to 500 characters, indexed)

unicode

Long Unicode text string (not indexed)

unicode or str

Byte string (up to 500 bytes, indexed)

str

Long byte string (not indexed)

str

Boolean

bool

Integer (64-bit)

int or long (converted to 64-bit long)

Float (double precision)

float

Date-time

datetime.datetime

Null value

None

Entity key

ndb.Key

A Google account

users.User

A geographical point

ndb.GeoPt

Table 6-1. Datastore property value types and equivalent Python types

Example 6-2 demonstrates the use of several of these data types.

Example 6-2. Code to set property values of various types

import datetime

import webapp2

from google.appengine.ext import ndb

from google.appengine.api import users

class Comment(ndb.Expando):

pass

class CommentHandler(webapp2.RequestHandler):

def post(self):

c = Comment()

c.commenter = users.get_current_user() # returns a users.User object

c.message = self.request.get('message')

c.date = datetime.datetime.now()

c.put()

# Redirect to a result page...

TIP

When you use Python’s ndb.Expando, values that are converted to native datastore types when stored come back as the datastore types when you retrieve the entity. For example, an int value is stored as a long, and so appears as a long on the retrieved object. If you use ndb.Expando in your app, it’s best to use the native datastore types, so the value types stay consistent.

The data modeling interfaces offer a way to store values in these alternative types and convert them back automatically when retrieving the entity. (For more information on data modeling, see Chapter 9.)

Strings, Text, and Bytes

The datastore has two distinct data types for storing strings of text: short strings and long strings. Short strings are indexed; that is, they can be the subject of queries, such as a search for every Person entity with a given value for a last_name property. Short string values must be less than 500 bytes in length. Long strings can be longer than 500 bytes, but are not indexed.

Text strings, short and long, are strings of characters from the Unicode character set. Internally, the datastore stores Unicode strings by using the UTF-8 encoding, which represents some characters using multiple bytes. This means that the 500-byte limit for short strings is not necessarily the same as 500 Unicode characters. The actual limit on the number of characters depends on which characters are in the string.

When using ndb.Expando and no Property declarations, fields are considered indexable by default. Setting a property to a str value longer than 500 bytes or a unicode value longer than 500 characters stores the propery as nonindexed automatically. For more control over this process, see Chapter 9.

Unset Versus the Null Value

One possible value of a property is the null value. In Python, the null value is represented by None.

A property with the null value is not the same as an unset property. Consider the following code:

class Entity(ndb.Expando):

pass

# ...

a = Entity()

a.prop1 = 'abc'

a.prop2 = None

a.put()

b = Entity()

b.prop1 = 'def'

b.put()

This creates two entities of the kind Entity. Both entities have a property named prop1. The first entity has a property named prop2; the second does not.

Of course, an unset property can be set later:

b.prop2 = 123

b.put()

# b now has a property named "prop2."

Similarly, a set property can be made unset. In the API, you delete the property by deleting the attribute from the object, using the del keyword:

del b.prop2

b.put()

# b no longer has a property named "prop2."

Multivalued Properties

As we mentioned earlier, a property can have multiple values. We’ll discuss the more substantial aspects of multivalued properties when we talk about queries and data modeling. But for now, it’s worth a brief mention.

A property can have one or more values. A property cannot have zero values; a property without a value is simply unset. Each value for a property can be of a different type, and can be the null value.

The datastore preserves the order of values as they are assigned. The API returns the values in the same order as they were set.

In Python, a property with multiple values is represented as a single Python list value:

e.prop = [1, 2, 'a', None, 'b']

NOTE

Because a property must have at least one value, it is an error to assign an empty list ([] in Python) to a property on an entity whose class is derived from the ndb.Expando class:

class Entity(ndb.Expando):

pass

# ...

e = Entity()

e.prop = [] # ERROR

To be able to represent the empty list, you must declare the property type and specify that the property is repeated. In the following example, the property prop is declared as a GenericProperty, which does not enforce a type for the values in the list:

class Entity(ndb.Expando):

prop = ndb.GenericProperty(repeated=True)

# ...

e = Entity()

e.prop = [] # OK

The Property declaration takes care of translating between “no property set” in the datastore and the empty list value in the code. Again, we’ll see more about property declarations in Chapter 9.

Keys and Key Objects

The key for an entity is a value that can be retrieved, passed around, and stored like any other value. If you have the key for an entity, you can retrieve the entity from the datastore quickly, much more quickly than with a datastore query. Keys can be stored as property values, as an easy way for one entity to refer to another.

The API represents an entity key value as an instance of the Key class, in the ndb package. To get the key for an entity, you access the entity object’s key attribute.3 The Key instance provides access to its several parts by using accessor methods, including the kind, string ID (if any), and system-assigned ID (if the entity does not have a string ID).

When you construct a new entity object and do not provide a string ID, the entity object has a key, but the key does not yet have an ID. The ID is populated when the entity object is saved to the datastore for the first time. You can get the key object prior to saving the object, but it will be incomplete:

e = Entity()

e.prop = 123

k = e.key # key is incomplete, does not have an ID

kind = k.kind() # 'Entity'

e.put() # ID is assigned

k = e.key # key is complete, has ID

id = k.id() # the system-assigned ID

If the entity object was constructed with a string ID, the key is complete before the object is saved—although, if the entity has not been saved, the string ID is not guaranteed to be unique. (The entity class method get_or_insert(), mentioned earlier, always returns a saved entity, either one that was saved previously or a new one created by the call.)

If the key is incomplete, the id() method returns None. If the key is complete, id() returns either the string ID (as a str) or the numeric ID (long), whichever one it has. You can request the string ID or integer ID directly using string_id() or integerid(), respectively. These methods return None if the key is incomplete or if the ID is not of the requested type.

Once you have a complete key, you can assign it as a property value on another entity to create a reference:

e2 = Entity()

e2.ref = k

e2.put()

If you know the kind and ID of an entity in the datastore, you can construct the key for that entity without its object. The ndb.Key() constructor can take a kind name (str) and an ID (str or long). A complete explanation of this feature involves another feature we haven’t mentioned yet (ancestor paths), but the following suffices for the examples you’ve seen so far:

e = Entity(id='alphabeta')

e.prop = 123

e.put()

# ...

k = ndb.Key('Entity', 'alphabeta')

Ancestor paths are related to how the datastore does transactions. (We’ll get to them in Chapter 8.) For the entities we have created so far, the path is just the kind followed by the ID or name.

Keys can be converted to string representations for the purposes of passing around as textual data, such as in a web form or cookie. The string representation avoids characters considered special in HTML or URLs, so it is safe to use without escaping characters. The encoding of the value to a string is simple and easily reversed, so if you expose the string value to users, be sure to encrypt it, or make sure all key parts (such as kind names) are not secret. When accepting an encoded key string from a client, always validate the key before using it.

To get the URL-safe string representation of a key, call the urlsafe() method. To reconstruct a Key from such a string, pass it to the constructor’s urlsafe named argument:

k_str = k.urlsafe()

# ...

k = ndb.Key(urlsafe=k_str)

Using Entities

Let’s look briefly at how to retrieve entities from the datastore by using keys, how to inspect the contents of entities, and how to update and delete entities. The API methods for these features are straightforward.

Getting Entities Using Keys

Given a complete key for an entity, you can retrieve the entity from the datastore. Construct the ndb.Key object if you don’t already have one, then call its get() method:

from google.appengine.ext import ndb

# ...

k = ndb.Key('Entity', 'alphabeta')

e = k.get()

If you know the kind and ID of the entity you are fetching, you can also use the get_by_id() class method on the appropriate entity class:

class Entity(ndb.Expando):

pass

# ...

e = Entity.get_by_id('alphabeta')

To fetch multiple entities in a batch, you can pass a list of Key objects to the ndb.get_multi() function. The method returns a list containing entity objects, with None values for keys that do not have a corresponding entity in the datastore:

entities = ndb.get_multi([k1, k2, k3])

Getting a batch of entities in this way performs a single service call to the datastore for the entire batch. This is faster than getting each entity in a separate call. ndb knows how to batch calls to the datastore service automatically, so you only need to use ndb.get_multi() when it is convenient for your code.

Of course, you won’t always have the keys for the entities you want to fetch from the datastore. To retrieve entities that meet other criteria, you use datastore queries. (We’ll discuss queries in Chapter 7.)

Inspecting Entity Objects

Entity objects have methods for inspecting various aspects of the entity.

The API has several features for inspecting entities worth mentioning here. You’ve already seen the key attribute of an entity object, which returns the ndb.Key.

Entity properties can be accessed and modified just like object attributes:

e.prop1 = 1

e.prop2 = 'two'

self.response.write('prop2 has the value ' + e.prop2)

You can use Python built-in functions for accessing object attributes to access entity properties. For instance, to test that an entity has a property with a given name, use the hasattr() built-in:

if hasattr(e, 'prop1'):

# ...

To get or set a property whose name is defined in a string, use getattr() and setattr(), respectively:

# Set prop1, prop2, ..., prop9.

for n inrange(1, 10):

value = n * n

setattr(e, 'prop' + str(n), value)

value = getattr(e, 'prop' + str(7))

We’ve seen that property values can be initialized with arguments passed to the entity class’s constructor. As with all named arguments in Python, you can pass a mapping (dict) of keys and values to this constructor using the ** syntax:

props = {'name1': 'value1', 'name2': 'value2'}

e = Entity(**props)

The populate() method provides another way to set properties after the entity has been constructed. This method takes named arguments just like the constructor, and can also take a mapping using the ** syntax:

e.populate(name1='value1', name2='value2')

e.populate(**props)

You can get all of an entity’s properties as a dict by calling the to_dict() method. By default, all properties are included. You can limit the mapping to a specific set of properties by passing a list of names as the include argument. Alternatively, you can exclude specific properties (and return all others) with the exclude argument. Here is an example using the default 'margin-top:0cm;margin-right:0cm;margin-bottom:0cm; margin-left:20.0pt;margin-bottom:.0001pt;line-height:normal;vertical-align: baseline'> e = Entity(name1='value1', name2='value2')

prop_dict = e.to_dict()

for name inprop_dict:

# ... prop_dict[name] ...

Note that some Property declared types, such as JsonProperty, use mutable objects (such as dict values) to represent their values. to_dict() will return these values as direct references to those objects, not copies.

Saving Entities

Calling the put() method on an entity object saves the entity to the datastore. If the entity does not yet exist in the datastore, put() creates the entity. If the entity exists, put() updates the entity so that it matches the object:

e = Entity()

e.prop = 123

e.put()

When you update an entity, the app sends the complete contents of the entity to the datastore. The update is all or nothing: there is no way to send just the properties that have changed to the datastore. There is also no way to update a property on an entity without retrieving the complete entity, making the change, and then sending the new entity back.

You use the same API to create an entity as you do to update an entity. The datastore does not make a distinction between creates and updates. If you save an entity with a complete key (such as a key with a kind and a string ID) and an entity already exists with that key, the datastore replaces the existing entity with the new one.

TIP

If you want to test that an entity with a given key does not exist before you create it, you can do so using a transaction. You must use a transaction to ensure that another process doesn’t create an entity with that key after you test for it and before you create it. For more information on transactions, see Chapter 8.

If you have several entity objects to save, you can save them all in one call using the put_multi() function in the ndb package:

ndb.put_multi([e1, e2, e3])

When the call to put() returns, the datastore entity records are up-to-date, and all future fetches of these entities in the current request handler and other handlers will see the new data. The specifics of how the datastore gets updated are discussed in detail in Chapter 8.

Deleting Entities

To delete an entity, you acquire or construct its ndb.Key, then call the delete() method:

e = ndb.Key('Entity', 'alphabeta').get()

# ...

e.key.delete()

# Deleting without first fetching the entity:

k = ndb.Key('Entity', 'alphabeta')

k.delete()

As with gets and puts, you can delete multiple keys in a single batch call with ndb.delete_multi():

ndb.delete_multi([e1, e2, e3])

Allocating System IDs

When you create a new entity without specifying an explicit string ID, the datastore assigns a numeric system ID to the entity. Your code can read this system ID from the entity’s key after the entity has been created.

Sometimes you want the system to assign the ID, but you need to know what ID will be assigned before the entity is created. For example, say you are creating two entities, and the property of one entity must be set to the key of the other entity. One option is to save the first entity to the datastore, then read the key of the entity, set the property on the second entity, and then save the second entity:

class Entity(db.Expando):

pass

# ...

e1 = Entity()

e1.put()

e2 = Entity()

e2.reference = e1.key()

e2.put()

This requires two separate calls to the datastore in sequence, which takes valuable clock time. It also requires a period of time where the first entity is in the datastore but the second entity isn’t.

We can’t read the key of the first entity before we save it, because it is incomplete: reading e1.key before calling e1.put() would return an unusable value. We could use a string ID instead of a system ID, giving us a complete key, but it’s often the case that we can’t easily calculate a unique string ID, which is why we’d rather have a system-assigned ID.

To solve this problem, the datastore provides a method to allocate system IDs ahead of creating entities. You call the datastore to allocate an ID (or a range of IDs for multiple entities), then create the entity with an explicit ID. Notice that this is not the same as using a string ID: you give the entity the allocated numeric ID, and it knows the ID came from the system.

To allocate one or more system IDs, call the allocate_ids() class method of the entity class. The method takes a number of IDs to allocate (size) or a maximum ID value to allocate (max):

# Allocate 1 system ID for entities of kind "Entity".

ids = Entity.allocate_ids(size=1)

e1 = Entity(id=ids[0])

e2 = Entity()

e2.reference = e1.key

ndb.put_multi([e1, e2])

The allocate_ids() method acquires unused unique IDs given the rest of the key. In simple cases, the rest of the key is just the kind name, which is derived from the class whose allocate_ids() method you are calling. For keys with ancestory paths, you must also provide the parentargument, whose value is an ndb.Key. See Chapter 8 for information on keys with ancestor paths.

WARNING

A batch put of two entities does not guarantee that both entities are saved together. If your app logic requires that either both entities are saved or neither are saved, you must use a transaction. For more information, see Chapter 8. (As you can probably tell by now, that’s an important chapter.)

The Development Server and the Datastore

The development server simulates the datastore service on your local machine while you’re testing your app. All datastore entities are saved to a local file. This file is associated with your app, and persists between runs of the development server, so your test data remains available until you delete it.

You can tell the development server to reset this data when it starts. From the command line, you pass the --clear_datastore argument to dev_appserver.py:

dev_appserver.py --clear_datastore appdir

1 In 2011–2012, App Engine transitioned from an older datastore infrastructure, known as the “master/slave” (M/S) datastore, to the current one, known as the “high replication” datastore (HR datastore, or HRD). The two architectures differ in how data is updated, but the biggest difference is that the M/S datastore requires scheduled maintenance periods during which data cannot be updated, and is prone to unexpected failures. The HR datastore stays available during scheduled maintenance, and is far more resistant to system failure. All new App Engine applications use the HR datastore, and the M/S datastore is no longer an option. I only mention it because you’ll read about it in older articles, and may see occasional announcements about maintenance of the M/S datastore. You may also see mentions of a datastore migration tool, which old apps still using the M/S datastore can use to switch to the new HR datastore. In this book, “the datastore” always refers to the HR datastore.

2 An entity ID specified by the app is sometimes known as the “key name” in older documentation, to distinguish it from the numeric ID. The newer terminology is simpler: every entity has an ID, and it’s either a string provided by the app or a number provided by the datastore.

3 If you are upgrading to ndb from the ext.db library, notice that the former key() method is now a key attribute. In general, keys are handled rather differently in ndb than in ext.db.