Google Data Store

  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Google Data Store as PDF for free.

More details

  • Words: 17,129
  • Pages: 56
1. Datastore Python API Overview The Google App Engine datastore provides robust scalable data storage for your web application. The datastore is designed with web applications in mind, with an emphasis on read and query performance. It stores data entities with properties, organized by application-defined kinds. It can perform queries over entities of the same kind, with filters and sort orders on property values and keys. All queries are pre-indexed for fast results over very large data sets. The datastore supports transactional updates, using entity groupings defined by the application as the unit of transactionality in the distributed data network. • • • • • • •

Introducing the Datastore Data Modeling With Python Entities and Properties Queries and Indexes Transactions and Entity Groups Datastore Statistics Quotas and Limits

Introducing the Datastore The App Engine datastore stores and performs queries over data objects, known as

entities. An entity has one or more properties, named values of one of several supported data types. A property can be a reference to another entity. The datastore can execute multiple operations in a single transaction, and roll back the entire transaction if any of the operations fail. This is especially useful for distributed web applications, where multiple users may be accessing or manipulating the same data object at the same time. Unlike traditional databases, the datastore uses a distributed architecture to manage scaling to very large data sets. An App Engine application can optimize how data is distributed by describing relationships between data objects, and by defining indexes for queries. The App Engine datastore is strongly consistent, but it's not a relational database. While the datastore interface has many of the same features of traditional databases, the datastore's unique characteristics imply a different way of designing and managing data to take advantage of the ability to scale automatically.

Data Modeling With Python Datastore entities are schemaless: Two entities of the same kind are not obligated to have the same properties, or use the same value types for the same properties. The application is responsible for ensuring that entities conform to a schema when needed. For this purpose, the Python SDK includes a rich library of data modeling features that make enforcing a schema easy.

In the Python API, a model describes a kind of entity, including the types and configuration for its properties. An application defines a model using Python classes, with class attributes describing the properties. Entities of a kind are represented by instances of the corresponding model class, with instance attributes representing the property values. An entity can be created by calling the constructor of the class, then stored by calling the put() method. import datetime from google.appengine.ext import db from google.appengine.api import users class Employee(db.Model): name = db.StringProperty(required=True) role = db.StringProperty(required=True, choices=set(["executive", "manager", "producer"])) hire_date = db.DateProperty() new_hire_training_completed = db.BooleanProperty() account = db.UserProperty() e = Employee(name="", role="manager", account=users.get_current_user()) e.hire_date = datetime.datetime.now() e.put()

The datastore API provides two interfaces for queries: a query object interface, and a SQL-like query language called GQL. A query returns entities in the form of instances of the model classes that can be modified and put back into the datastore. training_registration_list = [users.User("[email protected]"), users.User("[email protected]"), users.User("[email protected]")] employees_trained = db.GqlQuery("SELECT * FROM Employee WHERE account IN :1", training_registration_list) for e in employees_trained: e.new_hire_training_completed = True db.put(e)

Entities and Properties A data object in the App Engine datastore is known as an entity. An entity has one or more properties, named values of one of several data types, including integers, floating point values, strings, dates, binary data, and more. Each entity also has a key that uniquely identifies the entity. The simplest key has a kind and a unique numeric ID provided by the datastore. The ID can also be a string provided by the application. An application can fetch an entity from the datastore by using its key, or by performing a query that matches the entity's properties. A query can return zero or more entities, and can return the results sorted by property values. A query can also limit the number of results returned by the datastore to conserve memory and run time.

Unlike relational databases, the App Engine datastore does not require that all entities of a given kind have the same properties. The application can specify and enforce its data model using libraries included with the SDK, or its own code. A property can have one or more values. A property with multiple values can have values of mixed types. A query on a property with multiple values tests whether any of the values meets the query criteria. This makes such properties useful for testing for membership.

Queries and Indexes An App Engine datastore query operates on every entity of a given kind (a data class). It specifies zero or more filters on entity property values and keys, and zero or more sort orders. An entity is returned as a result for a query if the entity has at least one value (possibly null) for every property mentioned in the query's filters and sort orders, and all of the filter criteria are met by the property values. Every datastore query uses an index, a table that contains the results for the query in the desired order. An App Engine application defines its indexes in a configuration file. The development web server automatically adds suggestions to this file as it encounters queries that do not yet have indexes configured. You can tune indexes manually by editing the file before uploading the application. As the application makes changes to datastore entities, the datastore updates the indexes with the correct results. When the application executes a query, the datastore fetches the results directly from the corresponding index. This mechanism supports a wide range of queries and is suitable for most applications. However, it does not support some kinds of queries you may be used to from other database technologies.

Transactions and Entity Groups With the App Engine datastore, every attempt to create, update or delete an entity happens in a transaction. A transaction ensures that every change made to the entity is saved to the datastore, or, in the case of failure, none of the changes are made. This ensures consistency of data within an entity. You can perform multiple actions on an entity within a single transaction using the transaction API. For example, say you want to increment a counter field in an object. To do so, you need to read the value of the counter, calculate the new value, then store it. Without a transaction, it is possible for another process to increment the counter between the time you read the value and the time you update the value, causing your app to overwrite the updated value. Doing the read, calculation and write in a single transaction ensures that no other process interferes with the increment. You can make changes to multiple entities within a single transaction. To support this, App Engine needs to know in advance which entities will be updated together, so it knows to store them in a way that supports transactions. You must declare that an entity belongs to the same entity group as another entity when you create the entity. All entities fetched, created, updated or deleted in a transaction must be in the same entity group.

Entity groups are defined by a hierarchy of relationships between entities. To create an entity in a group, you declare that the entity is a child of another entity already in the group. The other entity is the parent. An entity created without a parent is a root entity. A root entity without any children exists in an entity group by itself. Each entity has a path of parent-child relationships from a root entity to itself (the shortest path being no parent). This path is an essential part of the entity's complete key. A complete key can be represented by the kind and ID or key name of each entity in the path. The datastore uses optimistic concurrency to manage transactions. While one app instance is applying changes to entities in an entity group, all other attempts to update any entity in the group fail instantly. The app can try the transaction again to apply it to the updated data.

Datastore Statistics The datastore maintains statistics about the data stored for an application, such as how many entities there are of a given kind, or how much space is used by property values of a given type. You can view these statistics in the Administration Console, under Datastore > Statistics. You can also access these values programmatically within the application by querying for specially named entites using the datastore API. For more information, see Datastore Statistics.

Quotas and Limits Each call to the datastore API counts toward the Datastore API Calls quota. Note that some library calls result in multiple calls to the underlying datastore API. Data sent to the datastore by the app counts toward the Data Sent to (Datastore) API quota. Data received by the app from the datastore counts toward the Data Received from (Datastore) API quota. The total amount of data currently stored in the datastore for the app cannot exceed the Stored Data (billable) quota. This includes entity properties and keys, but does not include indexes. The amount of CPU time consumed by datastore operations applies to the following quotas: • •

CPU Time (billable) Datastore CPU Time

For more information on quotas, see Quotas, and the "Quota Details" section of the Admin Console. In addition to quotas, the following limits apply to the use of the datastore: Limit

Amount

maximum entity size 1 megabyte maximum number of values in an index for an entity (1) 1,000 values maximum number of entities in a batch put or batch delete (2) 500 entities maximum number of entities in a batch get (2) 1,000 entities maximum results offset for a query 1,000 1. An entity uses one value in an index for every column × every row that refers to the entity, in all indexes. The number of indexes values for an entity can grow large if an indexed property has multiple values, requiring multiple rows with repeated values in the table. 2. The total size of a datastore call must not exceed 1 megabyte. Also, the total size of the response to a datastore call must not exceed 1 megabyte.

2. Entities and Models A datastore entity has a key and a set of properties. An application uses the datastore API to define data models, and create instances of those models to be stored as entities. Models provide a common structure to the entities created by the API, and can define rules for validating property values. • • • •



The Model Interface Expando Models Polymorphic Models Properties and Types o Strings and Blobs o Lists o References Property_Names

The Model Interface An application describes the kinds of data it uses with models. A model is a Python class that inherits from the Model class. The model class defines a new kind of datastore entity and the properties the kind is expected to take. Model properties are defined using class attributes on the model class. Each class attribute is an instance of a subclass of the Property class, usually one of the provided property classes. A property instance holds configuration for the property, such as whether or not the property is required for the instance to be valid, or a default value to use for the instance if none is provided. from google.appengine.ext import db class Pet(db.Model): name = db.StringProperty(required=True) type = db.StringProperty(required=True, choices=set(["cat", "dog", "bird"])) birthdate = db.DateProperty() weight_in_pounds = db.IntegerProperty() spayed_or_neutered = db.BooleanProperty() owner = db.UserProperty(required=True)

An entity of one of the defined entity kinds is represented in the API by an instance of the corresponding model class. The application can create a new entity by calling the constructor of the class. The application accesses and manipulates properties of the entity using attributes of the instance. The model instance constructor accepts initial values for properties as keyword arguments. from google.appengine.api import users pet = Pet(name="Fluffy", type="cat", owner=users.get_current_user()) pet.weight_in_pounds = 24

Note: The attributes of the model class are configuration for the model properties, whose values are Property instances. The attributes of the model instance are the actual property values, whose values are of the type accepted by the Property class. The Model class uses the Property instances to validate values assigned to the model instance attributes. Property value validation occurs when a model instance is first constructed, and when an instance attribute is assigned a new value. This ensures that a property can never have an invalid value. Because validation occurs when the instance is constructed, any property that is configured to be required must be initialized in the constructor. In this example, name, type and owner are all required values, so their initial values are specified in the constructor. weight_in_pounds is not required by the model, so it starts out unassigned, then is assigned a value later. An instance of a model created using the constructor does not exist in the datastore until it is "put" for the first time. See Creating, Getting and Deleting Data. Note: As with all Python class attributes, model property configuration is initialized when the script or module is first imported. Because App Engine caches imported modules between requests, module configuration may be initialized during a request for one user, and re-used during a request for another. Do not initialize model property configuration, such as default values, with data specific to the request or the current user. See App Caching for more information.

Expando Models A model defined using the Model class establishes a fixed set of properties that every instance of the class must have (perhaps with default values). This is a useful way to model data objects, but the datastore does not require that every entity of a given kind have the same set of properties. Sometimes it is useful for an entity to have properties that aren't necessarily like the properties of other entities of the same kind. Such an entity is represented in the datastore API by an "expando" model. An expando model class subclasses the Expando superclass. Any value assigned to an attribute of an instance of an expando model becomes a property of the datastore entity, using the name of the attribute. These properties are known as dynamic properties. Properties defined using Property class instances in class attributes are fixed properties. An expando model can have both fixed and dynamic properties. The model class simply sets class attributes with Property configuration objects for the fixed properties. The application creates dynamic properties when it assigns them values. class Person(db.Expando): first_name = db.StringProperty() last_name = db.StringProperty() hobbies = db.StringListProperty() p = Person(first_name="Albert", last_name="Johnson") p.hobbies = ["chess", "travel"]

p.chess_elo_rating = 1350 p.travel_countries_visited = ["Spain", "Italy", "USA", "Brazil"] p.travel_trip_count = 13

Because dynamic properties do not have model property definitions, dynamic properties are not validated. Any dynamic property can have a value of any of the datastore base types, including None. Two entities of the same kind can have different types of values for the same dynamic property, and one can leave a property unset that the other sets. Unlike fixed properties, dynamic properties need not exist. A dynamic property with a value of None is different from a non-existent dynamic property. If an expando model instance does not have an attribute for a property, the corresponding data entity does not have that property. You can delete a dynamic property by deleting the attribute. del p.chess_elo_rating

A query that uses a dynamic property in a filter will only return entities whose value for the property is of the same type as the value used in the query. Similarly, the query will only return entities with that property set. p1 = Person() p1.favorite = 42 p1.put() p2 = Person() p2.favorite = "blue" p2.put() p3 = Person() p3.put() people = db.GqlQuery("SELECT * FROM Person WHERE favorite < :1", 50) # people has p1, but not p2 or p3 people = db.GqlQuery("SELECT * FROM Person WHERE favorite > :1", 50) # people has no results

The Expando class is a subclass of the Model class, and inherits all of its methods.

Polymorphic Models The Python API includes a another class for data modeling that allows you to define hierarchies of classes, and perform queries that can return entities of a given class or any of its subclasses. Such models and queries are called "polymorphic," because they allow instances of one class to be results for a query of a parent class. The following example defines an Contact class, and Person and Company classes that are subclasses of Contact: from google.appengine.ext import db from google.appengine.ext.db import polymodel

class Contact(polymodel.PolyModel): phone_number = db.PhoneNumberProperty() address = db.PostalAddressProperty() class Person(Contact): first_name = db.StringProperty() last_name = db.StringProperty() mobile_number = db.PhoneNumberProperty() class Company(Contact): name = db.StringProperty() fax_number = db.PhoneNumberProperty()

This model ensures that all Person entities and all Company entities have phone_number and address properties, and queries for Contact entities can return either Person or Company entities. Only Person entities have mobile_number properties. The subclasses can be instantiated just like any other model class: p = Person(phone_number='1-206-555-9234', address='123 First Ave., Seattle, WA, 98101', first_name='Alfred', last_name='Smith', mobile_number='1-206-555-0117') p.put() c = Company(phone_number='1-503-555-9123', address='P.O. Box 98765, Salem, OR, 97301', name='Data Solutions, LLC', fax_number='1-503-555-6622') c.put()

A query for Contact entities can return instances of either Contact, Person or Company. The following code prints information for both entities created above: for contact in Contact.all(): print 'Phone: %s\nAddress: %s\n\n' % (contact.phone, contact.address))

A query for Company entities returns only instances of Company: for company in Company.all() # ...

For now, polymorphic models should not passed to the Query class constructor directly. Instead use the all() method, as in the example above. For more information on how to use polymorphic models, and how they are implemented, see The PolyModel Class.

Properties and Types The datastore supports a fixed set of value types for entity properties, including Unicode strings, integers, floating point numbers, dates, entity keys, byte strings (blobs), and

various GData types. Each of the datastore value types has a corresponding Property class provided by the google.appengine.ext.db module. Types and Property Classes describes all of the supported value types and their corresponding Property classes. Several special value types are described below.

Strings and Blobs The datastore supports two value types for storing text: short text strings up to 500 bytes in length, and long text strings up to 1 megabyte in length. Short strings are indexed and can be used in query filter conditions and sort orders. Long strings are not indexed and cannot be used in filter conditions or sort orders. A short string value can be either a unicode value or a str value. If the value is a str, an encoding of 'ascii' is assumed. To specify a different encoding for a str value, you can convert it to a unicode value with the unicode() type constructor, which takes the str and the name of the encoding as arguments. Short strings can be modeled using the StringProperty class. class MyModel(db.Model): string = db.StringProperty() obj = MyModel() # Python Unicode literal syntax fully describes characters in a text string. obj.string = u"kittens" # unicode() converts a byte string to a Unicode value using the named codec. obj.string = unicode("kittens", "latin-1") # A byte string is assumed to be text encoded as ASCII (the 'ascii' codec). obj.string = "kittens" # Short string properties can be used in query filters. results = db.GqlQuery("SELECT * FROM MyModel WHERE string = :1", u"kittens")

A long string value is represented by a db.Text instance. Its constructor takes either a unicode value, or a str value and optionally the name of the encoding used in the str. Long

strings can be modeled using the TextProperty class. class MyModel(db.Model): text = db.TextProperty() obj = MyModel() # Text() can take a Unicode value. obj.text = db.Text(u"lots of kittens") # Text() can take a byte string and the name of an encoding. obj.text = db.Text("lots of kittens", "latin-1") # If no encoding is specified, a byte string is assumed to be ASCII text. obj.text = db.Text("lots of kittens")

# Text properties can store large values. obj.text = db.Text(open("a_tale_of_two_cities.txt").read(), "utf-8")

The datastore also supports two similar types for non-text byte strings: db.ByteString and db.Blob. These values are strings of raw bytes, and are not treated as encoded text (such as UTF-8). Like str or unicode values, db.ByteString values are indexed, and limited to 500 characters. A ByteString instance represents a short string of bytes, and takes a str value as an argument to its constructor. Byte strings are modeled using the ByteStringProperty class. Like db.Text, a db.Blob value can be as large as 1 megabyte, but is not indexed, and cannot be used in query filters or sort orders. The db.Blob class takes a str value as an argument to its constructor. Blobs are modeled using the BlobProperty class. class MyModel(db.Model): blob = db.BlobProperty() obj = MyModel() obj.blob = db.Blob(open("image.png").read())

Lists A property can have multiple values, represented in the datastore API as a Python list. The list can contain values of any of the value types supported by the datastore. A single list property may even have values of different types. Order is preserved, so when entities are returned by queries and get(), list properties will have values in the same order as when they were stored. The ListProperty class models a list, and enforces that all values in the list are of a given type. For convenience, the library also provides StringListProperty, similar to ListProperty(basestring). class MyModel(db.Model): numbers = db.ListProperty(long) obj = MyModel() obj.numbers = [2, 4, 6, 8, 10] obj.numbers = ["hello"] # ERROR: MyModel.numbers must be a list of longs.

A query filter on a list property tests the given value against members of the list. The condition is true if at least one member of the list meets the condition. # Get all entities where numbers contains a 6. results = db.GqlQuery("SELECT * FROM MyModel WHERE numbers = 6") # Get all entities where numbers contains at least one element less than 10. results = db.GqlQuery("SELECT * FROM MyModel WHERE numbers < 10")

Query filters only operate on the list members. There is no way to test two lists for similarity in a query filter.

Internally, the datastore represents a list property value as multiple values for the property. If a list property value is the empty list, then the property has no representation in the datastore. The datastore API treats this situation differently for static properties (with ListProperty) and dynamic properties: •



A static ListProperty can be assigned the empty list as a value. The property does not exist in the datastore, but the model instance behaves as if the value is the empty list. A static ListProperty cannot have a value of None. A dynamic property with a list value cannot be assigned an empty list value. However, it can have a value of None, and can be deleted (using del).

The ListProperty model tests that a value added to the list is of the correct type, and throws a BadValueError if it isn't. This test occurs (and potentially fails) even when a previously stored entity is retrieved and loaded into the model. Because str values are converted to unicode values (as ASCII text) prior to storage, ListProperty(str) is treated as ListProperty(basestring), the Python data type which accepts both str and unicode values. You can also use StringListProperty() for this purpose. For storing non-text byte strings, use db.Blob values. The bytes of a blob string are preserved when they are stored and retrieved. You can declare a property that is a list of blobs as ListProperty(db.Blob). List properties interact in unusual ways with sort orders. See Queries and Indexes: Sort Orders and Properties With Multiple Values for details.

References A property value can contain the key of another entity. The value is a Key instance. The ReferenceProperty class models a key value, and enforces that all values refer to entities of a given kind. For convenience, the library also provides SelfReferenceProperty, equivalent to a ReferenceProperty that refers to the same kind as the entity with the property. Assigning a model instance to a ReferenceProperty property automatically uses its key as the value. class FirstModel(db.Model): prop = db.IntegerProperty() class SecondModel(db.Model): reference = db.ReferenceProperty(FirstModel) obj1 = FirstModel() obj1.prop = 42 obj1.put() obj2 = SecondModel() # A reference value is the key of another entity. obj2.reference = obj1.key() # Assigning a model instance to a property uses the entity's key as the value.

obj2.reference = obj1 obj2.put()

A ReferenceProperty property value can be used as if it were the model instance of the referenced entity. If the referenced entity is not in memory, using the property as an instance automatically fetches the entity from the datastore. obj2.reference.prop = 999 obj2.reference.put() results = db.GqlQuery("SELECT * FROM SecondModel") another_obj = results.fetch(1)[0] v = another_obj.reference.prop

When an entity whose key is the value of a reference property is deleted, the reference property does not change. A reference property value can be a key that is no longer valid. If an application expects that a reference could be invalid, it can test for the existence of the object using an if statement: obj1 = obj2.reference if not obj1: # Referenced entity was deleted.

ReferenceProperty has another handy feature: back-references. When a model has a ReferenceProperty to another model, each referenced entity gets a property whose value is a Query that returns all of the entities of the first model that refer to it. # To fetch and iterate over every SecondModel entity that refers to the # FirstModel instance obj1: for obj in obj1.secondmodel_set: # ...

The name of the back-reference property defaults to modelname_set (with the name of the model class in lowercase letters, and "_set" added to the end), and can be adjusted using the collection_name argument to the ReferenceProperty constructor. If you have multiple ReferenceProperty values that refer to the same model class, the default construction of the back-reference property will raise an error: class FirstModel(db.Model): prop = db.IntegerProperty() # This class raises a DuplicatePropertyError with the message # "Class Firstmodel already has property secondmodel_set" class SecondModel(db.Model): reference_one = db.ReferenceProperty(FirstModel) reference_two = db.ReferenceProperty(FirstModel)

To avoid this error, you must explicitly set the collection_name argument: class FirstModel(db.Model): prop = db.IntegerProperty() # This class runs fine

class SecondModel(db.Model): reference_one = db.ReferenceProperty(FirstModel, collection_name="secondmodel_reference_one_set") reference_two = db.ReferenceProperty(FirstModel, collection_name="secondmodel_reference_two_set")

Automatic referencing and dereferencing of model instances, type checking and backreferences are only available using the ReferenceProperty model property class. Keys stored as values of Expando dynamic properties or ListProperty values do not have these features.

Property Names The datastore reserves all property names begining and ending with two underscore characters (__*__). An application cannot create a property with such a name. In the Python API, attributes of model instances whose names begin with an underscore (_) are ignored, and are not saved to the datastore entity. This allows you to store values on the model instance for temporary internal use without affecting the data saved with the entity. Because the Python API uses attributes of model instances as property names by default, none of the attributes already in use by instance methods can be used directly as property attribute names. Similarly, none of the names used by the keyword arguments of model constructors can be used as property attribute names. See the list of reserved property names. The datastore itself allows these names. If an application needs a datastore entity to have a property with a name similar to a word reserved in the Python API, the application can use a fixed property and pass the name argument to the Property class constructor. See the Property class constructor. class MyModel(db.Model): obj_key = db.StringProperty(name="key")

3. Creating, Getting and Deleting Data The datastore API represents entities as instances of model classes. Methods of a model instance create, update and delete the entity. Entities can be fetched from the datastore as model instances using queries or keys. • • • •

Creating and Updating an Entity Getting Entities Using a Query Getting an Entity Using a Key Deleting an Entity

Creating and Updating an Entity Instances of Model (and Expando) classes represent datastore entities. An application creates a new entity of a given kind by calling the constructor of the model class that corresponds with the kind. pet = Pet(name="Fluffy", type="cat", owner=users.get_current_user())

The new entity is not created in the datastore until the instance is "put" for the first time, either by calling the put() method on the instance, or by passing the instance to the db.put() function. pet.put() db.put(pet)

If an instance has been stored before, the put() method updates the existing entity. Queries return results as model instances. These instances can be modified and put back into the datastore. if users.get_current_user(): user_pets = db.GqlQuery("SELECT * FROM Pet WHERE owner = :1", users.get_current_user()) for pet in user_pets: pet.spayed_or_neutered = True db.put(user_pets)

Getting Entities Using a Query The datastore can execute queries across entities of a given kind. A query can filter results using conditions that entity property values must meet, and can return the results sorted by property values. A query can also limit the scope to entities with a given ancestor; see Keys and Entity Groups. For a complete description of how queries work, including several things queries cannot do, see Queries and Indexes.

The datastore API provides two interfaces for performing queries on entity properties: Query, an interface that prepares queries using methods on a query object, and GqlQuery, an interface that uses a SQL-like query language called GQL.

The Query Interface The all() method on a Model (or Expando) class returns a Query object that represents a query for all entities of the corresponding kind. The application prepares the query by calling the filter(), order(), and ancestor() methods on the object. class Story(db.Model): title = db.StringProperty() date = db.DateTimeProperty() query = Story.all() query.filter('title =', 'Foo') query.order('-date') query.ancestor(key) # These methods can be chained together on one line. query.filter('title =', 'Foo').order('-date').ancestor(key)

The GqlQuery Interface The GqlQuery class constructor takes a GQL query string and optional parameter bindings. The query string specifies the kind, and the filters, sort orders and ancestor conditions. The query string can also include a result limit and offset. # Parameters can be bound with positional arguments. query = db.GqlQuery("SELECT * FROM Story WHERE title = :1 " "AND ANCESTOR IS :2 " "ORDER BY date DESC", 'Foo', key) # Or, parameters can be bound with keyword arguments. query = db.GqlQuery("SELECT * FROM Story WHERE title = :title " "AND ANCESTOR IS :parent " "ORDER BY date DESC", title='Foo', parent=key) # String, number and Boolean values can be literal values in the string. query = db.GqlQuery("SELECT * FROM Story WHERE title = 'Foo' " "AND ANCESTOR IS :parent " "ORDER BY date DESC", parent=key)

The gql() class method of a Model class also prepares a GqlQuery object from a string. The string is the GQL query string with the SELECT ... FROM Model omitted, because this part is implied. query = Story.gql("WHERE title = :title " "AND ANCESTOR IS :parent " "ORDER BY date DESC", title='Foo', parent=key)

Parameter bindings can be re-bound to new values using the bind() method. An application can re-use a GqlQuery object by re-binding parameters and re-executing the query.

Executing the Query and Accessing Results Query and GqlQuery objects do not execute the query until the application tries to access the results. When the application accesses results, the query is executed, and results are loaded into memory as instances of the model class for the query. Both query classes provide two ways to execute the query and access results: the fetch() method, and the iterator interface. The fetch() method takes a maximum number of results to fetch (the limit), and an optional number of results to skip (the offset). The method executes the query, then fetches results until it has fetched the limit or there are no more results. Once the results are loaded into memory, it skips to the offset if one was specified, then returns the requested results as a list of model instances. The full query is executed for each call to fetch(). Note: The offset does not affect the number of results fetched from the datastore. All results up to the limit are fetched and stored in memory. The offset only affects what is returned by the fetch() method. results = query.fetch(10) for result in results: print "Title: " + result.title

The limit and offset given to the fetch() method override any limit and offset specified in a GQL query string. If the query object is used as an iterator, the query is executed with no limit or offset, the results are loaded into memory, and the value returned is an iterator over the results. The iterator yields instances of the model class. for result in query: print "Title: " + result.title

Note: The datastore returns a maximum of 1000 results in response to a query, regardless of the limit and offset used to fetch the results. The 1000 results includes any that are skipped using an offset, so a query with more than 1000 results using an offset of 100 will return 900 results.

Getting an Entity Using a Key After an entity is stored in the datastore, the entity has a unique key. Key values are represented in the API represents as instances of the Key class. The put() method of a model instance and db.put() function return the Key of the stored entity. After a model instance has been stored for the first time, the key() method of the model instance returns the Key for the instance.

entity.put() key = entity.key() # ... entity = db.get(key)

A common use of a Key value is to store it as the value of a property on another entity. The ReferenceProperty model property class provides automatic referencing and dereferencing of model instances as keys: A model instance can be assigned to a ReferenceProperty directly, and its key will be used as the value. class PetOwner(db.Model): name = db.StringProperty() class Pet(db.Model): name = db.StringProperty() owner = db.ReferenceProperty(PetOwner) owner = PetOwner(name="Albert") pet = Pet(name="Fluffy", owner=owner) # This is equivalent: pet = Pet(name="Fluffy", owner=owner.key())

Similarly, a ReferenceProperty value accessed via a property acts like its instance. The data entity is fetched automatically, and is not fetched until it is used. pets = db.GqlQuery("SELECT * FROM Pet WHERE name = :1", "Fluffy") pet = pets.get() owner_name = pet.owner.name

Key values stored without the ReferenceProperty model, such as with an Expando dynamic property or a ListProperty element, do not have the automatic dereferencing behavior. The db.get() function fetches an entity from the datastore for a Key (or list of Keys). Keys can be encoded as strings for passing around outside of the application. To convert a string-encoded key back to a Key object, the application passes the string to the Key constructor. obj = MyModel(name="Foo") self.response.write('%s' % (str(obj.key()), obj.name())) # ... key_name = self.request.get('key') obj = db.get(db.Key(key_name))

Note: The string encoding of a Key is opaque, but not encrypted. If your application needs keys to not be guessable, you should further encrypt the string-encoded Key before sending it to the user.

Deleting an Entity An application can delete an entity from the datastore using a model instance or a Key. The model instance's delete() method deletes the corresponding entity from the datastore. The delete() function takes a Key or list of Keys and deletes the entity (or entities) from the datastore. q = db.GqlQuery("SELECT * FROM Message WHERE create_date < :1", earliest_date) results = q.fetch(10) for result in results: result.delete() # or... q = db.GqlQuery("SELECT __key__ FROM Message WHERE create_date < :1", earliest_date) results = q.fetch(10) db.delete(results)

Deleting an entity does not change any Key values in the datastore that may have referred to the entity. If your application may attempt to de-reference a Key value for a deleted entity, the application should do so using db.get(), then test the return value before accessing properties. Deleting an entity that is an ancestor for other entities does not affect the other entities. As long as the application does not depend on the existence of the ancestor to build keys for the descendant entities, the application can still access the descendants.

4. Keys and Entity Groups Every entity in the datastore has a key, an identifier unique to the entity across all entities for an application. A key has several components: a path describing a parent-child relationship between the entity and another entity, the kind of the entity, and either a name assigned to the entity by the application or a numeric ID assigned by the datastore. • • •

Kinds, Names and IDs Entity Groups, Ancestors and Paths Paths and Key Uniqueness

Kinds, Names and IDs Every entity is of a particular kind, a group of entities that can be returned by a query. Unlike rows in a table, two entities of the same kind need not have the same properties, though an application can establish such a restriction in a data model. The datastore API uses the name of the Model (or Expando) subclass as the name of the kind. For example, this class defines a model for a kind named "Story": class Story(db.Model): title = db.StringProperty() author = db.StringProperty()

Every entity has an identifier. An application can assign its own identifier for use in the key by giving the instance constructor a key_name argument (a str value): s = Story(key_name="xzy123")

A key_name is stored as a Unicode string (with str values converted as ASCII text). Key names of the form __*__ (start and end with two underscores) are reserved, and should not be used by the application. If a key_name is not specified, the entity is assigned a numeric ID when it is first stored in the datastore. s2 = Story() # s2 does not have a name or an ID. s2.put() # s2 is given an ID by the datastore.

Once the entity has been created, its ID or name cannot be changed. Tip: Key names and IDs are not property values. However, you can perform a limited set of queries on keys by referring to the special property name __key__. If all entities have the same kind and parent, filters and sort orders can be applied to the key names or IDs. See Queries on Keys.

Entity Groups, Ancestors and Paths

Every entity belongs to an entity group, a set of one or more entities that can be manipulated in a single transaction. Entity group relationships tell App Engine to store several entities in the same part of the distributed network. A transaction sets up datastore operations for an entity group, and all of the operations are applied as a group, or not at all if the transaction fails. When the application creates an entity, it can assign another entity as the parent of the new entity, using the parent argument in the Model constructor. Assigning a parent to a new entity puts the new entity in the same entity group as the parent entity. An entity without a parent is a root entity. An entity that is a parent for another entity can also have a parent. A chain of parent entities from an entity up to the root is the path for the entity, and members of the path are the entity's ancestors. The parent of an entity is defined when the entity is created, and cannot be changed later. Every entity with a given root entity as an ancestor is in the same entity group. All entities in a group are stored in the same datastore node. A single transaction can modify multiple entities in a single group, or add new entities to the group by making the new entity's parent an existing entity in the group. For more information about transactions, see Transactions. If an entity that is an ancestor of another entity is deleted, the descendant entity is not deleted. The descendant entity is still accessible using its complete Key or path. You can create an entity with an ancestor path without first creating the parent entity. To do so, you create a Key for the ancestor using a kind and key name, then use it as the parent of the new entity. All entities with the same root ancestor belong to the same entity group, whether or not the root of the path represents an actual entity. Tips for using entity groups: •



• •

Only use entity groups when they are needed for transactions. For other relationships between entities, use ReferenceProperty properties and Key values, which can be used in queries. The more entity groups your application has—that is, the more root entities there are—the more efficiently the datastore can distribute the entity groups across datastore nodes. Better distribution improves the performance of creating and updating data. Also, multiple users attempting to update entities in the same entity group at the same time will cause some users to retry their transactions, possibly causing some to fail to commit changes. Do not put all of the application's entities under one root. A good rule of thumb for entity groups is that they should be about the size of a single user's worth of data or smaller. Entity groups do not have a significant impact on the speed of queries.

Paths and Key Uniqueness

The complete key of an entity, including the path, the kind and the name or numeric ID, is unique and specific to that entity. The complete key is assigned when the entity is created in the datastore, and none of its parts can change. The keys of two different entities can have similar parts as long as at least one part is different. For instance, two entities can have the same kind and name if they have different parents. Similarly, two entities can have the same parent (or no parent) and name if they are of different kinds. An application should not rely on numeric IDs being assigned in increasing order with the order of entity creation. This is generally the case, but not guaranteed.

5. Queries and Indexes Every datastore query uses an index, a table that contains the results for the query in the desired order. An App Engine application defines its indexes in a configuration file named index.yaml. The development web server automatically adds suggestions to this file as it encounters queries that do not yet have indexes configured. You can tune indexes manually by editing the file before uploading the application. The index-based query mechanism supports most common kinds of queries, but it does not support some queries you may be used to from other database technologies. Restrictions on queries, and their explanations, are described below. • • • • • • •

Introducing Queries Introducing Indexes Defining Indexes With Configuration Queries on Keys Kindless Ancestor Queries Restrictions on Queries Big Entities and Exploding Indexes

Introducing Queries A query retrieves entities from the datastore that meet a set of conditions. The query specifies an entity kind, zero or more conditions based on entity property values (sometimes called "filters"), and zero or more sort order descriptions. When the query is executed, it fetches all entities of the given kind that meet all of the given conditions, sorted in the order described. A query can also return just the keys of the result entities instead of the entities themselves. The datastore Python API provides two interfaces for preparing and executing queries: the Query interface, which uses methods to prepare the query, and the GqlQuery interface, which uses a SQL-like query language called GQL to prepare the query from a query string. These interfaces are described in more detail in Creating, Getting and Deleting Data: Getting Entities Using a Query and the corresponding reference pages. class Person(db.Model): first_name = db.StringProperty() last_name = db.StringProperty() city = db.StringProperty() birth_year = db.IntegerProperty() height = db.IntegerProperty() # The Query interface prepares a query using instance methods. q = Person.all() q.filter("last_name =", "Smith") q.filter("height <", 72) q.order("-height") # The GqlQuery interface prepares a query using a GQL query string.

q = db.GqlQuery("SELECT * FROM Person " + "WHERE last_name = :1 AND height < :2 " + "ORDER BY height DESC", "Smith", 72) # The query is not executed until results are accessed. results = q.fetch(5) for p in results: print "%s %s, %d inches tall" % (p.first_name, p.last_name, p.height)

A filter includes a property name, a comparison operator, and a value. An entity passes the filter if it has a property of the given name and its value compares to the given value as described by the operator. The entity is a result for the query if it passes all of its filters. The filter operator can be any of the following: • • • • • • •

< less than <= less than or equal to = equal to > greater than >= greater than or equal to != not equal to IN equal to any of the values in the provided list

The != operator actually performs 2 queries: one where all other filters are the same and the not-equal filter is replaced with a less-than filter, and one where the not-equal filter is replaces with a greater-than filter. The results are merged, in order. As described below in the discussion of inequality filters, a query can only have one not-equal filter, and such a query cannot have other inequality filters. The IN operator also performs multiple queries, one for each item in the provided list value where all other filters are the same and the IN filter is replaces with an equal-to filter. The results are merged, in the order of the items in the list. If a query has more than IN filter, the query is performed as multiple queries, one for each combination of values in the IN filters. A single query containing != or IN operators is limited to 30 sub-queries.

Introducing Indexes The App Engine datastore maintains an index for every query an application intends to make. As the application makes changes to datastore entities, the datastore updates the indexes with the correct results. When the application executes a query, the datastore fetches the results directly from the corresponding index. An application has an index for each combination of kind, filter property and operator, and sort order used in a query. Consider the example query, stated in GQL: SELECT * FROM Person WHERE last_name = "Smith" AND height < 72 ORDER BY height DESC

The index for this query is a table of keys for entities of the kind Person, with columns for the values of the height and last_name properties. The index is sorted by height in descending order. Two queries of the same form but with different filter values use the same index. For example, the following query uses the same index as the query above: SELECT * FROM Person WHERE last_name = "Jones" AND height < 63 ORDER BY height DESC

The datastore executes a query using the following steps: 1. The datastore identifies the index that corresponds with the query's kind, filter properties, filter operators, and sort orders. 2. The datastore starts scanning the index at the first entity that meets all of the filter conditions using the query's filter values. 3. The datastore continues to scan the index, returning each entity, until it finds the next entity that does not meet the filter conditions, until it reaches the end of the index, or until it has collected the maximum number of results requested by the query. An index table contains columns for every property used in a filter or sort order. The rows are sorted by the following aspects, in order: • • • •

ancestors property values used in equality filters property values used in inequality filters property values used in sort orders

Note: For the purposes of indexes, IN filters are handled like = filters, and != filters are handled like the other inequality filters. This puts all results for every possible query that uses this index in consecutive rows in the table. Tip: Query filters do not have an explicit way to match just part of a string value, but you can fake a prefix match using inequality filters: db.GqlQuery("SELECT * FROM MyModel WHERE prop >= :1 AND prop < :2", "abc", u"abc" + u"\ufffd")

This matches every MyModel entity with a string property prop that begins with the characters abc. The unicode string u"\ufffd" represents the largest possible Unicode character. When the property values are sorted in an index, the values that fall in this range are all of the values that begin with the given prefix. This mechanism supports a wide range of queries and is suitable for most applications. However, it does not support some kinds of queries you may be used to from other database technologies.

Entities Without a Filtered Property Are Never Returned by a Query An index only contains entities that have every property referred to by the index. If an entity does not have a property referred to by an index, the entity will not appear in the index, and will never be a result for the query that uses the index. Note that the App Engine datastore makes a distinction between an entity that does not possess a property and an entity that possesses the property with a null value (None). If you want every entity of a kind to be a potential result for a query, you can use a data model that assigns a default value (such as None) to properties used by query filters.

Properties that Aren't Indexed Property values that aren't indexed are not findable by queries. This includes properties that are marked as not indexed, as well as properties with values of the long text value type (Text) or the long binary value type (Blob). To declare that a property be stored as non-indexed using the Model class, provide the indexed argument to the property model constructor with the value False, as follows: class MyModel(db.Model): unindexed_string = db.StringProperty(indexed=False)

A query with a filter or sort order on a property will never match an entity whose value for the property is a Text or Blob, or which was written with that property marked as not indexed. Properties with such values behave as if the property is not set with regard to query filters and sort orders.

Property Values of Mixed Types are Ordered By Type When two entities have properties of the same name but of different value types, an index of the property sorts the entities first by value type, then by an order appropriate to the type. For example, if two entities each have a property named "age," one with an integer value and one with a string value, the entity with the integer value will always appear before the entity with the string value when sorted by the "Age" property, regardless of the values themselves. This is especially worth noting in the case of integers and floating point numbers, which are treated as separate types by the datastore. A property with the integer value 38 is sorted before a property with the floating point value 37.5, because all integers are sorted before floats.

Defining Indexes With Configuration App Engine builds indexes for several simple queries by default. For other queries, the application must specify the indexes it needs in a configuration file named index.yaml. If the application running under App Engine tries to perform a query for which there is no corresponding index (either provided by default or described in index.yaml), the query will fail.

App Engine provides automatic indexes for the following forms of queries: • • • •

queries using only equality and ancestor filters queries using only inequality filters (which can only be of a single property) queries with no filters and only one sort order on a property, either ascending or descending queries using equality filters on properties and inequality or range filters on keys

Other forms of queries require their indexes to be specified in index.yaml, including: • • • •

queries with multiple sort orders queries with a sort order on keys in descending order queries with one or more inequality filters on a property and one or more equality filters over other properties queries with inequality filters and ancestor filters

The development web server makes managing index configuration easy: Instead of failing to execute a query that does not have an index and requires it, the development web server can generate configuration for an index that would allow the query to succeed. If your local testing of your application calls every possible query the application will make (every combination of kind, ancestor, filter and sort order), the generated entries will represent a complete set of indexes. If your testing might not exercise every possible query form, you can review and adjust the index configuration before uploading the application. App Engine builds indexes for several simple queries by default. For other queries, the application must specify the indexes it needs in a configuration file named index.yaml. If the application running under App Engine tries to perform a query for which there is no corresponding index (either provided by default or described in index.yaml), the query will fail. index.yaml describes each index table, including the kind, the properties needed for the

query filters and sort orders, and whether or not the query uses an ancestor clause (either Query.ancestor() or a GQL ANCESTOR IS clause). The properties are listed in the order they are to be sorted: properties used in equality or IN filters first, followed by the property used in inequality filters, then the query results sort orders and their directions. Consider once again the following example query: SELECT * FROM Person WHERE last_name = "Smith" AND height < 72 ORDER BY height DESC

If the application executed only this query (and possibly other queries similar to this one but with different values for "Smith" and 72), the index.yaml file would look like this: indexes: - kind: Person properties: - name: last_name - name: height direction: desc

When an entity is created or updated, every appropriate index is updated as well. The number of indexes that apply to an entity affects the time it takes to create or update the entity. For more information on the syntax of index.yaml, see Configuring Indexes.

Queries on Keys Entity keys can be the subject of a query filter or sort order, using the special name __key__ in place of the property name. The datastore considers the complete key value for such queries, including the entity's parent path, the kind, and the app-assigned key name string or system-assigned numeric ID. A query can return entity keys instead of full entities. You can trigger this behavior by passing keys_only=True to Query, or by using SELECT __key__ in GQL. Note: Queries that return keys are faster and cost less CPU than queries that return entities, since the keys themselves are already in the index, so the query doesn't need to fetch the actual entities. If you only need the keys from your query results — for example, if you're just going to delete the results — consider using a keys only query. Because an entity key is unique across all entities in the system, __key__ queries make it easy to retrieve entities of a given kind in batches, such as for a batch dump of the contents of the datastore. Unlike offset, this works efficiently for any number of entities. For example: class MainHandler(webapp.RequestHandler): def get(self): query = Entity.gql('ORDER BY __key__') # Use a query parameter to keep track of the last key of the last # batch, to know where to start the next batch. last_key_str = self.request.get('last') if last_key_str: last_key = db.Key(last_key_str) query = Entity.gql('WHERE __key__ > :1 ORDER BY __key__', last_key) # For batches of 20, fetch 21, then use result #20 as the "last" # if there is a 21st. entities = query.fetch(21) new_last_key_str = None if len(entities) == 21: new_last_key_str = str(entities[19].key()) # Return the data and new_last_key_str. Client would use # http://...?last=new_last_key_str to fetch the next batch. # ...

Keys are ordered first by parent path, then by kind, then by key name or ID. Kinds and key names are strings and are ordered by byte value. IDs are integers and are ordered numerically. If entities of the same parent and kind use a mix of key name strings and numeric IDs, entities with numeric IDs are considered to be less than entities with key

name strings. Elements of the parent path are compared similarly: by kind (string), then by key name (string) or ID (number). Queries involving keys use indexes just like queries involving properties. Queries on keys require custom indexes in the same cases as with properties, with a couple of exceptions: inequality filters or an ascending sort order on __key__ do not require a custom index, but a descending sort order on __key__ does. As with all queries, the development web server creates appropriate configuration entries in this file when a query that needs a custom index is tested.

Kindless Ancestor Queries Using GQL or the Python query interface, you can perform queries for entities with a given ancestor regardless of kind. Such queries can also include filters, equality or inequality, on __key__. Kindless queries cannot include sort orders (other than key, ascending, which is the default) or filters on properties. To perform a kindless ancestor query using the Query class, call the constructor without a kind class: q = db.Query() q.ancestor(ancestor_key) q.filter('__key__ >', last_seen_key)

To perform a kindless ancestor query using GQL (either in the Administrator Console or using the GqlQuery class), omit the FROM Kind clause: q = db.GqlQuery('SELECT * WHERE ANCESTOR IS :1 AND __key__ > :2', ancestor_key, last_seen_key)

Kindless ancestor queries do not require custom indexes.

Restrictions on Queries The nature of the index query mechanism imposes a few restrictions on what a query can do.

Filtering Or Sorting On a Property Requires That the Property Exists A query filter condition or sort order for a property also implies a condition that the entity have a value for the property. A datastore entity is not required to have a value for a property that other entities of the same kind have. A filter on a property can only match an entity with a value for the property. Entities without a value for a property used in a filter or sort order are omitted from the index built for the query.

No Filter That Matches Entities That Do Not Have a Property It is not possible to perform a query for entities that are missing a given property. One alternative is to create a fixed (modeled) property with a default value of None, then create a filter for entities with None as the property value.

Inequality Filters Are Allowed On One Property Only A query may only use inequality filters (<, <=, >=, >, !=) on one property across all of its filters. For example, this query is allowed: SELECT * FROM Person WHERE birth_year >= :min AND birth_year <= :max

However, this query is not allowed, because it uses inequality filters on two different properties in the same query: SELECT * FROM Person WHERE birth_year >= :min_year AND height >= :min_height # ERROR

Filters can combine equal (=) comparisons for different properties in the same query, including queries with one or more inequality conditions on a property. This is allowed: SELECT * FROM Person WHERE last_name = :last_name AND city = :city AND birth_year >= :min_year

The query mechanism relies on all results for a query to be adjacent to one another in the index table, to avoid having to scan the entire table for results. A single index table cannot represent multiple inequality filters on multiple properties while maintaining that all results are consecutive in the table.

Properties In Inequality Filters Must Be Sorted Before Other Sort Orders If a query has both a filter with an inequality comparison and one or more sort orders, the query must include a sort order for the property used in the inequality, and the sort order must appear before sort orders on other properties. This query is not valid, because it uses an inequality filter and does not order by the filtered property: SELECT * FROM Person WHERE birth_year >= :min_year ORDER BY last_name # ERROR

Similarly, this query is not valid because it does not order by the filtered property before ordering by other properties: SELECT * FROM Person WHERE birth_year >= :min_year ORDER BY last_name, birth_year # ERROR

This query is valid: SELECT * FROM Person WHERE birth_year >= :min_year ORDER BY birth_year, last_name

To get all results that match an inequality filter, a query scans the index table for the first matching row, then returns all consecutive results until it finds a row that doesn't match. For the consecutive rows to represent the complete result set, the rows must be ordered by the inequality filter before other sort orders.

Sort Orders and Properties With Multiple Values Due to the way properties with multiple values are indexed, the sort order for these properties is unusual: • • • •

If the entities are sorted by a multi-valued property in ascending order, the value used for ordering is the smallest value. If the entities are sorted by a multi-valued property in descending order, the value used for ordering is the greatest value. Other values do not affect the sort order, nor does the number of values. In the case of a tie, the key of the entity is used as the tie-breaker.

This sort order has the unusual consequence that [1, 9] comes before [4, 5, 6, 7] in both ascending and descending order.

Sort Orders Are Ignored on Properties With Equals Filters One important caveat is queries with both an equality filter and a sort order on a multivalued property. In those queries, the sort order is disregarded. For single-valued properties, this is a simple optimization. Every result would have the same value for the property, so the results do not need to be sorted further. However, multi-valued properties may have additional values. Since the sort order is disregarded, the query results may be returned in a different order than if the sort order were applied. (Restoring the dropped sort order would be expensive and require extra indices, and this use case is rare, so the query planner leaves it off.)

Only Ancestor Queries in Transactions Queries are only supported inside transactions if they include an ancestor filter. The query's ancestor must be in the same entity group as the other operations in the transaction. This preserves the restriction that a transaction can only operate on entities in a single entity group.

Big Entities and Exploding Indexes As described above, every property (that doesn't have a Text or Blob value) of every entity is added to at least one index table, including a simple index provided by default, and any indexes described in the application's index.yaml file that refer to the property. For an entity that has one value for each property, App Engine stores a property value once in

its simple index, and once for each time the property is referred to in a custom index. Each of these index entries must be updated every time the value of the property changes, so the more indexes that refer to the property, the more time it will take to update the property. To prevent the update of an entity from taking too long, the datastore limits the number of index entries that a single entity can have. The limit is large, and most applications will not notice. However, there are some circumstances where you might encounter the limit. For example, an entity with very many single-value properties can exceed the index entry limit. Properties with multiple values store each value as a separate entry in an index. An entity with a single property with very many values can exceed the index entry limit. Custom indexes that refer to multiple properties with multiple values can get very large with only a few values. To completely record such properties, the index table must include a row for every permutation of the values of every property for the index. For example, the following index (described in index.yaml syntax) includes the x and y properties for entities of the kind MyModel: indexes: - kind: MyModel properties: - name: x - name: y

The following code creates an entity with 2 values for the property x and 2 values for the property y: class MyModel(db.Expando): pass e2 = MyModel() e2.x = ['red', 'blue'] e2.y = [1, 2] e2.put()

To accurately represent these values, the index must store 12 property values: 2 each for the built-in indexes on x and y, and 2 for each of the 4 permutations of x and y in the custom index. With many values of multi-valued properties, this can mean an index must store very many index entries for a single entity. You could call an index that refers to multiple properties with multiple values an "exploding index," because it can get very large with just a few values. If a put() would result in a number of index entries that exceeds the limit, the call will fail with an exception. If you create a new index that would contain a number of index entries that exceeds the limit for any entity when built, queries against the index will fail, and the index will appear in the "Error" state in the Admin Console. To handle "Error" indexes, first remove them from your index.yaml file and run appcfg.py vacuum_indexes. Then, either reformulate the index definition and corresponding queries or

remove the entities that are causing the index to "explode." Finally, add the index back to index.yaml and run appcfg.py update_indexes. You can avoid exploding indexes by avoiding queries that would require a custom index using a list property. As described above, this includes queries with multiple

6. Transactions The App Engine datastore supports transactions. A transaction is an operation or set of operations that either succeeds completely, or fails completely. An application can perform multiple operations and calculations in a single transaction. • • • •

Using Transactions What Can Be Done In a Transaction Isolation and Consistency Uses For Transactions

Using Transactions A transaction is a datastore operation or a set of datastore operations that either succeed completely, or fail completely. If the transaction succeeds, then all of its intended effects are applied to the datastore. If the transaction fails, then none of the effects are applied. Every datastore write operation is atomic. An attempt to create, update or delete an entity either happens, or it doesn't. An operation may fail due to a high rate of contention, with too many users trying to modify an entity at the same time. Or an operation may fail due to the application reaching a quota limit. Or there may be an internal error with the datastore. In all cases, the operation's effects are not applied, and the datastore API raises an exception. An application can execute a set of statements and datastore operations in a single transaction, such that if any statement or operation raises an exception, none of the datastore operations in the set are applied. The application defines the actions to perform in the transaction using a Python function, then calls db.run_in_transaction() with the function as an argument: from google.appengine.ext import db class Accumulator(db.Model): counter = db.IntegerProperty() def increment_counter(key, amount): obj = db.get(key) obj.counter += amount obj.put() q = db.GqlQuery("SELECT * FROM Accumulator") acc = q.get() db.run_in_transaction(increment_counter, acc.key(), 5)

db.run_in_transaction() takes the function object, and positional and keyword arguments to pass to the function. If the function returns a value, db.run_in_transaction() will return the value.

If the function returns, the transaction is committed, and all effects of datastore operations are applied. If the function raises an exception, the transaction is "rolled back," and the effects are not applied. If the function raises the Rollback exception, db.run_in_transaction() returns None. For any other exception, db.run_in_transaction() re-raises the exception.

What Can Be Done In a Transaction The datastore imposes several restrictions on what can be done inside a single transaction. All datastore operations in a transaction must operate on entities in the same entity group. This includes querying for entities by ancestor, retrieving entities by key, updating entities, and deleting entities. Notice that each root entity belongs to a separate entity group, so a single transaction cannot create or operate on more than one root entity. For an explanation of entity groups, see Keys and Entity Groups. An app can perform a query during a transaction, but only if it includes an ancestor filter. An app can also get datastore entities by key during a transaction. You can prepare keys prior to the transaction, or you can build keys inside the transaction with key names or IDs. All other Python code is allowed inside a transaction function. The transaction function should not have side effects other than the datastore operations. The transaction function may be called multiple times if a datastore operation fails due to another user updating entities in the entity group at the same time. When this happens, the datastore API retries the transaction a fixed number of times. If they all fail, db.run_in_transaction() raises a TransactionFailedError. You can adjust the number of times the transaction is retried using db.run_in_transaction_custom_retries() instead of db.run_in_transaction(). Similarly, the transaction function should not have side effects that depend on the success of the transaction, unless the code that calls the transaction function knows to undo those effects. For example, if the transaction stores a new datastore entity, saves the created entity's ID for later use, then the transaction fails, the saved ID does not refer to the intended entity because the entity's creation was rolled back. The calling code would have to be careful not to use the saved ID in this case.

Isolation and Consistency The datastore's isolation level outside of transactions is closest to READ_COMMITTED. Inside transactions, on the other hand, the isolation level is SERIALIZABLE, specifically a form of snapshot isolation. See the Transaction Isolation article for more information on isolation levels. Queries and gets inside a transaction are guaranteed to see a single, consistent snapshot of the datastore as of the beginning of the transaction. In particular, entities and index rows in the transaction's entity group are fully updated so that queries will return the

complete, correct set of result entities, without the false positives or false negatives described in Transaction Isolation that can occur in queries outside transactions. This consistent snapshot view also extends to reads after writes inside transactions. Unlike with most databases, queries and gets inside a datastore transaction do not see the results of previous writes inside that transaction. Specifically, if an entity is modified or deleted within a transaction, a query or get will return the original version of the entity as of the beginning of the transaction, or nothing if the entity did not exist then.

Uses For Transactions This example demonstrates one use of transactions: updating an entity with a new property value relative to its current value. def increment_counter(key, amount): obj = db.get(key) obj.counter += amount obj.put()

This requires a transaction because the value may be updated by another user after this code fetches the object, but before it saves the modified object. Without a transaction, the user's request will use the value of counter prior to the other user's update, and the save will overwrite the new value. With a transaction, the application is told about the other user's update. If the entity is updated during the transaction, then the transaction is retried until all steps are completed without interruption. Another common use for transactions is to update an entity with a named key, or create it if it doesn't yet exist: class SalesAccount(db.Model): address = db.PostalAddressProperty() phone_number = db.PhoneNumberProperty() def create_or_update(parent_obj, account_id, address, phone_number): obj = db.get(Key.from_path("SalesAccount", account_id, parent=parent_obj)) if not obj: obj = SalesAccount(key_name=account_id, parent=parent_obj, address=address, phone_number=phone_number) else: obj.address = address obj.phone_number = phone_number obj.put()

As before, a transaction is necessary to handle the case where another user is attempting to create or update an entity with the same string ID. Without a transaction, if the entity does not exist and two users attempt to create it, the second will overwrite the first without knowing that it happened. With a transaction, the second attempt will retry, notice that the entity now exists, and update the entity instead.

Create-or-update is so useful that there is a built-in method for it: Model.get_or_insert() takes a key name, an optional parent, and arguments to pass to the model constructor if an entity of that name and path does not exist. The get attempt and the create happen in one transaction, so (if the transaction is successful) the method always returns a model instance that represents an actual entity. Tip: A transaction should happen as quickly as possible to reduce the likelihood that the entities used by the transaction will change, requiring the transaction be retried. As much as possible, prepare data outside of the transaction, then execute the transaction to perform datastore operations that depend on a consistent state. The application should prepare keys for objects used inside the transaction, then fetch the entities inside the transaction. Finally, a transaction can be used to read a consistent snapshot of the datastore. This can be useful when multiple reads gets are needed to render a page or export data that must be consistent. These kinds of transactions are often called read-only transactions, since they perform no writes. Committing and rolling back a read-only transaction are both noops. class Customer(db.Model): user = db.UserProperty() class Account(db.Model): """An Account has a Customer as its parent.""" address = db.PostalAddressProperty() balance = db.FloatProperty() def get_all_accounts(): """Returns a consistent view of the current user's accounts.""" accounts = [] for customer in Customer.all().filter('user =', users.get_current_user()): accounts.extend(Account.all().ancestor(customer)) return accounts

7.Types and Property Classes The App Engine datastore supports a fixed set of value types for properties on data entities. Property classes can define new types that are converted to and from the underlying value types, and the value types can be used directly with Expando dynamic properties and ListProperty aggregate property models. The following table describes the Property classes whose values correspond directly with the underlying data types. Any of these value types can be used in an Expando dynamic property or ListProperty aggregate type. Property class StringProperty ByteStringProperty BooleanProperty IntegerProperty

Value type str unicode ByteString bool int long (64 bits) float

FloatProperty DateTimeProperty datetime.datetime DateProperty TimeProperty ListProperty list of a supported StringListProperty type ReferenceProperty db.Key SelfReferenceProperty UserProperty users.User BlobProperty db.Blob TextProperty db.Text CategoryProperty db.Category LinkProperty db.Link EmailProperty db.Email GeoPtProperty db.GeoPt IMProperty db.IM PhoneNumberProperty db.PhoneNumber PostalAddressProperty db.PostalAddress RatingProperty db.Rating

Sort order Unicode (str is treated as ASCII) byte order False < True Numeric Numeric Chronological If ascending, by least element; if descending, by greatest element By path elements (kind, ID or name, kind, ID or name...) By email address (Unicode) (not orderable) (not orderable) Unicode Unicode Unicode By latitude, then longitude Unicode Unicode Unicode Numeric

Datastore Value Types Datastore entity property values can be of one of the following types. See above for a list of corresponding Property classes to use with Model definitions.

Other than the Python standard types and users.User, all classes described in this section are provided by the google.appengine.ext.db module. str or unicode

A short string value, less than 500 bytes in length. A str value is assumed to be text encoded with the ascii codec, and is converted to a unicode value before being stored. The value is returned by the datastore as a unicode value. For short strings using other codecs, use a unicode value. Short strings are indexed by the datastore, and can be used in filters and sort orders. For text strings longer than 500 bytes (which are not indexed), use a Text instance. For unencoded byte strings longer than 500 bytes (also not indexed), use a Blob instance.

Model property: StringProperty bool

A Boolean value, True or False.

Model property: BooleanProperty int or long

An integer value, up to 64 bits. Python int values are converted to Python long values prior to storage. A value stored as an int will be returned as a long. If a long larger than 64 bits is assigned, only the least significant 64 bits are stored.

Model property: IntegerProperty float

A floating point value.

Model property: FloatProperty datetime.datetime

A date and time. See the datetime module documentation. If the datetime value has a tzinfo attribute, it will be converted to the UTC time zone for storage. Values come back from the datastore as UTC, with a tzinfo of None. An application that needs date and time values to be in a particular time zone must set tzinfo correctly when updating the value, and convert values to the timezone when accessing the value.

Some libraries use the TZ environment variable to control the time zone applied to date-time values. App Engine sets this environment variable to "UTC". Note that changing this variable in an application will not change the behavior of some datetime functions, because changes to environment variables are not visible outside of the Python code. If you only convert values to and from a particular time zone, you can implement a custom datetime.tzinfo to convert values from the datastore: class Pacific_tzinfo(datetime_module.tzinfo): """Implementation of the Pacific timezone.""" def utcoffset(self, dt): return datetime_module.timedelta(hours=-8) + self.dst(dt) def _FirstSunday(self, dt): """First Sunday on or after dt.""" return dt + datetime_module.timedelta(days=(6-dt.weekday())) def dst(self, dt): # 2 am on the second Sunday in March dst_start = self._FirstSunday(datetime_module.datetime(dt.year, 3, 8, 2)) # 1 am on the first Sunday in November dst_end = self._FirstSunday(datetime_module.datetime(dt.year, 11, 1, 1)) if dst_start <= dt.replace(tzinfo=None) < dst_end: return datetime_module.timedelta(hours=1) else: return datetime_module.timedelta(hours=0) def tzname(self, dt): if self.dst(dt) == datetime_module.timedelta(hours=0): return "PST" else: return "PDT" pacific_time = utc_time.astimezone(Pacific_tzinfo())

See the datetime module documentation (including datetime.tzinfo). See also the third-party module pytz, though note that the pytz distribution has many files. The DateTimeProperty model property class includes features such as the ability to automatically use the date and time a model instance is stored. These are features of the model, and are not available on the raw datastore value (such as in an Expando dynamic property).

Model properties: DateTimeProperty, DateProperty, TimeProperty list

A list of values, each of which is of one of the supported data types. See Entities and Models: Lists. When a list is used as the value of an Expando dynamic property, it cannot be an empty list. This is due to how list values are stored: When a list property has no

items, it has no representation in the datastore. You can use a static property and the ListProperty class to represent an empty list value for a property.

Model property: ListProperty db.Key The key for another datastore entity. m = Employee(name="Susan", key_name="susan5") m.put() e = Employee(name="Bob", manager=m.key()) e.put() m_key = db.Key.from_path("Employee", "susan5") e = Employee(name="Jennifer", manager=m_key)

Model properties: ReferenceProperty, SelfReferenceProperty users.User A user with a Google account. A User value in the datastore does not get updated if the user changes her email address. This may be remedied in a future release. Until then, you can use the User value's user_id() as the user's stable unique identifier.

Model property: UserProperty class Blob(arg=None) Binary data, as a byte string. This is a subclass of the built-in str type. Blob properties are not indexed, and cannot be used in filters or sort orders. Blob is for binary data, such as images. It takes a str value, but this value is stored as a byte string and is not encoded as text. Use a Text instance for large text data.

Model property: BlobProperty class MyModel(db.Model): blob = db.BlobProperty() m = MyModel() m.blob = db.Blob(open("image.png").read())

In XML, blobs are base-64 encoded whether or not they contain binary data. class ByteString(arg) A short blob value (a "byte string"), less than 500 bytes in length. ByteString is a subclass of str, and takes an unencoded str value as an argument to its constructor.

ByteStrings are indexed by the datastore, and can be used in filters and sort orders. For byte strings longer than 500 bytes (which are not indexed), use a Blob instance. For encoded text data, use str (short, indexed) or Text (long, not indexed).

Model property: ByteStringProperty class Text(arg=None, encoding=None) A long string. This is a subclass of the built-in unicode type. arg a unicode or str value. If arg is a str, then it is parsed with the encoding specified by encoding, or ascii if no encoding is specified. See the list of standard encodings for possible values for encoding. Unlike an entity property whose value is a simple str or unicode, a Text property can be more than 500 bytes long. However, Text properties are not indexed, and cannot be used in filters or sort orders.

Model property: TextProperty class MyModel(db.Model): text = db.TextProperty() m = MyModel() m.text = db.Text(u"kittens") m.text = db.Text("kittens", encoding="latin-1")

class Category(tag) A category or "tag". This is a subclass of the built-in unicode type.

Model property: CategoryProperty class MyModel(db.Model): category = db.CategoryProperty() m = MyModel() m.category = db.Category("kittens")

In XML, this is an Atom category element. See the Atom specification. class Email(email) An email address. This is a subclass of the built-in unicode type. Neither the property class nor the value class perform validation of email addresses, they just store the value.

Model property: EmailProperty class MyModel(db.Model): email_address = db.EmailProperty()

m = MyModel() m.email_address = db.Email("[email protected]")

In XML, this is a gd:email element. See the GData API reference. class GeoPt(lat, lon=None) A geographical point represented by floating-point latitude and longitude coordinates.

Model property: GeoPtProperty In XML, this is a georss:point element. See georss.org. class IM(protocol, address=None) An instant messaging handle. protocol is the canonical URL of the instant messaging service. Some possible values: Protocol

Description sip SIP/SIMPLE xmpp XMPP/Jabber http://aim.com/ AIM http://icq.com/ ICQ http://talk.google.com/ Google Talk http://messenger.msn.com/ MSN Messenger http://messenger.yahoo.com/ Yahoo Messenger http://sametime.com/ Lotus Sametime http://gadu-gadu.pl/ Gadu-Gadu unknown Unknown or unspecified address is the handle's address.

Model property: IMProperty class MyModel(db.Model): im = db.IMProperty() m = MyModel() m.im = db.IM("http://example.com/", "Larry97")

In XML, this is a gd:im element. See the GData API reference. class Link(link) A fully qualified URL. This is a subclass of the built-in unicode type.

Model property: LinkProperty class MyModel(db.Model): link = db.LinkProperty() m = MyModel() m.link = db.Link("http://www.google.com/")

In XML, this is an Atom link element. See the Atom specification. class PhoneNumber(phone) A human-readable telephone number. This is a subclass of the built-in unicode type.

Model property: PhoneNumberProperty class MyModel(db.Model): phone = db.PhoneNumberProperty() m = MyModel() m.phone = db.PhoneNumber("1 (206) 555-1212")

In XML, this is a gd.phoneNumber element. See the GData API reference. class PostalAddress(address) A postal address. This is a subclass of the built-in unicode type.

Model property: PostalAddressProperty class MyModel(db.Model): address = db.PostalAddressProperty() m = MyModel() m.address = db.PostalAddress("1600 Ampitheater Pkwy., Mountain View, CA")

In XML, this is a gd:postalAddress element. See the GData API reference. class Rating(rating) A user-provided rating for a piece of content, as an integer between 0 and 100. This is a subclass of the built-in long type. The class validates that the value is an integer between 0 and 100, and raises a BadValueError if the value is invalid.

Model property: RatingProperty class MyModel(db.Model): rating = db.RatingProperty() m = MyModel() m.rating = db.Rating(97)

In XML, this is a gd:rating element. See the GData API reference.

Property Classes All model property classes provided by google.appengine.ext.db are subclasses of the Property base class, and support all of the base constructor's arguments. See the base class documentation for information about those arguments. The google.appengine.ext.db package provides the following model property classes: class BlobProperty(...) A binary data property. Blob data is a byte string. For text data, which may involve encoding, use TextProperty.

Value type: Blob class BooleanProperty(...) A Boolean property.

Value type: bool class ByteStringProperty(verbose_name=None, ...) A short blob property (a "byte string"). Takes a ByteString value of 500 bytes or less. ByteStringProperty property values are indexed, and can be used in filters and sort orders. Like StringProperty, except the value is not encoded in any way. The bytes are stored literally. If the ByteStringProperty is required, the value cannot be an empty string.

Value type: ByteString class CategoryProperty(...) A category or "tag," a descriptive word or phrase.

Value type: Category class DateProperty(verbose_name=None, auto_now=False, auto_now_add=False, ...) A date property, without a time of day. See DateTimeProperty for more information.

Value type:

datetime.date. This is converted to a datetime.datetime internally.

class DateTimeProperty(verbose_name=None, auto_now=False, auto_now_add=False, ...) A date and time property. If auto_now is True, the property value is set to the current time whenever the model instance is stored in the datastore, overwriting the property's previous value. This is useful for tracking a "last modified" date and time for a model instance. If auto_now_add is True, the property value is set to the current time the first time the model instance is stored in the datastore, unless the property has already been assigned a value. This is useful for storing a "created" date and time for a model instance. Date-time values are stored as and returned using the UTC time zone. See datetime.datetime for a discussion of how to manage time zones.

Value type: datetime.datetime class EmailProperty(...) An email address. Neither the property class nor the value class perform validation of email addresses, they just store the value.

Value type: Email class FloatProperty(...) A floating point number property.

Value type: float class GeoPtProperty(...) A geographical point represented by floating-point latitude and longitude coordinates.

Value type: GeoPt class IMProperty(...) An instant messaging handle.

Value type: IM class IntegerProperty(...) An integer property, up to 64 bits.

Python int values are converted to Python long values prior to storage. A value stored as an int will be returned as a long. If a long larger than 64 bits is assigned, only the least significant 64 bits are stored.

Value type: int or long class LinkProperty(...) A fully qualified URL.

Value type: Link class ListProperty(item_type, verbose_name=None, default=None, ...) A list of values of the type given as item_type. In a query, comparing a list property to a value performs the test against the list members: list_property = value tests if the value appears anywhere in the list, list_property < value tests if any of the members of the list are less than the given value, and so forth. A query cannot compare two list values. There is no way to test two lists for equality without testing each element for membership separately. item_type is the type of the items in the list, as a Python type or class. All items in the list value must be of the given type. item_type must be one of the datastore value types, and cannot be list. See Datastore Value Types, above. The value of a ListProperty cannot be None. It can, however, be an empty list. When None is specified for the default argument (or when the default argument is not specified), the default value of the property is the empty list. Tip: Because ListProperty aggregate types do not use the Property classes, Property class features such as automatic values and validation are not applied automatically to members of the list value. If you want to validate a member value using a Property class, you can instantiate the class and call its validate() method on the value. default is the default value for the list property. If None, the default is an empty list. A list property can define a custom validator to disallow the empty list. See Entities and Models for more information on ListProperty and list values.

Value type: a Python list of zero or more values, where each value is of the configured type class PhoneNumberProperty(...) A human-readable telephone number.

Value type: PhoneNumber class PostalAddressProperty(...) A postal address.

Value type: PostalAddress class RatingProperty() A user-provided rating for a piece of content, as an integer between 0 and 100.

Value type: Rating class ReferenceProperty(reference_class=None, verbose_name=None, collection_name=None, ...) A reference to another model instance. For example, a reference may indicate a many-to-one relationship between the model with the property and the model referenced by the property. reference_class is the model class of the model instance being referenced. If specified, only model instances of the class can be assigned to this property. If None, any model instance can be the value of this property. collection_name is the name of the property to give to the referenced model class whose value is a Query for all entities that reference the entity. If no collection_name is set, then modelname_set (with the name of the model in lowercase letters and "_set" added) is used. Note: collection_name must be set if there are multiple properties within the same model referencing the same model class. Otherwise, a DuplicatePropertyError will be raised when the default names are generated. ReferenceProperty automatically references and dereferences model instances as property values: A model instance can be assigned to a ReferenceProperty directly, and its key will be used. The ReferenceProperty value can be used as if it were a model instance, and the datastore entity will be fetched and the model instance created when it is first used in this way. Untouched reference properties do not query for unneeded data. class Author(db.Model): name = db.StringProperty() class Story(db.Model): author = db.ReferenceProperty(Author) story = db.get(story_key) author_name = story.author.name

As with a Key value, it is possible for a reference property value to refer to a data entity that does not exist. If a referenced entity is deleted from the datastore,

references to the entity are not updated. An application can explicitly db.get() the value of a ReferenceProperty (which is a Key) to test whether the referenced entity exists. Deleting an entity does not delete entities referred to by a ReferenceProperty. See also this introduction to reference properties.

Value type: db.Key (see above) class SelfReferenceProperty(verbose_name=None, collection_name=None, ...) A reference to another model instance of the same class. See ReferenceProperty.

Value type: db.Key (see above) class StringListProperty(verbose_name=None, default=None, ...) Similar to a ListProperty of Python str or unicode (basestring) values. See ListProperty.

Value type: a Python list of str or unicode values class StringProperty(verbose_name=None, multiline=False, ...) A short string property. Takes a Python str or unicode (basestring) value of 500 bytes or less. StringProperty property values are indexed, and can be used in filters and sort orders. If multiline is False, then the value cannot include linefeed characters. The djangoforms library uses this to enforce a difference between text fields and textarea fields in the data model, and others can use it for a similar purpose. If the StringProperty is required, the value cannot be an empty string.

Value type: str or unicode class TextProperty() A long string. Unlike StringProperty, a TextProperty value can be more than 500 bytes long. However, TextProperty values are not indexed, and cannot be used in filters or sort orders. TextProperty values store text with a text encoding. For binary data, use BlobProperty. If the TextProperty is required, the value cannot be an empty string.

Value type: Text class TimeProperty(verbose_name=None, auto_now=False, auto_now_add=False, ...) A time property, without a date. Takes a Python standard library datetime.time value. See DateTimeProperty for more information.

Value type:

datetime.time. This is converted to a datetime.datetime internally.

class UserProperty(verbose_name=None, auto_current_user=False, auto_current_user_add=False, ...) A user with a Google account. If auto_current_user is True, the property value is set to the currently signed-in user whenever the model instance is stored in the datastore, overwriting the property's previous value. This is useful for tracking which user modifies a model instance. If auto_current_user_add is True, the property value is set to the currently signed-in user the first time the model instance is stored in the datastore, unless the property has already been assigned a value. This is useful for tracking which user creates a model instance, which may not be the same user that modifies it later. UserProperty does not accept a default value. Default values are set when the model class is first imported, and with import caching may not be the currently signed-in user.

Value type: users.User (see above)

8. GQL Reference GQL is a SQL-like language for retrieving entities or keys from the App Engine scalable datastore. While GQL's features are different from those of a query language for a traditional relational database, the GQL syntax is similar to that of SQL. The GQL syntax can be summarized as follows: SELECT [* | __key__] FROM [WHERE [AND ...]] [ORDER BY <property> [ASC | DESC] [, <property> [ASC | DESC] ...]] [LIMIT [,]] [OFFSET ] := <property> {< | <= | > | >= | = | != } := <property> IN <list> := ANCESTOR IS <entity or key>

As with SQL, GQL keywords are case insensitive. Kind and property names are case sensitive. A GQL query returns zero or more entities or Keys of the requested kind. Every GQL query always begins with either SELECT * FROM or SELECT __key__ FROM, followed by the name of the kind. (A GQL query cannot perform a SQL-like "join" query.) Tip: SELECT __key__ queries are faster and cost less CPU than SELECT * queries. The optional WHERE clause filters the result set to those entities that meet one or more conditions. Each condition compares a property of the entity with a value using a comparison operator. If multiple conditions are given with the AND keyword, then an entity must meet all of the conditions to be returned by the query. GQL does not have an OR operator. However, it does have an IN operator, which provides a limited form of OR. The IN operator compares value of a property to each item in a list. The IN operator is equivalent to many = queries, one for each value, that are ORed together. An entity whose value for the given property equals any of the values in the list can be returned for the query. Note: The IN and != operators use multiple queries behind the scenes. For example, the IN operator executes a separate underlying datastore query for every item in the list. The entities returned are a result of the cross-product of all the underlying datastore queries and are de-duplicated. A maximum of 30 datastore queries are allowed for any single GQL query. A condition can also test whether an entity has a given entity as an ancestor, using the ANCESTOR IS operator. The value is a model instance or Key for the ancestor entity. For more information on ancestors, see Keys and Entity Groups. The left-hand side of a comparison is always a property name. The right-hand side can be one of the following (as appropriate for the property's data type):

• • • • •

a str literal, as a single-quoted string. Single-quote characters in the string must be escaped as ''. For example: 'Joe''s Diner' an integer or floating point number literal. For example: 42.7 a Boolean literal, as TRUE or FALSE. the NULL literal, which represents the null value (None in Python). a datetime, date, or time literal, with either numeric values or a string representation, in the following forms: o o o o o o



an entity key literal, with either a string-encoded key or a complete path of kinds and key names/IDs: o o



DATETIME(year, month, day, hour, minute, second) DATETIME('YYYY-MM-DD HH:MM:SS') DATE(year, month, day) DATE('YYYY-MM-DD') TIME(hour, minute, second) TIME('HH:MM:SS')

KEY('encoded key') KEY('kind', 'name'/ID [, 'kind', 'name'/ID...])

a User object literal, with the user's email address: USER('email-address')



a GeoPt literal, with the latitude and longitude as floating point values: GEOPT(lat, long)



a bound parameter value. In the query string, positional parameters are referenced by number: title = :1 Keyword parameters are referenced by name: title = :mytitle

Note: conditions of the form property = NULL (which are equivalent) check to see whether a null value is explicitly stored in the datastore for that property. This is not the same as checking to see if the entity lacks any value for the property! Datastore queries which refer to a property never return entities which don't have some value for that property. Bound parameters can be bound as positional arguments or keyword arguments passed to the GqlQuery constructor or a Model class's gql() method. Property data types that do not have corresponding value literal syntax must be specified using parameter binding, including the list data type. Parameter bindings can be re-bound with new values during the lifetime of the GqlQuery instance (such as to efficiently reuse a query) using the bind() method. The optional ORDER BY clause indicates that results should be returned sorted by the given properties, in either ascending (ASC) or descending (DESC) order. If the direction is not specified, it defaults to ASC. The ORDER BY clause can specify multiple sort orders as a comma-delimited list, evaluated from left to right. An optional LIMIT clause causes the query to stop returning results after the first count entities. The LIMIT can also include an offset to skip that many results to find the first result to return. An optional OFFSET clause can specify an offset if no LIMIT clause is present. Note: A LIMIT clause has a maximum of 1000. If a limit larger than the maximum is specified, the maximum is used. This same maximum applies to the fetch() method of the GqlQuery class. Note: Like the offset parameter for the fetch() method, an OFFSET in a GQL query string does not reduce the number of entities fetched from the datastore. It only affects which

results are returned by the fetch() method. A query with an offset has performance characteristics that correspond linearly with the offset size. For information on executing GQL queries, binding parameters, and accessing results, see the GqlQuery class, and the Model.gql() class method.

9. Datastore Statistics in Python The datastore maintains statistics about the data stored for an application, such as how many entities there are of a given kind, or how much space is used by property values of a given type. You can view these statistics in the Administration Console, under Datastore > Statistics. You can also access these values programmatically within the application by querying for specially named entites using the datastore API. Each statistic is accessible as an entity whose kind name begins and ends with two underscores. For example, each app has exactly one entity of the kind __Stat_Total__ that represents statistics about all of the entities in the datastore in total. Each statistic entity has the following properties: • • •

count, the number of items considered by the statistic (a long integer) bytes, the total size of the items for this statistic (a long integer) timestamp, the time of the most recent update to the statistic (a date-time value)

Some statistic kinds also have additional properties, listed below. A Python application can use model classes provided by the package google.appengine.ext.db.stats to access statistic entities. from google.appengine.ext.db import stats global_stat = stats.GlobalStat.all().get() print 'Total bytes stored: %d' % global_stat.bytes print 'Total entities stored: %d' % global_stat.count

When the statistics system creates new statistic entities, it does not delete the old ones right away. The best way to get a consistent view of the statistics is to query for the GlobalStat entity with the most recent timestamp, then use that timestamp value as a filter when fetching other statistic entities. The statistic entities are included in the calculated statistic values. Statistic entities take up space relative to the number of unique kinds and property names used by the application. The complete list of available statistics is as follows: Statistic all entities

Stat Entity Kind __Stat_Total__

Python class: GlobalStat

__Stat_Kind__

entities of a kind Python class: KindStat

Description All entities. Entities of a kind; one stat entity for each kind of entity stored. Additional properties: •

kind_name, the name of the

kind represented (a string) root entities of a __Stat_Kind_IsRootEntity__

Entities of a kind that are entity

kind

Python class: KindRootEntityStat

group root entities (have no ancestor parent); one stat entity for each kind of entity stored. Additional properties: •

kind_name, the name of the

kind represented (a string)

non-root entities __Stat_Kind_NotRootEntity__ Python class: KindNotRootEntityStat of a kind

Entities of a kind that are not entity group root entities (have an ancestor parent); one stat entity for each kind of entity stored. Additional properties: •

kind_name, the name of the

kind represented (a string)

properties of a type

__Stat_PropertyType__ Python class: PropertyTypeStat

Properties of a value type across all entities; one stat entity per value type. Additional properties: •

property_type, the name of

the value type (a string)

properties of a type per kind

__Stat_PropertyType_Kind__ Python class: KindPropertyTypeStat

Properties of a value type across entities of a given kind; one stat entity per combination of property type and kind. Additional properties: •

property_type, the name of



kind_name, the name of the

the value type (a string) kind represented (a string)

properties with a name

__Stat_PropertyName_Kind__

Python class: KindPropertyNameStat

Properties with a given name across entities of a given kind; one stat entity per combination of unique property name and kind. Additional properties: •

property_name, the name of



the property (a string) kind_name, the name of the kind represented (a string)

properties of a __Stat_PropertyType_PropertyName_Kind__ Properties with a given name and type and with a Python class: of a given value type across entities KindPropertyNamePropertyTypeStat name of a given kind; one stat entity per

combination of property name, value type and kind that exists in the datastore. Additional properties: •

property_type, the name of



property_name, the name of



kind_name, the name of the

the value type (a string) the property (a string) kind represented (a string)

Some statistics refer to datastore property value types by name, as strings. These names are as follows: • • • • • • • • • • • • • • • • • •

"Blob" "Boolean" "ByteString" "Category" "Date/Time" "Email" "Float" "GeoPt" "Integer" "Key" "Link" "NULL" "PhoneNumber" "PostalAddress" "Rating" "String" "Text" "User"

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 3.0 License, and code samples are licensed under the Apache 2.0 License. Java is a registered trademark of Sun Microsystems, Inc. ©2009 Google - Code Home - Terms of Service - Privacy Policy - Site Directory Google Code offered in: English - Español -

- 한한한 - Português - Pусский -

Related Documents

Google Data Store
June 2020 1
Store
November 2019 33
Store Management
April 2020 12