Getting Started - Essential SQLAlchemy, 2nd Edition (2010)

Essential SQLAlchemy, 2nd Edition (2010)

Chapter 2. Getting Started

This chapter guides you through installing version 0.4 of SQLAlchemy (the version documented by this book) via EasyInstall. It will also give you a quick tutorial on the basic features of SQLAlchemy to “get your hands dirty” as soon as possible.

Installing SQLAlchemy

In order to use SQLAlchemy, you need to install the SQLAlchemy package as well as a Python database driver for your database. This section will guide you through installing both.

Installing the SQLAlchemy Package

Installing the SQLAlchemy is a straightforward process involving the widely used SetupTools package.

Installing setup tools

SQLAlchemy is distributed as an EGG file via the Python package index (PyPI), also known as the CheeseShop. If you have installed EGGs before using easy_install, you can skip to the next section. Otherwise, you will need to install SetupTools, a package that enhances the Python standard library-provided distutils package.

Note

SetupTools includes a tool called easy_install, which can be used to install various Python modules from the CheeseShop. easy_install is particularly good at resolving dependencies between Python packages and installing a package’s dependencies along with the package itself. If you intend to take advantage of the rich library of free software available in the CheeseShop, or if you intend to take advantage of the benefits of distributing your own code through SetupTools, it is a good idea to become familiar with all its features. You can find more documentation on SetupTools at http://peak.telecommunity.com/DevCenter/EasyInstall.

To install SetupTools, first download the bootstrap script ez_setup.py from http://peak.telecommunity.com/dist/ez_setup.py. You will then need to run the script to download the rest of SetupTools.

Note

In Windows, you must make certain that you have administrator privileges before running easy_install or ez_setup.py, as both of these scripts modify your Python site-packages directory.

In Windows, it’s also a good idea to make sure that Python and your Python scripts directories are on your path. In the default Python installation, these directories are c:\python25 and c:\python25\scripts.

In Unix-like systems, including Linux, BSD, and OS X, you can install SetupTools as follows:

$ sudo python ez_setup.py

In Windows, you will need to open a command prompt and run the bootstrap script as follows:

c:\>python ez_setup.py

Once you have installed SetupTools using ez_setup, you are ready to install SQLAlchemy.

Installing SQLAlchemy with easy_install

To install SQLAlchemy using easy_install on a Unix-like system, simply type the following:

$ sudo easy_install -UZ SQLAlchemy

On Windows, the corresponding command is as follows (as long as your scripts directory, generally c:\python25\scripts, is on your path):

c:\>easy_install -UZ SQLAlchemy

This will download and install SQLAlchemy to your Python site-packages directory. If you wish to install a particular version of SQLAlchemy, add a version specifier to the easy_install command line. In Unix, this would be:

$ sudo easy_install -UZ SQLAlchemy==0.4.1

In Windows, the command is similar:

c:\>easy_install -UZ SQLAlchemy==0.4.1

Note

Python EGGs are typically distributed and installed as ZIP files. Although this is convenient for distribution, it is often nice to see the actual source code. easy_install includes an option to specify that the EGG should be unzipped. The -UZ options as shown specify thatSQLAlchemy should be Updated if already installed and should not be Zipped. If you are installing SQLAlchemy for the first time, you can leave off the -U, and if you don’t care to look at the source code, you can leave off the -Z.

Testing the install

To verify that your installation of SQLAlchemy has been successful, simply open up an interactive Python interpreter and try importing the module and verifying its version:

>>> import sqlalchemy

>>> sqlalchemy.__version__

'0.4.1'

This book covers the 0.4 release of SQLAlchemy, so confirm that the version installed on your system is at least 0.4.0.

SQLAlchemy also has an extensive unit test suite that can be downloaded separately (not via easy_install) from http://sqlalchemy.org if you wish to test the installation more extensively.

Installing Some Database Drivers

The next step is installing the appropriate DB-API database drivers for the database you wish to use. If you are using a version of Python greater than or equal to 2.5, you already have the SQLite driver installed, as it is included in the standard Python library. If you are using Python 2.3 or 2.4, you will need to install the SQLite driver separately.

Installing the SQLite driver on Python versions before 2.5

For many of the examples in this book, we use the SQLite database driver, mainly because it requires no separate database server installation, and you can use it to generate throwaway in-memory databases. Even if your production database is not SQLite, it can be advantageous to install the driver for prototyping code and running the examples in this book. The SQLite database driver became part of the Python standard library in version 2.5, so if you are running more recent versions of Python, you can skip this section.

Installing SQLite is different depending on whether you are using Windows or another operating system. If you are using Windows, you can download the pysqlite binary module from http://pysqlite.org/ and install it. If you are using another operating system, you will also need to install the SQLite library from http://sqlite.org/.

Other supported drivers

If you wish to connect to other databases, you must install the appropriate DB-API driver module. The complete list of supported databases and drivers follows:

PostgreSQL

psycopg2 at http://www.initd.org/pub/software/psycopg/

SQLite

pysqlite at http://initd.org/pub/software/pysqlite/ or sqlite3 (included with Python versions 2.5 and greater)

MySQL

MySQLdb athttp://sourceforge.net/projects/mysql-python

Oracle

cx_Oracle athttp://www.cxtools.net/

SQL Server

Support for Microsoft SQL server is provided by multiple drivers as follows:

§ pyodbc at http://pyodbc.sourceforge.net/ (recommended driver)

§ adodbapi at http://adodbapi.sourceforge.net/

§ pymssql at http://pymssql.sourceforge.net/

Firebird

kinterbasdb athttp://kinterbasdb.sourceforge.net/

Informix

informixdb athttp://informixdb.sourceforge.net/

SQLAlchemy Tutorial

Once you have installed SQLAlchemy and the SQLite driver (either pysqlite or sqlite3), you can start really exploring SQLAlchemy. This tutorial shows off some of the basic features of SQLAlchemy that you can use to become immediately productive. This tutorial is based on a stripped-down version of a user authentication module that might be used in a web application.

Connecting to the Database and Creating Some Tables

Before doing anything, we need to import the modules we will use. In this case, for simplicity’s sake, we will simply import everything from the sqlalchemy package. We will also import the datetime class from the datetime package for use in defining default values for our tables.

from sqlalchemy import *

from datetime import datetime

To connect to the database, we will create a MetaData object, which is used by SQLAlchemy to keep track of the tables we define:

metadata = MetaData('sqlite:///tutorial.sqlite')

The MetaData object we create is bound to a particular database Engine, in this case a SQLite engine connected to the database in the file tutorial.sqlite. If tutorial.sqlite does not already exist, it will be created automatically by SQLite.

Once we have created our MetaData, we can define our tables. The first table defined is the user table:

user_table = Table(

'tf_user', metadata,

Column('id', Integer, primary_key=True),

Column('user_name', Unicode(16),

unique=True, nullable=False),

Column('password', Unicode(40), nullable=False),

Column('display_name', Unicode(255), default=''),

Column('created', DateTime, default=datetime.now))

Notice how the Table constructor is given the SQL name of the table ('tf_user'), a reference to the metadata object, and a list of columns. The columns are similarly defined with their SQL names, data types, and various optional constraints. In this case, since we defined an 'id' column as a primary key, SQLAlchemy will automatically create the column with an auto-increment default value. Also note that we can specify uniqueness and nullability constraints on columns, provide literal defaults, or provide Python callables (e.g., datetime.now) as defaults.

Next, we define our group and permission tables:

group_table = Table(

'tf_group', metadata,

Column('id', Integer, primary_key=True),

Column('group_name', Unicode(16),

unique=True, nullable=False))

permission_table = Table(

'tf_permission', metadata,

Column('id', Integer, primary_key=True),

Column('permission_name', Unicode(16),

unique=True, nullable=False))

Each table is simply defined with an auto-increment primary key and a unique name.

Finally, we define the join tables that provide a many-to-many relationship between users and groups and groups and permissions:

user_group_table = Table(

'tf_user_group', metadata,

Column('user_id', None, ForeignKey('tf_user.id'),

primary_key=True),

Column('group_id', None, ForeignKey('tf_group.id'),

primary_key=True))

group_permission_table = Table(

'tf_group_permission', metadata,

Column('permission_id', None, ForeignKey('tf_permission.id'),

primary_key=True),

Column('group_id', None, ForeignKey('tf_group.id'),

primary_key=True))

Note in particular the use of compound primary keys (each table is keyed by two columns) and the use of foreign key constraints. We also specified the data type of the foreign key columns as None. When a foreign key column is specified with this datatype, SQLAlchemy will examine the column on the related table (e.g., 'tf_user.id') to determine the data type for the foreign key column.

Once the tables have been defined, we can create them in the database using the following code:

metadata.create_all()

If you were not creating the database, but rather connecting to an existing database, you could, of course, leave out the call to metadata.create_all(). SQLAlchemy will in any case create tables using the IF NOT EXISTS syntax, so a metadata.create_all() is a safe operation.

Performing Queries and Updates

Once we have defined the tables in our schema, we can insert some data. To create a new user, we use SQLAlchemy to construct an INSERT statement using the following syntax:

stmt = user_table.insert()

Once the insert statement has been created, it can be executed multiple times with different values:

stmt.execute(user_name='rick', password='secret',

display_name='Rick Copeland')

stmt.execute(user_name='rick1', password='secret',

display_name='Rick Copeland Clone')

If we wish to see the actual SQL generated, we can instruct SQLAlchemy to log the queries using the metadata.bind.echo property:

>>> metadata.bind.echo = True

>>> stmt.execute(user_name='rick2', password='secret',

... display_name='Rick Copeland Clone 2')

2007-09-06 10:19:52,317 INFO sqlalchemy.engine.base.Engine.0x..50

... INSERT INTO tf_user (user_name, password, display_name, created)

...

... VALUES (?, ?, ?, ?)

2007-09-06 10:19:52,318 INFO sqlalchemy.engine.base.Engine.0x..50

... ['rick2', 'secret', 'Rick Copeland Clone 2', '2007-09-06

... 10:19:52.317540']

2007-09-06 10:19:52,319 INFO sqlalchemy.engine.base.Engine.0x..50

... COMMIT

<sqlalchemy.engine.base.ResultProxy object at 0x2b7ee8ffb610>

>>> metadata.bind.echo = False

Note again that SQLAlchemy uses bind parameters for the values to be inserted, and that SQLAlchemy automatically generates the created column value based on the result of calling datetime.now() when the insert was executed.

To select data back out of the table, we can use the table’s select() method as follows:

>>> stmt = user_table.select()

>>> result = stmt.execute()

>>> for row in result:

... print row

...

(1, u'rick', u'secret1', u'Rick Copeland',

... datetime.datetime(2007, 9, 7, 10, 6, 4, 415754))

(2, u'rick1', u'secret', u'Rick Copeland Clone',

... datetime.datetime(2007, 9, 7, 10, 6, 4, 476505))

(3, u'rick2', u'secret', u'Rick Copeland Clone 2',

... datetime.datetime(2007, 9, 7, 10, 6, 4, 543650))

We can also retrieve values from each row of the result using dict-like indexing or simple attribute lookup as follows:

>>> result = stmt.execute()

>>> row =result.fetchone()

>>> row['user_name']

u'rick'

>>> row.password

u'secret1'

>>> row.created

datetime.datetime(2007, 9, 7, 10, 6, 4, 415754)

>>> row.items()

[(u'id', 1), (u'user_name', u'rick'), (u'password', u'secret1'),

... (u'display_name', u'Rick Copeland'),

... (u'created', datetime.datetime(2007, 9, 7, 10, 6, 4, 415754))]

To restrict the rows that are returned from the select() method, we can supply a where clause. SQLAlchemy provides a powerful SQL expression language to assist in the construction of where clauses, as shown in the following example:

>>> stmt = user_table.select(user_table.c.user_name=='rick')

>>> print stmt.execute().fetchall()

[(1, u'rick', u'secret1', u'Rick Copeland',

... datetime.datetime(2007, 9, 7, 10, 6, 4, 415754))]

The SQL expression language is covered in more detail in Chapter 5.

We can also use the SQL expression language to generate updates and deletes by passing clauses to the update() and delete() methods on Table objects:

>>> # Create an update constrained by user name

... stmt = user_table.update(user_table.c.user_name=='rick')

>>> # Execute the update, setting the password column to secret123

... stmt.execute(password='secret123')

<sqlalchemy.engine.base.ResultProxy object at 0xa20c50>

>>>

>>> # Create a delete statement that deletes all users

... # except for 'rick'

... stmt = user_table.delete(user_table.c.user_name != 'rick')

>>> stmt.execute()

<sqlalchemy.engine.base.ResultProxy object at 0x2b12bf430210>

>>> # Select the users back from the database

... user_table.select().execute().fetchall()

[(1, u'rick', u'secret123', u'Rick Copeland',

... datetime.datetime(2007, 9, 7, 18, 35, 35, 529412))]

>>> # Add the users back

... stmt = user_table.insert()

>>> stmt.execute(user_name='rick1', password='secret',

... display_name='Rick Copeland Clone')

<sqlalchemy.engine.base.ResultProxy object at 0xa20c90>

>>> stmt.execute(user_name='rick2', password='secret',

... display_name='Rick Copeland Clone 2')

<sqlalchemy.engine.base.ResultProxy object at 0xa20cd0>

>>>

SQLAlchemy also provides for more generalized queries via the insert(), select(), update(), and delete() functions (rather than the methods on Table objects) to allow you to specify more complex SQL queries. Again, this is covered in more detail in Chapter 5.

Mapping Objects to Tables

In addition to the SQL-level features covered thus far, SQLAlchemy also provides a powerful object-relational mapper (ORM) that allows you to map tables (and other “selectable” objects, such as SELECT statements) to objects, making those objects automatically “SQL-persistable.” In order to use the ORM, we need to import the appropriate names:

from sqlalchemy.orm import *

The simplest case of mapping is to just declare empty classes for our application objects and declare an empty mapper:

class User(object): pass

class Group(object): pass

class Permission(object): pass

mapper(User, user_table)

mapper(Group, group_table)

mapper(Permission, permission_table)

Now that we have declared the mapping between our classes and tables, we can start doing queries. First off, though, we need to understand the unit of work (UOW) pattern. In UOW as implemented by SQLAlchemy, there is an object known as a Session that tracks changes to mapped objects and can flush() them out en masse to the database in a single “unit of work.” This can lead to substantial performance improvement when compared to executing multiple separate updates. In SQLAlchemy, the Session class is created using the sessionmaker() function, and theSession object is created by instantiating the class returned from sessionmaker(). The intent is that sessionmaker() should be called once (at the module level), with its return value used to create individual sessions:

Session = sessionmaker()

session = Session()

Once we have the session object, we use it to obtain access to a Query object for our class:

query = session.query(User)

The simplest way to use the Query object is as an iterator for all the objects in the database. Since we have already inserted a row in the user_table, we can retrieve that row as a User object:

>>> list(query)

[<__main__.User object at 0xb688d0>,

... <__main__.User object at 0xb68910>,

... <__main__.User object at 0xb68c10>]

>>> for user in query:

... print user.user_name

...

rick

rick1

rick2

We can also retrieve an object from the database by using its primary key with the get() method on the Query object:

>>> query.get(1)

<__main__.User object at 0xb688d0>

If we want to filter the results retrieved by the Query object, we can use the filter() and filter_by() methods:

>>> for user in query.filter_by(display_name='Rick Copeland'):

... print user.id, user.user_name, user.password

...

1 rick secret123

>>> for user in query.filter(User.c.user_name.like('rick%')):

... print user.id, user.user_name, user.password

...

1 rick secret123

2 rick1 secret

3 rick2 secret

Note the use of the .c attribute of the User object. It was added by the mapper as a convenience to access the names of mapped columns. If we wanted to, we could freely substitute user_table.c.user_name for User.c.user_name, and vice versa.

To insert objects into the database, we simply create an object in Python and then use the save() method to notify the session about the object:

>>> newuser = User()

>>> newuser.user_name = 'mike'

>>> newuser.password = 'password'

>>> session.save(newuser)

Due to the UOW pattern, the new user has not yet been saved to the database. If we try to count the users using the user_table, we still get 3:

>>> len(list(user_table.select().execute()))

3

If, however, we try to use the Query object, the ORM recognizes the need to perform a flush() on the Session, inserts the new user, and we get a count of 4:

>>> metadata.bind.echo = True

>>> query.count()

2007-09-09 21:33:09,482 INFO sqlalchemy.engine.base.Engine.0x..50

... INSERT INTO tf_user (user_name, password, display_name, created)

...

... VALUES (?, ?, ?, ?)

2007-09-09 21:33:09,482 INFO sqlalchemy.engine.base.Engine.0x..50

... ['mike', 'password', '', '2007-09-09 21:33:09.481854']

2007-09-09 21:33:09,485 INFO sqlalchemy.engine.base.Engine.0x..50

... SELECT count(tf_user.id)

FROM tf_user

2007-09-09 21:33:09,486 INFO sqlalchemy.engine.base.Engine.0x..50 []

4

You can disable the auto-flushing behavior of SQLAlchemy by specifying autoflush=False in the call to sessionmaker().

To update objects in the database, we simply make changes to the object in Python and allow the SQLAlchemy Session to track our changes and eventually flush everything out to the database:

>>> newuser.password = 'password1'

>>> newuser.display_name = 'Michael'

>>>

>>> rick = query.get(1)

>>> rick.display_name = 'Richard'

>>>

>>> session.flush()

2007-09-09 21:40:21,854 INFO sqlalchemy.engine.base.Engine.0x..50

... UPDATE tf_user SET display_name=? WHERE tf_user.id = ?

2007-09-09 21:40:21,854 INFO sqlalchemy.engine.base.Engine.0x..50

... ['Richard', 1]

2007-09-09 21:40:21,856 INFO sqlalchemy.engine.base.Engine.0x..50

... UPDATE tf_user SET password=?, display_name=? WHERE tf_user.id =

... ?

2007-09-09 21:40:21,857 INFO sqlalchemy.engine.base.Engine.0x..50

['password1', 'Michael', 4]

To delete an object, simply call the session’s delete() method with the object to be deleted. To flush the session and commit the transaction, we call session.commit():

>>> session.delete(newuser)

>>>

>>> session.commit()

2007-09-09 21:42:56,327 INFO sqlalchemy.engine.base.Engine.0x..50

... DELETE FROM tf_user WHERE tf_user.id = ?

2007-09-09 21:42:56,328 INFO sqlalchemy.engine.base.Engine.0x..50

... [4]

2007-09-09 21:42:56,328 INFO sqlalchemy.engine.base.Engine.0x..50

... COMMIT

The SQLAlchemy ORM also includes support for managing relationships between classes, as well as flexible overrides of its column-mapping conventions. The ORM is covered in more detail in Chapters 6, 7, and 8.