The Rails 4 Way (2014)
Chapter 5. Working with Active Record
An object that wraps a row in a database table or view, encapsulates the database access, and adds domain logic on that data.
—Martin Fowler, Patterns of Enterprise Architecture
The Active Record pattern, identified by Martin Fowler in his seminal work, Patterns of Enterprise Architecture, maps one domain class to one database table, and one instance of that class to each row of that database. It is a simple approach that, while not perfectly applicable in all cases, provides a powerful framework for database access and object persistence in your application.
The Rails Active Record framework includes mechanisms for representing models and their relationships, CRUD (Create, Read, Update and Delete) operations, complex searches, validation, callbacks, and many more features. It relies heavily on convention over configuration, so it’s easy to use when you’re creating a new database schema that can follow those conventions. However, Active Record also provides configuration settings that let you adapt it to work well with legacy database schemas that don’t necessarily conform to Rails conventions.
According to Martin Fowler, delivering the keynote address at the inaugural Rails conference in 2006, Ruby on Rails has successfully taken the Active Record pattern much further than anyone imagined it could go. It shows you what you can achieve when you have a single-minded focus on a set of ideals, which in the case of Rails is simplicity.
5.1 The Basics
For the sake of completeness, let’s briefly review the basics of how Active Record works. In order to create a new model class, the first thing you do is to declare it as a subclass of ActiveRecord::Base, using Ruby’s class extension syntax:
classClient < ActiveRecord::Base
end
By convention, an Active Record class named Client will be mapped to the clients table. Rails understands pluralization, as covered in the section “Pluralization” in this chapter. Also by convention, Active Record will expect an id column to use as primary key. It should be an integer and incrementing of the key should be managed automatically by the database server when creating new records. Note how the class itself makes no mention of the table name, columns, or their datatypes.
Each instance of an Active Record class provides access to the data from one row of the backing database table, in an object-oriented manner. The columns of that row are represented as attributes of the object, using straightforward type conversions (i.e. Ruby strings for varchars, Ruby dates for dates, and so on), and with no default data validation. Attributes are inferred from the column definition pertaining to the tables with which they’re linked. Adding, removing, and changing attributes and their types are done by changing the columns of the table in the database.
When you’re running a Rails server in development mode, changes to the database schema are reflected in the Active Record objects immediately, via the web browser. However, if you make changes to the schema while you have your Rails console running, the changes will not be reflected automatically, although it is possible to pick up changes manually by typing reload! at the console.
Courtenay says… Active Record is a great example of the Rails “Golden Path.” If you keep within its limitations, you can go far, fast. Stray from the path, and you might get stuck in the mud. This Golden Path involves many conventions, like naming your tables in the plural form (“users”). It’s common for new developers to Rails and rival web-framework evangelists to complain about how tables must be named in a particular manner, how there are no constraints in the database layer, that foreign keys are handled all wrong, enterprise systems must have composite primary keys, and more. Get the complaining out of your system now, because all these defaults are simply defaults, and in most cases can be overridden with a single line of code or a plugin. |
5.2 Macro-Style Methods
Most of the important classes you write while coding a Rails application are configured using what I call macro-style method invocations (also known in some circles as a domain-specific language or DSL). Basically, the idea is to have a highly readable block of code at the top of your class that makes it immediately clear how it is configured.
Macro-style invocations are usually placed at the top of the file, and for good reason. Those methods declaratively tell Rails how to manage instances, perform data validation and callbacks, and relate with other models. Many of them do some amount of metaprogramming, meaning that they participate in adding behavior to your class at runtime, in the form of additional instance variables and methods.
5.2.1 Relationship Declarations
For example, look at the Client class with some relationships declared. We’ll talk about associations extensively in Chapter 7, “Active Record Associations”, but all I want to do right now is to illustrate what I’m talking about when I say macro-style:
1 classClient < ActiveRecord::Base
2 has_many :billing_codes
3 has_many :billable_weeks
4 has_many :timesheets, through: :billable_weeks
5 end
As a result of those three has_many declarations, the Client class gains at least three new attributes, proxy objects that let you manipulate the associated collections interactively.
I still remember the first time I sat with an experienced Java programmer friend of mine to teach him some Ruby and Rails. After minutes of profound confusion, an almost visible light bulb appeared over his head as he proclaimed, “Oh! They’re methods!”
Indeed, they’re regular old method calls, in the context of the class object. We leave the parentheses off to emphasize the declarative intention. That’s a style issue, but it just doesn’t feel right to me with the parentheses in place, as in the following code snippet:
1 classClient < ActiveRecord::Base
2 has_many(:billing_codes)
3 has_many(:billable_weeks)
4 has_many(:timesheets, through: :billable_weeks)
5 end
When the Ruby interpreter loads client.rb, it executes those has_many methods, which, again, are defined as class methods of Active Record’s Base class. They are executed in the context of the Client class, adding attributes that are subsequently available to Client instances. It’s a programming model that is potentially strange to newcomers, but quickly becomes second nature to the Rails programmer.
5.2.2 Convention over Configuration
Convention over configuration is one of the guiding principles of Ruby on Rails. If we follow Rails conventions, very little explicit configuration is needed, which stands in stark contrast to the reams of configuration that are required to get even a simple application running in other technologies.
It’s not that a newly bootstrapped Rails application comes with default configuration in place already, reflecting the conventions that will be used. It’s that the conventions are baked into the framework, actually hard-coded into its behavior, and you need to override the default behavior with explicit configuration when applicable.
It’s also worth mentioning that most configuration happens in close proximity to what you’re configuring. You will see associations, validations, and callback declarations at the top of most Active Record models.
I suspect that the first explicit configuration (over convention) that many of us deal with in Active Record is the mapping between class name and database table, since by default Rails assumes that our database name is simply the pluralized form of our class name.
5.2.3 Setting Names Manually
The table_name and primary_key setter methods let you use any table and primary names you’d like, but you’ll have to specify them explicitly in your model class.
1 classClient < ActiveRecord::Base
2 self.table_name = "CLIENT"
3 self.primary_key = "CLIENT_ID"
4 end
It’s only a couple of extra lines per model, but on a large application it adds unnecessary complexity, so don’t do it if you don’t absolutely have to.
When you’re not at liberty to dictate the naming guidelines for your database schema, such as when a separate DBA group controls all database schemas, then you probably don’t have a choice. But if you have flexibility, you should really just follow Rails conventions. They might not be what you’re used to, but following them will save you time and unnecessary headaches.
5.2.4 Legacy Naming Schemes
If you are working with legacy schemas, you may be tempted to automatically set table_name everywhere, whether you need it or not. Before you get accustomed to doing that, learn the additional options available that might just be more DRY and make your life easier.
Let’s assume you need to turn off table pluralization altogether; you would set the following attribute to your config/application.rb:
config.active_record.pluralize_table_names = false
There are various other useful attributes of ActiveRecord::Base, provided for configuring Rails to work with legacy naming schemes.
5.2.4.1 primary_key_prefix_type
Accessor for the prefix type that will be prepended to every primary key column name. If :table_name is specified, Active Record will look for tableid instead of id as the primary column. If :table_name_with_underscore is specified, Active Record will look for table_id instead of id.
5.2.4.2 table_name_prefix
Some departments prefix table names with the name of the database. Set this attribute accordingly to avoid having to include the prefix in all of your model class names.
5.2.4.3 table_name_suffix
Similar to prefix, but adds a common ending to all table names.
5.3 Defining Attributes
The list of attributes associated with an Active Record model class is not coded explicitly. At runtime, the Active Record model examines the database schema directly from the server. Adding, removing, and changing attributes and their type is done by manipulating the database itself via Active Record migrations.
The practical implication of the Active Record pattern is that you have to define your database table structure and make sure it exists in the database prior to working with your persistent models. Some people may have issues with that design philosophy, especially if they’re coming from a background in top-down design.
The Rails way is undoubtedly to have model classes that map closely to your database schema. On the other hand, remember you can have models that are simple Ruby classes and do not inherit from ActiveRecord::Base. Among other things, it is common to use non-Active Record model classes to encapsulate data and logic for the view layer.
5.3.1 Default Attribute Values
Migrations let you define default attribute values by passing a :default option to the column method, but most of the time you’ll want to set default attribute values at the model layer, not the database layer. Default values are part of your domain logic and should be kept together with the rest of the domain logic of your application, in the model layer.
A common example is the case when your model should return the string “n/a” instead of a nil (or empty) string for an attribute that has not been populated yet. Seems simple enough and it’s a good way to learn how attributes exist at runtime.
To begin, let’s whip up a quick spec describing the desired behavior.
1 describe TimesheetEntry do
2 it "has a category of 'n/a' if not available" do
3 entry = TimesheetEntry.new
4 expect(entry.category).to eq('n/a')
5 end
6 end
We run that spec and it fails, as expected. Active Record doesn’t provide us with any class-level methods to define default values for models declaratively. So it seems we’ll have to create an explicit attribute accessor that provides a default value.
Normally, attribute accessors are handled magically by Active Record’s internals, but in this case we’re overriding the magic with an explicit getter. All we need to do is to define a method with the same name as the attribute and use Ruby’s || operator, which will short-circuit if @category is not nil.
1 classTimesheetEntry < ActiveRecord::Base
2 def category
3 @category || 'n/a'
4 end
5 end
Now we run the spec and it passes. Great. Are we done? Not quite. We should test a case when the real category value should be returned. I’ll insert an example with a not-nil category.
1 describe TimesheetEntry do
2 it "returns category when available" do
3 entry = TimesheetEntry.new(category: "TR4W")
4 expect(entry.category).to eq("TR4W")
5 end
6
7 it "has a category of 'n/a' if not available" do
8 entry = TimesheetEntry.new
9 expect(entry.category).to eq('n/a')
10 end
11 end
Uh-oh. The first spec fails. Seems our default ‘n/a’ string is being returned no matter what. That means that @category must not be getting set. Should we even know that it is getting set or not? It is an implementation detail of Active Record, is it not?
The fact that Rails does not use instance variables like @category to store the model attributes is in fact an implementation detail. But model instances have a couple of methods, write_attribute and read_attribute, conveniently provided by Active Record for the purposes of overriding default accessors, which is exactly what we’re trying to do. Let’s fix our TimesheetEntry class.
1 classTimesheetEntry < ActiveRecord::Base
2 def category
3 read_attribute(:category) || 'n/a'
4 end
5 end
Now the spec passes. How about a simple example of using write_attribute?
1 classSillyFortuneCookie < ActiveRecord::Base
2 def message=(txt)
3 write_attribute(:message, txt + ' in bed')
4 end
5 end
Alternatively, both of these examples could have been written with the shorter forms of reading and writing attributes, using square brackets.
1 classSpecification < ActiveRecord::Base
2 def tolerance
3 self[:tolerance] || 'n/a'
4 end
5 end
6
7 classSillyFortuneCookie < ActiveRecord::Base
8 def message=(txt)
9 self[:message] = txt + ' in bed'
10 end
11 end
5.3.2 Serialized Attributes
One of Active Record’s coolest features is the ability to mark a column of type text as being serialized. Whatever object (more accurately, graph of objects) you assign to that attribute will be stored in the database as YAML, Ruby’s native serialization format.
Sebastian says… TEXT columns usually have a maximum size of 64K and if your serialized attributes exceeds the size constraints, you’ll run into a lot of errors. On the other hand, if your serialized attributes are that big, you might want to rethink what you’re doing. At least move them into a separate table and use a larger column type if your server allows it. |
One of the first things that new Rails developers do when they discover the serialize declaration is to use it to store a hash of arbitrary objects related to user preferences. Why bother with the complexity of a separate preferences table if you can denormalize that data into the users table instead?
1 classUser < ActiveRecord::Base
2 serialize :preferences, Hash
3 end
The optional second parameter (used in the example) takes a class that limits the type of object that can be stored. The serialized object must be of that class on retrieval or SerializationTypeMismatch will be raised.
The API does not give us an easy way to set a default value. That’s unfortunate, because it would be nice to be able to assume that our preferences attribute is already initialized when we want to use it.
1 user = User.new
2 # the following line will raise NoMethodError
3 # unless preferences has a default
4 user.preferences[:inline_help] = false
Unless a value has already been set for the attribute, it’s going to be nil. You might be tempted to set a default YAML string for the serialized attribute at the database level, so that it’s not nil when you’re using a newly created object:
add_column :users, :preferences, :text, default: "--- {}"
However, that approach won’t work with MySQL 5.x, which ignores default values for binary and text columns. One possible solution is to overload the attribute’s reader method with logic that sets the default value if it’s nil.
def preferences
read_attribute(:preferences) || write_attribute(:preferences, {})
end
I prefer this method over the alternative, using an after_initialize callback, because it incurs a small performance hit only when the preferences attribute is actually used and not at instantiation time of every single User object in your system.
5.3.3 ActiveRecord::Store
With version 3.2, Rails introduced the store declaration, which uses serialize behind the scenes to declare a single-column key/value store.
classUser < ActiveRecord::Base
store :preferences
end
An added benefit of using store is that its assigned serialized attribute is set to an empty Hash by default. Therefore removing the need to set a default by overriding the attribute’s reader method or setting one at the database level, as was done in the previous section.
It’s possible to add an :accessors option to the store declaration which declares read/write accessors in your Active Record model.
store :preferences, accessors: [:inline_help]
Writing to a store accessor method will create a key/value pair within the serialized hash attribute, as shown in the following example:
>> user = User.new
=> #<User id: nil, preferences: {}, ...>
>> user.inline_help = false
=> false
>> user.preferences
=> {"inline_help"=>false}
Alternatively, you can use the store_accessor declaration to declare read/write accessors for a serialized attribute.
store_accessor :inline_help
5.4 CRUD: Creating, Reading, Updating, Deleting
The four standard operations of a database system combine to form a popular acronym: CRUD. It sounds somewhat negative, because as a synonym for garbage or unwanted accumulation the word crud in English has a rather bad connotation. However, in Rails circles, use of the word CRUD is benign. In fact, as in earlier chapters, designing your app to function primarily as RESTful CRUD operations is considered a best practice!
5.4.1 Creating New Active Record Instances
The most straightforward way to create a new instance of an Active Record model is by using a regular Ruby constructor, the class method new. New objects can be instantiated as either empty (by omitting parameters) or pre-set with attributes, but not yet saved. Just pass a hash with key names matching the associated table column names. In both instances, valid attribute keys are determined by the column names of the associated table—hence you can’t have attributes that aren’t part of the table columns.
You can find out if an Active Record object is saved by looking at the value of its id, or programmatically, by using the methods new_record? and persisted?:
>> c = Client.new
=> #<Client id: nil, name: nil, code: nil>
>> c.new_record?
=> true
>> c.persisted?
=> false
Active Record constructors take an optional block, which can be used to do additional initialization. The block is executed after any passed-in attributes are set on the instance:
>> c = Client.new do |client|
?> client.name = "Nile River Co."
>> client.code = "NRC"
>> end
=> #<Client id: 1, name: "Nile River Co.", code: "NRC">
Active Record has a handy-dandy create class method that creates a new instance, persists it to the database, and returns it in one operation:
>> c = Client.create(name: "Nile River, Co.", code: "NRC")
=> #<Client id: 1, name: "Nile River, Co.", code: "NRC" ...>
The create method takes an optional block, just like new.
5.4.2 Reading Active Record Objects
Finding an existing object by its primary key is very simple, and is probably one of the first things we all learn about Rails when we first pick up the framework. Just invoke find with the key of the specific instance you want to retrieve. Remember that if an instance is not found, aRecordNotFound exception is raised.
>> first_project = Project.find(1)
=> #<Project id: 1 ...>
>> boom_client = Client.find(99)
ActiveRecord::RecordNotFound: Couldn't find Client with ID=99
>> all_clients = Client.all
=> #<ActiveRecord::Relation [#<Client id: 1, name: "Paper Jam Printers",
code: "PJP" ...>, #<Client id: 2, name: "Goodness Steaks",
code: "GOOD_STEAKS" ...>]>
>> first_client = Client.first
=> #<Client id: 1, name: "Paper Jam Printers", code: "PJP" ...>
By the way, it is entirely common for methods in Ruby to return different types depending on the parameters used, as illustrated in the example. Depending on how find is invoked, you will get either a single Active Record object or an array of them.
For convenience, first, last and all also exist as syntactic sugar wrappers around the find method.
>> Product.last
=> #<Product id: 1, name: "leaf", sku: nil,
created_at: "2010-01-12 03:34:41", updated_at: "2010-01-12 03:34:41">
Finally, the find method also understands arrays of ids, and raises a RecordNotFound exception if it can’t find all of the ids specified:
>> Product.find([1, 2])
ActiveRecord::RecordNotFound: Couldn't find all Products with IDs (1,
2) (found 1 results, but was looking for 2)
5.4.3 Reading and Writing Attributes
After you have retrieved a model instance from the database, you can access each of its columns in several ways. The easiest (and clearest to read) is simply with dot notation:
>> first_client.name
=> "Paper Jam Printers"
>> first_client.code
=> "PJP"
The private read_attribute method of Active Record, covered briefly in an earlier section, is useful to know about, and comes in handy when you want to override a default attribute accessor. To illustrate, while still in the Rails console, I’ll go ahead and reopen the Client class on the fly and override the name accessor to return the value from the database, but reversed:
>> classClient < ActiveRecord::Base
>> def name
>> read_attribute(:name).reverse
>> end
>> end
=> nil
>> first_client.name
=> "sretnirP maJ repaP"
Hopefully it’s not too painfully obvious for me to demonstrate why you need read_attribute in that scenario. Recursion is a bitch, if it’s unexpected:
>> classClient < ActiveRecord::Base
>> def name
>> self.name.reverse
>> end
>> end
=> nil
>> first_client.name
SystemStackError: stack level too deep
from (irb):21:in 'name'
from (irb):21:in 'name'
from (irb):24
As can be expected by the existence of a read_attribute method (and as we covered earlier in the chapter), there is also a write_attribute method that lets you change attribute values. Just as with attribute getter methods, you can override the setter methods and provide your own 'line-height:normal'>1 classProject < ActiveRecord::Base
2 # The description for a project cannot be changed to a blank string
3 def description=(new_value)
4 write_attribute(:description, new_value) unless new_value.blank?
5 end
6end
The preceding example illustrates a way to do basic validation, since it checks to make sure that a value is not blank before allowing assignment. However, as we’ll see in Chapter 8, “Validations”, there are better ways to do this.
5.4.3.1 Hash Notation
Yet another way to access attributes is using the [attribute_name] operator, which lets you access the attribute as if it were a regular hash.
>> first_client['name']
=> "Paper Jam Printers"
>> first_client[:name]
=> "Paper Jam Printers"
String Versus Symbol
Many Rails methods accept symbol and string parameters interchangeably, and that is potentially very confusing. Which is more correct? The general rule is to use symbols when the string is a name for something, and a string when it’s a value. You should probably be using symbols when it comes to keys of options hashes and the like.
5.4.3.2 The attributes Method
There is also an attributes method that returns a hash with each attribute and its corresponding value as returned by read_attribute. If you use your own custom attribute reader and writer methods, it’s important to remember that attributes will not use custom attribute readers when accessing its values, but attributes= (which lets you do mass assignment) does invoke custom attribute writers.
>> first_client.attributes
=> {"name"=>"Paper Jam Printers", "code"=>"PJP", "id"=>1}
Being able to grab a hash of all attributes at once is useful when you want to iterate over all of them or pass them in bulk to another function. Note that the hash returned from attributes is not a reference to an internal structure of the Active Record object. It is copy, which means that changing its values will have no effect on the object it came from.
>> atts = first_client.attributes
=> {"name"=>"Paper Jam Printers", "code"=>"PJP", "id"=>1}
>> atts["name"] = "Def Jam Printers"
=> "Def Jam Printers"
>> first_client.attributes
=> {"name"=>"Paper Jam Printers", "code"=>"PJP", "id"=>1}
To make changes to an Active Record object’s attributes in bulk, it is possible to pass a hash to the attributes writer.
5.4.4 Accessing and Manipulating Attributes Before They Are Typecast
The Active Record connection adapters, classes that implement behavior specific to databases, fetch results as strings and Rails takes care of converting them to other datatypes if necessary, based on the type of the database column. For instance, integer types are cast to instances of Ruby’sFixnum class, and so on.
Even if you’re working with a new instance of an Active Record object, and have passed in constructor values as strings, they will be typecast to their proper type when you try to access those values as attributes.
Sometimes you want to be able to read (or manipulate) the raw attribute data without having the column-determined typecast run its course first, and that can be done by using the attribute_before_type_cast accessors that are automatically created in your model.
For example, consider the need to deal with currency strings typed in by your end users. Unless you are encapsulating currency values in a currency class (highly recommended, by the way) you need to deal with those pesky dollar signs and commas. Assuming that our Timesheet model had a rate attribute defined as a :decimal type, the following code would strip out the extraneous characters before typecasting for the save operation:
1 classTimesheet < ActiveRecord::Base
2 before_validation :fix_rate
3
4 def fix_rate
5 self[:rate] = rate_before_type_cast.tr('$,','')
6 end
7 end
5.4.5 Reloading
The reload method does a query to the database and resets the attributes of an Active Record object. The optional options argument is passed to find when reloading so you may do, for example, record.reload(lock: true) to reload the same record with an exclusive row lock. (See the section “Database Locking” later in this chapter.)
5.4.6 Cloning
Producing a copy of an Active Record object is done simply by calling clone, which produces a shallow copy of that object. It is important to note that no associations will get copied, even though they are stored internally as instance variables.
5.4.7 Custom SQL Queries
The find_by_sql class method takes a SQL select query and returns an array of Active Record objects based on the results. Here’s a barebones example, which you would never actually need to do in a real application:
>> Client.find_by_sql("select * from clients")
=> [#<Client id: 1, name: "Paper Jam Printers",
code: "PJP" ...>, #<Client id: 2, name: "Goodness Steaks",
code: "GOOD_STEAKS" ...>]
I can’t stress this enough: You should take care to use find_by_sql only when you really need it! For one, it reduces database portability. When you use Active Record’s normal find operations, Rails takes care of handling differences between the underlying databases for you.
Note that Active Record already has a ton of built-in functionality abstracting SELECT statements. Functionality that it would be very unwise to reinvent. There are lots of cases where at first glance it might seem that you might need to use find_by_sql, but you actually don’t. A common case is when doing a LIKE query:
>> Client.find_by_sql("select * from clients where code like 'A%'")
=> [#<Client id: 1, name: "Amazon, Inc" ...>]
Turns out that you can easily put that LIKE clause into a conditions option:
>> param = "A"
>> Client.where("code like ?", "#{param}%")
=> [#<Client id: 1, name: "Amazon, Inc" ...>]
Preventing SQL Injection attacks
Under the covers, Rails sanitizes17 your SQL code, provided that you parameterize your query. Active Record executes your SQL using the connection.select_all method, iterating over the resulting array of hashes, and invoking your Active Record’s initialize method for each row in the result set.
What would this section’s example look like un-parameterized?
>> Client.where("code like '#{params[:code]}%'")
=> [#<Client id: 1, name: "Amazon, Inc" ...>] # NOOOOO!
Notice the missing question mark as a variable placeholder. Always remember that interpolating user-supplied values into a SQL fragment of any type is very unsafe! Just imagine what would happen to your project if a malicious user called that unsafe find with params[:code] set to
1 "Amazon'; DELETE FROM users;'
This particular example might fail in your own experiments. The outcome is very specific to the type of database/driver that you’re using. Some popular databases drivers may even have features that help to prevent SQL injection. I still think it’s better to be safe than sorry.
The count_by_sql method works in a manner similar to find_by_sql.
>> Client.count_by_sql("select count(*) from clients")
=> 132
Again, you should have a special reason to be using it instead of the more concise alternatives provided by Active Record.
5.4.8 The Query Cache
By default, Rails attempts to optimize performance by turning on a simple query cache. It is a hash stored on the current thread, one for every active database connection. (Most Rails processes will have just one.)
Whenever a find (or any other type of select operation) happens and the query cache is active, the corresponding result set is stored in a hash with the SQL that was used to query for them as the key. If the same SQL statement is used again in another operation, the cached result set is used to generate a new set of model objects instead of hitting the database again.
You can enable the query cache manually by wrapping operations in a cache block, as in the following example:
1 User.cache do
2 puts User.first
3 puts User.first
4 puts User.first
5 end
Check your development.log and you should see the following entries:
User Load (0.1ms) SELECT "users".* FROM "users" ORDER BY "users"."id"
ASC LIMIT 1
CACHE (0.0ms) SELECT "users".* FROM "users" ORDER BY "users"."id"
ASC LIMIT 1 LIMIT 1
CACHE (0.0ms) SELECT "users".* FROM "users" ORDER BY "users"."id"
ASC LIMIT 1
The database was queried only once. Try a similar experiment in your own console without the cache block, and you’ll see that three separate User Load events are logged.
Save and delete operations result in the cache being cleared, to prevent propagation of instances with invalid states. If you find it necessary to do so for whatever reason, call the clear_query_cache class method to clear out the query cache manually.
5.4.8.1 Logging
The log file indicates when data is being read from the query cache instead of the database. Just look for lines starting with CACHE instead of a Model Load.
Place Load (0.1ms) SELECT * FROM places WHERE (places.id = 15749)
CACHE (0.0ms) SELECT * FROM places WHERE (places.id = 15749)
CACHE (0.0ms) SELECT * FROM places WHERE (places.id = 15749)
5.4.8.2 Default Query Caching in Controllers
For performance reasons, Active Record’s query cache is turned on by default for the processing of controller actions.
5.4.8.3 Limitations
The Active Record query cache was purposely kept very simple. Since it literally keys cached model instances on the SQL that was used to pull them out of the database, it can’t connect multiple find invocations that are phrased differently but have the same semantic meaning and results.
For example, “select foo from bar where id = 1” and “select foo from bar where id = 1 limit 1”” are considered different queries and will result in two distinct cache entries.
5.4.9 Updating
The simplest way to manipulate attribute values is simply to treat your Active Record object as a plain old Ruby object, meaning via direct assignment using myprop=(some_value)
There are a number of other different ways to update Active Record objects, as illustrated in this section. First, let’s look at how to use the update class method of ActiveRecord::Base
1 classProjectController < ApplicationController
2 def update
3 Project.update(params[:id], params[:project])
4 redirect_to projects_path
5 end
6
7 def mass_update
8 Project.update(params[:projects].keys, params[:projects].values])
9 redirect_to projects_path
10 end
11 end
The first form of update takes a single numeric id and a hash of attribute values, while the second form takes a list of ids and a list of values and is useful in scenarios where a form submission from a web page with multiple updateable rows is being processed.
The update class method does invoke validation first and will not save a record that fails validation. However, it returns the object whether or not the validation passes. That means that if you want to know whether or not the validation passed, you need to follow up the call to update with a call to valid?
1 classProjectController < ApplicationController
2 def update
3 project = Project.update(params[:id], params[:project])
4 if project.valid? # uh-oh, do we want to run validate again?
5 redirect_to project
6 else
7 render 'edit'
8 end
9 end
10 end
A problem is that now we are calling valid? twice, since the update call also called it. Perhaps a better option is to use the update instance method once as part of an if statement:
1 classProjectController < ApplicationController
2 def update
3 project = Project.find(params[:id])
4 if project.update(params[:project])
5 redirect_to project
6 else
7 render 'edit'
8 end
9 end
10 end
And of course, if you’ve done some basic Rails programming, you’ll recognize that pattern, since it is used in the generated scaffolding code. The update method takes a hash of attribute values, and returns true or false depending on whether the save was successful or not, which is dependent on validation passing.
5.4.10 Updating by Condition
Active Record has another class method useful for updating multiple records at once: update_all. It maps closely to the way that you would think of using a SQL update...where statement. The update_all method takes two parameters, the set part of the SQL statement and the conditions, expressed as part of a where clause. The method returns the number of records updated.
I think this is one of those methods that is generally more useful in a scripting context than in a controller method, but you might feel differently. Here is a quick example of how I might go about reassigning all the Rails projects in the system to a new project manager.
Project.update_all({manager: 'Ron Campbell'}, technology: 'Rails')
The update_all method also accepts string parameters, which allows you to leverage the power of SQL!
Project.update_all("cost = cost * 3", "lower(technology) LIKE '%microsoft%'")
5.4.11 Updating a Particular Instance
The most basic way to update an Active Record object is to manipulate its attributes directly and then call save. It’s worth noting that save will insert a record in the database if necessary or update an existing record with the same primary key.
>> project = Project.find(1)
>> project.manager = 'Brett M.'
>> project.save
=> true
The save method will return true if it was successful or false if it failed for any reason. There is another method, save!, that will use exceptions instead. Which one to use depends on whether you plan to deal with errors right away or delegate the problem to another method further up the chain.
It’s mostly a matter of style, although the non-bang save and update methods that return a boolean value are often used in controller actions, as the clause for an if condition:
1 classStoryController < ApplicationController
2 def points
3 story = Story.find(params[:id])
4 if story.update_attribute(:points, params[:value])
5 render text: "#{story.name} updated"
6 else
7 render text: "Error updating story points"
8 end
9 end
10 end
5.4.12 Updating Specific Attributes
The instance methods update_attribute and update take one key/value pair or hash of attributes, respectively, to be updated on your model and saved to the database in one operation.
The update_attribute method updates a single attribute and saves the record, but updates made with this method are not subjected to validation checks! In other words, this method allows you to persist an Active Record model to the database even if the full object isn’t valid. Model callbacks are executed, but the updated_at is still bumped.
Lark says… I feel dirty whenever I use update_attribute. |
On the other hand, update is subject to validation checks and is often used on update actions and passed the params hash containing updated values.
Active Record also provides an instance method update_column, which accepts a single key/value pair. Although similar to update_attribute, the update_column method not only skips validations checks, but also does not run callbacks and skips the bumping of the updated_at timestamp.
Being introduced in Rails 4, the update_columns method works exactly the same as update_column, except that instead of accepting a single key/value pair as a parameter, it accepts a hash of attributes.
Courtenay says… If you have associations on a model, Active Record automatically creates convenience methods for mass assignment. In other words, a Project model that has_many :users will expose a user_ids attribute writer, which gets used by its update method.This is an advantage if you’re updating associations with checkboxes, because you just name the checkboxes project[user_ids][] and Rails will handle the magic. In some cases, allowing the user to set associations this way would be a security risk. |
5.4.13 Convenience Updaters
Rails provides a number of convenience update methods in the form of increment, decrement, and toggle, which do exactly what their names suggest with numeric and boolean attributes. Each has a bang variant (such as toggle!) that additionally invokes update_attribute after modifying the attribute.
5.4.14 Touching Records
There may be certain cases where updating a time field to indicate a record was viewed is all you require, and Active Record provides a convenience method for doing so in the form of touch. This is especially useful for cache auto-expiration, which is covered in Chapter 17, “Caching and Performance”.
Using this method on a model with no arguments updates the updated_at timestamp field to the current time without firing any callbacks or validation. If a timestamp attribute is provided it will update that attribute to the current time along with updated_at.
>> user = User.first
>> user.touch # => sets updated_at to now.
>> user.touch(:viewed_at) # sets viewed_at and updated_at to now.
If a :touch option is provided to a belongs to relation, it will touch the parent record when the child is touched.
classUser < ActiveRecord::Base
belongs_to :client, touch: true
end
>> user.touch # => also calls user.client.touch
5.4.15 Readonly Attributes
Sometimes you want to designate certain attributes as readonly, which prevents them from being updated after the parent object is created. The feature is primarily for use in conjunction with calculated attributes. In fact, Active Record uses this method internally for counter_cache attributes, since they are maintained with their own special SQL update statements.
The only time that readonly attributes may be set are when the object is not saved yet. The following example code illustrates usage of attr_readonly. Note the potential gotcha when trying to update a readonly attribute.
classCustomer < ActiveRecord::Base
attr_readonly :social_security_number
end
>> customer = Customer.new(social_security_number: "130803020")
=> #<Customer id: 1, social_security_number: "130803020", ...>
>> customer.social_security_number
=> "130803020"
>> customer.save
>> customer.social_security_number = "000000000" # Note, no error raised!
>> customer.social_security_number
=> "000000000"
>> customer.save
>> customer.reload
>> customer.social_security_number
=> "130803020" # the original readonly value is preserved
The fact that trying to set a new value for a readonly attribute doesn’t raise an error bothers my sensibilities, but I understand how it can make using this feature a little bit less code-intensive.
You can get a list of all readonly attributes via the class method readonly_attributes.
>> Customer.readonly_attributes
=> #<Set: {"social_security_number"}>
5.4.16 Deleting and Destroying
Finally, if you want to remove a record from your database, you have two choices. If you already have a model instance, you can destroy it:
>> bad_timesheet = Timesheet.find(1)
>> bad_timesheet.destroy
=> #<Timesheet id: 1, user_id: "1", submitted: nil,
created_at: "2006-11-21 05:40:27", updated_at: "2006-11-21 05:40:27">
The destroy method will both remove the object from the database and prevent you from modifying it again:
>> bad_timesheet.user_id = 2
RuntimeError: can't modify frozen Hash
Note that calling save on an object that has been destroyed will fail silently. If you need to check whether an object has been destroyed, you can use the destroyed? method.
The destroy method also has a complimentary bang method, destroy!. Calling destroy! on an object that cannot be destroyed will result in an ActiveRecord::RecordNotDestroyed exception being raised.
You can also call destroy and delete as class methods, passing the id(s) to delete. Both variants accept a single parameter or array of ids:
Timesheet.delete(1)
Timesheet.destroy([2, 3])
The naming might seem inconsistent, but it isn’t. The delete method uses SQL directly and does not load any instances (hence it is faster). The destroy method does load the instance of the Active Record object and then calls destroy on it as an instance method. The semantic differences are subtle, but come into play when you have assigned before_destroy callbacks or have dependent associations—child objects that should be deleted automatically along with their parent object.
5.5 Database Locking
Locking is a term for techniques that prevent concurrent users of an application from overwriting each other’s work. Active Record doesn’t normally use any type of database locking when loading rows of model data from the database. If a given Rails application will only ever have one user updating data at the same time, then you don’t have to worry about it.
However, when more than one user may be accessing and updating the exact same data simultaneously, then it is vitally important for you as the developer to think about concurrency. Ask yourself, what types of collisions or race conditions could happen if two users were to try to update a given model at the same time?
There are a number of approaches to dealing with concurrency in database-backed applications, two of which are natively supported by Active Record: optimistic and pessimistic locking. Other approaches exist, such as locking entire database tables. Every approach has strengths and weaknesses, so it is likely that a given application will use a combination of approaches for maximum reliability.
5.5.1 Optimistic Locking
Optimistic locking describes the strategy of detecting and resolving collisions if they occur, and is commonly recommended in multi-user situations where collisions should be infrequent. Database records are never actually locked in optimistic locking, making it a bit of a misnomer.
Optimistic locking is a fairly common strategy, because so many applications are designed such that a particular user will mostly be updating with data that conceptually belongs to him and not other users, making it rare that two users would compete for updating the same record. The idea behind optimistic locking is that since collisions should occur infrequently, we’ll simply deal with them only if they happen.
5.5.1.1 Implementation
If you control your database schema, optimistic locking is really simple to implement. Just add an integer column named lock_version to a given table, with a default value of zero.
1 classAddLockVersionToTimesheets < ActiveRecord::Migration
2
3 def change
4 add_column :timesheets, :lock_version, :integer, default: 0
5 end
6
7 end
Simply adding that lock_version column changes Active Record’s behavior. Now if the same record is loaded as two different model instances and saved differently, the first instance will win the update, and the second one will cause an ActiveRecord::StaleObjectError to be raised.
We can illustrate optimistic locking behavior with a simple spec:
1 describe Timesheet do
2 it "locks optimistically" do
3 t1 = Timesheet.create
4 t2 = Timesheet.find(t1.id)
5
6 t1.rate = 250
7 t2.rate = 175
8
9 expect(t1.save).to be_true
10 expect { t2.save }.to raise_error(ActiveRecord::StaleObjectError)
11 end
12 end
The spec passes, because calling save on the second instance raises the expected ActiveRecord::StaleObjectError exception. Note that the save method (without the bang) returns false and does not raise exceptions if the save fails due to validation, but other problems such as locking in this case, can indeed cause it to raise exceptions.
To use a database column named something other than lock_version change the setting using locking_column. To make the change globally, add the following line to your config/application.rb:
config.active_record.locking_column = :alternate_lock_version
Like other Active Record settings, you can also change it on a per-model basis with a declaration in your model class:
classTimesheet < ActiveRecord::Base
self.locking_column = :alternate_lock_version
end
5.5.1.2 Handling StaleObjectError
Now of course, after adding optimistic locking, you don’t want to just leave it at that, or the end user who is on the losing end of the collision would simply see an application error screen. You should try to handle the StaleObjectError as gracefully as possible.
Depending on the criticality of the data being updated, you might want to invest time into crafting a user-friendly solution that somehow preserves the changes that the loser was trying to make. At minimum, if the data for the update is easily re-creatable, let the user know why their update failed with controller code that looks something like the following:
1 def update
2 timesheet = Timesheet.find(params[:id])
3 timesheet.update(params[:timesheet])
4 # redirect somewhere
5 rescue ActiveRecord::StaleObjectError
6 flash[:error] = "Timesheet was modified while you were editing it."
7 redirect_to [:edit, timesheet]
8 end
There are some advantages to optimistic locking. It doesn’t require any special feature in the database, and it is fairly easy to implement. As you saw in the example, very little code is required to handle the StaleObjectError.
The main disadvantages to optimistic locking are that update operations are a bit slower because the lock version must be checked, and the potential for bad user experience, since they don’t find out about the failure until after they’ve potentially lost data.
5.5.2 Pessimistic Locking
Pessimistic locking requires special database support (built into the major databases) and locks down specific database rows during an update operation. It prevents another user from reading data that is about to be updated, in order to prevent them from working with stale data.
Pessimistic locking works in conjunction with transactions as in the following example:
1 Timesheet.transaction do
2 t = Timesheet.lock.first
3 t.approved = true
4 t.save!
5 end
It’s also possible to call lock! on an existing model instance, which simply calls reload(lock: true) under the covers. You wouldn’t want to do that on an instance with attribute changes since it would cause them to be discarded by the reload. If you decide you don’t want the lock anymore, you can pass false to the lock! method.
Pessimistic locking takes place at the database level. The SELECT statement generated by Active Record will have a FOR UPDATE (or similar) clause added to it, causing all other connections to be blocked from access to the rows returned by the select statement. The lock is released once the transaction is committed. There are theoretically situations (Rails process goes boom mid-transaction?!) where the lock would not be released until the connection is terminated or times out.
5.5.3 Considerations
Web applications scale best with optimistic locking, which as we’ve discussed doesn’t really use any database-level locking at all. However, you have to add application logic to handle failure cases. Pessimistic locking is a bit easier to implement, but can lead to situations where one Rails process is waiting on another to release a database lock, that is, waiting and not serving any other incoming requests. Remember that Rails processes are typically single-threaded.
In my opinion, pessimistic locking should not be super dangerous as it is on other platforms, since in Rails we don’t ever persist database transactions across more than a single HTTP request. In fact, it would be impossible to do that in a shared-nothing architecture. (If you’re running Rails with JRuby and doing crazy things like storing Active Record object instances in a shared session space, all bets are off.)
A situation to be wary of would be one where you have many users competing for access to a particular record that takes a long time to update. For best results, keep your pessimistic-locking transactions small and make sure that they execute quickly.
5.6 Where Clauses
In mentioning Active Record’s find method earlier in the chapter, we didn’t look at the wealth of options available in addition to finding by primary key and the first, last and all methods. Each method discussed here returns an ActiveRecord::Relation - a chainable object that is lazy evaluated against the database only when the actual records are needed.
5.6.1 where(*conditions)
It’s very common to need to filter the result set of a find operation (just a SQL SELECT under the covers) by adding conditions (to the WHERE clause). Active Record gives you a number of ways to do just that with the where method.
The conditions parameter can be specified as a string or a hash. Parameters are automatically sanitized to prevent SQL-injection attacks.
Passing a hash of conditions will construct a where clause containing a union of all the key/value pairs. If all you need is equality, versus, say LIKE criteria, I advise you to use the hash notation, since it’s arguably the most readable of the styles.
Product.where(sku: params[:sku])
The hash notation is smart enough to create an IN clause if you associate an array of values with a particular key.
Product.where(sku: [9400,9500,9900])
The simple string form can be used for statements that don’t involve data originating outside of your app. It’s most useful for doing LIKE comparisons, as well as greater-than/less-than and the use of SQL functions not already built into Active Record. If you do choose to use the string style, additional arguments to the where method will be treated as query variables to insert into the where clause.
Product.where('description like ? and color = ?', "%#{terms}%", color)
Product.where('sku in (?)', selected_skus)
where.not
The Active Record query interface for the most part abstracts SQL from the developer. However, there is a condition that always requires using pure string conditions in a where clause, specifying a NOT condition with <> or !=, depending on the database. Starting in Rails 4, query method not has been added to rectify this.
To use the new query method, it must be chained to a where clause with no arguments:
Article.where.not(title: 'Rails 3')
# >> SELECT "articles".* FROM "articles"
# WHERE ("articles"."title" != 'Rails 3')
The not query method can also accept an array to ensure multiple values are not in a field:
Article.where.not(title: ['Rails 3', 'Rails 5'])
# >> SELECT "articles".* FROM "articles"
# WHERE ("articles"."title" NOT IN ('Rails 3', 'Rails 5'))
5.6.1.1 Bind Variables
When using multiple parameters in the conditions, it can easily become hard to read exactly what the fourth or fifth question mark is supposed to represent. In those cases, you can resort to named bind variables instead. That’s done by replacing the question marks with symbols and supplying a hash with values for the matching symbol keys as a second parameter.
Product.where("name = :name AND sku = :sku AND created_at > :date",
name: "Space Toilet", sku: 80800, date: '2009-01-01')
During a quick discussion on IRC about this final form, Robby Russell gave me the following clever snippet:
Message.where("subject LIKE :foo OR body LIKE :foo", foo: '%woah%')
In other words, when you’re using named placeholders (versus question mark characters) you can use the same bind variable more than once. Like, whoa!
Simple hash conditions like this are very common and useful, but they will only generate conditions based on equality with SQL’s AND operator.
User.where(login: login, password: password).first
If you want logic other than AND, you’ll have to use one of the other forms available.
5.6.1.2 Boolean Conditions
It’s particularly important to take care in specifying conditions that include boolean values. Databases have various different ways of representing boolean values in columns. Some have native boolean datatypes, and others use a single character, often 1 and 0 or T and F (or even Y and N). Rails will transparently handle the data conversion issues for you if you pass a Ruby boolean object as your parameter:
Timesheet.where('submitted = ?', true)
5.6.1.3 Nil Conditions
Rails expert Xavier Noria reminds us to take care in specifying conditions that might be nil. Using a question mark doesn’t let Rails figure out that a nil supplied as the value of a condition should probably be translated into IS NULL in the resulting SQL query.
Compare the following two find examples and their corresponding SQL queries to understand this common gotcha. The first example does not work as intended, but the second one does work:
>> User.where('email = ?', nil)
User Load (151.4ms) SELECT * FROM users WHERE (email = NULL)
>> User.where(:email => nil)
User Load (15.2ms) SELECT * FROM users WHERE (users.email IS NULL)
5.6.2 order(*clauses)
The order method takes one or more symbols (representing column names) or a fragment of SQL, specifying the desired ordering of a result set:
Timesheet.order('created_at desc')
The SQL spec defaults to ascending order if the ascending/descending option is omitted, which is exactly what happens if you use symbols.
Timesheet.order(:created_at)
As of Rails 4, order can also accept hash arguments, eliminating the need of writing SQL for descending order clauses.
Timesheet.order(created_at: :desc)
Wilson says… The SQL spec doesn’t prescribe any particular ordering if no ‘order by’ clause is specified in the query. That seems to trip people up, since the common belief is that ‘ORDER BY id ASC’ is the default. |
5.6.2.1 Random Ordering
The value of the :order option is not validated by Rails, which means you can pass any code that is understood by the underlying database, not just column/direction tuples. An example of why that is useful is when wanting to fetch a random record:
1 # MySQL
2 Timesheet.order('RAND()')
3
4 # Postgres
5 Timesheet.order('RANDOM()')
6
7 # Microsoft SQL Server
8 Timesheet.order('NEWID()') # uses random uuids to sort
9
10 # Oracle
11 Timesheet.order('dbms_random.value').first
Remember that ordering large datasets randomly is known to perform terribly on most databases, particularly MySQL.
Tim says… A clever, performant and portable way to get a random record is to generate a random offset in Ruby. Timsheet.limit(1).offset(rand(Timesheet.count)).first |
5.6.3 limit(number) and offset(number)
The limit method takes an integer value establishing a limit on the number of rows to return from the query. The offset method, which must be chained to limit, specifies the number of rows to skip in the result set and is 0-indexed. (At least it is in MySQL. Other databases may be 1-indexed.) Together these options are used for paging results.
For example, a call to find for the second page of 10 results in a list of timesheets is:
Timesheet.limit(10).offset(10)
Depending on the particulars of your application’s data model, it may make sense to always put some limit on the maximum amount of Active Record objects fetched in any one specific query. Letting the user trigger unbounded queries pulling thousands of Active Record objects into Rails at one time is a recipe for disaster.
5.6.4 select(*clauses)
By default, Active Record generates SELECT * FROM queries, but it can be changed if, for example, you want to do a join, but not include the joined columns. Or if you want to add calculated columns to your result set, like this:
>> b = BillableWeek.select("mon_hrs + tues_hrs as two_day_total").first
=> #<BillableWeek ...>
>> b.two_day_total
=> 16
Now, if you actually want to fully use objects with additional attributes that you’ve added via the select method, don’t forget the * clause:
>> b = BillableWeek.select(:*, "mon_hrs + tues_hrs as two_day_total").first
=> #<BillableWeek id: 1...>
Keep in mind that columns not specified in the query, whether by * or explicitly, will not be populated in the resulting objects! So, for instance, continuing the first example, trying to access created_at on b has unexpected results:
ActiveModel::MissingAttributeError: missing attribute: created_at
5.6.5 from(*tables)
The from method allows you to modify the table name(s) portion of the SQL statements generated by Active Record. You can provide a custom value if you need to include extra tables for joins, or to reference a database view.
Here’s an example of usage from an application that features tagging:
1 defself.find_tagged_with(list)
2 select("#{table_name}.*").
3 from("#{table_name}, tags, taggings").
4 where("#{table_name}.#{primary_key} = taggings.taggable_id
5 and taggings.tag_id = tags.id
6 and tags.name IN (?)",
7 Tag.parse(list))
8 end
(If you’re wondering why table_name is used instead of a an explicit value, it’s because this code is mixed into a target class using Ruby modules. That subject is covered in Chapter 9, “Advanced Active Record”.)
5.6.6 exists?
A convenience method for checking the existence of records in the database is included in ActiveRecord as the aptly named exists? method. It takes similar arguments to find and instead of returning records returns a boolean for whether or not the query has results.
>> User.create(login: "mack")
=> #<User id: 1, login: "mack">
>> User.exists?(1)
=> true
>> User.exists?(login: "mack")
=> true
>> User.exists?(id: [1, 3, 5])
=> true
>> User.where(login: "mack").exists?
=> true
5.6.7 extending(*modules, &block)
Specifies one or many modules with methods that will extend the scope with additional methods.
1 modulePagination
2 def page(number)
3 # pagination code
4 end
5 end
6
7 scope = Model.all.extending(Pagination)
8 scope.page(params[:page])
5.6.8 group(*args)
Specifies a GROUP BY SQL-clause to add to the query generated by Active Record. Generally you’ll want to combine :group with the :select option, since valid SQL requires that all selected columns in a grouped SELECT be either aggregate functions or columns.
>> users = Account.select('name, SUM(cash) as money').group('name').to_a
=> [#<User name: "Joe", money: "3500">, #<User name: "Jane", money: "9245">]
Keep in mind that those extra columns you bring back might sometimes be strings if Active Record doesn’t try to typecast them. In those cases, you’ll have to use to_i and to_f to explicitly convert the string to numeric types.
>> users.first.money > 1_000_000
ArgumentError: comparison of String with 1000000 failed
from (irb):8:in '>'
5.6.9 having(*clauses)
If you need to perform a group query with a SQL HAVING clause, you use the having method
>> User.group("created_at").having(["created_at > ?", 2.days.ago])
=> [#<User name: "Joe", created_at: "2013-03-05 19:30:11">]
5.6.10 includes(*associations)
Active Record has the ability to eliminate “N+1” queries by letting you specify what associations to eager load using the includes method or option in your finders. Active Record will load those relationships with the minimum number of queries possible.
To eager load first degree associations, provide includes with an array of association names. When accessing these a database hit to load each one will no longer occur.
>> users = User.where(login: "mack").includes(:billable_weeks)
=> [#<User login: "mack">]
>> users.first.billable_weeks.each { |week| puts week }
=> #<Week start_date: "2008-05-01 00:00:00">
For second degree associations, provide a hash with the array as the value for the hash key.
>> clients = Client.includes(users: [:avatar])
=> [#<Client id: 1, name: "Hashrocket">]
You may add more inclusions following the same pattern.
>> Client.includes(
users: [:avatar, { timesheets: :billable_weeks }]
)
=> [#<Client id: 1, name: "Hashrocket">]
Similarly to includes, you may use eager_load or preload with the same syntax.
>> Client.eager_load(
users: [:avatar, { timesheets: :billable_weeks }]
)
=> [#<Client id: 1, name: "Hashrocket">]
>> Client.preload(
users: [:avatar, { timesheets: :billable_weeks }]
)
=> [#<Client id: 1, name: "Hashrocket">]
5.6.11 joins
The joins method can be useful when you’re grouping and aggregating data from other tables, but you don’t want to load the associated objects.
Buyer.select('buyers.id, count(carts.id) as cart_count').
joins('left join carts on carts.buyer_id = buyers.id').
group('buyers.id')
However, the most common usage of the join method is to allow you to eager-fetch additional objects in a single SELECT statement, a technique that is discussed at length in Chapter 7, “Active Record Associations”.
5.6.12 none
Being introduced in Rails 4 is ActiveRecord::QueryMethods.none, a chainable relation causes a query to return zero records. The query method returns ActiveRecord::NullRelation which is an implementation of the Null Object patten. It is to be used in instances where you have a method which returns a relation, but there is a condition in which you do not want the database to be queried. All subsequent chained conditions will work without issue, eliminating the need to continuously check if the object your are working with is a relation.
1 def visible
2 case role
3 when :reviewer
4 Post.published
5 when :bad_user
6 Post.none
7 end
8 end
9
10 # If chained, the following code will not break for users
11 # with a :bad_user role
12 posts = current_user.visible.where(name: params[:name])
5.6.13 readonly
Chaining the readonly method marks returned objects as read-only. You can change their attributes, but you won’t be able to save them back to the database.
>> c = Comment.readonly.first
=> #<Comment id: 1, body: "Hey beeyotch!">
>> c.body = "Keep it clean!"
=> "Keep it clean!"
>> c.save
ActiveRecord::ReadOnlyRecord: ActiveRecord::ReadOnlyRecord
5.6.14 references
The query method references is used to indicate that an association is referenced by a SQL string and therefore be joined over being loaded separately. As of Rails 4.1, adding a string condition of an included reference will result in an exception being raised.
Here is an example that selects all Teams which have a member named Tyrion:
>> Team.includes(:members).where('members.name = ?', 'Tyrion')
SQLite3::SQLException: no such column: members.name: SELECT "teams".*
FROM "teams" WHERE (members.name = 'Tyrion')
ActiveRecord::StatementInvalid: SQLite3::SQLException: no such column:
members.name: SELECT "teams".* FROM "teams" WHERE (members.name =
'Tyrion')
...
To get the above example to work in Rails 4.1, we must include query method references with the name of the association to join.
Team.includes(:members).
where("members.name = ?", 'Tyrion').references(:members)
However, if you were using the hash syntax with association conditions, it would still perform a LEFT OUTER JOIN without any exception being raised:
Team.includes(:members).where(members: { name: 'Tyrion' })
Note that ordering string SQL snippets on included associations will still work the same way without the need of references:
Team.includes(:members).order('members.name')
5.6.15 reorder
Using reorder, you can replace any existing defined order on a given relation.
>> Member.order('name DESC').reorder(:id)
Member Load (0.6ms) SELECT "members".* FROM "members" ORDER BY
"members"."id" ASC
Any subsequent calls to order will be appended to the query.
>> Member.order('name DESC').reorder(:id).order(:name)
Member Load (0.6ms) SELECT "members".* FROM "members" ORDER BY
"members".name ASC, "members"."id" ASC
5.6.16 reverse_order
A convenience method to reverse an existing order clause on a relation.
>> Member.order(:name).reverse_order
Member Load (0.4ms) SELECT "members".* FROM "members" ORDER BY
"members".name DESC
5.6.17 uniq / distinct
If you need to perform a query with a DISTINCT SQL-clause, you can use the uniq method.
>> User.select(:login).uniq
User Load (0.2ms) SELECT DISTINCT login FROM "users"
5.6.18 unscope(*args)
The unscope query method is useful when you want to remove an unwanted relation without reconstructing the entire relation chain. For example, to remove an order clause from a relation, add unscope(:order):
>> Member.order('name DESC').unscope(:order)
SELECT "members".* FROM "members"
Additionally, one can pass a hash as an argument to unscope specific :where values. This will cause only the value specified to not be included in the where clause.
Member.where(name: "Tyrion", active: true).unscope(where: :name)
is equivalent to
Member.where(active: true)
The following is a listing of the query methods unscope accepts:
· :from
· :group
· :having
· :includes
· :joins
· :limit
· :lock
· :offset
· :order
· :readonly
· :select
· :where
5.6.19 arel_table
For cases in which you want to generate custom SQL yourself through Arel, you may use the arel_table method to gain access to the Table for the class.
>> users = User.arel_table
>> users.where(users[:login].eq("mack")).to_sql
=> "SELECT `users`.`id`, `users`.`login` FROM `users` WHERE `users`.`login` = 'mack'"
You can consult the Arel documentation directly on how to construct custom queries using its DSL.18
5.7 Connections to Multiple Databases in Different Models
Connections are created via ActiveRecord::Base.establish_connection and retrieved by ActiveRecord::Base.connection. All classes inheriting from ActiveRecord::Base will use this connection. What if you want some of your models to use a different connection? You can add class-specific connections.
For example, let’s say need to access data residing in a legacy database apart from the database used by the rest of your Rails application. We’ll create a new base class that can be used by models that access legacy data. Begin by adding details for the additional database under its own key indatabase.yml. Then call establish_connection to make LegacyProjectBase and all its subclasses use the alternate connection instead.
1 classLegacyProjectBase < ActiveRecord::Base
2 establish_connection :legacy_database
3 self.abstract_class = true
4 ...
5 end
Incidentally, to make this example work with subclasses, you must specify self.abstract_class = true in the class context. Otherwise, Rails considers the subclasses of LegacyProject to be using single-table inheritance (STI), which we discuss at length in Chapter 9, “Advanced Active Record”.
Xavier says… You can easily point your base class to different databases depending on the Rails environment like this: 1 classLegacyProjectBase < ActiveRecord::Base 2 establish_connection "legacy_#{Rails.env}" 3 self.abstract_class = true 4 ... 5 end Then just add multiple entries to database.yml to match the resulting connection names. In the case of our example, legacy_development, legacy_test, etc. |
The establish_connection method takes a string (or symbol) key pointing to a configuration already defined in database.yml. Alternatively, you can pass it a literal hash of options, although it’s messy to put this sort of configuration data right into your model file instead of database.yml
classTempProject < ActiveRecord::Base
establish_connection adapter: 'sqlite3', database: ':memory:'
...
end
Rails keeps database connections in a connection pool inside the ActiveRecord::Base class instance. The connection pool is simply a Hash object indexed by Active Record class. During execution, when a connection is needed, the retrieve_connection method walks up the class-hierarchy until a matching connection is found.
5.8 Using the Database Connection Directly
It is possible to use Active Record’s underlying database connections directly, and sometimes it is useful to do so from custom scripts and for one-off or ad-hoc testing. Access the connection via the connection attribute of any Active Record class. If all your models use the same connection, then use the connection attribute of ActiveRecord::Base.
ActiveRecord::Base.connection.execute("show tables").values
The most basic operation that you can do with a connection is simply to execute a SQL statement from the DatabaseStatements module. For example, Listing 5.1 shows a method that executes a SQL file statement by statement.
Listing 5.1: Execute a SQL file line by line using active record’s connection
1 def execute_sql_file(path)
2 File.read(path).split(';').each do |sql|
3 begin
4 ActiveRecord::Base.connection.execute(#{sql}\n") unless sql.blank?
5 rescue ActiveRecord::StatementInvalid
6 $stderr.puts "warning: #{$!}"
7 end
8 end
9 end
5.8.1 The DatabaseStatements Module
The ActiveRecord::ConnectionAdapters::DatabaseStatements module mixes a number of useful methods into the connection object that make it possible to work with the database directly instead of using Active Record models. I’ve purposely left out some of the methods of this module because they are used internally by Rails to construct SQL statements dynamically and I don’t think they’re of much use to application developers.
For the sake of readability in the select_ examples below, assume that the connection object has been assigned to conn, like this:
conn = ActiveRecord::Base.connection
5.8.1.1 begin_db_transaction()
Begins a database transaction manually (and turns off Active Record’s default autocommitting behavior).
5.8.1.2 commit_db_transaction()
Commits the transaction (and turns on Active Record’s default autocommitting behavior again).
5.8.1.3 delete(sql_statement)
Executes a SQL DELETE statement provided and returns the number of rows affected.
5.8.1.4 execute(sql_statement)
Executes the SQL statement provided in the context of this connection. This method is abstract in the DatabaseStatements module and is overridden by specific database adapter implementations. As such, the return type is a result set object corresponding to the adapter in use.
5.8.1.5 insert(sql_statement)
Executes an SQL INSERT statement and returns the last autogenerated ID from the affected table.
5.8.1.6 reset_sequence!(table, column, sequence = nil)
Used in Oracle and Postgres; updates the named sequence to the maximum value of the specified table’s column.
5.8.1.7 rollback_db_transaction()
Rolls back the currently active transaction (and turns on auto-committing). Called automatically when a transaction block raises an exception or returns false.
5.8.1.8 select_all(sql_statement)
Returns an array of record hashes with the column names as keys and column values as values.
conn.select_all("select name from businesses limit 5")
=> [{"name"=>"Hopkins Painting"}, {"name"=>"Whelan & Scherr"},
{"name"=>"American Top Security Svc"}, {"name"=>"Life Style Homes"},
{"name"=>"378 Liquor Wine & Beer"}]
5.8.1.9 select_one(sql_statement)
Works similarly to select_all, but returns only the first row of the result set, as a single Hash with the column names as keys and column values as values. Note that this method does not add a limit clause to your SQL statement automatically, so consider adding one to queries on large datasets.
>> conn.select_one("select name from businesses")
=> {"name"=>"New York New York Salon"}
5.8.1.10 select_value(sql_statement)
Works just like select_one, except that it returns a single value: the first column value of the first row of the result set.
>> conn.select_value("select * from businesses limit 1")
=> "Cimino's Pizza"
5.8.1.11 select_values(sql_statement)
Works just like select_value, except that it returns an array of the values of the first column in all the rows of the result set.
>> conn.select_values("select * from businesses limit 5")
=> ["Ottersberg Christine E Dds", "Bally Total Fitness", "Behboodikah,
Mahnaz Md", "Preferred Personnel Solutions", "Thoroughbred Carpets"]
5.8.1.12 update(sql_statement)
Executes the update statement provided and returns the number of rows affected. Works exactly like delete.
5.8.2 Other Connection Methods
The full list of methods available on connection, which returns an instance of the underlying database adapter, is fairly long. Most of the Rails adapter implementations define their own custom versions of these methods. That makes sense, since all databases have slight variations in how they handle SQL and very large variations in how they handle extended commands, such as for fetching metadata.
A peek at abstract_adapter.rb shows us the default method implementations:
1 ...
2
3 # Returns the human-readable name of the adapter. Use mixed case - one
4 # can always use downcase if needed.
5 def adapter_name
6 'Abstract'
7 end
8
9 # Does this adapter support migrations? Backend specific, as the
10 # abstract adapter always returns +false+.
11 def supports_migrations?
12 false
13 end
14
15 # Can this adapter determine the primary key for tables not attached
16 # to an Active Record class, such as join tables? Backend specific, as
17 # the abstract adapter always returns +false+.
18 def supports_primary_key?
19 false
20 end
21
22 ...
In the following list of method descriptions and code samples, I’m accessing the connection of our sample time_and_expenses application in the Rails console, and again I’ve assigned connection to a local variable named conn, for convenience.
5.8.2.1 active?
Indicates whether the connection is active and ready to perform queries.
5.8.2.2 adapter_name
Returns the human-readable name of the adapter, as in the following example:
>> conn.adapter_name
=> "SQLite"
5.8.2.3 disconnect! and reconnect!
Closes the active connection or closes and opens a new one in its place, respectively.
5.8.2.4 raw_connection
Provides access to the underlying database connection. Useful for when you need to execute a proprietary statement or you’re using features of the Ruby database driver that aren’t necessarily exposed in Active Record. (In trying to come up with a code sample for this method, I was able to crash the Rails console with ease. There isn’t much in the way of error checking for exceptions that you might raise while mucking around with raw_connection.)
5.8.2.5 supports_count_distinct?
Indicates whether the adapter supports using DISTINCT within COUNT in SQL statements. This is true for all adapters except SQLite, which therefore requires a workaround when doing operations such as calculations.
5.8.2.6 supports_migrations?
Indicates whether the adapter supports migrations.
5.8.2.7 tables
Produces a list of tables in the underlying database schema. It includes tables that aren’t usually exposed as Active Record models, such as schema_info and sessions.
>> conn.tables
=> ["schema_migrations", "users", "timesheets", "expense_reports",
"billable_weeks", "clients", "billing_codes", "sessions"]
5.8.2.8 verify!(timeout)
Lazily verify this connection, calling active? only if it hasn’t been called for timeout seconds.
5.9 Other Configuration Options
In addition to the configuration options used to instruct Active Record on how to handle naming of tables and primary keys, there are a number of other settings that govern miscellaneous functions. Set them in an initializer.
5.9.0.1 ActiveRecord::Base.default_timezone
Tells Rails whether to use Time.local (using :local) or Time.utc (using :utc) when pulling dates and times from the database. Defaults to :local
5.9.0.2 ActiveRecord::Base.schema_format
Specifies the format to use when dumping the database schema with certain default rake tasks. Use the :sql option to have the schema dumped as potentially database-specific SQL statements. Just beware of incompatibilities if you’re trying to use the :sql option with different databases for development and testing. The default option is :ruby, which dumps the schema as an ActiveRecord::Schema file that can be loaded into any database that supports migrations.
5.9.0.3 ActiveRecord::Base.store_full_sti_class
Specifies whether Active Record should store the full constant name including namespace when using Single-Table Inheritance (STI), covered in Chapter 9, “Advanced Active Record”.
5.10 Conclusion
This chapter covered the fundamentals of Active Record, the framework included with Ruby on Rails for creating database-bound model classes. We’ve learned how Active Record expresses the convention over configuration philosophy that is such an important part of the Rails way, and how to make settings manually, which override the conventions in place.
We’ve also looked at the methods provided by ActiveRecord::Base, the parent class of all persistent models in Rails, which include everything you need to do basic CRUD operations: Create, Read, Update, and Delete. Finally, we reviewed how to drill through Active Record to use the database connection whenever you need to do so.
In the following chapter, we continue our coverage of Active Record by learning how migrations help evolve an application’s database schema.