iOS 7 Programming Fundamentals: Objective-C, Xcode, and Cocoa Basics (2014)
Part I. Language
Chapter 2. Object-Based Programming
My object all sublime.
—W. S. Gilbert, The Mikado
Objective-C, the native language for programming the Cocoa API, is an object-oriented language; to use it, the programmer must have an appreciation of the nature of objects and object-based programming. There’s little point in learning the syntax of Objective-C message sending or instantiation without a clear understanding of what a message or an instance is. That is what this chapter is about.
An object, in programming, is based on the concept of an object in the real world. It’s an independent, self-contained thing. These objects, unlike purely inert objects in the real world, have abilities. So an object in programming is more like a clock than a rock; it doesn’t just sit there, but actually does something. Perhaps one could compare an object in programming more to the animate objects of the real world, as opposed to the inanimate objects, except that — unlike real-world animate things — a programming object is supposed to be predictable: in particular, it does what you tell it. In the real world, you tell a dog to sit and anything can happen; in the programming world, you tell a dog to sit and it sits. (This is why so many of us prefer programming to dealing with the real world.)
In object-based programming, a program is organized into many discrete objects. This organization can make life much easier for the programmer. Each object has abilities that are specialized for that object. You can think of this as being a little like how an automobile assembly line works. Each worker or station along the line does one thing (screw on the bumpers, or paint the door, or whatever) and does it well. You can see immediately how this organization helps the programmer. If the car is coming off the assembly line with the door badly painted, it is very likely that the blame lies with the door-painting object, so we know where to look for the bug in our code. Or, if we decide to change the color that the door is to be painted, we have but to make a small change in the door-painting object. Meanwhile, other objects just go on doing what they do. They neither know nor care what the door-painting object does or how it works. Objects are concerned with other objects only to the extent that they interact with them. Those interactions take the form of messages.
Messages and Methods
Nothing in a computer program happens unless it is instructed to happen. In a C program, all code belongs to a function and doesn’t run unless that function is called. In an object-based program, all code belongs to an object, and doesn’t run unless that object is told to run that code. All the action in an object-based program happens because an object was told to act. What does it mean to tell an object something?
An object, in object-based programming, has a well-defined set of abilities — things it knows how to do. For example, imagine an object that is to represent a dog. We can design a highly simplified, schematic dog that knows how to do an extremely limited range of things: eat, come for a walk, bark, sit, lie down, sleep. The purpose of these abilities is so that the object can be told, as appropriate, to exercise them. So, again, we can imagine our schematic dog, rather like some child’s toy robot, responding to simple commands: Eat! Come for a walk! Bark!
In object-based programming, a command directed to an object is called a message. To make the dog object eat, we send the eat message to the dog object. This mechanism of message sending is the basis of all activity in the program. The program consists entirely of objects, so its activity consists entirely of objects sending messages to one another. Messages are so crucial to the activity of an object-based program that it is tempting to propose a (somewhat circular) definition of the notion object in terms of messages: an object is a thing to which a message can be sent.
A moment ago, I said that in a C program, all code belongs to a function. The object-based analogue to a function is called a method. So, for example, a dog object might have an eat method. When the dog object is sent the eat message, it responds by calling the eat method.
It may sound as if I’m not drawing any clear distinction between a message and a method. But there is a difference. A message is what one object says to another. A method is a bundle of code that gets called. The connection between the two is not perfectly direct. You might send a message to an object that corresponds to no method of that object. For example, you might tell the dog to recite the soliloquy from Hamlet. I’m not sure what will happen if you do that; the details are implementation-dependent. (The dog might just sit there silently. Or it might get annoyed and bite you. Or, I suppose, it might nip off, read Hamlet, memorize the soliloquy, and recite it.) But that implementation-dependence is exactly the point of the distinction between message and method.
Nevertheless, in general the distinction between sending a message and calling a method won’t usually be important in real life. Most of the time, when you’re using Objective-C, your reason for sending a message to an object will be that that object implements the corresponding method and you are expecting to call that method. So sending a message to an object and calling a method of an object will appear to be the same act.
Objective-C is object-based; C is not. That is why we speak of C functions but of Objective-C methods.
Classes and Instances
We come now to an extremely characteristic and profound feature of object-based programming. Just like in the real world, every object in the object-based programming world is of some type. This type, called a class, is the object-based analogy to the data type in C. Just as a simple variable in C might be an int or a float, an object in the object-based programming world might be a Dog or an NSString. In the object-based programming world, the idea of this arrangement is to ensure that more than one individual object can be relied upon to act the same way.
There can, for example, be more than one dog. You might have a dog called Fido and I might have a dog called Rover. But both dogs know how to eat, come for a walk, and bark. In object-based programming, they know this because they both belong to the Dog class. The knowledge of how to eat, come for a walk, and bark is part of the Dog class. Your dog Fido and my dog Rover possess this knowledge solely by virtue of being Dog objects.
From the programmer’s point of view, what this means is simple: all the code you write is put into a class. All the methods you write will be part of some class or other. You don’t program an individual dog object: you program the Dog class.
But I just got through saying that an object-based program works through the sending of messages to individual objects. So even though the programmer does not write the code for an individual dog object, there still needs to be an individual dog object in order for there to be something to send a message to. It is the Dog class that knows how to bark, but it is an individual dog object that is told to bark, and that actually does bark. So the question is: if all Dog code lives in a Dog class, where do individual dogs come from?
The answer is that they have to be created in the course of the program as it runs. When the program starts out, it contains code for a Dog class, but no individual dog objects. If any barking by any dogs is to be done, the program must first create an individual dog object. This object will belong to the Dog class, so it can be sent the bark message. An individual object belonging to the Dog class (or any class) is an instance of that class. To manufacture, from a class, an actual individual object that is an instance of that class, is to instantiate that class.
Classes, then, exist from the get-go, as part of the fact that the program exists in the first place; they are where the code is. Instances, on the other hand, are manufactured, deliberately and individually, as the program runs. Each instance is manufactured from a class, it is an instance of that class, and it has methods by virtue of the fact that the class has those methods. The instance can then be sent a message; what it will do in response depends on what code the class contains in its methods. The instance is the individual thing that can be sent messages; the class, with its methods, is the locus of the thing’s ability to respond to messages (Figure 2-1).
Figure 2-1. Class and instance
Because every individual object is an instance of a class, to know what messages you can officially send to that object, you need to know at least what methods its class has endowed it with. The public knowledge of this information is that class’s API. (A class may also have methods that you’re not really supposed to call from outside that object; these would not be public and other objects couldn’t officially send those messages to an instance of that class.) That’s why Apple’s own Cocoa documentation consists largely of pages listing and describing the methods supplied by some class. For example, to know what messages you can send to an NSString object (instance), you’d start by studying the NSString class documentation. That page is really just a big list of methods, so it tells you what an NSString object can do. That isn’t everything in the world there is to know about an NSString, but it’s a big percentage of it.
I’ve said that nothing happens in a program unless a message is sent to an object. But I’ve also said that there are no instances until they are created as the program runs; at the outset, there are only classes. How, then, are these instances to be created? To make an instance as the program runs, it must be possible to send a message to something; otherwise, nothing will happen. But if there are no instances, what is the “make an instance” message to be sent to?
The answer is that classes are themselves objects and can be sent messages. And indeed, one of the most important things you can ask a class to do by sending it a message is to instantiate itself. Thus there is a step missing from our earlier Figure 2-1; it shows a class and an instance, with the code in the class and a message being sent to the instance, but it fails to show how the instance was generated from the class in the first place. A more complete picture would look like Figure 2-2.
Figure 2-2. Class and instance, take two
It thus begins to look as if there must be two kinds of message: messages that you are allowed to send to a class (such as telling the Dog class to instantiate itself) and messages that you are allowed to send to an instance (such as telling an individual dog to bark). That is exactly true. More precisely, all code lives as a method in a class, but methods are of two kinds: class methods and instance methods. If a method is a class method, you can send that message to the class. If a method is an instance method, you can send that message to an instance of the class.
In Objective-C syntax, class methods and instance methods are distinguished by the use of a plus sign or a minus sign. For example, Apple’s NSString class documentation page listing the methods of the NSString class starts out like this:
The string method is a class method. The init method is an instance method.
In general, though not exclusively, class methods tend to be factory methods — that is, methods for generating and vending an instance. This makes sense, because making an instance of itself is one of the main things you’re likely to want to ask a class to do. You might think that a class really needs only one class method for generating an instance of itself, and that is rigorously true, but classes tend to provide multiple factory methods purely as a convenience to the programmer. For example, here are three NSString class methods:
They all make instances. The first class method, string, generates an empty NSString instance (a string with no text). The second class method, stringWithFormat:, generates an NSString instance based on text that you provide, which can include transforming other values into text; for example, you might use it to start with an integer 9 and generate an NSString instance @"9". The third class method reads the contents of a file and generates an NSString instance from those contents. When you come to write your own classes, you too might well create multiple class methods that act as instance factories for your own future programming convenience.
Now that I’ve revealed that classes are objects and can be sent messages, you might be wondering why there need to be instances at all. Why doesn’t the mere existence of classes as objects suffice for object-based programming? Why would you ever bother to instantiate any of the classes? Why wouldn’t you write all your code as class methods, have the program send messages from one class object to another, and be done with it?
The answer is that instances have a feature that classes do not: instance variables. An instance variable is just what the name suggests: it’s a variable belonging to an instance. Like instance methods, instance variables are defined as part of the class. But the value of an instance variable is set as the program runs and belongs to one instance alone. In other words, different instances can have different values for the same instance variable.
For example, suppose we have a Dog class and we decide that it might be a good idea for every dog to have a name. Just as you can learn a real-world dog’s name by reading the tag on its collar, we want to be able to assign every dog instance a name and, subsequently, to learn what that name is. So, in designing the Dog class, we declare that this class has an instance variable called name, whose value is a string (probably an NSString, as we’re using Objective-C). Now when our program runs we can instantiate Dog and assign the resulting dog instance a name (that is, we can assign its name instance variable a value). We can also instantiate Dog again and assign that resulting dog instance a name. Let’s say these are two different names: one is @"Rover" and one is @"Fido". Then we’ve got two instances of Dog, and they are significantly different; they differ in the value of their name instance variables (Figure 2-3).
Figure 2-3. Instance variables
So an instance is a reflection of the instance methods of its class, but that isn’t all it is; it’s also a collection of instance variables. The class is responsible for what instance variables the instance has, but not for the values of those variables. The values can change as the program runs and apply only to a particular instance. An instance is a cluster of particular instance variable values.
In short, an instance is both code and data. The code it gets from its class and in a sense is shared with all other instances of that class, but the data belong to it alone. The data can persist as long as the instance persists. The instance has, at every moment, a state — the complete collection of its own personal instance variable values. An instance is a device for maintaining state. It’s a box for storage of data.
The Object-Based Philosophy
We may summarize the nature of objects in two phrases: encapsulation of functionality, and maintenance of state. (I used this same summary many years ago in my book REALbasic: The Definitive Guide.)
Encapsulation of functionality
Each object does its own job, and presents to the rest of the world — to other objects, and indeed in a sense to the programmer — an opaque wall whose only entrances are the methods to which it promises to respond and the actions it promises to perform when the corresponding messages are sent to it. The details of how, behind the scenes, it actually implements those actions are secreted within itself; no other object needs to know them.
Maintenance of state
Each individual instance is a bundle of data that it maintains. Typically that data is private, which means that it’s encapsulated as well; no other object knows what that data is or in what form it is kept. The only way to discover from outside what data an object is maintaining is if there’s a method that reveals it.
As an example, imagine an object whose job is to implement a stack — it might be an instance of a Stack class. A stack is a data structure that maintains a set of data in LIFO order (last in, first out). It responds to just two messages: push and pop. Push means to add a given piece of data to the set. Pop means to remove from the set the piece of data that was most recently pushed and hand it out. It’s like a stack of plates: plates are placed onto the top of the stack or removed from the top of the stack one by one, so the first plate to go onto the stack can’t be retrieved until all other subsequently added plates have been removed (Figure 2-4).
Figure 2-4. A stack
The stack object illustrates encapsulation of functionality because the outside world knows nothing of how the stack is actually implemented. It might be an array, it might be a linked list, it might be any of a number of other implementations. But a client object — an object that actually sends a push or pop message to the stack object — knows nothing of this and cares less, provided the stack object adheres to its contract of behaving like a stack. This is also good for the programmer, who can, as the program develops, safely substitute one implementation for another without harming the vast machinery of the program as a whole. And just the other way round, the stack object knows nothing and cares less about who is telling it to push or to pop, and why. It just hums along and does its job in its reliable little way.
The stack object illustrates maintenance of state because it isn’t just the gateway to the stack data — it is the stack data. Other objects can get access to that data, but only by virtue of having access to the stack object itself, and only in the manner that the stack object permits. The stack data is effectively inside the stack object; no one else can see it. All that another object can do is push or pop. If a certain object is at the top of our stack object’s stack right now, then whatever object sends the pop message to this stack object will receive that object in return. If no object sends thepop message to this stack object, then the object at the top of the stack will just sit there, waiting.
As a second example of the philosophy and nature of object-based programming at work, I’ll revert to another imaginary scenario I used in my REALbasic book. Pretend we’re writing an arcade game where the user is to “shoot” at moving “targets,” and the score increases every time a target is hit. We immediately have a sense of how we might organize our code using object-based programming and can see how object-based programming will fulfill its nature and purpose:
§ There will be a Target class. Every target object will be an instance of this class. This decision makes sense because we want every target to behave the same way. A target will need to know how to draw itself; that knowledge will be part of the Target class, which makes sense because all targets will draw themselves in the same way. Thus we have the relationship between class and instance.
§ Targets may draw themselves the same way, but they may also differ in appearance. Perhaps some targets are blue, others are red, and so on. This difference between individual targets can be expressed as an instance variable. Call it color. Every time we instantiate a target, we’ll assign it a color. The Target class’s code for drawing an individual target will look at that target’s color instance variable and use it when filling in the target’s shape. Clearly, we could extend this individualization as much as we like: targets could have different sizes, different shapes, and so on, and all of these parametric distinctions could be made on an individual basis through the use of instance variables. Thus we have both encapsulation of functionality and maintenance of state. A target has a state, the parameters that describe how it should look, and also has the ability to draw itself, expressing that state visually.
§ When a target is hit by the user, it will explode. So perhaps the Target class will have an explode instance method; thus, every target knows how to explode. One thing that should happen whenever a target explodes is that the user’s score should increase. So let’s imagine a score object — an instance of the Score class. When a target explodes, one of the things its explode instance method will do is send an increase message to the score object. Thus we have both encapsulation of functionality and maintenance of state. The score object responds indifferently to any object that sends it the increase message; it doesn’t need to know why it’s being sent that message. Nor does the score object even need to know that targets exist, or indeed that it’s part of a game. It just sits there maintaining the score, and when it receives the increase message, it increases it.
As we imagine constructing an object-based program for performing a particular task, we bear in mind the nature of objects, as sketched in this chapter. There are classes and instances. A class is a set of functionality (methods) describing what all instances of that class can do (encapsulation of functionality). Instances of the same class differ only in the value of their instance variables (maintenance of state). We plan accordingly. Objects are an organizational tool, a set of boxes for encapsulating the code that accomplishes a particular task. They are also a conceptual tool. The programmer, being forced to think in terms of discrete objects, must divide the goals and behaviors of the program into discrete tasks, each task being assigned to an appropriate object.
But no object is an island. Objects can cooperate with one another, namely by communicating with one another — that is, by sending messages to one another. The ways in which appropriate lines of communication can be arranged are innumerable. The assembly-line analogy that I used at the start of this chapter illustrates one such arrangement — first, object 1 operates upon the end-product; then it hands it off to object 2, and object 2 operates upon the end-product, and so on. But that arrangement won’t be appropriate to most tasks. Coming up with an appropriate arrangement — an architecture — for the cooperative and orderly relationship between objects is one of the most challenging aspects of object-based programming.
Using object-based programming effectively to make a program do what you want it to do while keeping it clear and maintainable is something of an art; your abilities will improve with experience. Eventually, you may want to do some further reading on effective planning and construction of the architecture of an object-based program. I recommend in particular two classic, favorite books. Refactoring, by Martin Fowler (Addison-Wesley, 1999), describes how you can get a sense that you might need to rearrange what methods belong to what classes (and how to conquer your fear of doing so). Design Patterns, by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides (also known as “the Gang of Four”), is the bible on architecting object-based programs, listing all the ways you can arrange objects with the right powers and the right knowledge of one another (Addison-Wesley, 1994).