Designing RESTful Services - REST and the JAX-RS Standard - RESTful Java with JAX-RS 2.0 (2013)

RESTful Java with JAX-RS 2.0 (2013)

Part I. REST and the JAX-RS Standard

Chapter 2. Designing RESTful Services

In Chapter 1, I gave you a brief overview of REST and how it relates to HTTP. Although it is good to obtain a solid foundation in theory, nothing can take the place of seeing theory put into practice. So, let’s define a RESTful interface for a simple order entry system of a hypothetical ecommerce web store. Remote distributed clients will use this web service to purchase goods, modify existing orders in the system, and view information about customers and products.

In this chapter, we will start off by examining the simple underlying object model of our service. After walking through the model, we will add a distributed interface to our system using HTTP and the architectural guidelines of REST. To satisfy the addressability requirements of REST, we will first have to define a set of URIs that represent the entry points into our system. Since RESTful systems are representation-oriented, we will next define the data format that we will use to exchange information between our services and clients. Finally, we will decide which HTTP methods are allowed by each exposed URI and what those methods do. We will make sure to conform to the uniform, constrained interface of HTTP when doing this step.

The Object Model

The object model of our order entry system is very simple. Each order in the system represents a single transaction or purchase and is associated with a particular customer. Orders are made up of one or more line items. Line items represent the type and number of each product purchased.

Based on this description of our system, we can deduce that the objects in our model are Order, Customer, LineItem, and Product. Each data object in our model has a unique identifier, which is the integer id property. Figure 2-1 shows a UML diagram of our object model.

Order entry system object model

Figure 2-1. Order entry system object model

We will want to browse all orders as well as each individual order in our system. We will also want to submit new orders and update existing ones. Finally, we will want to have the ability to cancel and delete existing orders. The OrderEntryService object represents the operations we want to perform on our Order, Customer, LineItem, and Product objects.

Model the URIs

The first thing we are going to do to create our distributed interface is define and name each of the distributed endpoints in our system. In a RESTful system, endpoints are usually referred to as resources and are identified using a URI. URIs satisfy the addressability requirements of a RESTful service.

In our object model, we will be interacting with Orders, Customers, and Products. These will be our main, top-level resources. We want to be able to obtain lists of each of these top-level items and to interact with individual items. LineItems are aggregated within Order objects so they will not be a top-level resource. We could expose them as a subresource under one particular Order, but for now, let’s assume they are hidden by the data format. Given this, here is a list of URIs that will be exposed in our system:

/orders

/orders/{id}

/products

/products/{id}

/customers

/customers/{id}

NOTE

You’ll notice that the nouns in our object model have been represented as URIs. URIs shouldn’t be used as mini-RPC mechanisms and should not identify operations. Instead, you should use a combination of HTTP methods and the data format to model unique operations in your distributed RESTful system.

Defining the Data Format

One of the most important things we have to do when defining a RESTful interface is determine how our resources will be represented over the wire to our clients. XML is perhaps one of the most popular formats on the Web and can be processed by most modern languages, so let’s choose that. JSON is also a popular format, as it is more condensed and JavaScript can interpret it directly (great for Ajax applications), but let’s stick to XML for now.

Generally, you would define an XML schema for each representation you want to send across the wire. An XML schema defines the grammar of a data format. It defines the rules about how a document can be put together. I do find, though, that when explaining things within an article (or a book), providing examples rather than schema makes things much easier to read and understand.

Read and Update Format

The XML format of our representations will look a tiny bit different when we read or update resources from the server as compared to when we create resources on the server. Let’s look at our read and update format first.

Common link element

Each format for Order, Customer, and Product will have a common XML element called link:

<link rel="self" href="http://example.com/..."/>

The link[2] element tells any client that obtains an XML document describing one of the objects in our ecommerce system where on the network the client can interact with that particular resource. The rel attribute tells the client what relationship the link has with the resource the URI points to (contained within the href attribute). The self value just means it is pointing to itself. While not that interesting on its own, link becomes very useful when we aggregate or compose information into one larger XML document.

The details

So, with the common elements described, let’s start diving into the details by first looking at our Customer representation format:

<customer id="117">

<link rel="self" href="http://example.com/customers/117"/>

<first-name>Bill</first-name>

<last-name>Burke</last-name>

<street>555 Beacon St.<street>

<city>Boston</city>

<state>MA</state>

<zip>02115</zip>

</customer>

Pretty straightforward. We just take the object model of Customer from Figure 2-1 and expand its attributes as XML elements. Product looks much the same in terms of simplicity:

<product id="543">

<link rel="self" href="http://example.com/products/543"/>

<name>iPhone</name>

<cost>$199.99</cost>

</product>

In a real system, we would, of course, have a lot more attributes for Customer and Product, but let’s keep our example simple so that it’s easier to illustrate these RESTful concepts:

<order id="233">

<link rel="self" href="http://example.com/orders/233"/>

<total>$199.02</total>

<date>December 22, 2008 06:56</date>

<customer id="117">

<link rel="self" href="http://example.com/customers/117"/>

<first-name>Bill</first-name>

<last-name>Burke</last-name>

<street>555 Beacon St.<street>

<city>Boston</city>

<state>MA</state>

<zip>02115</zip>

</customer>

<line-items>

<line-item id="144">

<product id="543">

<link rel="self" href="http://example.com/products/543"/>

<name>iPhone</name>

<cost>$199.99</cost>

</product>

<quantity>1</quantity>

</line-item>

</line-items>

</order>

The Order data format has the top-level elements of total and date that specify the total cost of the order and the date the Order was made. Order is a great example of data composition, as it includes Customer and Product information. This is where the link element becomes particularly useful. If the client is interested in interacting with a Customer or Product that makes up the Order, it has the URI needed to interact with one of these resources.

Create Format

When we are creating new Orders, Customers, or Products, it doesn’t make a lot of sense to include an id attribute and link element with our XML document. The server will generate IDs when it inserts our new object into a database. We also don’t know the URI of a new object because the server also generates this. So, the XML for creating a new Product would look something like this:

<product>

<name>iPhone</name>

<cost>$199.99</cost>

</product>

Orders and Customers would follow the same pattern and leave out the id attribute and link element.

Assigning HTTP Methods

The final thing we have to do is decide which HTTP methods will be exposed for each of our resources and what these methods will do. It is crucial that we do not assign functionality to an HTTP method that supersedes the specification-defined boundaries of that method. For example, an HTTP GET on a particular resource should be read-only. It should not change the state of the resource it is invoking on. Intermediate services like a proxy-cache, a CDN (Akamai), or your browser rely on you to follow the semantics of HTTP strictly so that they can perform built-in tasks like caching effectively. If you do not follow the definition of each HTTP method strictly, clients and administration tools cannot make assumptions about your services, and your system becomes more complex.

Let’s walk through each method of our object model to determine which URIs and HTTP methods are used to represent them.

Browsing All Orders, Customers, or Products

The Order, Customer, and Product objects in our object model are all very similar in how they are accessed and manipulated. One thing our remote clients will want to do is to browse all the Orders, Customers, or Products in the system. These URIs represent these objects as a group:

/orders

/products

/customers

To get a list of Orders, Products, or Customers, the remote client will call an HTTP GET on the URI of the object group it is interested in. An example request would look like the following:

GET /products HTTP/1.1

Our service will respond with a data format that represents all Orders, Products, or Customers within our system. Here’s what a response would look like:

HTTP/1.1 200 OK

Content-Type: application/xml

<products>

<product id="111">

<link rel="self" href="http://example.com/products/111"/>

<name>iPhone</name>

<cost>$199.99</cost>

</product>

<product id="222">

<link rel="self" href="http://example.com/products/222"/>

<name>Macbook</name>

<cost>$1599.99</cost>

</product>

...

</products>

One problem with this bulk operation is that we may have thousands of Orders, Customers, or Products in our system and we may overload our client and hurt our response times. To mitigate this problem, we will allow the client to specify query parameters on the URI to limit the size of the dataset returned:

GET /orders?startIndex=0&size=5 HTTP/1.1

GET /products?startIndex=0&size=5 HTTP/1.1

GET /customers?startIndex=0&size=5 HTTP/1.1

Here we have defined two query parameters: startIndex and size. The startIndex parameter represents where in our large list of Orders, Products, or Customers we want to start sending objects from. It is a numeric index into the object group being queried. The size parameter specifies how many of those objects in the list we want to return. These parameters will be optional. The client does not have to specify them in its URI when crafting its request to the server.

Obtaining Individual Orders, Customers, or Products

I mentioned in the previous section that we would use a URI pattern to obtain individual Orders, Customers, or Products:

/orders/{id}

/products/{id}

/customers/{id}

We will use the HTTP GET method to retrieve individual objects in our system. Each GET invocation will return a data format that represents the object being obtained:

GET /orders/233 HTTP/1.1

For this request, the client is interested in getting a representation of the Order with an order id of 233. GET requests for Products and Customers would work the same. The HTTP response message would look something like this:

HTTP/1.1 200 OK

Content-Type: application/xml

<order id="233">...</order>

The response code is 200, “OK,” indicating that the request was successful. The Content-Type header specifies the format of our message body as XML, and finally we have the actual representation of the Order.

Creating an Order, Customer, or Product

There are two possible ways in which a client could create an Order, Customer, or Product within our order entry system: by using either the HTTP PUT or POST method. Let’s look at both ways.

Creating with PUT

The HTTP definition of PUT states that it can be used to create or update a resource on the server. To create an Order, Customer, or Product with PUT, the client simply sends a representation of the new object it is creating to the exact URI location that represents the object:

PUT /orders/233 HTTP/1.1

PUT /customers/112 HTTP/1.1

PUT /products/664 HTTP/1.1

PUT is required by the specification to send a response code of 201, “Created,” if a new resource was created on the server as a result of the request.

The HTTP specification also states that PUT is idempotent. Our PUT is idempotent, because no matter how many times we tell the server to “create” our Order, the same bits are stored at the /orders/233 location. Sometimes a PUT request will fail and the client won’t know if the request was delivered and processed at the server. Idempotency guarantees that it’s OK for the client to retransmit the PUT operation and not worry about any adverse side effects.

The disadvantage of using PUT to create resources is that the client has to provide the unique ID that represents the object it is creating. While it usually possible for the client to generate this unique ID, most application designers prefer that their servers (usually through their databases) create this ID. In our hypothetical order entry system, we want our server to control the generation of resource IDs. So what do we do? We can switch to using POST instead of PUT.

Creating with POST

Creating an Order, Customer, or Product using the POST method is a little more complex than using PUT. To create an Order, Customer, or Product with POST, the client sends a representation of the new object it is creating to the parent URI of its representation, leaving out the numeric target ID. For example:

POST /orders HTTP/1.1

Content-Type: application/xml

<order>

<total>$199.02</total>

<date>December 22, 2008 06:56</date>

...

</order>

The service receives the POST message, processes the XML, and creates a new order in the database using a database-generated unique ID. While this approach works perfectly fine, we’ve left our client in a quandary. What if the client wants to edit, update, or cancel the order it just posted? What is the ID of the new order? What URI can we use to interact with the new resource? To resolve this issue, we will add a bit of information to the HTTP response message. The client would receive a message something like this:

HTTP/1.1 201 Created

Content-Type: application/xml

Location: http://example.com/orders/233

<order id="233">

<link rel="self" href="http://example.com/orders/233"/>

<total>$199.02</total>

<date>December 22, 2008 06:56</date>

...

</order>

HTTP requires that if POST creates a new resource, it respond with a code of 201, “Created” (just like PUT). The Location header in the response message provides a URI to the client so it knows where to further interact with the Order that was created (i.e., if the client wanted to update the Order). It is optional whether the server sends the representation of the newly created Order with the response. Here, we send back an XML representation of the Order that was just created with the ID attribute set to the one generated by our database as well as a link element.

NOTE

I didn’t pull the Location header out of thin air. The beauty of this approach is that it is defined within the HTTP specification. That’s an important part of REST—to follow the predefined behavior within the specification of the protocol you are using. Because of this, most systems are self-documenting, as the distributed interactions are already mostly defined by the HTTP specification.

Updating an Order, Customer, or Product

We will model updating an Order, Customer, or Product using the HTTP PUT method. The client PUTs a new representation of the object it is updating to the exact URI location that represents the object. For example, let’s say we wanted to change the price of a product from $199.99 to $149.99. Here’s what the request would look like:

PUT /orders/233 HTTP/1.1

Content-Type: application/xml

<product id="111">

<name>iPhone</name>

<cost>$149.99</cost>

</product>

As I stated earlier in this chapter, PUT is great because it is idempotent. No matter how many times we transmit this PUT request, the underlying Product will still have the same final state.

When a resource is updated with PUT, the HTTP specification requires that you send a response code of 200, “OK,” and a response message body or a response code of 204, “No Content,” without any response body. In our system, we will send a status of 204 and no response message.

NOTE

We could use POST to update an individual Order, but then the client would have to assume the update was nonidempotent and we would have to take duplicate message processing into account.

Removing an Order, Customer, or Product

We will model deleting an Order, Customer, or Product using the HTTP DELETE method. The client simply invokes the DELETE method on the exact URI that represents the object we want to remove. Removing an object will wipe its existence from the system.

When a resource is removed with DELETE, the HTTP specification requires that you send a response code of 200, “OK,” and a response message body or a response code of 204, “No Content,” without any response body. In our application, we will send a status of 204 and no response message.

Cancelling an Order

So far, the operations of our object model have fit quite nicely into corresponding HTTP methods. We’re using GET for reading, PUT for updating, POST for creating, and DELETE for removing. We do have an operation in our object model that doesn’t fit so nicely. In our system, Orderscan be cancelled as well as removed. While removing an object wipes it clean from our databases, cancelling only changes the state of the Order and retains it within the system. How should we model such an operation?

Overloading the meaning of DELETE

Cancelling an Order is very similar to removing it. Since we are already modeling remove with the HTTP DELETE method, one thing we could do is add an extra query parameter to the request:

DELETE /orders/233?cancel=true

Here, the cancel query parameter would tell our service that we don’t really want to remove the Order, but cancel it. In other words, we are overloading the meaning of DELETE.

While I’m not going to tell you not to do this, I will tell you that you shouldn’t do it. It is not good RESTful design. In this case, you are changing the meaning of the uniform interface. Using a query parameter in this way is actually creating a mini-RPC mechanism. HTTP specifically states that DELETE is used to delete a resource from the server, not cancel it.

States versus operations

When modeling a RESTful interface for the operations of your object model, you should ask yourself a simple question: is the operation a state of the resource? If you answer yes to this question, the operation should be modeled within the data format.

Cancelling an Order is a perfect example of this. The key with cancelling is that it is a specific state of an Order. When a client follows a particular URI that links to a specific Order, the client will want to know whether the Order was cancelled or not. Information about the cancellation needs to be in the data format of the Order. So let’s add a cancelled element to our Order data format:

<order id="233">

<link rel="self" href="http://example.com/orders/233"/>

<total>$199.02</total>

<date>December 22, 2008 06:56</date>

<cancelled>false</cancelled>

...

</order>

Since the state of being cancelled is modeled in the data format, we can now use our already defined mechanism of updating an Order to model the cancel operation. For example, we could PUT this message to our service:

PUT /orders/233 HTTP/1.1

Content-Type: application/xml

<order id="233">

<total>$199.02</total>

<date>December 22, 2008 06:56</date>

<cancelled>true</cancelled>

...

</order>

In this example, we PUT a new representation of our order with the cancelled element set to true. By doing this, we’ve changed the state of our order from viable to cancelled.

This pattern of modeling an operation as the state of the resource doesn’t always fit, though. What if we expanded on our cancel example by saying that we wanted a way to clean up all cancelled orders? In other words, we want to purge all cancelled orders from our database. We can’t really model purging the same way we did cancel. While purge does change the state of our application, it is not in and of itself a state of the application.

To solve this problem, we model this operation as a subresource of /orders and we trigger a purging by doing a POST on that resource. For example:

POST /orders/purge HTTP/1.1

An interesting side effect of this is that because purge is now a URI, we can evolve its interface over time. For example, maybe GET /orders/purge returns a document that states the last time a purge was executed and which orders were deleted. What if we wanted to add some criteria for purging as well? Form parameters could be passed stating that we only want to purge orders older than a certain date. In doing this, we’re giving ourselves a lot of flexibility as well as honoring the uniform interface contract of REST.

Wrapping Up

So, we’ve taken an existing object diagram and modeled it as a RESTful distributed service. We used URIs to represent the endpoints in our system. These endpoints are called resources. For each resource, we defined which HTTP methods each resource will allow and how those individual HTTP methods behave. Finally, we defined the data format that our clients and services will use to exchange information. The next step is to actually implement these services in Java. This will be the main topic for the rest of this book.


[2] I actually borrowed the link element from the Atom format. Atom is a syndication format that is used to aggregate and publish blogs and news feeds. You can find out more about Atom at http://www.w3.org/2005/Atom.