PHP Web Services (2013)
Chapter 10. Making Service Design Decisions
This is the million-dollar question: what kind of a service do I need for my next project? REST is cool, but RPC is familiar. JSON is lighter, but the client already works with XML. The API will be used by mobile consumers, or web consumers, or a reporting engine, or all of these.
There’s rarely a clear-cut “one true way” when picking the best solution for a given API, but there are some key elements that can influence how to choose a solution that will be a good fit. API design is mostly engineering with a generous dash of common sense also required.
The big questions you need to ask at each step are these:
1. Who will be using this API?
2. What are they trying to achieve?
3. Which technologies do they use?
With these in mind, you can consider each of the following points.
1. It can be helpful to create some user stories to represent some of the expected users and tasks that the API will serve.
2. Building APIs is all about creating an interface point that makes sense when viewed from the outside, so the users’ perspective is always the lens needed to scrutinize any decision.
3. Not every piece of data or possible piece of functionality in an application will necessarily make sense exposed over an API, so don’t be tempted to build something huge immediately!
Service Type Decisions
The first decision to make when designing any API is one that can’t be changed: decide what kind of a service you will offer. This depends on a combination of the audience and the type of service to be created.
For users who have larger systems using technology stacks such as Java, C++, or .NET, it may be easier for them to integrate with a SOAP service. This book covered SOAP in detail in Chapter 7, but basically it is an RPC-style service and is well-supported in PHP. Between platforms, problems can arise with mixed data types (such as when a data type exists in one of the languages but not in another), so do take care when picking data types.
For everyone else, a choice can be made that is based more on what kind of an API will be needed. If it will mostly be dealing with CRUD (Create, Read, Update, Delete) operations on data, then a REST API is a strong contender. It offers a simple way to work with data records and a well-designed RESTful API is very intuitive to pick up and use (read more about REST in Chapter 8).
The biggest pitfall with a RESTful API is the need to represent everything as a resource. This means that the URLs should be crafted to contain no verbs at all, and for an API that offers functional features, it can be quite tricky to reframe those ideas—both for the creators of the API and for those using it. These types of APIs have been quite popular in recent years but they won’t be the right choice for every situation. The worst outcome here is to create a “RESTful” API, which isn’t really RESTful.
Perhaps the simplest choice is one of the RPC formats. Developers on any platform are familiar with calling methods, passing parameters, and getting values returned. These services are easy to understand and will work well even for developers with limited API experience. RPC-style APIs are also very useful when using HTTP within existing applications, either to provide some modularity or to integrate two existing systems with a functional style, but they do have their downsides. Since the RPC services are usually made up entirely of POST requests, none of the responses can be cached by an HTTP proxy.
Consider Data Formats
A SOAP service will always use XML, but for RESTful or RPC services, the data format that fits best can be chosen. The most common options are JSON and XML, but there are also services that handle incoming form-encoded data formats, outgoing HTML formats, serialized PHP formats, YAML, and even plain text.
We saw in Chapter 6 some examples of XML being used with an RPC service, and SOAP is XML underneath. However, XML has plenty more applications than just SOAP, and can be used as the data format (or a data format) in any one of a number of different styles of service. XML allows us to mark up elements with child elements, character data, and also attributes, but produces quite a large data size in return. Therefore, XML would do well when the bandwidth used for the transfers isn’t slow or expensive, and the devices consuming the data have enough memory and processing power to handle and parse the data.
HTML as a data format is an idea that isn’t found in many textbooks, but certainly shows up in the real world on a regular basis. In its simplest form, we might return HTML in response to an AJAX request from a webpage, perhaps showing some new content in HTML on the page (something that you may already feature in your applications). It doesn’t take a huge leap of faith from this to providing HTML as an optional output format for an API, if only for reading data. An example of this is found in the RESTful Joind.in API, where HTML is offered as an output format; if you request http://api.joind.in from your browser, the API reads your Accept headers and returns the data as HTML, with the hypermedia presented as clickable hyperlinks. This serves as excellent documentation for your service.
Accepting incoming requests from a web form, or in that format, can also be very web-friendly if the users of the API are mostly web developers and it is likely to be used mostly with or from a web page. This is a step away from the pure idea of exchanging data between machines, but can be a valuable option depending on the audience of the API.
If the user stories show that different consumers will want different data formats, then the API will need to return multiple formats such as XML, JSON, and perhaps HTML as well. This needs a bit of planning, but has major advantages because every consumer of your service will be able to ask for the data in the format that is right for their scenario. An application that takes care to make use of common templates or output handlers for each data format, used by every response sent, will be able to consistently return data in multiple formats.
As well as choosing data formats, there are other variables for which the “right” choice to make will differ between the consumers of the API. An easy example is the number of entries you return. Returning all the data is fine…until the application becomes terribly popular, and suddenly the API is returning four thousand records instead of forty! To improve this experience for everyone, APIs often offer pagination of data. As well as giving a way to specify which range of results to return, it is good practice to allow the number of results returned to be customized. A reporting server on a fast network might want all the data, whereas the mobile device with a patchy signal might only want the newest five records.
Another big variable is how much information to return with each request, and this decision usually manifests in two forms. When returning information about a particular item, should all the information be returned? And the follow up question: Should any related data be returned also? Including data means we’ll sometimes be returning more information than needed, a bit like doing SELECT * FROM … in SQL. But if you omit data, then some consumers will have to make a large number of requests to obtain what they need.
Consider the example of the classic blog application. Should the API return the body of every article? If you’re showing the user a list of articles, you probably don’t want to show the entire text of the post, and to include all the text for all the articles would result in a huge response to send—but when showing an individual article, it will be an important piece of data to have. Allowing the consumer of your API to specify whether he needs headline data or detailed data, or offering different methods depending on whether a list of outline elements is needed or a single, in-depth method is required, will help users to get the best out of your API.
Now for the follow up question of whether to include related data. With the hypothetical blog post application, the post record itself will include, perhaps, the ID of the author. Your API will offer a way to fetch an author by his or her ID. But that means that when the consumer retrieves a list of articles, an additional call must be made for each of the items in the list to discover the name of the author so it can be displayed to the user, and these additional calls can be slow if many of them are needed. In a situation like this, it is quite clear-cut that we would return the name of the author with each article, to save lots of round trips to the server. In the real world, few situations are quite this clear-cut, and you will have to make some decisions about when data should be included and when it should be available separately. This is where the user stories I mentioned at the beginning of this chapter will help you to gain insight into what the “right” decision is, and in fact some resources should probably be made available in “brief” and “verbose” formats to allow consumers some choices.
Pick Your Defaults
It’s important to offer users some choice, but also to offer a simpler path so that people can jump straight in and use your API without having to set up too many options. Every customizable option should have a default value that is returned if no preference is stated. Are you missing theAccept header? Send JSON. You don’t have any pagination settings? Send the first 25 results. This approach allows people to get the best of the API very quickly and easily, and they can delve deeper to change the defaults if their requirements don’t fit well with the defaults chosen.
Consider whether or not you will comply with all requests, though; if a consumer requests 1,000 results that might be expensive for your API to generate, you may still only send the first 200 (or whatever makes sense for your system). Similarly, some APIs will benefit from having rate limits. This means that each client can only make a certain number of requests in a given time period. Many APIs allow a very limited number of requests for unregistered users, and may allow differing levels of access to different customers, particularly for paid-for apps. Rate limiting is a way of making sure that you guarantee an expected level of service to all users by managing the load on your servers and allowing different users to have a level of access that suits them.
This philosophy of making things easy and useful to users, with minimal effort on their part, makes the barrier to entry much lower for your application and makes the experience of using a new API one of tolerance and welcome.