Build APIs You Won't Hate: Everyone and their dog wants an API, so you should probably learn how to build them (2014)
6. Outputting Data
6.1 Introduction
In Chapter 3: Input and Output Theory we looked at the theory of the output structure and the pros and cons for various different formats. The rest of this book assumes you have picked your favorite, and it assumes that favorite is my favorite. This doesn’t matter all that much but doing everything for everyone would be an exercise in futility and boredom.
The aim of this chapter is to help you build out your controller endpoints. Assuming you have written tests for these endpoints before they exist, we can now fill up a few of those tests with green lights instead of the omnishambles of errors and fails you are most likely facing.
The examples in the first section will be trying to show off a list of places, and show of one specific place:
1 {
2 "data": [
3 {
4 "id": 2,
5 "name": "Videology",
6 "lat": 40.713857,
7 "lon": -73.961936,
8 "created_at": "2013-04-02"
9 },
10 {
11 "id": 1,
12 "name": "Barcade",
13 "lat": 40.712017,
14 "lon": -73.950995,
15 "created_at": "2012-09-23"
16 }
17 ]
18 }
1 {
2 "data": [
3 "id": 2,
4 "name": "Videology",
5 "lat": 40.713857,
6 "lon": -73.961936,
7 "created_at": "2013-04-02"
8 ]
9 }
6.2 The Direct Approach
The first thing that every developer tries to do is take their favorite ORM, ODM, DataMapper or Query Builder, pull up a query and wang that result directly into the output.
Dangerously bad example of passing data from the database directly as output
1 <?php
2 classPlaceControllerextends ApiController
3 {
4 publicfunction show($id)
5 {
6 return json_encode([
7 'data' => Place::find($id)->toArray(),
8 ]);
9 }
10
11 publicfunction index()
12 {
13 return json_encode([
14 'data' => Place::all()->toArray(),
15 ]);
16 }
17 }
This is the absolute worst idea you could have for enough reasons for me to fill up a chapter on its own, but I will try to keep it to just a section.
ORMs in Controllers Your controller should definitely not have this sort of ORM/Query Builder logic scattered around the methods. This is done to keep the example to one class. |
Performance: If you return “all” items then that will be fine during development, but suck when you have a thousand records in that table… or a million.
Display: PHP’s popular SQL extensions all type-cast all data coming out of a query as a string, so if you have a MySQL “boolean” field (generally this is a tinyint(1) field with a value of 0 or 1) will display in the JSON output as a string, with a value of "0" or "1" which is lunacy. If you’re using PostgreSQL it is even worse, the value directly output by PHP’s PostgreSQL driver is "f" or "t". Your mobile developers won’t like it one bit, and anyone looking at your public API is going to immediately consider this an amateur API. You want true or false as an actual JSON boolean, not a numeric string or a char(1).
Security: Outputting all fields can lead to API clients (users of all sorts) being able to view your users passwords, see sensitive information like email addresses for businesses involved (venues, partners, events, etc), gain access to secret keys and tokens generally not allowed. If you leak your forgotten password tokens for example then you’re going to have an EXTREMELY bad time, its as bad as leaking the password itself.
Some ORM’s have a “hidden” option to hide specific fields from being output. If you can promise that you and every single other developer on your team (now, next year and for the entire lifetime of this application) will remember about that then congratulations, you could also achieve world peace with a team that focused.
Stability: If you change the name of a database field, or modify your MongoDB document, or change the statuses available for a field between v3 and v4 then your API will continue to behave perfectly, but all of your iPhone users are going to have busted crashing applications and it is your fault. You will promise yourself that you won’t change things, but you absolutely will. Change happens.
So, next our theoretical developer friend will try hard-coding the output.
Laborious example of type-casting and formatting data for output
1 <?php
2 classPlaceControllerextends ApiController
3 {
4 publicfunction show($id)
5 {
6 $place = Place::find($id);
7
8 return json_encode([
9 'data' => [
10 'id' => (int) $place->id,
11 'name' => $place->name,
12 'lat' => (float) $place->lat,
13 'lon' => (float) $place->lon,
14 'created_at' => (string) $place->created_at,
15 ],
16 ]);
17 }
18
19 publicfunction index()
20 {
21 $places = array();
22
23 foreach (Place::all() as $place) {
24 $places[] = [
25 'id' => (int) $place->id,
26 'name' => $place->name,
27 'lat' => (float) $place->lat,
28 'lon' => (float) $place->lon,
29 'created_at' => (string) $place->created_at,
30 ];
31 }
32
33 return json_encode([
34 'data' => $places,
35 ]);
36 }
37 }
Thanks to specifying exactly what fields to return in the JSON array the security issues are taken care of. The type-casting of various fields turn numeric strings into integers, coordinates into floats, and that pesky Carbon (DateTime) object from Laravel into a string, instead of letting the object turn itself into an array.
The only issue this has not taken care of from the above example is performance, but that is a job for pagination which will be covered in Chapter 10.
A new issue has however been created, which should be a fairly obvious one: This is icky. Our theoretical developer now tries something else.
Considerably better approach to formatting data for output
1 <?php
2 classPlaceControllerextends ApiController
3 {
4 publicfunction show($id)
5 {
6 $place = Place::find($id);
7
8 return json_encode([
9 'data' => $this->transformPlaceToJson($place),
10 ]);
11 }
12
13 publicfunction index()
14 {
15 $places = array();
16 foreach (Place::all() as $place) {
17 $places[] = $this->transformPlaceToJson($place);
18 }
19
20 return json_encode([
21 'data' => $places,
22 ]);
23 }
24
25 privatefunction transformPlaceToJson(Place $place)
26 {
27 return [
28 'id' => (int) $place->id,
29 'name' => $place->name,
30 'lat' => (float) $place->lat,
31 'lon' => (float) $place->lon,
32 'created_at' => (string) $place->created_at,
33 ];
34 }
35 }
Certainly much better, but what if a different controller wants to show a place at any point? You could theoretically move all of these transform methods to a new class or shove them in the ApiController, but that would just be odd.
Really you want to make what I have come to call “Transformers”, partially because the name is awesome and because that is what they are doing.
These are essentially just classes which have a transform method, which does the same as the transformPlaceToJson() above, but to avoid you having to learn how to make your own I have released a PHP package which takes care of it: Fractal.
6.3 Transformations with Fractal
With Fractal, Transformers are created as either a callback or an instance of an object implementing League\Fractal\TransformerAbstract. They do exactly the job that our transformPlaceToJson() method did but they live on their own, are easily unit-testable (if that floats your boat) and remove a lot of presentation clutter from the controller.
Fractal does a lot more than that which will be explored later on, but it covers concerns with transformation perfectly, removes the security, stability and display concerns addressed earlier.
While other languages have great solutions for this already, PHP seemed to be rather lacking for this exact purpose. Some call it “Data Marshalling” or “Nested Serialization”, but it is all achieving roughly the same goal: take potentially complicated data from a range of stores and turn it into a consistent output.
· Jbuilder looks fairly slick for the Ruby crowd
· Tweet other suggestions to @philsturgeon
That is the end of theory in this book. We will now be working with code. Open up the Sample Code ZIP file or head to the GitHub repo and extract it somewhere useful.
1 $ cd chapter6
2 $ php artisan serve
3 Laravel development server started on http://localhost:8000
Open your browser and go to http://localhost:8000/places, and there is a list of places looking like this:
Fractal default JSON structure using the JSONView extension for Chrome
This is a Laravel 4 application but only because it has migrations and seeding and I like it. This is made up of a few bits of PHP that would work in any framework, and the approach works in any language.
· composer.json - Added an autoloadable folder using PSR-0 to allow my own code to be loaded
· app/controllers/ApiController.php - Insanely simple base controller for wrapping responses
· app/controllers/PlaceController.php - Grab some data and pass it to the ApiController
Other than defining some basic GET routes in app/routes.php that is basically all that is being done.
The PlaceController looks like this:
Example of a controller using Fractal to output data
1 <?php
2 use App\Transformer\PlaceTransformer;
3
4 classPlaceControllerextends ApiController
5 {
6 publicfunction index()
7 {
8 $places = Place::take(10)->get();
9 return $this->respondWithCollection($places, new PlaceTransformer);
10 }
11
12 publicfunction show($id)
13 {
14 $place = Place::find($id);
15 return $this->respondWithItem($place, new PlaceTransformer);
16 }
17 }
The “raw data” (happens to be an ORM model but could be anything) is sent back with the appropriate convenience method and a transformer instance is provided too. These respondWithCollection() and respondWithItem() methods come from ApiController, and their job is just to create Fractal instances without exposing as many classes to interact with.
The PlaceTransformer looks like this:
1 <?php namespace App\Transformer;
2
3 use Place;
4 use League\Fractal\TransformerAbstract;
5
6 classPlaceTransformerextends TransformerAbstract
7 {
8 /**
9 * Turn this item object into a generic array
10 *
11 * @return array
12 */
13 publicfunction transform(Place $place)
14 {
15 return [
16 'id' => (int) $place->id,
17 'name' => $place->name,
18 'lat' => (float) $place->lat,
19 'lon' => (float) $place->lon,
20 'address1' => $place->address1,
21 'address2' => $place->address2,
22 'city' => $place->city,
23 'state' => $place->state,
24 'zip' => (float) $place->zip,
25 'website' => $place->website,
26 'phone' => $place->phone,
27 ];
28 }
29 }
Simple.
The ApiController is kept super simple at this point too:
Simple ApiController for basic responses using Fractal
1 <?php
2
3 use League\Fractal\Resource\Collection;
4 use League\Fractal\Resource\Item;
5 use League\Fractal\Manager;
6
7 classApiControllerextends Controller
8 {
9 protected $statusCode = 200;
10
11 publicfunction __construct(Manager $fractal)
12 {
13 $this->fractal = $fractal;
14 }
15
16 publicfunction getStatusCode()
17 {
18 return $this->statusCode;
19 }
20
21 publicfunction setStatusCode($statusCode)
22 {
23 $this->statusCode = $statusCode;
24 return $this;
25 }
26
27 protectedfunction respondWithItem($item, $callback)
28 {
29 $resource = new Item($item, $callback);
30
31 $rootScope = $this->fractal->createData($resource);
32
33 return $this->respondWithArray($rootScope->toArray());
34 }
35
36 protectedfunction respondWithCollection($collection, $callback)
37 {
38 $resource = new Collection($collection, $callback);
39
40 $rootScope = $this->fractal->createData($resource);
41
42 return $this->respondWithArray($rootScope->toArray());
43 }
44
45 protectedfunction respondWithArray(array $array, array $headers = [])
46 {
47 return Response::json($array, $this->statusCode, $headers);
48 }
49
50 }
The method respondWithArray() takes a general array to convert into JSON, which will prove useful with errors. Other than that everything you return will be a Fractal Item, or a Collection.
6.4 Hiding Schema Updates
Schema updates happen, and they can be hard to avoid. If the change in question is simply a renamed field then this is insanely easy to handle:
Before
1 'website' => $place->website,
After
1 'website' => $place->url,
By changing the right (our internal data structure) and keeping the left the same (the external field name) we maintain control over the stability for the client applications.
Sometimes it is a status change. A new status is added, or the change is fairly drastic and the status all change, but the old API version is still expecting the old one. Maybe someone changed “available” to “active” to be consistent with the other tables, because the original developer was as consistent and logical as a rabid ferret.
Before
1 'status' => $place->status,
After
1 'status' => $place->status === 'available' ? 'active' : $place->status,
Gross, but useful.
6.5 Outputting Errors
Exactly how to output errors is something I personally am still toying with. The current front-runner is adding convenience methods to the ApiController which handle global routes with a constant as the code and a HTTP error code set, with an optional message in case I want to override the message.
Simple error codes and responses added to ApiController
1 <?php
2
3 // ...
4
5 classApiControllerextends Controller
6 {
7 // ...
8
9 const CODE_WRONG_ARGS = 'GEN-FUBARGS';
10 const CODE_NOT_FOUND = 'GEN-LIKETHEWIND';
11 const CODE_INTERNAL_ERROR = 'GEN-AAAGGH';
12 const CODE_UNAUTHORIZED = 'GEN-MAYBGTFO';
13 const CODE_FORBIDDEN = 'GEN-GTFO';
14
15 // ...
16
17 protectedfunction respondWithError($message, $errorCode)
18 {
19 if ($this->statusCode === 200) {
20 trigger_error(
21 "You better have a really good reason for erroring on a 200...",
22 E_USER_WARNING
23 );
24 }
25
26 return $this->respondWithArray([
27 'error' => [
28 'code' => $errorCode,
29 'http_code' => $this->statusCode,
30 'message' => $message,
31 ]
32 ]);
33 }
34
35 /**
36 * Generates a Response with a 403 HTTP header and a given message.
37 *
38 * @return Response
39 */
40 publicfunction errorForbidden($message = 'Forbidden')
41 {
42 return $this->setStatusCode(403)->respondWithError($message, self::CODE_FORBIDDEN);
43 }
44
45 /**
46 * Generates a Response with a 500 HTTP header and a given message.
47 *
48 * @return Response
49 */
50 publicfunction errorInternalError($message = 'Internal Error')
51 {
52 return $this->setStatusCode(500)->respondWithError($message, self::CODE_INTERNAL_E\
53 RROR);
54 }
55
56 /**
57 * Generates a Response with a 404 HTTP header and a given message.
58 *
59 * @return Response
60 */
61 publicfunction errorNotFound($message = 'Resource Not Found')
62 {
63 return $this->setStatusCode(404)->respondWithError($message, self::CODE_NOT_FOUND);
64 }
65
66 /**
67 * Generates a Response with a 401 HTTP header and a given message.
68 *
69 * @return Response
70 */
71 publicfunction errorUnauthorized($message = 'Unauthorized')
72 {
73 return $this->setStatusCode(401)->respondWithError($message, self::CODE_UNAUTHORIZ\
74 ED);
75 }
76
77 /**
78 * Generates a Response with a 400 HTTP header and a given message.
79 *
80 * @return Response
81 */
82 publicfunction errorWrongArgs($message = 'Wrong Arguments')
83 {
84 return $this->setStatusCode(400)->respondWithError($message, self::CODE_WRONG_ARGS\
85 );
86 }
This basically allows for generic error messages to be returned in your controller without having to think too much about the specifics.
Controller using Fractal, combined with a simple error response
1 <?php
2 use App\Transformer\PlaceTransformer;
3
4 classPlaceControllerextends ApiController
5 {
6 publicfunction index()
7 {
8 $places = Place::take(10)->get();
9 return $this->respondWithCollection($places, new PlaceTransformer);
10 }
11
12 publicfunction show($id)
13 {
14 $place = Place::find($id);
15
16 if (! $place) {
17 return $this->errorNotFound('Did you just invent an ID and try loading a place? M\
18 uppet.');
19 }
20
21 return $this->respondWithItem($place, new PlaceTransformer);
22 }
23 }
Other “Place” specific errors could go directly into the PlaceController as methods just like these, with their own constants in the controller, picking a statusCode in the method or relying on one as an argument.
6.6 Testing this Output
You have already seen how to test your endpoints using the Gherkin syntax in Chapter 5: Endpoint Testing, so we can apply that testing logic to this output:
1 Feature: Places
2
3 Scenario: Listing places without search criteria is not possible
4 When I request "GET /places"
5 Then I get a "400" response
6
7 Scenario: Finding a specific place
8 When I request "GET /places/1"
9 Then I get a "200" response
10 And scope into the "data" property
11 And the properties exist:
12 """
13 id
14 name
15 lat
16 lon
17 address1
18 address2
19 city
20 state
21 zip
22 website
23 phone
24 created_at
25 """
26 And the "id" property is an integer
27
28 Scenario: Searching non-existent place
29 When I request "GET /places?q=c800e42c377881f8202e7dae509cf9a516d4eb59&lat=1&lon=1"
30 Then I get a "200" response
31 And the "data" property contains 0 items
32
33
34 Scenario: Searching places with filters
35 When I request "GET /places?lat=40.76855&lon=-73.9945&q=cheese"
36 Then I get a "200" response
37 And the "data" property is an array
38 And scope into the first "data" property
39 And the properties exist:
40 """
41 id
42 name
43 lat
44 lon
45 address1
46 address2
47 city
48 state
49 zip
50 website
51 phone
52 created_at
53 """
54 And reset scope
This is again using the FeatureContext.php provided in the sample code, which makes it really easy to test output. We are again assuming that all output is in a "data" element, which is either an object (when one resource has been requested) or an array of objects (multiple resources or a collection have been requested).
When you are searching for data you want to ensure that not finding any data doesn’t explode. This can be down to your controller processing on output and failing because what should be an array is null, or because some PHP collection class is missing methods, etc. This is why we perform the search with a hardcoded invalid search term, then check that it returns an empty collection:
1 {
2 "data": []
3 }
The line And the "data" property contains 0 items will cover this. Then we can search for valid terms, knowing that our database seeder has made sure at least one Place has the keyword “cheese” in the name. Using the line And scope into the first "data" property the scope changes to be inside the first data item returned, and the properties can be checked for existence too. If no data, or required fields are missing, this test will fail.
6.7 Homework
Your homework is to take apart the sample application, fit it into your API and try to build valid output for as many of your GET endpoints as possible. Check the data types and make sure the array structure is being output in the way you expect using the test example above.
With valid output covered and basic errors covered, what is next? The most complicated part of API generation, which at some point every developer has to try and work out: embedding/nesting resources, or making “relationships”.