Would you like the design a web API that is reliable and pleasant to use? I have some ideas for you.
This is the version that's all on one page, but the chapters are also available as separate pages. The content is exactly the same.
-
Introduction
-
What problems are we trying to solve?
-
A shortcut
-
Establishing contracts
-
Requests and Responses
-
Resources
-
Side effects
-
Many nouns, few verbs
-
Designing Data
-
Relationships
-
Discoverability
-
Conclusion
-
Appendix: Versioning your API
-
Appendix: Versioning your Resources
-
Appendix: Authentication
-
Appendix: REST
-
Appendix: JSON:API
-
Appendix: Domain-Driven Design
Introduction
§
I've designed a lot of web APIs in my career. The details—data, language, business—always change, but the work is largely the same. When it's really good, we spend approximately 0% of our time focusing on what the API looks like and 100% of our time focusing on what the API enables.
I've finally taken some time to write all this down, because with every new project I find myself repeating the same ideas. I hope this serves as inspiration for you, but also I hope it serves as a shorthand for me.
I hope that this guide will help you accomplish the same thing. It's a set of practical and pragmatic strategies based on real experience that prioritizes the goals of your user and mitigates development risk. Soon, you'll be implementing predictable and pleasant APIs very quickly, and everyone will be happy, and life will probably be just great.
If you're reading this right now because we're about to build an API together and I asked you to take a look, thank you very much for your time.
If you find errors, please email me; I'd really appreciate it. If you disagree with me on subjective points, you can also email me about those, but I'll appreciate it a little less.
What problems are we trying to solve?
§
Here are my priorities, in order of importance. I think these should be your priorities too.
- Our API fulfills our product requirements.
- It's sufficiently fast and stable for our expected traffic and load.
- It's easy to integrate into various downstream systems.
- It's easy to explore with simple tools like cURL.
- It's adaptable for future requirements that we don't know about yet.
- It's pleasant to develop with.
We may ship client libraries for various languages, but ideally we don't have to, because nothing about it is that complicated. The inputs and outputs should be simple and predictable enough that introducing a layer of abstraction just feels like a redundancy. (It's nice to avoid additional layers of abstraction, because then there are fewer components we have to maintain. The API is the abstraction.)
Here are some things I don't care about—problems we're not trying to solve—in no particular order. This list is not exhaustive.
- It's not trendy or modern.
- It's not a drop-in replacement for some other API.
- It's not perfectly efficient or faster than it needs to be.
- It's not 100% discoverable (in the hypermedia sense).
JSON over HTTP
It may not surprise you to know that what we're going to talk about is an API that serves JSON over HTTP using REST principles. It's about as simple as you can get.
There are other, newer options: a graph-based interface like GraphQL, or an RPC-based interface like gRPC. I do not recommend adopting these technologies. They certainly have their benefits, but for my priorities, they all pale in comparison to the most obvious and "boring" choice. If your API is unexceptional, but it helps your users do their jobs quickly and efficiently, you've succeeded.
A shortcut
§
I'm going to show you my conclusions first, and then we'll work backwards and talk about why each of these things are important. If you just read this list and go forth to build an API that checks all these boxes, I'm proud of you. If you're skeptical about why all these things—all these things—are essential, keep reading and we're going to get into it.
This is a process I’ve used countless times, for tiny projects and for huge projects. It leaves very little room for bikeshedding. By framing it like this, non-technical stakeholders can help, and often they have important insights that the engineers might not be thinking about.
- Serve JSON over HTTP.
- Every single possible valid request is defined by an OpenAPI spec that is freely available to every user of your API.
- Every request to your API can be described as retrieving, updating, or deleting a resource.
- You use GET for retrieving, POST for creating, PATCH for updating, and DELETE for deleting.
- Every path fits one of the following patterns:
/resource_type
represents a collection of resources.
/resource_type/resource_id
represents a single resource, identified by ID.
/resource_type/resource_id/relationship_name
represents the resource or resources that are related to the identified resource with the given relationship name.
- Your responses are always objects. Successful responses have a
data
key that is an object or an array of objects; erroneous responses have an error
key that is an array of strings.
- A request to retrieve a collection can be paginated, ordered, and filtered using query parameters.
- If your API supports authenticated access, requests are authenticated with an
Authorization
header.
- Resource types (both in your paths and in your JSON objects) are plural names (like
"posts"
). Relationships are singular or plural based on if they are to-one or to-many, respectively.
We're going to get into all the details, but just keep all of this in mind.
Establishing contracts
§
Devise a development pipeline where code must follow the flow outlined in the guide. With guards in place, it's technically impossible to wing it, and also it's easier to just do it this one way than to try to do it some other way. Your team is going to take shortcuts when they find them, so just start with the shortcuts.
To me, this means starting with an OpenAPI definition, using code generation to implement your endpoints based on the spec, and using strict linters to ensure your code always conforms to the spec.
This also means that you can make consistency an important rule, but break that rule when you need to. If some particular endpoint really needs an extra field or two that break your esablished patterns, you have a safe way to introduce the discrepancy. Define it in your spec, make sure your server satisfies the spec, and move on. This leaves room for bikeshedding, and that's okay. The most effective tools to resist bikeshedding are shared values and mature conversations.
If you're validating input based on what's permissible by your strictly-defined API, you can do less in your route handlers to guard against ingesting bad data. If you're validating output based on that same definition, you can do less still in your route handlers to guard against transmitting bad data. In many cases, your code can be fully covered by writing one functional test for each response code. If your validation works, you're done!
In Node.js land, a really good piece of middleware for facilitating this process is called express-openapi-validator
. (I'm not involved with this project, but I've used it in production and I had a great time.) If you're not using Express.js, take a look at what this middleware does and try to find a similar project, or just emulate in your own codebase. The important behaviors are that you can immediately return a 400 error to the client if they've tried sending a request that doesn't match the spec, and you can return a 500 error to the client if your server tries sending a response that doesn't match the spec. Unexpected input never reaches your server, and unexpected output never reaches your client. This makes it really easy to catch bugs that might have been too subtle to detect otherwise.
Even if you're just writing "a few endpoints," this practice will pay off immediately.
Requests and Responses
§
You're using a RESTful HTTP architecture. The core tenet of this architecture is that incoming requests and outgoing responses are stateless and resource-oriented. TK explain what this means
Take advantage of the HTTP verbs. Use them consistently and with exclusive intent. We'll get deep into what this means in practice in a future chapter.
Your verbs are GET
for retrieving one or more resources; POST
for creating one new resource; PATCH
for updating one existing resource; and DELETE
for deleting one existing resource. Don't use any other verbs, and don't use any of these verbs for anything other than what I've described here.
As I mentioned in the previous chapter, you have exactly three patterns of paths at your disposal.
A collection of resources of a given type are availble at a path that looks like /resource_type
. For example, fetching Posts looks like GET /posts
. You can GET
this or POST
to it. It supports pagination and filtering.
A single resource is identified with /resource_type/resource_id
. You can GET
, PATCH
, and DELETE
this.
The resource or resources linked to a given resource with a named relationship can be retrieved from /resource_type/resource_id/relationship_name
. You can GET
this.
Your responses should be JSON. They should be unsurprising and very flat.
You might be tempted to return graphs of data, either using something like GraphQL or simply offering deep webs of interconnected JSON data. Don't do it. The flatter your structure is, the fewer options your clients have when trying to decide how to retrieve and manipulate your data. Small responses with few dependencies are also very easy to cache.
Use a consistent structure at the root of your responses. Your response bodies are always JSON objects. The object has a data
key or an error
key, but not both. If data
is present, its value is a resource or an array of resources. If errors
is present, it's an array of strings describing what went wrong in human-friendly terms.
Responses with data
have status codes in the 200 range. Responses with error
have status codes in the 400-500 range.
Your GETs support a filter
query parameter, which works for every attribute of the resource being retrieved.
Order of returned resources is well-defined and consistent, with documented default values. Order can be changed with an order
query param, which is the name of an attribute on the resources, potentially with a leading hyphen. order=created_at
returns your resources by creation date, oldest first. order=-updated_at
returns your resources by update date, most recent first.
Pagination is cursor-based, or offset-based, or page-based, but it’s one style for everything. It uses one or two query params: cursor
; offset
and limit
; or page
and per_page
. If you use cursor
, your paginated response documents include a cursor
key whose value can be used to retrieve the next page of results.
Your non-GET requests operate on one resource at a time.
Validation
TK
Resources
§
A resource is identified by its type and ID. Types should be plural, and they should be identical to the resource type in the path. IDs should be strings, but they can be numeric strings if that's what you like. Don't try to encode data in your IDs; your clients should not be able to infer anything about your resources from the IDs alone, except for the fact that they identify a specific resource.
GET /posts/1
{
"data": {
"type": "posts",
"id": "1"
}
}
The goal here, as always, is to be boring and explicit.
Your POSTs should work the same way, but in reverse. You send this same kind of object (without an ID), and you get back the saved version of that object, which is usually identical to what you sent but now includes an ID.
Besides your root object, the only JSON objects in your responses should be resources, meaning an object should never appear without type
and id
attributes. If I'm a client and I have an object saved somewhere, I should be able to confidently fetch a new copy of it by taking its type
and id
, building a path, and making a GET request to it.
In addition to type
and id
, a resource has two kinds of data: attributes and relationships. Most of the data in a resource will be attributes, which are flat and primitive; no arrays or objects. Things like names, creation timestamps, and whatever else your clients may need are available here.
Dates and times should be strings that conform to RFC3339, by the way. All this means is dates are in the format "YYYY-MM-DD" and times are in UTC in the format "YYYY-MM-DDTHH:MM:SSZ".
Relationships are a little more complicated, but no less predictable. A to-one relationship is defined just like an attribute, and it ends with _id
or Id
(depending on if your keys are snake_case
or camelCase
). The value is just a string, and it's the ID of the related resource. If you're dealing with a lot of polymorphic relationships, you may want to represent this as a minimal resource instead of a string (i.e. an object with "type"
and "id"
keys, but nothing else). Your client will know how to fetch those resources if they need to. Avoid the tempation of embedding an entire related resource inside your primary resource.
To-many relationships will usually not be defined directly on the resource. Instead, they're retrievable with that third path pattern we specified.
If my Article resource has a relationship called categories, which is a to-many relationship to a collection of Category resources, I can retrieve them like so:
GET /articles/1/categories
{
"data": [
{ "type": "categories", "id": "1", ... },
{ "type": "categories", "id": "2", ... },
{ "type": "categories", "id": "3", ... }
]
}
The format of this response is identical to the response for GET /categories
. However, /categories
returns all Categories, while /articles/1/categories
returns only the Categories that relate to Article 1.
You might want to expose to-one relationships like this as well. If an Article can only relate to one Category, and Article 1 relates to Category 2, then GET /articles/1/category
should return an identical response to GET /categories/2
. Note that the first one uses category
, singular, while the second one uses categories
, plural; that's because category
is used here not as a resource type but as a relationship name. The type
attribute on the data returned will still be "categories"
.
If you really need to, you can embed the IDs of a to-many relationship on the resource as an attribute whose value is an array of strings. You should only do this if the client will have to specify these related resources at the time of creating the primary resource, like so:
POST /articles
{
"data": {
"title": "How to write a cool program",
"category_ids": ["1", "2", "3"]
}
}
Side effects
§
Manipulating some resources might result in changes being applied to other resources. Your Cart resource might have a subtotal attribute, and creating a new Cart Item resource that relates to the Cart may update that attribute.
This kind of calculated attribute can be very powerful and useful. It's important to express to your client that these fields are read-only. It's also helpful to document how they are calculated, not only on the resource that owns them but also on the resources that affect the value.
Avoid creating dependencies between resources that don't have an explicit relationship. In fact, it's quite alright to create a read-only relationship! This can be useful for splitting up a resource that would otherwise be enormous, especially if your clients are unlikely to use all of that resource's attributes at the same time. For example, a User resource may have many fields for an Instagram username, GitHub username, Facebook username, and so on. Creating a "User Social" or "User External" resource, which is automatically created when a "User" resource is created, can be helpful here.
Also, remember that your resources don't have to (and in many cases shouldn't) mirror your data store. You might have a users
table in your relational database with 50 columns, but maybe only 10 of those actually belong to the User resource. You can even use the same primary key for the related resources, so User 1 relates to User Social 1. Just make sure your clients are not expecting these to be identical, and always following the resource relationships. That way, if it someday makes sense to use a second table with its own primary key, nothing breaks.
Many nouns, few verbs
§
An important part of our system of resources is the idea of limiting the kinds of ways we interact with them. These interactions are retrieval (GET
), creation (POST
), updating (PATCH
), and deletion (DELETE
).
This limitation is very important in enabling precise conversations and a clear mental model of our system, but at times it might feel unnatural. You're going to have to take every single user action and express it with one of these four verbs, which may not be how you'd express this same idea in English. Here are some examples:
- Signing up: creating a User
- Logging in: creating a Session
- Logging out: deleting a Session
- Adding to cart: creating a Cart Item
- Checking out: creating an Order
My advice is to just try to get used to it. It's okay to speak colloquially about "signing in" as long as there's a shared understanding of what that means. In fact, this shorthand can be useful—for example, what you and your team call "checking out" may include creating an Order resource as well as creating Shipping Address, Billing Address, and Gift Message resources. This is one of those situations where you just can't—or shouldn't—try to represent everything in code. (But do try to write everything down.)
Implementing a new bit of functionality will almost always mean adding a verb that wasn't already available, or adding a resource that didn't exist before.
What about PUT?
One HTTP verb that is notably absent from my advice above is PUT
. Historically, PUT
has been a more popular choice for signaling to the server that the request is meant to update an existing record. However, PUT
is more appropriate for replacing the record. There are some instances where the most appropriate thing to do is actually a full replacement, and in those cases you might want to implement a PUT
. But it's not what our collective understanding of "updating" a resource actually means, and it's probably safe to completely ignore in almost every API you'll ever build.
Designing Data
§
Some of the choices I've listed so far may seem arbitrary, and they are! You and your team should make aesthetic changes if you want to, but I urge you to define them on a system level. If you can't agree on something that is subjective, I recommend that you just do what I said, because it doesn't matter and this way of doing things is already written down.
Once you've gotten the formats out of the way, you're left with the hard part, which is also the fun part. It's time to design your resources!
Resource design should be done carefully, deliberately, and with lots of input from everyone involved. This means the engineers writing the routes and the ones calling them, but it also means the visual designers who will create interfaces that are filled with these resources, and the product owners who will craft (or already have crafted) the business requirements that will be satisfied by these resources.
Remember that all of your business logic, 100% of the things a user of your system can do, must be defined in terms of the retrieval, creation, update, and deletion of resources.
Ideally you already have some rough user journeys or UI flows to reference here. If not, this is the time to create them. They don't have to be final or polished; you and your team just need a shared understanding of what a user is going to do.
At every step of the journey, there is probably some data to show the user, and some data the user might send to you. Map every last shred of data to one attribute on one resource. If you're building a web app, there ought to be one identifier that maps to the ID of one resource that can then be used to fetch everything else. For a blog post, your URL might be /articles/{id}
. On the page might be a title, body, publish date, author name, author photo, and comments; comments might consist of a commenter name, commenter photo, body, and publish date.
A predictable choice here is that the ID in the URL corresponds to the ID of an Article resource. An Article has attributes for title, body, and publish date. An Article has relationships called author, which is one User resource; and comments, which is a collection of many Comment resources. A User has attributes for name and photo; a Comment has attributes for body and publish date, and an author relationship to a User.
Now we know that rendering this page requires a GET for an Article by ID, a GET for the Article's comments relationship, and a series of GETs for Users, based on the author relationship of the Article and the author relationships of the Comments.
You may start to bristle at the number of HTTP requests you're making here. During implementation, you may choose to introduce some mechanism for wrapping all of this up in fewer requests, perhaps by embedding the Users in the resources they've authored, or adding a filter option for GET /users
that accepts a series of IDs, or some other technique. I advise you to avoid doing this kind of thinking in the design phase. Get your resource definitions right first.
Relationships
§
A resource that doesn't relate to anything else is probably a sign of a missing definition.
Relationships can be explicit or implicit. Explicit relationships are defined by the client—they create Post 123, and they tell the server that Post 123 relates to Category 456.
Implicit relationships are calculated based on other inputs. The client tells you a Billing Address's postal code is 11217, and the related Order gets related to Shipping Options accordingly.
Avoid optional relationships.
TK Move some of what's in the Resource chapter into this chapter
Included/Expanded Resources
TK
Updating to-many resources
TK
Discoverability
§
If you've built an API before, you probably have some strong opinions about some of the things I've laid out here.
One thing that makes me feel a little heated, myself, is that this API is not particularly discoverable. For example, just looking at an API response, I can't tell what type of resource a relationship is. I can't even tell that a to-many relationship exists!
In the hypothetical, academically-perfect RESTful API, we'd have full HATEOAS. This annoying acronym means "Hypertext As The Engine Of Application State." It means that each resource you receive gives you not only its own information, but also instructions about what else you can do and where else you can go from here. If you've ever learned how to navigate a website by clicking its links, you've benefited from HATEOAS.
It's a beautiful concept, and many hypermedia nerds (like myself) dream of fully assumption-free APIs, where a client merely needs a single URL to serve as an entrypoint and can then discover everything else from there, without documentation. Entire user interfaces could be generated without writing a single line of application-specific code! The dream!
In reality, this is basically never possible. More importantly, it's basically never useful. Being able to write a couple cURL
commands to get some data from an API without having to look at its documentation is great, and it's something to strive for—not just for easy curls, but because everything else gets easier too. But also, documentation is great, and it's necessary. If you specify 100% of your routes and their possible interactions in your docs, there's really no need for your API to be fully discoverable. Instead, focus on the ergonomics of reading and writing the data that you do choose to expose.
Conclusion
§
TK
If you've spotted any errors, logically or typographically, I'd love for you to email me about them. And again, thank you for your time! I hope these ideas help you on your next API project.
Appendix: Versioning your API
§
Finishing your API is just the first step. Someday there will be new requirements and things will have to change.
You should avoid breaking changes at all costs. A breaking change is when something that worked yesterday doesn't work today. That might mean that an endpoint that used to return some data now returns nothing; it might mean that it still returns data, but the data is different in an incompatible way to the what was returned before. Changing one attribute called full_name
into two attributes called first_name
and last_name
is a breaking change. Preserving full_name
while adding first_name
and last_name
, such that I can ignore those new fields if I don't care about them, is a safe, non-breaking change.
If you're still in development and you absolutely know that you (or your team) are the only ones using the API, don't worry about breaking changes. But make sure that everyone is clear on exactly when you're crossing the threshold at which point things must always continue to work. In development, when a QA tester encounters a breaking change, they know exactly who to bother about reverting the change or describing how to work around it. In production, when your users encounter a breaking change, they're stuck.
In general, and whenever possible: only add, never remove. If you need to make a change, and that change can be satisifed by adding a new attribute or relationship to an existing resource, do it that way. If you definitely need to change the whole implementation for a resource that already exists, leave that resource alone and create a new one instead. Deprecate the creation of that old resource if you need, but try to leave the existing data readable if you can. If you can leave the old way functional, don't even deprecate. Make it clear in your docs that you recommend using the new version, and explain what the benefits are. If the client insists on using the old way, update the docs to explain what they're missing out on by doing so.
If you're creating a replacement resource, give it a descriptive name that is not dependent on the reader knowing about the resource it's replacing. "TaxableOrder" is a much better name than "NewOrder" or "Order2."
In your design phase, ask your stakeholders and domain experts about what they think is missing from this API (and the product it supports). Many things are probably missing for good reason, and this doesn't mean you're going to include them. But knowing what might be added down the line—in a year, or in fifteen years—can help you to name the things you have and leave space for the things you don't.
Appendix: Versioning your Resources
§
If you're implementing something that needs to keep track of drafts or a version history, consider creating two separate resources: one that represents the "latest" version and another that represents each revision. All of the required fields on the latest version should be optional on the revisions.
Consider blocking deletes and updates to revisions
The latest version is fully calculated based on revisions
Relationships between revisions; linked list
Ordered by timestamp
Using slugs as IDs and handling changes
TKTKTK
Appendix: Authentication
§
If you can get away with it, just give your client an access token and expect it to be returned as a Bearer token in the Authorization header of every request. If you have very few users—like if you're building an API server for your colleagues to use on your first-party frontend—just find a reasonably straightforward way to generate tokens on the command line or in your database, and then hand them over. Don't build an automated means of generating new access tokens unless your API needs to be accessed by many people who don't already have a relationship with you and a means of communicating with you directly.
If you are in the second camp, like if you're building a public API and/or your API needs to make authenticated calls on behalf of a user, again: keep it simple and boring. It's probably a good idea to use JWT, and your language/framework of choice probably has a popular library for generating them already. So just use that!
Appendix: REST
§
REST is good and you should adopt it whole-heartedly.
When I say "REST," I don't mean when to use GET vs. POST (although that's also important). What I mean is that you should read Roy Fielding's dissertation, Architectural Styles and
the Design of Network-based Software Architectures, where Representational State Transfer was coined. Not once in its definition does he mention HTTP verbs!
Here are—in my opinion—the important characteristics of the REST architectural style:
- It's stateless.
- It's very cacheable.
- It's uniform.
- It's based on retrieving and manipulating well-defined resources.
TK
Appendix: JSON:API
§
You might notice that an API in this style looks quite a bit like an API that conforms to the JSON:API spec. This is not a coincidence! But if you've worked with JSON:API, you can probably figure out why my preferred format differs in the ways that it does.
I think the intention behind JSON:API is excellent. As you can probably tell if you've read through all of this, I hold many of those intentions very close to my heart. Perhaps the most important thing is that your whole team agrees on the problems it's trying to solve, and in a lot of cases it's a huge achievement to just get everyone to convincingly declare, "Let's use JSON:API."
I was so excited when the first draft of the JSON:API spec was published. I was building an API with a big team at the time and I pushed really hard for us to use it. We built some generic client and server libraries in the languages we were using in-house and we started modeling.
The modeling was a really great process. What used to be blobs of data invented and owned by different teams were becoming discrete objects we could discuss and reason about.
Where things started to fall down was in the finer details of how that data gets turned into JSON. Different members of the team latched on to different details of the spec. Things like…
- When a relationship ought to include an
href
,
- Or whether an attribute that was a big object—technically legal—was better served as a relationship,
- And if so, how that relationship's data should be persisted,
- And what that meant for the other metadata around that new resource,
- And what
POST
s and PUT
s should look like for this new network of data,
- And how to generically support the
include
parameter for everything,
- And so on.
The JSON:API spec tries to be a perfectly abstract representation of any arbitrary system that may be represented by JSON delivered over HTTP, and as a result I found that it is over-defined in some areas and under-defined in others. This leads to actual implementations making really curious choices. As is probably obvious by now, I think a more pragmatic approach is better.
Appendix: Domain-Driven Design
§
TK This is a related field that is probably worth exploring