Yaogang Lian

Partial Response in RESTful API Design

The idea of using partial response in RESTful API design has been around for quite some time. Google introduced partial response and partial update support in Google Data APIs in 2010. LinkedIn implemented similar ideas in their domain-model based API design back in 2009.

In the last few days I have been implementing this feature in muffin.io apps. I think this feature is so important that it should be supported out of box in all muffin apps. In this post I will explain what is partial response, how to design a clear and intuitive syntax for it, and how it’s implemented in muffin apps.

What is partial response?

The idea of partial response is quite simple — instead of returning full objects in API responses with all the data fields, only a subset of data fields are returned. The benefit is obvious — less data transfered over the network means less bandwidth usage, faster server response, less CPU time spent on the server and client, as well as less memory usage on the client. It’s a rare situation where everybody wins.

In the Google post it was mentioned that using a partial Google Calendar feed reduced the total data transfer by 95% compared to a full Google Calendar feed. That’s a huge amount of savings! I bet many people will burst into excitement if stores offer such deep discounts during Thanksgiving sales. More importantly, a faster and more efficient web saves natural resources and helps to reduce the enormous sins we have committed to Mother Nature.

The design of muffin is all about efficiency, so it’s no surprise that I want to implement this as a standard feature in muffin apps. However, efficiency is not all the reasons that I am excited about using partial response in RESTful API design.

In my opinion, a more important but often overlooked aspect of using partial response in API design is its enormous expressive power. Many of us have struggled to find the right balance between designing a flexible API and keeping the API scheme consistent. For example, a client app might only be interested in a few data fields in an object but tailoring the API for a specific client breaks API uniformity. That’s a dilemma that can be nicely resolved by partial response.

It’s best to illustrate this with an example. Suppose we have an object graph involving dogs and their owners (the code uses Google App Engine’s data modeling):

1
2
3
4
5
6
7
8
9
10
11
12
class Owner(db.Model):
name = db.StringProperty()
age = db.IntegerProperty()
class Dog(db.Model):
name = db.StringProperty()
breed = db.StringProperty()
hair_color = db.StringPropery()
eye_color = db.StringProperty()
age = db.IntegerProperty()
owner = db.ReferenceProperty(Owner, collection_name='dogs')

So a full API response for GET /dogs would return something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
[
{
"id": "32233",
"name": "Buddy",
"breed": "German Shepherd",
"hair_color": "yellow",
"eye_color": "brown",
"age": 5,
"owner":
{
"id": "32"
}
}
},
{
"id": "343444",
"name": "Dolly",
"breed": "Boxer",
"hair_color": "brown",
"eye_color": "black",
"age": 3,
"owner":
{
"id": "32"
}
}
}
]

If a client app A needs to know a dog’s name, breed, age, but not hair_color or eye_color, we can return a partial response with only those three fields. However, a client app B might require all the fields above. So should we design two APIs or sacrifice the efficiency and always return the full objects?

A better solution is to let the client specify which fields should be returned in an API response. Pause you who read this, and think for a moment of this powerful idea. It passes the control from the server to the client — clients with different requirements can share exactly the same API but the responses are optimized just for the requesting client. On the server side, the API scheme stays uniform and consistent.

Let’s go back to the example. For client A which only needs a dog’s name, breed and age, we can make an API request like this:

1
GET /dogs?fields=name,breed,age

However, the idea of partial response goes well beyond this. You really start to see its enormous expressive power when dealing with nested data structures. For example, if another client C want to retrieve information about the dog’s owner as well, the traditional way would be a two step process:

  1. Make the API call to get a list of dogs — GET /dogs.
  2. For each dog returned, make another API call with the owner id and retrieve information about the owner — GET /owners/{owner_id}.

But with partial response we can do it all in one step:

1
GET /dogs?fields=name,breed,age,owner(name,age)

And this is an example of the API response:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
[
{
"id": "32233",
"name": "Buddy",
"breed": "German Shepherd",
"age": 5,
"owner":
{
"id": "32",
"name": "Evan Wrigley",
"age": "33"
}
}
},
{
"id": "343444",
"name": "Dolly",
"breed": "Boxer",
"age": 3,
"owner":
{
"id": "32",
"name": "Evan Wrigley",
"age": "33"
}
}
}
]

Now that is cool. You can generalize this idea even further and retrieve data in deeper levels. But we will stop here.

Design an intuitive syntax

Above you have briefly seen the syntax I used for partial response. This syntax is a bit different from what Google and LinkedIn use. Before delving into why I chose a different syntax, let’s examine the ones used by Google and LinkedIn.

Google’s partial response syntax is inspird by the XPath syntax and goes like this:

1
GET .../calendar/feeds/private/full?fields=entry(title,gd:when)

This syntax is extremely powerful and goes well beyond what I covered in this post. If you want to see them in action, check out the YouTube API page. In short it does way more than simple filtering but let you specify existential conditions, equality comparison, logical comparison, numerical comparison, and more.

LinkedIn’s partial response syntax looks like this:

1
GET .../v2/people/456/friends:(name,photo,best-friend:(name,photo))

I admire Google’s syntax but it’s an overkill for simple services I build, and there is definitely a learning curve for anyone who tries to integrate with the API. LinkedIn’s syntax is closer to what I want but I am not a fan of those colons. A standard ?fields= syntax is easier to read and already familar to every web developer. Thus the syntax I use looks like this:

1
GET /dogs?fields=name,breed,age,owner(name,age)

I think it’s very intuitive and there is literally no learning curve. Just the way it should be.

How to implement

Hopefully I have got you excited about using partial response in API design, but now comes the technical question: how do you implement partial response on your server stack?

In the following I will use Google App Engine as an example, but you can easily adapt these techniques to other server stacks.

First we need a simple parser to parse the fields in the URL. This is the one I wrote:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
def parseFields(fields):
"""
example: "name,color,owner(name,age),gender"
"""
field = ''
subfields = ''
level = 0
results = {}
fields = fields.lower()
for c in fields:
if c == '(':
level += 1
if level == 1:
subfields = ''
continue
elif c == ')':
level -= 1
if level == 0:
if c in 'abcdefghijklmnopqrstuvwxyz0123456789_-':
field += c
else:
if field != '':
results[field] = subfields
field = ''
subfields = ''
else:
subfields += c
# Handle the field at the end
if level == 0 and field != '':
results[field] = subfields
return results

Not fancy, but gets the job done.

Now we can implement filter support on all the model classes. Every model class extends from a BaseModel class, in which a toJSON method is defined as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def toJSON(self, fields=None):
if fields is None or fields == '':
fields = self.fields()
parsed = parseFields(fields)
json = {}
for field, subfields in parsed.items():
v = getattr(self, field)
if v is not None:
if subfields == '':
json[field] = v
elif isinstance(v, list):
json[field] = [item.toJSON(subfields) for item in v]
else:
json[field] = v.toJSON(subfields)
return json

You see I still need to deal with some security concerns — throw exceptions if the client is requesting for some prohibited fields — but you get the basic idea.

With all these in place, now you have a nice way to retrieve exactly the data you want, and potentially save many API requests as well.

Yaogang Lian

An iOS, Mac and web developer. Focusing on building productivity and educational apps.