
Building Lob's API - dzhao
http://blog.lob.com/post/61597473247/building-lobs-api
======
7Figures2Commas
> Pretty print by default is one of those small things but makes your API that
> much easier to use. It makes it easy for customers who are debugging to
> easily read data and pleasant to use.

A pretty print option should be available, but the extra whitespace adds to
the content size, which is obviously a bad thing when you have any real
volume. As the primary consumer of a production API is not a person, pretty
print should _not_ be the default.

> Do not return total counts or loop through entire databases to get the large
> count.

The data that is available to consumers of an API should be based on actual
use cases, and there is often a reasonable need for a consumer to have a total
count. If there are performance implications associated with generating this,
which won't always be the case, it's better to address those than leave the
consumers of the API without the data they need.

~~~
gfodor
gzip compression kind of minimizes this issue.

~~~
7Figures2Commas
There's still no reason to add additional whitespace to a response that isn't
going to be consumed by a human.

Also, it's worth pointing out that not all of your API consumers will
necessarily support compression. The best practice is to use an Accept-
Encoding header to allow the consumer to explicitly request a compressed
response. This ensures that you can still serve uncompressed responses where
necessary.

~~~
gfodor
>There's still no reason to add additional whitespace to a response that isn't
going to be consumed by a human.

The point is if you don't know which requests are going to be consumed by a
human, and which are being consumed by eyeballs of developers testing your
API, it's a legitimate design tradeoff to consider printing your responses in
a human readable way. (And yes, "?pretty_print=1" is another option, but of
course this has its own set of obvious tradeoffs.)

You seem to think that a <1% hit on (uncompressed) response size doesn't
outweigh the benefits of such a design choice. But the problem is you're not
expressing this as an opinion but as a fact. There are plenty of scenarios
(I'd argue the majority) where a tiny increase in the % of bytes you transfer
is worth making it so developers can easily debug your API. Particularly when
you are an API-providing startup with a few people and minimizing support-side
touch points is crucial to scaling.

~~~
7Figures2Commas
What I wrote is based on a) best practice, b) the understanding that the vast
majority of APIs are, in production, consumed by applications, not humans, and
c) the knowledge that, in high-volume production environments, small
inefficiencies can add up to larger inefficiencies.

Finally, I don't know where you came up with your 1% figure but I just ran a
test on an API I use and the uncompressed pretty printed response was 24%
larger than the uncompressed non-pretty printed response because of all the
whitespace. I'd challenge you to find a scenario under which a pretty printed
response of reasonable size is only 1% larger than its non-pretty printed
counterpart.

~~~
callahad
To add more data to this discussion, I've been playing with the GitHub Issues
API recently, which pretty-prints by default. I grabbed the last hundred open
issues from the Mozilla Persona repo, and found

    
    
            == Uncompressed ==
      Pretty Printed: 312,556 bytes
            Minified: 272,604 bytes
                      -------------
               Delta:  39,952 bytes (+14.66%)
    

Not as bad as a 24% hit, but that's all moot because gfodor very explicitly
discussed using gzip when pretty-printing. Let's see what happens there:

    
    
               == Gzipped ==
      Pretty Printed:  41,748 bytes
            Minified:  40,648 bytes
                      -------------
               Delta:    1,100 bytes (+2.71%)
    

Not as good as the original claim of <1%, but still pretty darn negligible.

Data derived from this API endpoint:
[https://api.github.com/repos/mozilla/browserid/issues?per_pa...](https://api.github.com/repos/mozilla/browserid/issues?per_page=100)

~~~
gfodor
Thanks. My <1% claim was ill-founded, I was assuming the use of tabs and
trailing braces to minimize single-character lines so a little arithmetic in
my head pointed to a small relative cost. But out of the box pretty printing
does not try to minimize whitespace characters like this. Seems like it could
be a worthy hack.

------
eterm
"Do not return total counts or loop through entire databases to get the large
count"

How do you achieve this on a practical level? I'm pretty new to this kind of
thing and I've been coming up against this in my code, getting the totals have
been a headache.

------
ollyculverhouse
I wish they would have elaborated on to why not to return the total counts.
Does anyone have any ideas why this is a bad idea? I thought it would be
useful for the consumer so that they can account for pagination?

~~~
bavidar
Founder here. Returning total count is a very expensive action and most users
don't use the result. Therefore, by returning paginated results you can allow
the users that need that data, to loop through all the paginated pages and get
the total count. It may add an extra step but for 98% of users the API will
run faster.

~~~
7Figures2Commas
Assume that I am building an interface that displays data I have retrieved
from your API.

Are you saying that your API would require my interface to mimic your
pagination, as opposed to being able to retrieve data in such a way that I
could paginate as I saw fit?

~~~
bavidar
you can paginate however you want. The default is the return 10. You can
return an amount 1-100 and use the offset parameter to get the next X results.

~~~
7Figures2Commas
Maybe I'm missing something, but how can I display pagination _of my own
choosing_ if I can't calculate the total number of pages that exist?

More importantly, let's say that I am retrieving the default 10 results per
request. Let's say there are 15 results. If I request the second "page", what
will the next_url value in the response be?

In other words, how can you provide an accurate next_url in your responses if
you're not calculating the total number of results? At some point, aren't you
providing a next_url that will return 0 results?

------
crymer11
Where's the HATEOAS?

------
rkv
Bob Loblaw Lob Blog

