
Building a TCP API: How we built our realtime app, Filmgrain, part 2 - Guzba
http://blog.filmgrainapp.com/2013/07/02/how-we-built-filmgrain-part-2-of-2/
======
jallmann
Props for trying a new approach, Just Because. Here's some feedback:

> newlines make sense as indicators for the end of a message

Until someone decides to inject a newline into your JSON, then you've got a
problem [1]. Just use a length-prefixed encoding, there are tons: bencode,
tnetstrings, etc. Also, JSON objects delimit theselves, although you could use
a proper stream parser to handle fragmented messages better. A secondary
transport like websockets (as others have mentioned) would also give you
framing for free.

> Load balancing

Just use DNS. You're already making a DNS request for S3, so here you have a
redundant HTTP request, and leak network-level operational details in
application logic.

[1] Ignoring HTTP-style newlines here for moment; common implementations are
battle-hardened, but it doesn't mean newline-terminated protocols are really a
good idea...

~~~
treeform
You know how to give feed back! Thanks! +1

Would length-prefixed also cause issue if malicious client send numbers that
are too big, just like malicious clients now can send json with newlines?

I been using websockets in my editor for a while, but for a mobile app they
seamed like they just had extra stuff we did not need. But I am warming up to
them now.

Honestly we thought about it, did not use DNS because its scary. The errors
and propagation slowness can cost you a ton of uptime. I feel safer with a
static DNS.

~~~
jallmann
Set a upper bound on the length of the message, and everything within it --
constrain string lengths, numeric ranges, nesting depth, etc. This should be
built into the parser. Discard messages that are truncated, malformed or don't
conform to the schema (don't forget to check types...).

Load balancing is ultimately your call, and HTTP might be the simplest
solution if very quick turnaround is critical. But don't be afraid of the
Internet's plumbing, understanding it will make you a better engineer. For
example, you can still have fail-over to any of the DNS A records (completely
transparent with any decent HTTP/websocket client), or if you're _seriously_
concerned about availability, anycast/multihoming. But I'd put money on you
having more downtime due to errors on your end or with your provider, than
with DNS.

Basically, these problems have been studied for a long time, so it's valuable
to be comfortable with the various approaches. FWIW, adding extra A records is
just a couple clicks on Linode, and they don't make you pay for each request
to boot, unlike S3.

------
johnrob
HTTP lets you inherit a large and diverse body of infrastructure - load
balancing, monitoring, testing, logging, etc etc.

Also, some combination of websockets and/or keep-alive would yield most of the
benefits of using just TCP.

~~~
pbsdp
Most of the complex HTTP infrastructure exists to support a the complex HTTP
protocol.

Get rid of HTTP, and pretty much all of the complexity of the HTTP stack
evaporates.

~~~
treeform
Exactly! You said what I could not say, but thats exactly how I feel.

------
nitrogen
After ages fighting slowness and lag on an HTTP-based API for some hardware
(Hue lights) that could save 98% of its bandwidth using a TCP interface, I was
thinking just last week that HTTP obsession is getting out of hand, and that
I'd like to see more developers using TCP directly. Good on you.

~~~
bradleyland
HTTP _is_ TCP. Or rather, HTTP occurs over HTTP. There's a lot of confusion of
terms flying around in this thread. TCP is a transport layer technology. HTTP
provides signaling and data transport via TCP (actually, HTTP doesn't care
what the transport layer is; e.g., you can do HTTP over IPX/SPX if you'd
like). What's really being said is that HTTP is stateless, and stateless HTTP
isn't always the best solution.

When you say "an HTTP-based API for some hardware (Hue lights) that could save
98% of its bandwidth using a TCP interface", I really raised an eyebrow.
You're already using a TCP interface. I think what you meant to say was that
you it could save bandwidth by using a stateful protocol instead of HTTP, or
even a custom binary protocol.

~~~
nitrogen
Yes, I know HTTP runs over TCP. I expected the context to be sufficient to
infer that I meant a non-HTTP, single-connection, non-JSON protocol over
TCP/IP. Being able to send one TCP packet with a payload of "SET
1,hsb,280,255,0" is faster than a complete TCP handshake, followed by an HTTP
PUT to a lengthy URL of a JSON-formatted message. The CPU load on the puny
processor in the Hue bridge would be significantly reduced as well.

Has anybody done HTTP over IPX?

~~~
bradleyland
In an engineering context, it's necessary to be specific. Otherwise, someone
unfamiliar with the terms being used might come away with the wrong idea.

IPX is virtually dead, but it would be entirely possible to do HTTP over IPX
if you had a web server and client that could bind to an IPX interface.

~~~
nitrogen
I'll be sure to specify a specific protocol over TCP over IPv4/IPv6 over any
suitable layer 2 interface if I'm ever specifying such a system for
implementation.

On HN, I would hope that anyone confused by the terms knows they can look them
up on Wikipedia:

[https://en.wikipedia.org/wiki/HTTP](https://en.wikipedia.org/wiki/HTTP)

[https://en.wikipedia.org/wiki/Transmission_Control_Protocol](https://en.wikipedia.org/wiki/Transmission_Control_Protocol)

[https://en.wikipedia.org/wiki/IPX](https://en.wikipedia.org/wiki/IPX)

[https://en.wikipedia.org/wiki/Internet_protocol_suite](https://en.wikipedia.org/wiki/Internet_protocol_suite)

~~~
bradleyland
Sorry, I didn't mean to come across as being pedantic :) I really do think the
distinction is important. It's a foundational element of the language that
ties software and hardware together. It's the reason we have things like the
OSI model.

A couple of years ago I had a temporary run in telecom, working with a company
on a nation wide VoIP roll out. This was the first project where I was working
on technologies that were completely outside the realm of your typical web
application. I was surprised to find so many similarities. SIP, for example,
looks a lot like HTTP. The distinction between messaging and media gave me new
insights in to web programming.

As I learned more and more about telephony, it became obvious that the common
language between web and telecom technologies was the separation of layers in
the various protocols that make up each stack.

Knowing, and respecting, the language that is used to communicate these
concepts helped me transition quickly between technologies. Because of that, I
try to make an effort to share this knowledge with others.

My intent is to inspire, not to chastise :) I didn't do a very good job of
that in my original comment.

------
gobengo
I get that there's more than request-response. Why not use Websockets instead?
What do you plan to do if your mobile users want a browser-based experience?

~~~
treeform
You get what many people dont! When we will need a webapp we would write a
WebSocket protocol then. Why write stuff you don't need too? TCP with json was
just surprisingly simple and effective for our realtime twitter+move app.

------
icedog
Sounds like you re-invented the wheel.

Google Cloud Messaging and (to a lesser extent) websockets would have been
better solutions.

------
bhouston
Wouldn't web sockets have worked here nicely?

~~~
Guzba
Web sockets are great for getting TCP in the browser but since we're only
building mobile apps right now, we don't need the HTTP layer, instead we're
using straight sockets.

