

Ask HN: Good technology stack for IP based Distributed Control Platform? - dojomouse

I have a problem of needing to build something and not having a clear idea of the best way to go about it. People who are interested (and especially experienced) in distributed control systems using web platforms - read on! Would love to hear your thoughts and have some guidance through the fog of too many ways to skin a metaphor.<p>I am the lead-and-presently-only developer for a new distributed system control platform consisting of:<p>------<p>1.Distributed units: Initially few (&#60;100) but eventually thousands (in a couple of years) of embedded systems which are scattered all over the place doing 'stuff'. Each unit has a (probably cellular) internet connection (though eventually we may try to use a local area radio mesh to cluster them to share a single connection... or piggyback on a nearby wifi). They are fixed installations. For development I'm using the mBed platform (Cortex M3) as there's an existing community, libraries, and hardware... these few being of far higher value than custom hardware or advanced debug tools at this stage.
Each of these units has enough brains to look after itself in the short term when necessary/in event of unexpected outages (up to 1 min or so with no observable negative impact) but should generally be 'online' and able to respond to higher level commands (start; stop; rollover; reset; increase output to y; change parameters b, c, and d) with low latency (preferably below 100ms, but could live with 1s if this changes everything completely). They also need to be able to feed data up to the higher level controller, preferably also in soft real time (perhaps 20 variables of 16ish bit each, once per second... though in reality will only send when there is a delta).<p>2. Data connection: A connection/protocol able to reliably and securely support this sort of communication. As mentioned I'm not talking banks/trains/nuclear-plants reliability and security... but should be at least 'I want to read my email online when I feel like it, and stop curious and somewhat sinister people reading it at the same time' level.<p>3. Controller/Server/Database/analytics: Central controller that can aggregate all this incoming data and send out new set points etc based on decisions made at central level. Data should be stored for future retrieval and continuous analysis, preferably in some form that lends itself to various (somewhat simplistic to start with) 'Big Data' type analytical and optimization processes.<p>4. Some sort of browser based front end so curious owners of the distributed systems can log in and see how there flock is going.<p>-------<p>For my background - I'm an electrical engineer; have experience developing in C/C++ (though mostly from a control implementation perspective, and supported by much more capable software engineers for the tricky OS stuff) for hard real time systems. Have almost no experience at all with servers, clouds, interwebs, etc.<p>Since starting this project I've managed to hack together the framework of a rough demo with a LAMP stack server, reasonable facsimile of what we want for the front end (coded up from scratch in HTML/CSS, using Ajax for most of the server I/O - I'm so proud!) and as mentioned the mBed as the distributed target. Everything does not yet work elegantly, but most of the parts do, and I can more or less see how to get from here to 'working' demo.<p>The issue, and hence question (finally!): I'm sure I haven't built this in anything like an optimal way... and probably not even an adequate way for what I want the system to do in a few years. So where do I go from here?$<p>There are tens (hundreds) or possible candidates for the server, for the database, for the connection.
Do I use MySQL? MongoDB? NoSQL? A big long text file?
Do I use HTTP? Modbus/TCPIP? Carrier pigeons with a highly refined bribe system?
Do I use Apache? Node.js? Tornado?<p>From playing around with the mBed I discovered Tornado and Websockets. Websockets at least seem to provide a lot of the functionality I want for the connection between 1 &#38; 3 (distributed units and central controllers). So I'm beginning to dig into Tornado a bit and see if that's the right thing, but am painfully aware that I'm feeling my way in a dark room filled with things that I wouldn't recognize half of even with the lights ON.<p>So, if anyone remains after this too-long background: What would you suggest as a stack to build all this on? I guess my key question is about level 3 (the OS/DB/Server/Scripting/whatever), but also about whether secure-websockets are a good solution for level 2 considering the data requirements (1000 units+, 100 bytes data/unit.second during peak periods, some delays acceptable during peaks, bidirectional data flow).<p>For now I'll keep looking at Tornado/wss and trying to get them to work properly with my mBed (there are some reasonable walk throughs on the mBed website I'm following, so I'm not completely in the dark) but would very much appreciate some input from an wiser and more experienced soul (or just wiser/smarter - I'm not proud!).<p>Thanks in advance...
======
dojomouse
The tl;dr version: Need a technology stack for the web/cloud side of a
distributed control platform able to server 2000 connections simultaneously
with up to 100bytes bidirectional data/second per connection. Reliability and
security are important, but system is highly tolerant of short interruptions
(up to 1min) as long as they're not too frequent.

For now am entirely agnostic about everything including server, OS, DB,
language, connection... however leaning towards Tornado with Secure Websockets
as connection/server.

Presently reading: <http://www.kegel.com/c10k.html> to try and address some of
my more gross areas of ignorance...

