Hacker News new | past | comments | ask | show | jobs | submit login
Data.sparkfun.com: A place to push your data (sparkfun.com)
216 points by jamesbritt on July 10, 2014 | hide | past | web | favorite | 68 comments



That's the most honest IoT effort I've seen yet. The only thing better would be a VM I could spin up that let me self host an instance of this.

Edit: I know you can install phant yourself, but I'm not a node guy so I don't have an enviro just at the ready. I do use VMware player all the time thought to try stuff out. An aws image wouldn't be terrible either. Doesn't sound like a long hike to make a nice polished "IoT" backend server.


I bet a Docker image will pop up in the next 48 hours, if there's not 10 already.


The one thing that would make it better would be if it posted the data to the public blockchain, so we didn't need to depend on sparkfuns' servers for retrieval purposes ..


> The only thing better would be a VM I could spin up that let me self host an instance of this.

Yeah, we'll get on that.


This is pretty cool -- it's like a mini-ElasticSearch (and of course I'm overstating):

    $ npm install -g phant
    $ phant
    $ telnet localhost 8081

    phant> create
    Enter a title> library
    Enter a description> a bunch of books
    Enter fields (comma separated)> title,author
    Enter tags (comma separated)> paper,cloth

    Stream created!
    PUBLIC KEY: yrDmwO3XZEF6vMajz9oVTwPYVNM
    PRIVATE KEY:  NYzxmKOdl8tEQ0Bn2d3WhXLbn2O
    DELETE KEY:  WxpPb29VEkuyAemL8za2INraDPb

    $ curl "http://localhost:8080/input/yrDmwO3XZEF6vMajz9oVTwPYVNM.json" -H 'Content-Type: application/json' -d '{"title": "The Book of Virtues", "author": "William J. Bennett"}' -H 'Phant-Private-Key: NYzxmKOdl8tEQ0Bn2d3WhXLbn2O'

    $ curl "http://localhost:8080/output/yrDmwO3XZEF6vMajz9oVTwPYVNM.json"
    [{"author":"William J. Bennett","timestamp":"2014-07-10T18:07:13.964Z","title":"The Book of Virtues"}]


You should edit out those public and private keys.


Why? I belieb that those keys are unique to the stream (which I don't really care about).


Sorry didn't see it was just a local instance you were running so it probably doesn't matter. In general I wouldn't let your public and private keys get out because anyone could start writing, deleting, etc. with those keys.


Thanks for looking out for me, though! One can never be too careful.


Obviously.



Not obvious to everyone. HN is a place for people to learn, as well as discuss.


Here is a Python client library [1]. It's really amazing how fast I could put this together. Thanks Guido and Kenneth ;-)

[1] https://github.com/matze/python-phant


ThingSpeak ( https://thingspeak.com/ ) is another open source data store for the Internet of Things.


Looks great, I've been looking for a solution like this. What would be really cool is to be able to query public data streams and combine them in way similar to Yahoo Pipes.

I suspect people will also want to run aggregations/rollups of their data and that's something we enable in Streametry [1] in addition to other analytics. I might try to build a a bridge to Sparkfun.

[1] http://www.streametry.com


Does it support SSL? I ask because the examples just show normal non-SSL usage and I'd be very careful of sending the public and private key in a request as authentication without SSL. How easy is it to revoke your access keys and get new ones if someone does steal the keys?


SSL is supported, but not mandatory because a lot of the hardware we're targeting just isn't up to the task.

We're aware this isn't exactly optimal. Suggestions welcome.

Re: Key revocation, that's a good question and we should probably think hard about it. It's trivial to create a new stream, but obviously that could be a pain for various reasons.


You might check out stuff like Amazon's web services (DynamoDB, etc.) for inspiration. They sign requests with a hash of the request data and private access key (and date, etc.) so the private key doesn't need to be sent in the request. The data is still visible if it's over non-SSL but it keeps the key from getting out.

Yeah no SSL support on Arduino-capable hardware is a bummer. I have been able to do Amazon's signature generation on an Arduino without too much trouble though (just need to run SHA256 hash algorithm a few times). I would definitely think about key revocation and renewal since folks will need it if someone steals their keys.


Signing might be doable like that but it all depends on the size of the data that you're pushing. if it's a long string of readings that might not be so easy since you might not be able to fit everything in memory still (These devices routinely have <4k of ram). Might be doable to go byte by byte as you build the message and output it. It's certainly a challenging problem.


Yep the libraries I've used can generate a hash byte by byte luckily without loading everything in memory. Here's an example of how I call Amazon services and generate a signature on an Arduino: https://github.com/tdicola/CloudThermometer


You could always some form of SRP[0]. It shouldn't be too much overhead to do only when your session times out. At the very least you're 1) not sharing secrets 2) limits reply attacks to a finite amount of time.

The other method might be, as a sibling said, HMAC the message with a shared private key. If you can specify that clocks should be pretty close, your reply attack window becomes small.

[0]: http://en.wikipedia.org/wiki/Secure_Remote_Password_protocol


We are having the SSL/security discussion in a lot of places, but it would be great to focus the effort here if possible: https://github.com/sparkfun/phant/issues/49

Security on 8-bit microcontrollers is one of the hardest problems we've encountered when building this service, and I think it's going to take a lot of community support to help us figure out a good solution that works on most devices. Thanks for all of the input so far!


Any plans to allow the data to push back? Specifically, would it be possible to add webhooks after data was pushed to a stream? This could be amazing for a decentralized notification service.


We currently push data via websockets when you are viewing a stream, and we will probably offer live TCP output soon. Feel free to file a bug here if you would like to see webhooks added: https://github.com/sparkfun/phant/issues


I built something similar a while ago: http://getontrack.herokuapp.com/about


This is cool, but it really needs a better metadata section than just tags. Weather station data becomes a lot more interesting with it's location.


Thanks for the feedback. That's a planned feature: https://github.com/sparkfun/phant/issues/13

Right now the web UI is fairly simple, but we will start adding features soon.


There is also dweet[0]. I wrote a simple Python client[1] for dweet and its been fun playing with it. I like the approach taken here. Giving it a readable web GUI is a nice touch. Any future plans with push data?

[0]http://dweet.io [1]https://github.com/bliti/pydweet


Looks like a few people have already hit their quota for space on seemingly legitimate streams:

https://data.sparkfun.com/streams/NJKOopLR4VC6yN59jE5L

https://data.sparkfun.com/streams/pwG8Lw6jNDT5VwlnbE45


That's how much space is remaining: "100% (50.00 of 50 MB) remaining"

The bar along the top will turn red when the stream is getting close to full.


Good call, I misread it and was shocked. Makes sense now.


This is awesome. Can't wait to see a bunch of public data from random sources. Should open up some unique project opportunities.


Tried the example on their front page, but it just tells me "0 rate limit exceeded. try again in 131 seconds."


all feeds (including the example) are rate limited on data.sparkfun.com. Here are the details: http://phant.io/docs/input/limit


that's the one that anyone can just paste and use right? it seems like it must be rate limited as well..


Interesting! It will be fun to see what people end up using it for. I made a simple Python client library for it: https://bitbucket.org/boomlinde/pyphant


I've been building a little project called guppy it's at http://gpy.me and it requires less registration and is still being tested right now, but's a fun idea.


cool. btw, lack of ability to side-scroll on the public views: https://data.sparkfun.com/streams/dZ4EVmE8yGCRGx5XRX1W


Works fine on Chrome 35.0.1916.153 on OS X 10.9.4


No luck on Chrome 35.0.1916.153 on Windows 8.

edit: Actually, using the middle mouse button and arrows work once the focus is on the right element, but no scroll bar in sight.

edit edit: The scrollbar is actually at the bottom of the page, but it does behave a bit weird.


On Chrome 38.0.2085.0 Windows, the bottom scroll bar appears briefly when the page loads, and then promptly dissapears.


Great work SparkFun! Any more info available on what other libraries that are in stages of completion...? » http://phant.io/libraries/


There's some simple example code in the core phant repo:

https://github.com/sparkfun/phant/tree/master/examples

And a tiny PHP wrapper here:

https://github.com/sparkfun/SparkLib/blob/master/lib/SparkLi...


Very cool, thanks!


I'll be installing this on my own server to tinker with and will probably find the answer to my questions that way, but has anybody thought to use this to transfer binary data (i.e. zip, exe, etc.)?


Out of curiosity, what database are you using to store the data?


By default it writes metadata about the stream (title, description, etc) using a file based db called nedb, and it appends the actual logged data to CSV files that are split into 500k chunks. When the user requests their logged data, all of the files are stitched back together, converted into the requested format (JSON, CSV, etc), and streamed to the user’s web client.

For the production server, we are currently using MongoDb for metadata storage and the same CSV module for logged data storage.


That's a pretty unusual setup :)

I'd be interested in a blog post about how you choose this architecture.


Sounds good. I'll work on one once the traffic stabilizes.


Is this the same Nedb? https://github.com/louischatriot/nedb/

Looks like a pretty nice project.


Any chance of jsonp or CORS for reading the data. It'd allow for statically hosted frontends that consume a feed.

Edit: nm. Added ?callback=foo and saw it's already supported!


Quickly developed a Ruby client... PhantRb. https://github.com/girishso/phant_rb


Xively (formerly Cosm) is a similar service: https://xively.com/dev/


Unfortunately, Xively is neither free, nor open source.


Xively was one of the services we used as inspiration for Phant / data.sparkfun.com. Before they were Xively they were Pachube and things were easier to work with and free. Then Xively gobbled up Pachube and things got all business-y. Thus Xively has been an example of how we didn't want to build it.


Glad to hear it! I've been looking for a viable Pachube alternative since Pachube got gobbled up.


Try ThingSpeak on GitHub... Lots of active development and growing community.


It makes me very happy to hear this. I've always been sad that the initial promise of Pachube being an open place to store and exchange data was subsumed by corporate overlords.

This is as close to the mindset of the original Pachube as I've seen in a long while.

Nice work!


Very nice and thanks! Writing a replacement for Xively/Cosm/Pachube was on my TODO list, but now I don't have to. I'll take this for a spin later, and hope to contribute if there is anything I have to offer.


Nitrogen is a good alternative in this space that is free and open source: http://nitrogen.io


Notice how nowhere on that page does it actually say what the thing is. “A place to push your data” describes nothing specific.


Maybe it was different half an hour ago, but the page has an example of how to use it and a list of projects that could use it (as well as answers to a few other questions). I'm not sure what else you'd want on a first page.


This is certainly true for values of "that page" excluding everything after the title. :)


I really don't know much about crypto, but should you be sending your private key over HTTP GET requests?


The page listing the public data sources makes clear no one has a thing connected to the internet yet.


Any plans for a REST api? Would be useful to query for updates since a timestamp parameter.


How does phant compare to using Redis or RRDtool with node.js?


Looks like some people are abusing it:

https://data.sparkfun.com/streams/Jxyjr7DmxwTD5dG1D1Kv




Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: