Hacker News new | past | comments | ask | show | jobs | submit login
Redis as a JSON store (redislabs.com)
158 points by dvirsky on Mar 22, 2017 | hide | past | web | favorite | 113 comments

For the life of me I do not understand the JSON obsession. When I was young and naive, JSON indeed looked cool because it is both human and computer readable. Until you run into some of its limitations.

For example, AFAIK there is still no agreement on how to handle binary data, e.g. what do you do with:

  >>> json.dumps(["\xb8\xc3\xb6\xbb"])
  Traceback (most recent call last):
  UnicodeDecodeError: 'utf8' codec can't decode byte 0xb8 in position 0: invalid start byte
You can base64-encode it or do a number of other things, but then that's outside the JSON standard, and suddenly it isn't the flexible encoding standard it seemed at first...

Worse is Better: http://www.dreamsongs.com/RiseOfWorseIsBetter.html

Note that I think many people skim that too quickly and miss the point. It is more an explanation of why the "worse" things so often seem to beat out the "better" things than simply a cynical statement or something.

He does actually say "The lesson to be learned from this is that it is often undesirable to go for the right thing first. It is better to get half of the right thing available so that it spreads like a virus. Once people are hooked on it, take the time to improve it to 90% of the right thing." He's explicitly blessing the worse approach up front to get something working that then gets accepted and improved. It's ultimately a bird-in-the-hand argument.

See also http://urbanhonking.com/ideasfordozens/2012/06/24/designing-... about how design's goal is to keep people from noticing big changes. Makes "bird in hand" a good strategy for people trying to make a dent in the universe.

The reason people like JSON is because it maps nicely to the basic data structures in most languages. And if the data doesn't already map to those structures, then it forces the API developer to come up with a serialization strategy.

Beats the hell out of XML. The reason JSON works so well is b/c it's dead simple and no one can implement their own stupid version.

I have seen so many takes on xml and soap over the years. Every one of them made me sneer in disgust.

Bingo. It's a way to arrange three dimensional data into a string, just like XML is. It's more pithy and saves bandwidth money.

Totally agree. XML is probably over engineered but JSON is under engineered. No date type, no binaries, no comments, no easy append to existing files. We jumped from one bad standard format to another bad standard format. Other than maybe being a little easier on the eye I don't see how JSON is an improvement over other formats.

Why would you need any of that? A date type is bloat, store an epoch number. Comments have no place in a data format. Append is exactly the same as with XML: You take the part you want and add it at the right place in a list of the other file. I'd argue the lack of namespaces makes it easier than with XML.

For the binary data: JSON is not a binary format, one should not try to mix those. To encode it and store it as a string is exactly right. Otherwise a database as blob store is the better place for that.

Comments are important ina format that's being used for configuration as JSON is. Commenting out things makes for easier revert if something doesn't work. Having a Line of comment in front of each setting is also a good thing. This is just basic practicality.

You can't easily append something to an already existing JSON file. You have to decode the whole thing first and then write it back whole. XML has the same problem. That's why I said we replaced one bad format with another.

JSON is not an improvement over XML. It replaced one set of weaknesses with another.

I'd much rather deal with JSON than XML. XML is a document format, JSON is a data format. You extract data out of documents. JSON is 10 times easier to traverse than XML. XML is so difficult to traverse that you have to use a special language to do it cleanly, XPath. JSON, you just read it into whatever hash map data type your language has built-in and use your standard library.

"XML is a document format"

And works very well as a document format, and JSON works very poorly as a document format! So if you have documents (a novel, an article, a manual, etc.), XML is still a very nice choice. But was a mistake to use XML as a data format.

(Just elaborating on your point.)

How do you know that you are dealing with a document or data? Isn't a document just data too?

I'm not an expert on the difference, but informally, I'd say that the difference is this: a "document" will have more and/or larger passages of text, and the fields are mainly meant as metadata to add meaning to it. A good example would be a real document like a book - when stored in an XML file, a lot of the content would be the text of the book, and you would have some XML elements like <book>, <chapter>, <section>, <TOC>, etc. But for "data", a good example would be an invoice or a purchase order or a part record - while they do have text, it will be mainly in the form of structured fields (typically not too long, unlike a chapter or section in a book), which can again be stored in XML elements. And this is where the siblings and others are arguing about - which (XML or JSON) is a better format for what type - document or data.

"(a novel, an article, a manual, etc.)"

So something meant to be read by humans. XML is nice for marking up the actual text with structure like chapters, sections, footnotes, references, formatting, etc. (thus, the ML in XML comes from "markup language")

With JSON, there isn't really a notion of prose text to "mark up". It is structured data meant to be processed by machines.

(Of course it is useful for humans to read and write JSON sometimes, but JSON is primarily meant to be input to or output from some computing process. You wouldn't sit down to read a JSON doucment unless you are some super weird programmer dude.)

Data is sharply shape-constrained while documents aren't. You can easily look at data and devise traversal schemes to get at the information you want, whereas with documents, this isn't nearly as easy. If you're ingesting XML into a database, it's often easier to just put the whole XML in as a text field, whereas with JSON data, you can just extract the fields you want and insert them in.

Maybe we shouldn't use JSON as a configuration format. There are numerous better alternatives such as INI files and YAML.

INI files are cool, IMO [1]. Though they may have originated in Windows or DOS, other tech uses them too nowadays (and from long, of course). Some languages must have support for INI files in their standard or 3rd-party libraries. I know Python does, in the ConfigParser module. In fact, I'm currently using it in a project I'm working on for a client.

[1] I think they are cool because of their simplicity - it is just a two-level hierarchy - sections and options within sections. And options are key-value pairs. For many applications, this is enough, and any format that supports more may be overkill (though would still work).

Edited to add the [1] footnote.

I personally use INI files quit a bit but the cool kids insist on JSON.

TOML seems like a nice middle ground, it's gaining some popularity in the Go world.

or HOCON which is often used in Scala ;)

The creator of JSON is right: without comments, people won't be tempted to encode directives in comments like we have with XML.

Want comments? Pipe through json5. Still vastly simpler than XML.

It's hard to take people's criticism seriously every time JSON comes up. There's a sibling comment that even suggests that it's a mistake to use JSON for config in the first place. C'mon.

The mistake here is using JSON for configuration.

JSON is not for people to write, despite what people try to do. It's for machines.

From [1]: "JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write."

[1] http://www.json.org/

It may say that, but it's main use is a serialization format for machines. The lack of comments makes it apparent that it is not for people. When you want comments, what you actually want to do is write code in a real programming language.

Or, maybe, our tools should evolve with an eye on how they are actually being used. JSON/YAML/TOML/INI are all intended to store arbitrary data structures, and are all required to be written and read by algorithms. It doesn't make sense to split that world into "these are usually created/consumed only by software" and "these are for humans".

This whole thing is just a variation of the "you aren't doing it right" syndrome. Be it Scrum, OO, FP, XML, JSON, whenever you criticize one of them people will tell you "this is not what it was intended for" even though a lot of people use it that way.

But the more valid form of this argument is "not the right tool for the job". Trying to be all encompassing is how we wound up with XML.

There are many, many situations when being able to write or edit JSON data by hand is extremely useful.

JSON comes from Javascript, it is totally for people to write, like they write Javascript object literals. If it wasn't the case, JSON would have been a binary format which is much for efficient for machines to read,write an transmit. That's also why a significant number of JSON libs support comments. they'll make their way into the spec, sooner or later, that's inevitable.

Nothing is inevitable. If that really were to happen people like me (though I actually still use a lot of XML) would move on to the next simple data exchange format.

That's not what I see happening. JSON is being used a lot for things that should be human readable. If we want formats purely for machines, we should go binary. Easier to parse and much more efficient.

Binary formats are hard to read. JSON can be pretty printed and read easily.

I thought it's for machines? If humans are supposed to read it then comments would be helpful.

Comments do have a place in a data format if the format is intended to be human readable.

To expand on what the other commenter said.

The creator of JSON was explicitly against comments for a few reasons. The biggest one was that comments are often used to "extend" a format. Look at doc-block, annotations, etc...

In order for the format to work, everyone had to use the exact same set of rules, and that meant that if people started adding their own "crap" to the format in a "backwards compatible" way you'd lose what makes it special.

So he didn't add comments. And I personally think it worked out.

This argument against the inclusion of comments is a good argument in favor of the inclusion of a date data type and a binary data type.

Right now, there are a number of different ways to encode both types of data, and you see them all in the wild. And extra custom serialization and deserialization steps are always required to handle them.

So now lots of JSON parsers have extensions to ignore comment extensions to the format...

Ah ok, I hadn't considered that, that's quite clever really.

I think JSON is intended to strike the right balance between readability by machines (primary goal) and readability by humans (secondary). Comments don't make sense to the machine and are just overhead. And they tend to be abused or extended in hacky ways. So I'm glad they're not there.

"Otherwise a database as blob store is the better place for that."

Actually, I was thinking something more like Protocol Buffers or Avro if you are looking for a binary format for transmitting structured data.

Coincidentally I was looking at MessagePack recently. It seems to have some interesting features. Not done a proper study of it vs. other competing or comparable formats, though.

Sure, both seem better for that. At that moment I was thinking only of storage, not transmitting.

> A date type is bloat, store an epoch number.

Unix timestamps are lossy.

1. No time zones: you can't recover the sender's time-zone from it. (Not the end of the world, obviously.)

2. No defined calendar: if you convert an arbitrary datetime into a timestamp, you don't actually have enough information to convert it back into a datetime, because the same instant of continuous time has different representations in different calendars.

3. The unix epoch is very recent, and there is no such thing as a "negative timestamp": you can't convert historical datetimes to timestamps. If you're trying to pass around e.g. medical records tagged with their dates, what do you do with the ones about things that happened before 1970?

(And without problem 3, problem 2 would be even worse: you'd have to care about calendars beyond the Julian and Gregorian ones, to represent those historical dates. Some calendars don't map monotonically to continuous time! Some instants have multiple representations in the same calendar! Some instants have no representation in a given calendar!)

But the worst problem of all, that affects you even if you don't care about encoding weird historical dates:

4. Timestamps don't "care about" leap-seconds [i.e. the standard doesn't force timestamps to either include or exclude them.] Therefore, every time there's a leap second, Unix time becomes less precise by one second (the integer representing a timestamp after a leap second could map to dt n, or dt n-1, depending on if [and exactly when!] the system that generated it adjusted its clock for the leap-second.) That means that right now, if an unknown system handed you a timestamp that purportedly represents the current time, you'd not actually be able to know that with less than a ±27 second error-bar. That number will only keep climbing.

> to encode it and store it as a string is exactly right.

That's not actually the problem. It'd be fine if JSON took binary data and then represented it in some human-readable encoding like Base64.

The problem is that JSON doesn't specify the representation of binary data, or the mapping between binary data and such a representation. So you can't take a binary buffer, drop it into a JSON serializer, and expect it to pop out the other side on any random JSON-speaking system as a binary buffer. Instead, both sides have to have known characteristics (an agreement to represent binaries in JSON a certain way.)

The whole point of a serialization format is to bundle up those guaranteed characteristics, so that once you know "this system speaks JSON", you don't have to ask any more questions. JSON fails at being a serialization format because you still have to ask the other side how it "expects to see" binaries represented, or dates represented.

I was actually stopping myself from editing my comment and extending that sentence :) You are right, epoch has some drawbacks. But if you care about any of them, there are a number of different string formats you can use instead. Just save the date as string in iso8601 format. That you can later parse in every language.

> JSON fails at being a serialization format because you still have to ask the other side how it "expects to see" binaries represented, or dates represented.

That's not a failure, it is a feature. The moment you store binary stuff you always have to communicate with the other side about what exactly is represented and how. JSON does not make false promises. It stores a few main data types and a structure, and that's it.

This is the whole "Just use a string" vs "We need to have data types for everything". People who like Unix vs people who like PowerShell, dynamic vs static typing, hackers vs business types. It's a philosophy thing, not a "this is the right way" (apart from mine, obviously).

JSON is (heavily used by people as) an OSI layer 6 encoding—a "presentation layer."

Other examples of presentation-layer encodings: ASN.1's encodings [DER, XER]; YAML; the Erlang External Term Format.

These encodings take a set of native types, and represent them with canonical encoded/serialized forms. They don't have to allow for faithful bijective encoding of application or domain types (although YAML does), but they are expected to be able to represent any reasonably-common "part of the runtime" scalar or container-ADT type which you might want to use to build your domain types out of.

Presentation-layer protocols are the basis of RPC-like protocols like REST or SOAP: you define an application-level message encoding to give semantic meaning (i.e. an application-level type) to terms which have passed through an RPC channel, where those terms arrive in your runtime after RPC decoding with types guaranteed by the presentation-layer protocol you've chosen.

JSON is very limited as a presentation-layer protocol, because it guarantees representations for very few types, and many types that are needed to effectively implement RPC-like protocols are not in that small set. This is recognized by some, as there have been a good few attempts to define a presentation-layer encoding "on top of" JSON rather than just using JSON itself... but almost everybody ignores these [fine] layer-6 encodings in favor of continuing to create "JSON APIs" (i.e. layer-7 protocols that specify JSON itself as their layer-6 carrier.)

The creators of these JSON APIs then find themselves forced to specify how their particular API represents {binaries, datetimes, sets, strict maps, exceptions, UUIDs, ...}—in other words, to define their own one-off layer-6 encoding atop JSON. And the authors of clients for these APIs find themselves forced to write their own logic to parse and generate representations fitting these specifications—even though they're almost always exactly the same choices as every other API creator made.

Choosing a different, richer layer-6 encoding allows both the API creator, and the client authors, to just avoid all that work. Instead, the people who write tooling conforming to the standard for each runtime will do that work, once, and all the API designers and API-client implementors get to benefit by relying on that tooling. Which is rather the point of a standard.

> Presentation-layer protocols are the basis of RPC-like protocols like REST or SOAP

A nitpick, but, REST isn't RPC-like, and is an architectural style, not a protocol.

Right, I was trying to think how to phrase that, but I couldn't come up with anything better.

There's a sort of de-facto layer-6 protocol in the way people use REST in combination with JSON [usually using AJAX calls in browser SPAs] to do RPC, by PUTing or POSTing JSON documents to REST endpoints, and then reading back JSON response bodies. It's a "protocol" where HTTP is doing some of the presentation-layer type-encoding, the "application/x-www-form-urlencoded" encoder is doing some more (for e.g. GET query parameters), and JSON is doing the rest.

The "browser-like JSON over REST" RPC approach is so common that it's nearly synonymous with "REST." If you say "we've created a RESTful API", you usually mean that you've defined a layer-7 protocol where browsers use AJAX to make JSON-bodied requests to a hierarchy of REST-architected endpoints, expecting JSON-bodied responses.

(There's also JSON-RPC, which a clearer example of a standalone layer-6 protocol with the same problems, but it's not in wide-enough use to really matter.)

Upvoted for that last sentence :)

I'd argue that all of those are either problems in XML, or considered features. Datatypes and comments would give room for extensibility, which is against what JSON is intended to be. JSON should be instantly readable by any system, it shouldn't need a schema describing what all the datatypes and comment annotations mean first (because that's inevitably where comments would lead). Appending is a problem in any structural format, XML is not immune. Binary data is a problem, I will agree.

The real benefit to JSON is that it's so lean and easy to parse. I can get a hashmap or a plain object out of some JSON text in a few lines of code in almost any language, the same can't really be said of XML, especially not the kind of XML usually used as a data interchange. Additionally, that lean-ness means it has significantly less data to transfer, I've seen JSON versions of data be 1/10 the size of the XML equivalent, which can matter at the far end of the spectrum.

The real strength of JSON, though, is arrays. List structures have always been a major weak point of XML, as there's no way to do it without a million tags. God help you if your list has anything other than a primitive in it.

>We jumped from one bad standard format to another bad standard format.

So, what do or would [1] you consider a good format. Asking because I'm interested in data formats. They are inputs for my xtopdf toolkit, plus I'm generally interested in data munging and have done it a good amount.

[1] "do" for existing ones, "would" for non-existent ones.

Even something as simple as integers is problematic, because they are underspecified. Since javascript uses 64 bit floating point numbers for all numeric types, 64 bit integers can not accurately be represented. If you need 64 bit integers, your best bet is to use strings...

How is that a JSON issue? It's inability to specify the type of a data element?

A better format would specify the precision of integers so you can depend on it. Now you need to know the limitations of all implementations if you want to use it as an interchange format. It doesn't even mention the issue in the spec, even going as far as saying, "A number is very much like a C or Java number"[1], which is misleading to say the least. Your API written in C might work with a Java client, but if someone writes a JavaScript client, it will break.

I'm sure there is a ton of code out there that doesn't take this precision issue into account. It seems like Twitter ran into that issue at some point, because in their APIs they have both id and id_str (see [2]).

[1] http://www.json.org/

[2] https://dev.twitter.com/overview/api/twitter-ids-json-and-sn...

"I don't understand JSON, what do you do if JSON doesn't cover all your use cases?"

Uh, then don't use it. Nobody said JSON was the answer to everything.

Heh, people sure to use it like it's the answer to everything, and "don't use it" stops being an option if you're writing to a JSON-based interface.

JSON's fine if you don't have any requirements around data serialization and you want it to "just work" for your webapp, but there's a lot of tech debt inherent in it.

So you dump a report in JSON format and back it up to S3. S3 costs are growing faster than you thought, so you gzip deflate all of it. Everyone has to go patch their JSON deserialization to detect gzip extensions. Whatever, just growing pains.

Then another team tries to read the reports, and they're getting errors because your definition of an interface is "we'll just use JSON, the keys are human-readable".

You define a formal API for your report format and in doing so you realize the need for versioning attached to your report schema, so you wrap all your JSON objects with types and version annotations. You could define a central repository for these schemas, but it's easier to just bake them into the top-level response. Everyone agrees that this is "lightweight" and not "centralized".

Now you're storing reports where each sub-object has its own annotations, or you're defining an entire schema at the object level. Object deserialization is taking 200ms even for small payloads, because of all the validation callbacks you're firing, and developers are now "performance hacking" their components by disabling validation callbacks. Now you have all the space overhead of schematic annotations with none of the benefits.

In order to adhere to the API, either teams are writing separate serialization libraries, or you form a team to maintain them as infrastructure, which is a great idea except that the horse already left the barn 2 years ago.

Without even realizing it, you've reinvented XML and XSD. And I don't really like XML either but at least you have to be honest about what you're getting yourself into.

Often you're forced to use it, because the infrastructure you're working with only speaks JSON (e.g. databases whose wire protocol is JSON-based, like CouchDB.)

And even if you get to pick your own infrastructure, a lot of development effort is "stolen" by JSON-focused tooling and infrastructure, such that there's very little development effort given to protocols that aren't as limited.

Consider: how many languages that offer a "batteries-included" JSON parser+generator, also offer a "batteries-included" parser+generator for e.g. ASN.1?

There are a few things that must be parsed to/from strings, but that is usually safer as a string, since implementation details may vary by environment anyway... short of using a binary representation (there are a few out there), which may or may not be faster in any given language.

Also, much like shoving large blobs in most databases, maybe you shouldn't be shoving large blobs into JSON?

RethinkDB's protocol became some bastardized serialization of Protobuf definitions over JSON. The reason is language support and the higher performance of native JSON implementations in dynamically typed languages.

ASN.1 is an abomination, nobody would choose to use it if they could use JSON instead. Protobuf, Thrift, Cap’n Proto, Msgpack, XML, etc. have tooling for many languages.

Clearly almost no one is asking for alternatives.

Off the top of my head -- protobuf, thrift, avro, XML, lua tables, properties files, yaml.

I think a lot of companies would like you to believe that though, because the novelty of their product depends on it.


- it's a lot like XML but with less red tape. Looks nicer to humans. Also allows expressing the semantic difference between an (ordered) array and a map.

- it maps extremely well to Javascript, and a lot of APIs are consumed by web clients written in Javascript these days

I think the latter is the real explanation.

That being said, I agree that JSON is not a panacea. My personal pain points:

1) Missing an elegant way to express an unordered set. IMHO the three basic collection data types are array, set, and map. I think JSON would be stronger if it could express a set like you can do in ES these days: { "a", "b " }

2) Not really DRY if you encode a large array of objects and sned them over the wire. So yeah, your API should support more compact formats as well.

3) Allowing comments would be really nice when we use JSON for e.g. package.json. I don't think it would make the parsers any slower so why not? XML has it.

For example, AFAIK there is still no agreement on how to handle binary data, e.g. what do you do with: >>> json.dumps(["\xb8\xc3\xb6\xbb"])

Strings in javascript/json are UTF-8, UTF-16, or UTF-32 encoded[0]. So providing a non-Unicode string is going to fail.

Instead, use an array of numbers limited to the byte range (as you would for any other binary data).

  console.log(JSON.stringify({buffer: [0xb8, 0xc3, 0xb6, 0xbb]}))
[0] - http://rfc7159.net/rfc7159#rfc.section.8.1

It seems like Base64 or beyond would be a vastly better way.

I would personally use Base64 or Hex encoded binary data too if I controlled both sides of the communication.

However, the original commenter was worried about stuff "that's outside the JSON standard,". Storing byte-sized numbers in an array is well within the JSON standard, and any system that could consume json would be able to consume it without issue, and without additional dependencies.

It seems to me that an array of ints depends on agreed convention (beyond the JSON standard) just as much as a base64 string does. In general, interpreting the meaning of a JSON primitive (int, string, boolean) is something that is being agreed upon by the users of the format.


I feel the same about JS in general, yet it rules the world.

All I want is a standard date format. I had it with .Net's `/Date(xyz)/` format.

What's wrong with Unix timestamps (ie: integer or decimal) and UTC offset (string)?

Nothing, pick either and make part of the standard so that when it gets deserialized it becomes a actual date instead of a string or number.

Wouldn't it be annoying if numbers and booleans didn't exist in JSON, you just used strings? Same thing, JSON has a few data types it just missing a very common one.

That would severely complicate implementations, not to mention encourage bad programming practices, all for dubious convenience.

Somewhat surprisingly, human readable dates are an absolute mess of historical convention and complicated geophysics. To write a human readable date, at the most basic level, requires us to account for both calendar for and time zone.

But even that isn't enough. To get these display dates to match up with everything else in the world we also have to worry about a whole smorgasboard of national holidays, differing leap year conventions, leap seconds, historical mishaps, typographic conventions, and host of even more obscure minutia.

All this complexity obscures from the programmer that human readable dates, for the above reasons, are almost useless at reliably determining time sequences.

Instead, we should think of dates as UI details on the same level as a color scheme. Usually what we want is better thought of as a point in a time sequence which is best represented as ITA timestamps, e.g. Unix Time.

I think if you have control over your data you can take into consideration whether this is a likely scenario for you. If it's not, then just run with it. A lot of applications can get by with zero errors sticking to JSON. The ones that can't should just use something else.

>Since there does not exist a standard for path syntax, ReJSON implements its own.

Well there's this: JavaScript Object Notation (JSON) Pointer RFC https://tools.ietf.org/html/rfc6901

Looks like ReJSON just uses periods instead of slashes as the delimiter, except for also using a period for the root object.

We wanted to start with something simple and intuitive, and since we're not doing indexing yet, we figured complex multi-selectors are an overkill at this stage. What we chose is basically the most rudimentary subset of JsonPath.

Once we'll add indexing I'm guessing the path expressions will need to become more complicated, and we'll try to choose a "standard" and stick with it.

I think JsonPath is a great start and I hope you continue to support it. I'm excited to see JSON in Redis and plan on kicking the tires very soon.

Thanks! What I'm looking forward is adding secondary indexing to this module :)

Thank you, I didn't know about that one... good thing HN has comments :) Do you know if anybody is actually using RFC6901?

JSON Patch (http://jsonpatch.com/) also uses it.

JSON Schema uses the Pointer spec for references.

The graphic type use to show performance is inappropriate because line slopes have no significant meaning. A bar graph would be more appropriate.

What's the indexing story? Do it yourself? "Any JSON store implementation will contain a half-specified bug-ridden implementation of half of Postgres's GIN/GiST for jsonb"? :D

We're also developing a secondary indexing module (https://github.com/RedisLabsModules/secondary) and a search module (https://github.com/RedisLabsModules/RediSearch), the direction is to integrate one of them into the JSON module.

Wise words -- a good set of indices are the first thing you need for real problems with realistic data sizes.

Right now the problem that ReJSON tries to solve is very specific and typical for Redis users: You save JSON blobs on Redis (usually session data), now you want to get just a part of a JSON key, or manipulate just one element in a specific key.

Until now you couldn't do it without loading the entire object into memory. ReJSON just allows you to manipulate that object or retrieve just a part of it efficiently. For this use case we do not need and will never need indexing.

By that logic, redis in general would not be useful for "real problems", whatever that is.

It's Redis...

We use Redis Queue (RQ) as a listener cache at the Twitter firehose. Currently it is processing 300,000 tweets in JSON format per day and queuing them for further processing. Lovely piece of software (YTD experience ~ 2 months production environment usage)

Cool, Could you show me your app? Just interesting in handling Twitter Streaming API.

Are you using ReJSON?

This is a welcome addition. I expected JSON to become a first class citizens in REDISland for quite some time. In fact, REDIS has always been my goto store for JSON data. In 2013 I created a microservice[0] for accessing JSON data stored in native REDIS lists[1] via HTTP(S).



Wow, this looks rad. I'm just now working on a data storage module for my hobby project, and have been pushing strings through JSON.parse to store data in redis. Being able to directly address values deep in the hash is great - when setting foo.bar to 'baz', using a path selector instead of reading, deserializing, modifying, reserializing, and writing is great.

Naive question: does anyone know enough about this or modules generally to tell me whether ReJSON values are manipulable from Redis Lua?

That is, I have some application support from Lua scripts manipulating native types today. But if I can read and write into JSON stored this way from Lua, that might be very handy!

You can use modules from Lua if they are loaded, there is not difference between them and native stuff. You can certainly manipulate ReJSON objects from Lua, parse ReJSON returned objects from Lua, etc. No limitations as long as the module is loaded.

Great, that is promising... thank you for the clarification!


> Windows

> Yeah, right :)

How about just "There are no plans to build a Windows version"?

How about not policing someone else's expression?

Shouldn't storing a json as a string be better performance for redis? All the extra jobb to find subkeys put extra work for redis. While pulling out the str and parse it in application would minimize the time spent in redis. This gets more true as the number of applications against one redis grow.

If you're serializing and saving the entire object always - you don't need this extension.

But if you're storing say, a 1K user profile, and most of the time you just want to update the access time, or get some auth token - this both allows atomicity and speeds up what you are doing.

The bigger the JSON object is vs. the how small a piece you want to retrieve or manipulate - the more efficient this becomes.

Also, manipulating a field in a JSON object atomically is currently possible only with Lua scripts - and even then, Lua will have to parse, manipulate and serialize the entire object. Where as in this case, if you are just updating something - there is zero parsing and zero serialization going on in the module.

I really could have used this for my last side project. Moving JSON blobs between Python and Redis is pretty error-prone, especially if those blobs contain nested structures.

The page renders as 100% black and totally blank with uBlock Origin for me. Not complaining as much as pointing it out in hopes the author might see this and adjust to prevent people from bouncing.

Same here. Fanboy’s Annoyance List seems to be one triggering it.

Instead of disable all of ublock you can toggle the cosmetic filtering when you click on the ublock icon. http://i.imgur.com/Ne9mHBd.png

No problems here; 7 requests blocked, but page still rendered fine.

Same here. I'm using the default ublock origin configuration. Maybe it's specific rules or not due to ublock origin.

For anyone else interested but having rendering problems:

A talk by the author on youtube: https://www.youtube.com/watch?v=NLRbq2FtcIk

The ReJSON website: https://redislabsmodules.github.io/rejson/

Running uBlock Origin here and it's loading fine in firefox and chrome. I have a default install, if that helps debug why it's rendering blank for you.

With uMatrix, it looks fine. You need to allow Redislab.

Sorry, can't reproduce.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact