Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Storex – A modular and portable database abstraction for JavaScript
52 points by Blahah 10 months ago | hide | past | web | favorite | 21 comments
We'd like to show you Storex, a database abstraction layer that allows you to move your database interaction code between client- and server-side. By surrounding it with modular, recombinable packages for common problems such as schema migrations, it's possible to re-use logic touching the database across a wide variety of databases, allowing you to develop your code entirely in-browser in your daily development workflow, and then move your database to PostgreSQL/MySQL back-end once you're ready. Storex was designed modularly enough to easily be adapted to mBaaS like Firebase, which is coming up soon. Right now it's being used in WorldBrain's Memex (worldbrain.io) to provide you with a client-side full-text searchable database of every page you've seen online and your annotations on them (up to 5GB of data for some of our users.) We'd appreciate your thoughts very much, and don't hesitate to get involved! More info: https://medium.com/worldbrain/storex-a-modular-and-portable-...

I think this can make it easier for hackers to understand your data model right from the beginning. Even with Firebase I don't manipulate data in the client because anyone can just open a f12 and start changing the data as they wish, it's a good idea but it's a hacking world and we need to make sure that data manipulation stays in the backend only with CORS, SSL and requests protection against injections either in SQL or anything else can even be a header.

I understand were you're coming from, I used to think why not. Well the security is still the barrier and even back in the old ages of Java EE the beans stayed safe guarded in the backend from the Web Project.

If that's your concern, part of the plan for later would be to make it easier to generate a REST/GraphQL API that you can easily consume on the client side, while not having any schema data in the front-end. Say you'd have a TodoStorage which has high level methods like markAsDone(), you could have that either write directly to IndexedDB in your dev workflow, or move that class to the backend, and replace the front-end version of TodoStorage talk to a GraphQL API instead. The point is to have a choice, and allow the user of the library to understand the risks of technologies involved and use and mix them according to their needs with minimal efforts.

I was starting to work on my own project just this week to deal with the exact same issues and use cases. Excited to check this out and see how it evolves!

This could absolutely be my own misunderstanding, but you mention that it's explicitly not an ORM, but reading through the docs, it seems like the collections and querying aspects of this project are just that. It actually seems like most of what this project's scope seems to be is to be used as an ORM (which I personally am fine with). Can you elaborate on that at all?

Hey, Vincent (author of article) here! Indeed it's a fine line, but ORMs give you objects with methods that allow you to modify or further query the database, like User.save() or User.objects.find(). This is for me the distinction, and in Storex you get data objects back, but all the manipulation happens through one object, the StorageManager.

So its a data mapper instead of active record, but still an ORM.

Ah, thanks for pointing that out. Will read up on my terminology ;)

This is great. I definitely feel like the front-end could use a more ORM-like, feature rich data layer. I generally use an ORM on the backend. When I get the data to the front-end, it's still relational in nature, and I still want similar tools for working with it (e.g. schema definition, validations, and an api that lets me query and manipulate related data). It seems like Ember Data is closest thing. I wish it could more easily be used outside of the context of Ember.

You're free to implement an ORM on top if you feel that style of access suits you more. My mail is in the article, will endorse your package and help you implement it in a clean way if you want :)

Right... sorry. I didn't make my point very well. What I was trying to say was that this is sort of an ORM, which makes sense to me. I see a need for that on the front-end. I see your other comments about how this isn't exactly an ORM, which I see as a semantic discussion. It's ORM enough for my purposes. I bring up Ember Data because it's an example of battle tested and feature rich front-end ORM. I'm thinking it would be more popular if it wasn't tied to Ember so closely.

Thank you for your feedback, I got a bit confused there :) Indeed I think it's nice to be able to mix and match pieces of existing stuff, which I think more software should do. The multi-device sync I'm designing right now will also not be coupled tightly to Storex, so you can use it easily inside existing applications.

How is this not an ORM? Just because it's technically separate modules which are being combined to form an ORM? How does this meaningfully differ from Sequelize, Waterline, Objection, or Loopback's ORM?

Sequelize is only for SQL databases. This is also for the front end and in the future also for Firebase and others like it. And there's more in the article about adjacent functionality planned, which is outside of the scope of an ORM. And yes, got my terminology a bit wrong: meant to say it's not ActiveRecord.

Cool! We have developed something very similar in the last months and we are thinking of releasing it in open source as well, I'll keep an eye on your library as well! :-)

Very nice, curious to see how we could learn from each other :)

Cool, we have developed something very similar in the past months and we're thinking of releasing it in open source, I'll keep an eye on your library as well!

I like this strategy, are there similar efforts?

There is one DAL which doesn't take relations into account, nor has a plug-in based approach, but I don't remember it's name. Also didn't have the goal of being a collection of packages that solve problems 'around' your data, like migrations and throwing it back and forth between client and server.

This is an ORM.

See comments below.

I have mixed feelings about this.

Sharing database abstractions between the client and the server might make sense for some applications, but I worry it might often be a code smell. I used to believe in the dream of sharing models between the client and server, but that illusion was shattered on first contact with reality. With that said, I'm still open to having my mind changed. I think meteor [0] set out with a similar goal, if not more ambitious, and it doesn't appear to have done very well.

The limitations are pretty severe, since you're basically targeting the lowest common denominator. You cannot reduce the inherent complexity of the problem which these databases set out to solve without sacrificing functionality. My thoughts on this have been all over the place, but lately I've been leaning towards adopting fewer abstraction layers. At some point you'll be forced to actually understand each underlying system anyway, and by designing your data models with a specific database in mind you can leverage all of its features to maximum effect in order to achieve optimal performance.

It might be useful to review Django's ORM design for ideas and inspiration, as they also implement the data mapper pattern. You should consider copying Django's migration commands [1] as well.

The following refers mostly to server databases, but some of it can apply to client databases as well. You probably shouldn't be running migrations automatically, especially in production. We're all human and mistakes happen, but there are steps you can take to reduce their potential impact, if not eliminate them entirely within certain processes. Running migrations outside of peak hours reduces the chance you'll bump into performance problems and that it'll affect users. Always create a snapshot or backup before running migrations, data loss is never acceptable! When a user trusts you with their data it's your responsibility to handle it with immense care.

This tool doesn't help with the more challenging problems related to schema management. Zero-downtime migrations are the industry standard. There's a large set of common migration operations which require a full table lock unless they're written with deliberate care so as to avoid those pitfalls. Sever-side migrations must always be done in lock-step, with fully backwards-compatible code capable of handling both versions during a transition. Client-side applications are usually expected to remain largely fully compatible with both backwards and future changes.

[0] https://www.meteor.com/

[1] https://docs.djangoproject.com/en/2.1/topics/migrations/

You're absolutely right, and I'm coming from the same place. All the thought that has gone into this is a bit hard to condense into one article, but I've written this stuff with the same things in mind. (Also have been a Django, and South user for years.)

The first idea is to have the common denominator implemented, and then being able to access or translate things to lower-level operations. Because even though there may be very complex queries happening, a lot of queries are also quite simple, especially when you're starting out. The goal therefore is to minimize, not completely eliminate database-specific code. And even though best practices for data modeling may differ between databases, in the end you want to express and work on relationships between data, whether in SQL, MongoDB, Firebase or Neo4j. If you express your intentions correctly, your database should provide you with options to translate it to common best practices. (Like the decision to embed objects in MongoDB, or do manual cross-relationship queries for example.)

About schema management, this is the first tool I've seen that actually does begin to support you with the more complex issues of migrations (although it's definitely not finished.) Most framework come with a package deal of how you should run your migrations, meaning from their codebase. Storex generates an intermediate structure of what it's going to do during a migration, which you can then decide what to do with (generate an SQL script, run it directly from code, etc.) This structure is divided into 3 phases for now: preparation, data migration and finalizing (which is something I've never seen before.) Running the prepare phase adds new columns and tables, so the application servers can still access the database. The data migration stage defines what needs to be done, both forward and backwards, which you can also execute in multiple ways (e.g. set up a process that migrates both old and new requests to the new data format.) Then after upgrading your application servers to new code, you can execute the finalize stage to drop old columns, etc. (All of this should happen with monitoring tools and good operation procedures in place, so I'd recommend doing nothing totally automatically.)

Feel free to contact me or open an issue here [1] if you can think of a better way to structure migrations.

Of course you need to do a lot of work yourself still, but at least Storex doesn't fight you, but instead gives you the flexibility to implement your operations exactly the way you want.

Also for the client-server interactions, additional packages will be written to help with this. You're more than welcome to contribute your experience in issues or by contacting me directly to design this in way that accounts for the real-world challenges involved.

What you're saying in the end is also what I'm saying: creating a real-world application is more than just one mega abstraction, but a collection of problems that we need tools for that we can flexibly recombine to our highly individual needs.

[1] https://github.com/WorldBrain/storex-schema-migrations

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact