Craig here from Crunchy. Pretty excited to ship this. It started with one of our engineers showing up in slack 6 weeks ago "So I did something crazy over the weekend..." from there it evolved into much more.
The post explains a lot of the high level, but we're going to being doing some deeper dives as well including the build process, but also some of how the tutorials are powered by an internal notion doc which allows us to easily iterate and collaborate on the tutorials themselves.
Maybe I missed it, but is the licensing info available anywhere? The playground seems to be using v86.js I don't see the required "Redistributions in binary form must reproduce the above copyright notice" -- https://github.com/copy/v86/blob/master/LICENSE
Any chance you can give us access to the PSQL sandbox without it being attached to a tutorial. Would both be a great way to learn and also function as a productivity tool if one could fool around with PSQL "scratch pad"
Maybe give it another shot or two, it's definitely a slow process to import. We have some improvements coming there in time but it's ideal at moment for smaller schemas and datasets.
Would love to see this! I could see a wasm version of postgres being useful for all sorts of things, especially an easy dev instance of pg runing inside a node app.
This is great, I can't wait for someone to get python in the browser talking to postgres in the browser (and hopefully very soon afterward Django in the browser).
This is great. Have you done any benchmarking? I am currently looking to find/create good Wasm benchmarks, and it seems like this might be a good workload.
@Craig, I wonder if you've thought of making this available (would it work??) for test case construction. Today (working in Go) I start a postgres in a docker container for testing database code. Could I instead use a go wasm runtime, and start postgres inside of it? That would potentially free me from the docker dependency in these tests.
The communication in and out is a bit of extra work at the moment. We've got some thoughts on improving this in the future, but it's a bit unclear on timeline and roadmap. We're excited to hear what people want from it and how they use it and that'll inform a lot of how we move forward.
This is so awesome for two reasons, Postgress isn't exactly easy to setup, this makes it so much easer for tutorials, anything that makes teaching it quicker is good!
It's also an amazing demonstration of the power of WASM, I am utterly convinced that DBs running in browser via WASM on top of the "coming soon" filesystem/block store api [0] will be the future of all offline first apps. I think it's going to be SQLite that really sines here (it will be a smaller package), but if you can run Postgres in browser, you potentially have a close alignment between your server and browser implementations.
I love the idea of running Django in browser via PyScript with Postgres!
0: https://developer.mozilla.org/en-US/docs/Web/API/FileSystem - This API doesn't grant access to the users file system, but a local virtual one just for the website. It will operate at the block level so can be used to provide efficient persistent storage for WASM DBs.
—
Edit rather than replying individually:
It’s so important when teaching people a new thing, like sql, to hit the ground running. A lot of these people will have no experience with Docker, package managers or compiling from source. It’s so important to make technology accessible to all. This does that! Imagine high school kids having a lesson on SQL and running Postgres’s just by opening a webpage!
We shouldn’t assume that someone learning Postgres, or just SQL via it, is a developer with experience of other areas of development or system administration.
[I retract this comment entirely—see downthread where rewq4321 corrects me.]
> on top of the "coming soon" filesystem/block store api
I’m completely unimpressed by Origin Private File Systems, which I believe is what you’re talking about: it’s just a key-value store, just made to look a little like a file system, but is probably exactly equivalent to IndexedDB in capability and usefulness, quite easily perfectly polyfillable atop it. It is certainly completely unsuitable for building a database on top of as far as ACID transactions or such are concerned—you’ll get appalling write performance because you will have to close the file to commit each write.
I wrote more about this in the thread about Safari having implemented OPFS five months ago: https://news.ycombinator.com/item?id=30394737. (My use of “probably” above is explained in there too.)
Amusingly you where replying to me there, hello again!
I’m sure there are rough edges right now, but I’m complete convinced that even if the api isn’t there yet, this use case will win out and we will see it happen.
From memory the teams working on WASM SQLite are working with the File System API working group to ensure their use case is supported.
[I retract this comment entirely, apart from the pleasantries and the preference of being wrong—see downthread where rewq4321 corrects me.]
Heh, didn’t notice the username match there!
I don’t think “rough edges” is the right characterisation. What OPFS provides is just nothing in the direction required. The kind of file system you need to build a database on is a fundamentally completely different beast, with only unimportant surface-level similarities. It would generally require a complete replacement of the backend, with quite possibly literally no code in common.
I’d like something like this, because it’s certainly genuinely useful for cases like this, but I’d honestly be surprised if it ever happens, because it’s just… not webby. To be useful, it just about requires that the whole thing be backed by an actual file system and exposing that, which is something that has been assiduously avoided so far in the design, probably in significant part because it discloses quite a lot about the host system (fingerprinting; disk performance characteristics, quite possibly even file system identification by various nuances in behaviour; and surprisingly large side-channel attack possibilities, mildly similar to the fuss over high-resolution timers), but also because it tends to be a security hazard, just another moving part where things can go wrong more easily than you imagine.
I strongly suspect it will end up a bit like Web SQL: a nice idea that pretty much everyone agrees is a nice idea, but which is also a non-starter for other reasons.
But I wouldn’t mind being wrong. I do want to be able to deploy a robust, high-performing SQLite in the browser.
Are you sure you're up to date on the latest additions to this API? This wasn't in the MVP:
> The origin private file system provides optional access to a special kind of file that is highly optimized for performance, for example, by offering in-place and exclusive write access to a file's content.
Ah yes, can't wait to run my offline-first app on a kubernetes-on-docker-on-linux-on-qemu-on-wasm stack in the browser to have closer alignment between my server and browser implementations.
> Ah yes, can't wait to run my offline-first app on a kubernetes-on-docker-on-linux-on-qemu-on-wasm stack in the browser to have closer alignment between my server and browser implementations.
You can run vscode on docker on kubernetes on the browser with gitpod!
Speaking of making Postgres easy to setup and use, here are instructions for setting up per-project Postgres ("So I worked out how to set up a Postgres database that’s almost as convenient as SQLite."): https://jamey.thesharps.us/2019/05/29/per-project-postgres/
I think most people who use postgres in their personal projects have a snippet like one that they copy from project to project, I know I do. Curious to see others' scripts!
Mine logs in as `postgres`, creates the database and extensions I normally use, then creates a user and grants the appropriate privileges to that application user for the relevant db/schemas. Incidentally, it's the same script I use for "prod" deployments on my homelab.
> This is so awesome for two reasons, Postgress isn't exactly easy to setup
for tutorial level usage, "apt-get install postgresql-X" or installing Postgres.app has always worked perfectly. what sort of troubles do you run into?
I run into problems with the next steps: now you have to create a database, create a user, grant that user access to the database, setup credentials and start using them.
This isn't impossible - I can do it if I consult my notes - but it has enough steps where something might go wrong that it's a pretty high friction process for newcomers.
You can do all this pretty easily using `pgAdmin` without having to remember any commands. If you do this infrequently, then great use the GUI. If you do it frequently then learn the commands I guess :P `pgAdmin` can help with that too though as it usually shows you the queries it is running.
> I run into problems with the next steps: now you have to create a database, create a user, grant that user access to the database, setup credentials and start using them.
That's like complaining you need to break eggs when you want to make an omelette.
> That's like complaining you need to break eggs when you want to make an omelette.
I have a friend who we discovered didn't like cooking eggs because picking the shells out of the bowl was tedious. Turns out he was never taught how to crack an egg so he would just throw it in the bowl and have bits of shell everywhere. Point is that breaking eggs may not be so easy for everyone. (Of course, after describing to him how eggs are properly cracked, he was excited to try it out.)
Being necessary and being exhausting aren't mutually exclusive qualities.
But - why are the steps needed to create a database and setup credentials such a complex, manual process anyway? Why can't all those steps just be automated by a helper script that ships with postgres?
I reach for sqlite whenever I introduce SQL to people because its so much easier to get started with sqlite.
The reason that I reach for MySQL when introducing people to SQL is the formatted output, with nice tables and column names. Is there a simple way to get similar output with SQLite?
Honestly not my experience, if you have postgresql running locally on your laptop, I find upgrading versions is a pain, even with the automated scripts provided and I often run in socket issues: psql is already connected, not able to connect to this or that socket,...
So yeah, installing is easy and on a clean install it runs fine but it's not unusual to get pretty annoying issues over time in my experience.
They aren't but they probably won't appear user-side, the companies will most likely have to deal with their servers, fix those issues and the users can enjoy not thinking about that when they're trying PostgreSQL online.
1000%. Lots of people want to do data analysis, learn SQL, all that kind of stuff, without having to learn a load of other unrelated development crap. Think of all the people out there using Excel who want to take the next step.
After browsing this site for a couple of years, I made an account to reply to your comment and say, "This is exactly right" lol. I'm a teacher who uses Excel for basic stuff (power query, pivot tables), but I'm also using Excel as a "database", which is obviously not a good idea.
So here I am trying to take the next step and learn SQL along with good database design, but learning these things through Postgres is really not appropriate for someone like me. I think I have to swallow my pride and start with Access or something.
No shame in using Access! It's actually part of how I got my start in tech. Started with an Access database, started writing macros for it in VBA, then started writing VB.NET, then convinced a higher up to give me access to a SQL Server database.
I was in an industry with an absurd number of boilerplate forms that needed to be printed and the ability to create Access forms that automated all the manual filling out coworkers were doing felt like magic. Postgres doesn't really have an equivalent of that.
Yes. See this survey on Twitter from the other day - LOADS of "React" developers (aka front-end developers) are deterred from learning more about backend programming due to friction like this. https://twitter.com/rachelnabors/status/1558888478955421697
I know this much about frontend development too but I was sure that you still need a package manager to install your npm\yarn and other tools.
PS: Obviously you can live without one. Regardless of being backend\frontend developer.
PPS: and honestly, how can you be scared of postgresql and co. after webpack? If you can actually understand this crap postgres setup should be to easy for you.
Yeah I personally find Webpack a whole lot scarier than PostgreSQL / package managers, but that's down to my previous career experience.
Really the key thing here is that learning new things is hard, and anything that can be done to remove potential roadblocks is worthwhile. I've talked to so many people who were put off learning Python because they couldn't get to a working development environment on their own.
> If you can actually understand this crap postgres setup should be to easy for you.
You can figure almost anything out if you spend enough time reading documentation and noodling with it. I'd still rather spend that time solving my actual problem.
Just to be clear: I was not talking about the project at crunchydata.com. I merely say that in my opinion modern js development routine is much more complicated that postgres setup
Do you have a link pointing out the filesystem API actually being developed further? AFAIK that whole API is kind of stable and done but I might be wrong.
Since I use conda for everything anyway, I just end up doing a `conda install -p ./test postgres` when i want to install a throwaway postgres and then delete the environment later.
I think WASM is very interesting and running Postgres in the browser is interesting, but for learning one could readily use something like db-fiddle, which offers Postgres.
Here's my similar project, which uses Python and SQLite compiled to WebAssembly in the browser so that it can run my Datasette server-side application entirely client-side:
I've been looking at wasm some. its totally unclear to me how this would support Postgres - which requires a posix filesystem interface, and the ability to fork children that operate on a shared memory segment. Were these facilities added to a wasp runtime, or was there major surgery on the Postgres codebase. I feel like I must be missing a simple answer.
This is a fantastic idea, truly innovating what you can do with Postgres. Kudos!
Postgres, possibly surprising to many, is very "simple": it has essentially no dependencies other than a few OS system calls (open, read, write files) and some optional dependencies (e.g. libssl). Therefore, it is very portable and "easy" to compile on many environments. This includes new environments or ideas like compiling it to WASM.
But you need to come up with the idea. This is a great one and opens the door to other use cases. I hope this serves to push the mindset that Postgres can also be used in lighter-weight environments where SQLite (another fantastic database, don't get me wrong) is often considered as the only viable choice.
The only thing he missed is that javascript as a compilation target makes less sense than a byte code format designed for such a thing, which is what WASM is. WASM is more similar to JVM bytecode than it is to Javascript. It seems we have a few implementations of WASM that for historical reasons share a code base with a javascript interpreter. But technically that interpreter is not really needed.
You can actually use WASM to run your own javascript interpreter and people are already doing that to not be dependent on the interpreter that comes with the browser or wasm runtime (outside the browser). If you are going to run node.js in a wasm runtime, that's what you might want to do. Likewise, if you want to offer a browser IDE for a node.js project, you might want to run node.js in a browser and this is probably what you'd be doing rather than passing through the javascript to the browser javascript interpreter, which lacks most of the node.js API. Just easier that way.
Likewise if you want to run some old internet explorer 10 javascript, packaging up on old version of that as wasm might allow you to do that. The hard part of course would be getting your hands on the source code. But MS might help us out here or somebody might implement something compatible. Very much like is being done with flash.
There are so many things wrong with running an enterprise class database in a browser that I don't even know where to begin!
BUT IT'S FREAKING AWESOME!
Seriously, do you ever imagine Oracle or DB2 running in a browser? Crazy, right?
Congrats to the team. To me this is one of the great things about the times we are living in - tons of computing horsepower for cheap, open source software, new-ish technology (WASM), and one crazy idea.
Well, at least I know where all my spare time is going to be spent...
FYI, if you're on MacOS, someone has packaged Postgres into a standard "just works" self contained MacOS app. With a GUI & system tray menu to control it. It's so good that I use it instead of a Docker image for PG. All the psql and pg_restore commands are contained in the .app package and can be called from the terminal.
In the spirit of hacking, I hope it doesn't. Running A on B when B wasn't intended to run on A definitely piques my interest. I'm continuously impressed by such things, like Doom running on calculators.
I get that, but these are runtime platforms designed specifically to get general-purpose apps running in the browser. That is their sole reason for existence.
Yes, WASM is awesome like that. But we've been messing with stuff like that since enscripten, if not earlier.
Besides, postgres already runs on pretty much everything.
Could I brew install Postgres to play with some SQL and clean up after myself when I'm done? Yes. I could also just load this, perhaps in multiple tabs, and not think about it.
There are still a lot of things that can't be done in browsers but should be doable, so I'm ok with this. Like imagine that instead of having to make a native iPhone app for every little thing, which ironically is probably JS under React Native anyway, Apple supported most of those native features via the browser. Of course Apple has much more to lose than to gain from that. They deliberately don't even support push notifications in PWAs.
Would be nice to get Alpine or Debian compiled to WASM, that way it would just be another architecture to be run through Docker and we could get a lot of easy-to-run Docker images right in the browser. Is there any project working on this?
The problem is that it would be an actual new architecture (just as much as ARM or MIPS or whatever), which means you wouldn't have compatibility with any existing docker images.
I don't think thats really Postgres running in the browser, it's more a "compatible" reimplementation of Postgres in TypeScript. Obviously impressive in its own right!
Great achievement! Now please PostGIS in the browser would make this incredible. Turfjs is a bit limited for Ionic and electron apps, imagine PostGIS in the browser.
When I see 'in the browser' I read 'in an easy build and ship app'.
Can this become an embedded database? I noticed that the load time is quite high. That's prob due to the size of the binary. I think the size can be dramatically reduced if you remove stuff that is not practical in the browser.
The tutorials and interactive learning at SQL Tutorial are pretty good. Not Postgres (well, don't know anything about their implementation).
I've been having my kids go through it. They are learning a lot. I would recommend this site to anyone who needs to start with SQL from the ground-up and get a reasonable understanding of the language with practical hands-on usage.
Interesting news, I hope to see the WASM based postgres for dev machine integration tests definitely would make things easier then the current setup through Docker based containers.
Love this! I think it's time for the database to move closer to the browser.
We spend some time getting the node-postgres library working with websockets so we could go browser->websockify->postgres: https://github.com/bitdotioinc/node-postgres - this lets us use a full-featured postgres client (with, eg, cursor support) in the browser.
This is really cool, one nitpick on the basics tutorial though. It could be confusing for beginners to write `select distinct(event_type)` because it makes it look like `distinct` is a function, and makes it looks like distinct only applies to certain columns. I've seen postgres beginners confused by syntax like this writing queries like `select distinct(event_type), magnitude` and expecting the query to only do a distinct on `event_type`.
What we need is Kubernetes running in the browser to orchestrate
web (WASM) applications and infrastructure it takes to render
web 4.0 sites and all the different blockchains that it will require.
I seem to remember that Postgres uses SysV shared memory between the master and connection processes - how does this end up working with WebAssembly, is there a build option to use threads for shared memory instead or...?
> Due to browser sandboxing there is no way to connect directly to the Postgres instance beyond the embedded psql interface that we establish for you. The current configuration allocates 512MB of memory for your Postgres instance, we may make this more configurable in the future. It’s in the browser, hence if you refresh you’re going to get a fresh instance, we haven’t created any persistence layers (yet).
WASI is just a set of standardized interfaces that any WASM host can implement. So while there’s no WASI out of the box on browsers, there’s nothing preventing you as a developer from bringing your own implementation (in which, for example, writes to stdout could sent to console.log, and files could be backed by an in memory VFS).
Will this work with manifest v3? (Does it need a worker thread like absurd-sql and thus doesn't work with mv3?) If it does work, is there a repo or writeup on how to DIY this?
I'm creating tutorials where I try to let people learn as much as possible without the need to install anything and software in that spirit always makes me happy.
Both? So for the tutorials they use a SQL initialization file stored in a GIST (our solution, but can be stored at any public storage url) and it loads that data and any other SQL setup commands in that file on startup to the database that exists in your local browser session only.
MySQL was the M in the LAMP stack (I’m this old, crap) but it had issues with downright brain dead behaviors and defaults which were only fixed multiple major versions later. In the meantime if you wanted to store what you asked to be stored without silent truncation etc you chose Postgres.
From the user's perspective, it's the little things. The feature set is overall bigger and better than comparable RDBMSs. Not having to deal with Oracle or Microsoft is also nice. But there's not a huge difference in usage vs MySQL.
Those familiar with the internals might say a lot more about how it's faster or something. IDK, wouldn't surprise me if it were.
PostgreSQL's fame is kind of a paradox: it's so good because everyone uses PostgreSQL because it's so good. With the variety and scale of applications PostgreSQL is used for, it has organically evolved to be a swiss army knife of a database.
Basically PostgreSQL is to databases as Linux is to an OS.
Wasn't always this way...even not that long ago. Back when Heroku was getting serious about building up the Postgres service (I helped some...this is circa 2010) the common question was "that's weird, why not MySQL, isn't that how most Rails applications are run?" RDS only had a MySQL product for a very long time, and that followed from market demand. Tutorials for Rails were generally written with MySQL.
Hot standby (streaming replication) was a key feature to arrive around then. Also, I'd like to think a lot of the work Heroku did to engineer Heroku Postgres, and market it, contributed much better Postgres support in web frameworks and their affiliated ORMs in those critical years from 2010-2014, where encountering headwinds in defects in Postgres support for Rails, Node, etc was common.
When RDS came around with their Postgres offering, I'd say at that point, it could be said that Postgres entered a new stage: it was no longer reflexive for engineers to shrug their shoulders at shaky Postgres support in drivers/ORMs/etc like they did before that.
The post explains a lot of the high level, but we're going to being doing some deeper dives as well including the build process, but also some of how the tutorials are powered by an internal notion doc which allows us to easily iterate and collaborate on the tutorials themselves.
Perhaps our favorite easter egg is that you can bring your own SQL into it for example: https://www.crunchydata.com/developers/playground?sql=https:...