More

carderne · 2024-04-16T14:31:47

Hey OP (assuming you're the author), you might be interested in this similar experiment I did about four years ago, same dataset, same target, similar goal!

https://rdrn.me/optimising-sql/

Similar sequence of investigations, but using regular Postgres rather than Timescale. With my setup I got another ~3x speedup over COPY by copying binary data directly (assuming your data is already in memory).

PolarizedPoutin · 2024-04-16T14:57:17

Wish I saw this before I started haha! I left a footnote about why I didn't try binary copy (basically someone else found its performance disappointing) but it sounds like I should give it a try.

footnote: https://aliramadhan.me/2024/03/31/trillion-rows.html#fn:copy...

carderne · 2024-04-16T15:35:12

Yeah I imagine it depends where the data is coming from and what exactly it looks like (num fields, dtypes...?). What I did was source data -> Numpy Structured Array [0] -> Postgres binary [1]. Bit of a pain getting it into the required shape, but if you follow the links the code should get you going (sorry no type hints!).

[0] https://rdrn.me/optimising-sampling/#round-10-off-the-deep-e... [1] In the original blog I linked.

anentropic · 2024-04-16T16:14:29

I'd love to hear from anyone who's done the same in MySQL

PolarizedPoutin · 2024-04-17T20:20:41

Had a read through parts 1 and 2, thank you for the engaging reads! Love how you've formatted your posts with the margin notes too. Thank you for providing the function to write numpy structured arrays to Postgres binary, I couldn't figure this out before.

carderne · 2024-03-13T19:32:23

Same exact experience, a year on Fly but moved to GCP (GKE in our case for reasons) a month or two ago. Super slick when it worked, but that wasn’t often enough…

carderne · 2024-01-14T11:06:48

Would you mind explaining what is going on in the backend? You seem to be using both Rust and Bun/Typescript?

phildenhoff · 2024-01-14T17:12:22

The backend is straight Rust. Bun is used as a package management tool & as a script runner, to invoke Tauri commands that start up a Vite webserver to provide the UI, and start something that rebuilds the binary when Rust files change.

Typescript is used for the Svelte UI. Svelte talks to the Rust backend through Tauri (by default). Both support a headless/web mode where the Svelte frontend connects to the backend over HTTP.

carderne · 2024-01-14T19:47:20

Awesome, thanks for responding. Very keen to give Tauri a spin, seems like a happy middle way between native toolkits (which I'm just never going to spend the time to learn properly) and Electron (which I'm just never going to love).

carderne · on Jan 10, 2023

Here it is (haven’t used it but looks that part):

https://github.com/wfxr/minimap.vim

carderne · on Nov 1, 2022

I’m a heavy Gaia user, always offline.

I make sure to go on airplane mode and hard-reload the app, and it generally works great. If I have week signal, or don’t reload it, or anything like that, it gets confused and constantly freezes.

The offline route creation is very iffy though.

carderne · on Aug 26, 2022

> without having to worry about shell access etc

Do you mean you _don't_ want to give the students shell access?

By default you can run shell commands from within a Jupyter notebook be prefixing them with `!`.

rovr138 · on Aug 26, 2022

They might mean without having to setup individual user accounts for the server.

https://jupyter.org/hub

and if it's just the professor's lab, just a jupyter lab instance with one password works too.

carderne · on July 31, 2022

The line you’re presumably referencing is for four hours of storage.

That’s nowhere near enough to sensibly compare with a nuclear plant.

carderne · on July 18, 2022

Starling gives you API access to your own account. I use it to do a semi-automatic ingest into beancount.

carderne · on July 5, 2022

Surprised that no one has mentioned Placemark [0] yet. It's the other brand new collaborative online mapping thingy, but made by a single dev and more focused on data than cartography/print.

[0] https://www.placemark.io/

carderne · on June 26, 2022

Please edit your post, this is very wrong. It’s more like 50% to 40%.

https://ourworldindata.org/world-lost-one-third-forests