miked98's comments

miked98 · 2025-09-05T18:30:49 1757097049

In our experience, the primary driver of Snowflake costs is not the compute for aggregation, but the compute required for lots of reads/scans.

We recently built a Snowflake-to-ClickHouse pipeline for a customer where aggregates are built hourly in Snowflake, then pushed into a ClickHouse table to power their user-facing dashboards.

By offloading dashboard queries to ClickHouse, they slashed their Snowflake bill by ~25%, which was worth millions to them.

(Admittedly, running aggregations elsewhere—for example, in Spark—could further reduce costs, but you would then need Iceberg to make the tables queryable in Snowflake.)

nrjames · 2025-09-05T19:10:40 1757099440

I'm in an enterprise environment where a central IT platform team controls what size warehouses we can have in Snowflake. They are not receptive to arguments for larger warehouses, unfortunately. Our issue becomes long-running queries b/c Snowflake spills the data to disk during the joins. TBH, I could join the data more quickly on my laptop than in the warehouse I'm allowed to use. Anyhow, I have then an old build server that is beefy & has 512 GB of RAM, so I can set up my aggregation and/or OLAP services there, since it's an unencumbered playground.

miked98 · 2025-03-12T22:28:04 1741818484

Rill founder here, I have no comment on the UI similarity :) but I would emphasize our vision is building DuckDB-powered metrics layers and exploratory dashboards -- which we presented at DuckCon #6 last month, PDF below [1] -- and less on notebook style UIs like Hex and Jupyter.

Rill is fully open-source under the Apache license. [2]

[1] https://blobs.duckdb.org/events/duckcon6/mike-driscoll-rill-...

[2] https://github.com/rilldata/rill

archon810 · 2025-03-13T03:22:21 1741836141

I love HN. Random comments about some service out there and replies are like "I am the founder" or "I wrote that".

westurner · 2025-03-12T23:58:25 1741823905

WhatTheDuck does SQL with duckdb-wasm

Pygwalker does open-source descriptive statistics and charts from pandas dataframes: https://github.com/Kanaries/pygwalker

ydata-profiling does open-source Exploratory Data Analysis (EDA) with Pandas and Spark DataFrames and integrates with various apps: https://github.com/ydataai/ydata-profiling #integrations, #use-cases

westurner · 2025-03-13T04:49:15 1741841355

xeus-sqlite is a xeus kernel for jupyter and jupyterlite which has Vega visualizations for sql queries: https://github.com/jupyter-xeus/xeus-sqlite

jupyterlite-xeus installs packages specified in an environment.yml from emscripten-forge: https://jupyterlite-xeus.readthedocs.io/en/latest/environmen...

emscripten-forge has xeus-sqlite and pandas and numpy and so on; but not yet duckdb-wasm: https://repo.mamba.pm/emscripten-forge

westurner · 2025-03-13T05:42:13 1741844533

duckdb-wasm "Feature Request: emscripten-forge package" https://github.com/duckdb/duckdb-wasm/discussions/1978

loa_observer · 2025-03-17T11:07:22 1742209642

pygwalker also have a kernel compuataion mode which allows you to use duckdb to handle all queries from pygwalker UI.

wodenokoto · 2025-03-13T04:56:18 1741841778

Is there a video of your talk?

miked98 · 2025-03-13T05:36:05 1741844165

Yes thanks to DuckCon team it’s here:

https://youtu.be/_IqvrFWY7ZM?si=1ux9SGUsh4kDs-ff

Alongside several great talks including Rusty Conover presenting Airport - Arrow + DuckDB — and Christophe Blefari (Bl3f) introducing a new, lightweight orchestrator called yato.

wodenokoto · 2025-03-13T07:56:04 1741852564

Thank you for the additional recommendations!

miked98 · 2025-03-12T22:17:03 1741817823

https://news.ycombinator.com/item?id=43347834

miked98 · on May 3, 2021

Here were the initial responses from HackerNews in 2011 :)

“It’s always tempting to build it yourself. “

“They should have just used QlikView.”

“HANA has been doing this for at least 5 years now.”

via https://news.ycombinator.com/item?id=2501160

miked98 · on May 19, 2012

By DDG you mean the "Data Drinking Group", right? I've heard those guys are pretty active on HN.

gettygermany · on May 19, 2012

Cheers! <holdingupdatavodkawithredbull>

miked98 · on July 10, 2011

D3.js powers all of the visualizations at Metamarkets, thanks to Vadim Ogievetsky, who got an early look at the D3 code base. It's a powerful framework that extends beyond visualizations: it can be used to attach data to any part of the DOM, not just SVG elements.

Here's one example that weaves together both data and graphics: http://labs.metamx.com/.

miked98 · on Aug 3, 2009

Practical, extensive, and timely piece on the nuts and bolts of weaving Hadoop and EC2.