Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Datapane – A new way to build reports, dashboards, and apps in Python
55 points by pea on March 23, 2023 | hide | past | favorite | 14 comments
Hello HN! We’re Leo and Mandeep, founders of Datapane (https://datapane.com).

We're building a way to create reports, dashboards, and web apps from your existing data using Python. Think of it as a combination of React and htmx, specifically designed for the Python data stack.

Our GitHub is https://github.com/datapane/datapane and you can try building a report or app in ~2 minutes on Codespaces: https://try.datapane.com

We started building Datapane at our previous start-up, where we struggled to deliver ML model results to clients. Much to our surprise, the data science took less time than repeatedly creating reports by copying and pasting plots into PowerPoint decks.

It seemed absurd that we had to switch to PowerPoint or legacy BI tools like Tableau to share, and our initial goal was to programmatically generate reports using the datasets and plots we had in Python. To enable this, we started hacking on a Python-based UI framework for constructing HTML views from data-centric blocks – like plots, data tables, and layout components.

You can export these to standalone HTML files, or host them as a web app on somewhere like GitHub Pages or Fly.io. We recently also added the ability to connect Python functions to forms and front-end events so you can build web apps which run backend code. We handle the entire network and RPC layer, so you only need to write plain Python functions that take parameters and return other blocks.

You can check out an example of the code to create a simple app: https://github.com/datapane/examples/blob/main/apps/iris-plo...

Datapane’s philosophy is pretty different from other products in the space.

We wanted to keep things simple, but avoid the footguns our users faced with frameworks like Streamlit, where the reactive/network-aware model was hard to move beyond an MVP or POC. For backend interactivity, we believe the original web got a lot right, and unlike reactive models which rely on websockets, Datapane is unashamedly request/response. This takes inspiration from HTTP and our own experiences with htmx, which offers an elegant way to add interactivity to HTML. Under the hood, we actually compile down to a (gasp!) XML-based hypermedia format, akin to HTML, but tailored specifically for constructing data UIs.

The result is that not every change in your app requires a server round trip, as much of it can be pre rendered and most interactivity happens on the client-side. In addition to improving performance, this also makes running in production become 10x simpler.

This separation between the view and backend compute also makes Datapane modular. If our app server isn’t a good fit for your use-case, serve Datapane views from the web-framework of your choice (we’ve been hacking on serving views from Django). Want to compute blocks from inside Airflow or generate them on a schedule or from a webhook? Computation can happen out of band of the UI. You can even build and host apps from inside of Jupyter, where you can preview blocks live and convert notebook cells to blocks in your view.

We currently offer a hosting platform on https://datapane.com for sharing reports publicly (free) or with your team (paid), and will be adding serverless app hosting support to it in the next few weeks.

Our ultimate goal is to create an open-source toolkit for building data products across the entire stack – from reports, to dashboards, to full-stack apps – all using 100% Python. You can see a few we’ve built already in our gallery: https://datapane.com/gallery

We’d love to hear your feedback.


I haven't looked into this in detail but I'm excited for this. A company I used to work with had a tool a lot like this and it was very useful as it allowed us to build dashboards that could pull data from anything or join/transform data through code. Very very handy.

One thing I found annoying about our internal tool is I often found myself duplicating boilerplate code for simple things. Like, initialize a MySQL connection, run a query, and render it with a graphing library. I don't know what your tool does, but it would be nice if it had the ability to insert a "SQL query line graph" rather than making me write those 10 lines of python.

Thanks for the feedback. Regarding creating plots from SQL, we were just talking about that internally. It's a really interesting idea and something we have used ourselves in django-sql-explorer (https://django-sql-explorer.readthedocs.io/en/latest/feature...)

We have a set of components (https://github.com/datapane/components) and can look at something for autoplotting from a query. Similarly, we recently built a component to automatically create vega-lite plots from datasets using ChatGPT, which is another approach that works well (if interested, I can write up as a blogpost).

Looks cool and to me the most interesting part is that it uses htmx which I like!

However how does it differ from all the other offerings in this space that it is increasingly difficult to keep up with?

I'm thinking of: - Dash - Streamlit - Gradio - Pynecone - Shiny Python - Voila - Holoviews - Voila - ...

Thanks for the feedback! We love htmx but don't actually use it under the hood, although we were inspired by their model when designing Datapane.

Most frameworks in the space are great for building small apps and demos, but struggle to progress beyond an MVP, and for any moderately complex case quickly get messy. Datapane is for people who are building data products vs. small demos, which often include static sites/reports, externally triggered updates, background processing, or integration with other frameworks (like running inside of Flask). As an example, one of our users is a large gaming company who pre-render Datapane dashboards from inside of Spark pipelines in order to build internal apps for a/b testing.

That said, we also wanted to create a model which is dead simple to use and doesn't fall into callback hell or require the user to write HTML or CSS. I would love any feedback on this!

Cool project! We mainly use Metabase at GitStart and use the standard sharing options (https://www.metabase.com/learn/administration/guide-to-shari...), so I'm wondering, would there still be a use-case for us? And how long do you think it would take to migrate a 100+ queries Metabase library?

Thanks! If you are doing basic drag-and-drop BI queries on data you already have in a centralised warehouse, Metabase may be the best option for you.

From what I can see from GitStart, Datapane will be useful if:

1) You want to go beyond BI and build applications that can actually run code. For instance, an internal tool that pulls down data from a new Developer's GitHub repo and rates them on some heuristics.

2) You want to create custom reports or dashboards for each client. For instance, for each customer you have, programmatically generate a custom report with data and plots which demonstrate ROI, and share it with them / embed in your product. You could even create a Slack bot where they could generate these on-demand.

In terms of effort, it would be pretty trivial to port SQL queries to Python and use them to generate reports. Would love to help out if we can; I'm leo@{ourwebsite}.com

This feels like it should be somehow paired with https://datasette.io/

Very cool. Why are you using a neon purple theme for the plots and your web page ? I ask because that color scheme seems to be popping up everywhere.

Because it is way faster to just stick to the standard named TailwindCSS color. I'd do the same if the design is not my main selling point. TailwindCSS do make it super easy to define your custom design tokens (your own brand color scheme) and can be used. However, in that case, you are going to have to define your own styles and use TaildwindCSS color utility, which is not readily available if you want to copy-paste the design and the associated classes from TailwindCSS or from many of the pre-designed modules all over the Internet.

I believe it is a good choice to start with the default (still beautiful), especially in the early stage of the Startup and then eventually move to one's own branding later.

Thanks! We are big users of TailwindCSS across the product and those colours are from their palette: https://tailwindcss.com/docs/customizing-colors

the fact there are so many different steps in https://github.com/datapane/datapane#analytics signals that you may want to adopt https://consoledonottrack.com/

Thanks so much for sharing this, I hadn't seen that before. We'll get on implementing it.

> Query CSVs using ChatGPT

sql? in particular, postgresql.

a lesser shiny....

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact