
Show HN: Dropbase 2.0 – Turn offline files into live databases - jimmyechan
https://www.dropbase.io/
======
ayazhan
Hey HN,

We're happy to introduce Dropbase 2.0! It's a tool that helps you bring
offline files, such as CSV, Excel, and JSON files, into Postgres database. You
can also process your data before uploading it using a spreadsheet-like
interface or by writing a custom Python script. Once your data is in the
database, you can query it using any third party tool (credentials will be
provided). You can also access your data via REST API (powered by PostgREST)
or create custom endpoints to serve a more specific use case.

A bit about the tech:

Currently, we support .csv, .json, .xls, .xlsx files. For data processing, we
use Pandas, so if you are comfortable using Python, you can write your own
custom functions to process the data. We also give you a free shared Postgres
database to test the tool with (your data is isolated and hidden from others).
Each one of these databases come with an instance of PostgREST preinstalled,
so you can query your database using REST API
([[http://postgrest.org/en/v7.0.0/](http://postgrest.org/en/v7....](http://postgrest.org/en/v7.0.0/\]\(http://postgrest.org/en/v7.0.0/\))).
You can also generate an access token with an expiry date to share your data
with others.

There are many more features that we baked into the product. Come check it
out, it's open for HN community.

~~~
mtVessel
Question about the TOS:

"However, by posting Content using Service you grant us the right and license
to use, modify, publicly perform, publicly display, reproduce, and distribute
such Content on and through Service. You agree that this license includes the
right for us to make your Content available to other users of Service, who may
also use your Content subject to these Terms"

Does this mean I should have no expectation of privacy or control over
anything I upload?

~~~
jimmyechan
Your data is private and you own all of your data. We do not and will not
share your data with anybody else unless you share it yourself through the
sharing of projects, pipelines, endpoints, or exports. We do however store and
process your data. We also let you generate endpoints so we need some wording
to cover these cases. We'll double check our terms to make this point clearer,
but we added this because you can generate live endpoints and you can share
those.

~~~
rapnie
Unfortunately IANAL and the formulation of the ToS/PP in your, and that of
most other online service providers, always give me that naggy feeling that
the legalese leaves so many loopholes, texts open to different interpretation,
that effectively - even though it may seem so - I have no privacy guarantees
whatsoever. That might be entirely unwarranted of me, but the feeling is
there. Unease.

~~~
gwd
> even though it may seem so - I have no privacy guarantees whatsoever.

I mean, fundamentally, _really consistent_ security is hard; and the best you
can reasonably expect from someone you're not paying is "best effort". For
them to make real promises about security opens them up to being sued if they
fail; it's not really reasonable to ask someone to do that unless you're
paying them a reasonable chunk of cash to offset that risk.

~~~
rapnie
Sorry, but this almost feels like a GPT-3 response to me.

I don't see what security, paid vs. free or best-effort has got to do with my
argument, which is that the loopholes in legalese are so hard to spot for
anyone but a lawyer, that effectively my data _might_ still be used in any way
and possibly against my wishes or expectations (but which becomes legal when I
consent to the PP and ToC).

------
TheUndead96
On a somewhat unrelated note, the design of this landing page is fantastic. It
is exactly how I like to have new tools presented to me. All the fundamental
competencies of the tool displayed on one page, with sufficient, but not
verbose, technical detail. Kudos.

~~~
contravariant
Yeah, although in my case it would have been nice to have a 'light' version,
as I have some trouble reading the dark grey text on a black background in
broad daylight.

~~~
jimmyechan
Working on this.

------
Scheris
Is there a light mode option for the site? Or is iOS just not selecting it for
some reason?

My astigmatism makes reading dark UIs migraine inducing, so as cool as this
sounds I unfortunately can’t read more about it without triggering a migraine.
x_x

(Maybe still default to the dark UI, but if the user has light mode enabled it
uses a light UI?)

~~~
Kaze404
Oh, so that's what makes dark themes so hard to read for me. Unfortunately
there's no easy way out for me, since my eyes are photosensitive due to a
separate complication. Between a rock and a hard place :p

~~~
jimmyechan
I'm with you in the sense that I just learned through these posts that
astigmatism and website dark modes don't go well together.

I wonder if lighter colors and soft grey palettes would work for your case.
have you experimented with colors that are easier on your eyes, given your
complications?

~~~
Kaze404
I haven't actually. I always just used dark mode and assumed the additional
difficulty was a drawback everyone experienced and learned to live with it.
Now that I know that's not the case I'll see if I can find a color scheme that
works well, like you suggested :)

------
dvt
This is very cool. I think there's a lot of room to grow this space: local
"folders" that do some "magic" in the cloud.

Obviously, sync (Dropbox) is just the beginning, and Dropbase takes it a step
further. There's been times where I had a (big-ish) CSV and wanted to run a
few tests/queries on it. Auto-importing it into some database and being able
to run SQL/Python on the dataset (without bootstrapping that locally) would've
been a godsend.

Good luck with this!

~~~
orev
Perl’s DBD::CSV can do this. I would be surprised if Python didn’t have
something similar.

~~~
dvt
This reminds me of @BrandonM's famous reply to Drew Houston[1] :) Of course
there are _ways of doing it_. But sticking something in a folder and stuff
just automagically "working" is a much more pleasant workflow -- and more
importantly, how you create value. Jimmy, I'd say you're in good company!

[1]
[https://news.ycombinator.com/item?id=8863](https://news.ycombinator.com/item?id=8863)

~~~
jimmyechan
Thank you. That comment is legendary!

~~~
phreack
Please don't forget about BrandonM's follow up comment though!

I've seen the thread linked as a 'tech people don't appreciate simplicity' but
he actually acknowledges Dropbox could be very useful and wishes success.

The other criticisms were also very valid at the time, and were acted upon.

------
iblaine
Gitlab has a similar project that will load google sheets as a dataframe.[1]

At my current company, we load google sheets into s3, then mount those files
as external tables. There has not been a commit in years, meaning it has
worked out well for us.

What seems to be missing in these solutions, and what Dropbase provides, is a
UI to guide users through the process.

[1] [https://gitlab.com/gitlab-
data/analytics/tree/master/extract...](https://gitlab.com/gitlab-
data/analytics/tree/master/extract/sheetload)

~~~
jimmyechan
Thanks. That's a useful project! And yes, we aim to make data processing easy
(through UI, low code) and easy to reuse/export (by converting UI steps 1:1 to
code)

------
rpdillon
My favorite incarnation of this idea is probably Datasette.

[https://github.com/simonw/datasette](https://github.com/simonw/datasette)

~~~
jimmyechan
I just had a friend mention this too. I just checked it out and it's really
cool! What's your favorite thing about it?

~~~
rpdillon
I tend to think of software in terms of composable units, so Unix-like
utilities are very attractive in my workflow, and Datasette just fits right
into that model. Datasette is easy to deploy and does one thing well. I can
use it on my little single-board computer I use for hobby projects and allow
other machines on my network to have an API to view a database a daemon is
populating there. But it works just as well to share larger, static data sets
on the internet. It's just a tool that fits right into its niche in the stack
and does its job really well (much like sqlite).

~~~
jimmyechan
Thanks, that's a good point.

As engineers, we also tend to think in terms of modularity and control - call
this "tool flexibility."

With Dropbase, we're balancing flexibility with the goal of creating an
experience that allows users who can't directly work with these composable
units.

How we balance experience vs flexibility is that we give users full control of
the database and the processing steps (we even allow you to export Python code
you can run anywhere else).

We found this is the right balance for the uses cases we're targeting,
although, we're still doing a lot of research to figure out the right balance
and that balance might also evolve over time.

------
jeremyw
Good job. Channeling patio11: increase your prices, e.g. $49/$250/CALL US.
Your customers are businesses, these numbers will be perfectly reasonable.

~~~
jimmyechan
Thank you. Good point, we actually just had a few agencies/consultancies ask
about self-hosted / enterprise options.

We are offering our current pricing to early adopters on HN. We'll likely
increase prices once we do a general public launch.

~~~
btown
If you do this, please offer nonprofit and student/academic plans! Many people
in social sciences and the nonprofit world don't have the engineering
resources to build data pipelines, nor the budgets for a $250/mo plan. But
they're spending every day slicing Excel files of surveys and risk assessments
and potential donors and the now-departed intern's messy list of average
flight speeds of unladen swallows.

In all seriousness, this product could bring leverage to those in society who
could have the most impact. Design is brilliant, the pipeline idea is
brilliant, I can see this really gaining traction.

~~~
ayazhan
Thank you so much!

Yes, we will offer non-profit and academic plans.

------
joeyjojo
I really like this. I could see myself using this in the future for some
personal projects or for prototyping.

What I would really love though is something a little more similar to Dropbox,
with tight integration to the user's filesystem, and keeping the spreadsheet
as the source of the data.

~~~
ayazhan
Spreadsheet view for all your data is on our roadmap. Integrating into user's
file system is something we'll definately explore, it sounds quite
interesting. These are great suggestions, thank you!

------
2pointsomone
This is just just excellent. I do a lot of ETL work and need to build custom
workflows for it, this is exactly what I have been looking for. Good job,
team!

~~~
DouweM
Since you brought up ETL, you may also be interested in Meltano
([https://meltano.com](https://meltano.com)), an open source ETL tool we've
been working on at GitLab for a few years now!

I shared some thoughts on how it compares to DropBase in another comment:
[https://news.ycombinator.com/edit?id=24194916](https://news.ycombinator.com/edit?id=24194916)

If you end up giving it a try, I'd love to hear what you think :)

~~~
2pointsomone
Thank you! Will look into it.

------
ffpip
Congrats! Looks like it could be very useful!

Just a tiny thing I noticed. The free plan is usually mentioned on the left of
the page. ( [https://www.dropbase.io/pricing](https://www.dropbase.io/pricing)
)

Or was the page layout put intentionally that way?

~~~
jimmyechan
Updated the page. Free plan on the left! Thanks!

------
BlackLotus89
I would love to be able to import more than one json file or multiple URLs.
This would allow me to migrate legacy databases through apis.

~~~
ayazhan
it's on our roadmap and will try to add support for multiple files and zip in
the upcoming release

------
kfk
Some of the ideas are good but it’s more interesting if it processes files
like RDS Spectrum does vs loading first to Postgres. I know you are targeting
smallish datasets but eventually data size will go up and loading everything
in PG could become a scaling problem.

~~~
jimmyechan
Thanks! This is a neat idea! It would allow users to upload unstructured data.
We'll explore this.

------
nurbl
This reminds me of [https://www.visidata.org/](https://www.visidata.org/)
which is a terminal based application with similar purpose - loading tabular
data from various sources, and exploring and processing it in a visual way.

~~~
Rainymood
Woah. I need this, but then with Vim bindings ... that would be my ultimate
data browser

------
chrisweekly
Looks useful, and the landing page is excellent.

Minor grammar fix: "See how your data looks like as you process it." -> "See
what your data looks like as you process it."

(could be "how it looks" or "what it looks like", not "how it looks like")

~~~
jimmyechan
Thank you for this! I appreciate it.

Update it to: "Spreadsheet view. See how your data looks as you process it".
Went for this suggestion to keep everything to 2 lines.

------
interactivecode
I've been looking at a spreadsheet to API like GUI interface for quite a while
now. Mainly for small projects around the house NOT company sized work.

Does anyone know of a self-hosted interface like Dropbase or Airtable?

------
alanbernstein
This looks useful, I'd prefer self-hosted for some use cases though.

~~~
jimmyechan
Thanks. Yes, in some cases where you're working with regulated data you'd need
a self-hosted version. We're working on an enterprise version that allows
this.

Would you be able to describe your use case and the kind of data you're using?

~~~
alanbernstein
Sorry, what I have in mind is mostly just personal productivity stuff,
relatively small data, but private.

~~~
jimmyechan
Our BYODB plan lets you connect to 1 of your own databases. It's not quite
self-hosted but you are in full control of your DB.

------
jimmyechan
Thanks for all the support, comments and feedback HN! It's been a really long
day. I will respond to more comments in the morning!

------
config_yml
This is very cool! I was building something similar but with a different crowd
in mind, basically inbetween business and ops people as a "data sanity
gatekeeper".

I was debating going with a similar stack (postgres(t)) but am currently
playing around with sqlitebiter. Cool to see a similar product like this!

~~~
jimmyechan
Thanks! sqlitebiter looks interesting!

------
dorfsmay
How does it work for deep JSON? Does it import it as JSON and keep the depth
inside each row? Or is there an option to flatten the data, or spread across
related tables?

~~~
kelvinzhang
Hey, I'm one of the engineers behind Dropbase. Currently all imported data
needs to be structured, which means that the data needs to be formatted in
either values, records, index, or column orientations (see
[https://pandas.pydata.org/pandas-
docs/stable/reference/api/p...](https://pandas.pydata.org/pandas-
docs/stable/reference/api/pandas.DataFrame.to_json.html)). Right now we can
only auto-detect between those formats, but in the future we're looking to
accept unstructured data as well.

------
synunlimited
Reading the headline I clicked this thinking it might give you an API for your
file system and I had thoughts of managing / viewing my files via an API.

~~~
jimmyechan
We could work on the wording. Let me know if you have any suggestions.

We don't let you manage your file system through API but we offer access to
your database tables through REST APIs. It's not the same but you could hack
it to work that way.

~~~
synunlimited
Perhaps changing offline files to offline data (files) as just "offline files"
is pretty ambiguous of what is supported (and how I arrived at file system)

------
chromedev
Why not build a filesystem API without the need for importing first, so you
can do queries on the file system?

~~~
jimmyechan
That's a great suggestion. We were considering something like this for
business or enterprise versions. It would also allow you to connect Dropbase
pipelines to multiple files or entire folders in local or cloud storage.

------
matz1
tried to import csv from file, failed. there was CORS error shown in the dev
console.

~~~
jimmyechan
Are you still getting this issue? Could you check your file encoding and make
sure it's UTF-8?

~~~
matz1
yes, just tried again, same issue. stuck at "Importing..." forever. I looked
at the dev console.

Access to XMLHttpRequest at
'[https://api.dropbase.io/v1/pipeline/HhNFQmZnnFkjrqK6Lptr3w/l...](https://api.dropbase.io/v1/pipeline/HhNFQmZnnFkjrqK6Lptr3w/load_csv_dd/done')
from origin '[https://app.dropbase.io'](https://app.dropbase.io') has been
blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on
the requested resource.

The csv file is UTF-8 encoding.

~~~
ayazhan
Yes, there is a bug with error messages, we'll push an update with a fix later
today.

If you want, you can also share the csv with us and we'll troubleshoot it on
our end. just email us at hello@dropbase.io

------
xdevice
Xlsx sheets with more than one sheets, only first sheet gets imported?

~~~
jimmyechan
Yes, we'll add multi-sheet import soon.

------
_AzMoo
> Your password must be between 6 and 32 characters (inclusive) long.

Really?

~~~
jimmyechan
Good point. We just added code to allow for up to 256 characters. We're
testing this now and will push to prod end of day.

------
kyawzazaw
Is there a way to automate the data updating using JSON file?

~~~
jimmyechan
If you mean automating data ingestion on a schedule, then yes, it's something
we're building.

If not, could you clarify what you mean?

------
uke1
Would be amazing if you supported parquet :)

~~~
ayazhan
I agree, parquet cannot be ignored and we're planning to support it.

What would you be interested more in: loading data from parquet files or
converting and storing your data in parquet?

------
MuffinFlavored
Can it do joins/group by/etc.?

~~~
ayazhan
yes, you can run regular sql queries on your database where you can join/group
tables.

if you want to join/group static files (like csv/excel), then you'd need
import your files into dropbase DB first and then run sql query on that DB.

let me know if this answered your questions.

~~~
MuffinFlavored
no, it didn't. you should add a feature where you have 2,3,4,5 tables
represented as CSV and you can do joins on them

------
ta17711771
Any plans for sqlite?

~~~
jimmyechan
Yes, we have more database types in our roadmap. At the moment we are starting
out with Postgres. Are you trying to get data to a new sqlite or an existing
one?

------
teejmya
[http://postgrest.org/en/v7.0.0/](http://postgrest.org/en/v7.0.0/)

