
Show HN: Generate test data for your database tables - vasco
http://www.databasetestdata.com/
======
ivanhoe
Creating rnd data on a single table is easy, I don't really need a tool for
that. However, it would be extremely useful to have a tool able to creata test
data using actual relations between tables in the DB (e.g. generate fake
users, then use their User IDs to generate 1-10 random posts for each user.
Then for each posts I want a couple of random comments attached. And so on and
on...)

------
johnwatson11218
I think a tool like this should be able to inspect the real database and model
the columns statistically. If my salary info is distributed a certain way I
would like the fake data to have the same distribution. A tool could also
build language models of my text columns to generate random names that seem
like real names.

Ideally the tool could start to analyze relationships between tables and
columns and produce data that preserved those relationships in a statistical
way. For example if in production each customer has 5 orders on average then
so could the test data.

I'm not sure what you could do with clobs or xml data in columns but I guess
some stuff would remain opaque.

I could see this tool being useful to enable you to move your test code to the
cloud w/o revealing any private data to third parties. Studying a model like
this could also help new developers come up to speed on the system.

------
lutusp
This is easily done with a few lines of Python. I personally think that's a
more efficient way to go -- the Web interface doesn't really provide anything
essential to the process, and the Python approach generates more data more
quickly.

Also, Python knows how to talk directly to the user's MySQL database if it
exists.

~~~
vasco
To be fair, you could say that about most tools on the web though.

~~~
lutusp
Yes, that's fair, and I could. It's true -- there are any number of
advertising-supported Web-based applications whose only reason for existence
is to recruit eyeballs. Not a comment on the merits of the present example,
just the pattern.

------
yogo
I think this needs a SQL export format. It is 'databasetestdata' after all :)

~~~
vasco
Will look into it, thanks for the tip. Also had an idea of accepting an SQL
schema dump to auto-generate the right columns. There's definitely stuff to
improve.

~~~
dikei
Generate SQL is really easy using Squel.js, especially when all you need is
insert statement.

------
rpedela
Pretty cool! I think a good addition would be adding the ability specify data
type for random numbers. So if I want to test 64-bit integers, I can have it
generate numbers in the 64-bit range that exceed the 32-bit range.

~~~
vasco
Yeah, one of the things I had in mind were special option fields for each kind
of data. Like specifying range of numbers, number of words, formats for
data/time/zipcodes... But I wanted something out of the door quickly and had
to leave that out for now. Thanks for the feedback :)

------
verelo
Interesting, some of the 'test' data emails it gives me seem awfully real. For
example: Lue_Schiller@shanie.net

The domain is real, the email probably not...but if they have a catch all
someone else using this could get pretty annoying.

~~~
joshka
use @example.com for emails. see <http://www.iana.org/domains/example> for
more info.

~~~
pests
I know its the standard, but how will that help validating, creating
statistics, or otherwise interacting with the domain portion of an email
address?

~~~
joshka
You could use @prefix.example.com.

------
andrewem
Clicking on "Load Recipe" gives "No LocalStorage support detected. Please
upgrade your browser to enable." on both Firefox 19.0.2 and Chrome
25.0.1364.172 on my Mac. I presume that's a bug because a web site that I
wrote which uses HTML5 local storage works fine in those same browsers.

Edit: I ought to have said, cool idea, and I like being able to save and load
recipes. It would be neat to be able to share recipes on this, like you can do
with web pages on JsFiddle.

------
rpedela
You have a bug. It returns null for a random number.

~~~
vasco
Ah, there was a typo. It's fixed, thanks.

