In the end I wrote django-devdata to do a similar thing: export anonymised, referentially correct, relational data, from a large Django site. Configuration is just in code in Django settings, and data can be exported/imported pretty quickly. Happy to help anyone get set up with it if it’s useful to others!
I would be interested to know what impedance error you are referring to, and what problems you have had to deal with.
Also it would be interesting to know when that was. There have been a lot of improvements especially in the last year.
I am always trying to improve the tool. Feedback is very valuable for this.
Thanks in advance, Ralf
It's hard to describe exactly what I mean by the impedance mismatch, but generally this tool seemed to be (based on a small amount of research a while ago), a primarily GUI-based tool, that requires Java, requires quite a lot of up-front knowledge to use or edit configuration with, and that has no understanding of our application.
On the other hand, the solution we ended up using (django-devdata), was code-based rather than GUI, with configuration checked in to source control, code reviewed, etc. It's a Python dependency, which helps as most of our software and tooling was in Python (no one had Java installed), the config format is pretty approachable when making small updates, no need to learn much of a new tool, and we did very regular database updates on a schema with ~500 tables. And lastly, as the configuration was just Python code in our codebase, it was easy to integrate with the rest of our application, to re-use utils, validation, etc.
Obviously this tool wouldn't be suitable for projects that aren't Django sites, so it's far more limited, but that integration was handy and I'd probably re-implement it for Rails or any other ORM or language I worked with if necessary as it's only a few hundred lines of code.
It seems that you're targeting _R_DBMSs, but is there any chance that ElasticSearch is supported?