Hacker News new | past | comments | ask | show | jobs | submit login

> you have to compare them both completely. Is there an alternative?

When I was doing this stuff for a telco 10 years ago (comparing before/after for CDR mediation changes), I found it was much faster to dump the two x00M row tables as CSV, sort them, and then use a Perl script to compare on a row by row basis. The join approach taken by the Oracle expert took many hours; my dump-sort-scan took under an hour.




OMG. Something went very, very wrong if it's faster to do that outside the database than within.


Yeah, I've no idea how it was that much slower but they were reasonably sized tables (x00M rows) with tens of columns and it was basically a full join on each column - probably would need an index on every column to make it sensible?

[edit: And with CDR mediation, it's not as simple as "do all the columns match exactly?" because some of them are going to be different in certain allowed ways which needs to be considered a match and also you need to be able to identify which rows differ in one column to say "these look the same but XYZ is different?" Which is probably why the query was horrendous.]




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: