
Questions about the Massive South African “Master Deeds” Data Breach Answered - robin_reala
https://www.troyhunt.com/questions-about-the-massive-south-african-master-deeds-data-breach-answered/
======
thrwway_x34
The mystery of there being more ID numbers in the breach than people in SA
could be explained if they were being used in the database as a generated
primary key, and since the remaining columns are nullable, they could filled
in with data from various sources as it is sucked in. The fact that New_IDn
[1] is an bigint leads me to strongly suspect this, since most developers in
SA would treat ID numbers as strings, rather than numbers, unless they had a
very good reason to do otherwise.

It would be pretty easy to figure this out by stripping out the last 3 digits
from the ID numbers in the DB, and then ordering them by the first 6 (DOB)
digits. Assuming that digits 7-10 are allocated to males and females
sequentially in their respective ranges, there should be much sparser data for
higher numbers in digits 7-10 (say, above 5500 for males and 0500 for females)
[2], since those people wouldn't exist.

As for the source of the data, I have never owned property, but my email was
in the breach, so the Deeds Office isn't the (only?) source of the data. I
also doubt that the Deeds Office even records email addresses.

If would strongly suspect a credit bureau or two, since they track historic
address data, emails and employers. I have my credit reports that have
slightly incorrect address data from SA credit bureaus, and it would be
interesting to see if it matches the data in the DB. My wife, mother, and
children don't have credit records of any significance, and none of their
emails came up in the breach, and my father and I, who do, had our emails
exposed, according to haveibeenpwned.

[1]New_IDn is the column name because the original ID numbers tracked race in
Digit "A", and this fell away as apartheid was dismantled, so people born
before around 1990 had their original ID numbers reissued, and position A now
has an "8" for everyone.

[2]A good upper bound guess would be to look at the number of births in a
particular year/365/2 (for each sex)

~~~
danielbarla
> The mystery of there being more ID numbers in the breach than people in SA

Do you have a reference for this? From what I've seen, Troy managed to restore
31.6 million records, and said previously that the full backup seems to have
45 million. There last national census (2011) put the (current, living)
population at 51.7 million, and rapidly growing. Granted, many of those are
not of home ownership age, but the ballpark is very close all of a sudden
(especially if you include the deceased).

There's also the question of whether those records are unique ID numbers; from
what I understand (based on discussions of colleagues who work with actual SA
deeds data), each record represents one status change in title deed of a
property. So, each property purchase you've made will be in there as a row,
with some people in there dozens of times.

Edit: I see he talks about 60 million rows; though I wonder if the statement
about uniqueness is correct. Since the table he refers to seems to have two
columns with the letters "ID" in it, I'd wager that the first one is a simple
autonumber, and the second one is the actual ID number. But who knows, he's
the one with the data.

~~~
SideburnsOfDoom
> Do you have a reference for this? From what I've seen, Troy managed to
> restore 31.6 million records, and said previously that the full backup seems
> to have 45 million

the reference for an update on those numbers is in the article linked above.
He later got all the data in.

From TFA: "My original import of the South African "Master Deeds" data didn't
complete. Just ran a complete one: 60,323,827 rows with unique gov IDs."

~~~
danielbarla
I noticed that just after posting; made an edit though. Since he's basing that
on a row count, I'm not convinced it's necessarily unique IDs. To be fair, I
don't have the data, and he does.

We do like our autonumbers here in SA, so I wouldn't be surprised, despite the
column name.

~~~
SideburnsOfDoom
The row count is puzzling, yes. There simply are not that many property-owning
South African citizens.

Some expats are known to be in that data. Then there are the rows marked
deceased.

But as others have pointed out, so many South African citizens (out of the
circa 55 million) are poor and are not property owners. The 60 million rows of
data hasn't been comprehensively accounted for yet.

~~~
danielbarla
Yepp. As far as I can tell, about 15 million of the population are in the 0 to
14 age group, so are highly unlikely to appear on a deed (not sure if it's
even technically possible). The total number of properties in the country has
got to be a fraction of that; the average number of people per home is
apparently at 2.2 [1].

Stats SA seems to report in the region of 400 to 500K deaths per year, so if
it was just down to deceased, it'd have to go back a fairly long way.

[1] -
[https://www.arcgis.com/home/item.html?id=582208ececa2424ab6e...](https://www.arcgis.com/home/item.html?id=582208ececa2424ab6e387d9cdcf01e3)

------
coldcode
People's inability to understand basic security shouldn't amaze me anymore,
yet every time I see things like this I still wonder how many stupid things we
don't ever find out about.

~~~
Nomentatus
I suppose you mean that leaving an open to the world webserver copy lying
about without even a password, much less encryption was an idiot's blunder -
but that wasn't a legitimate site. It's way downstream of the actual
information heist, if I read the article correctly.

Nobody got phished. The information was originally gathered by some legitimate
institution, perhaps government, and then sold or stolen. I wouldn't be
surprised if Dracore was founded from the start by a leak from an insider who
figured out he could quit his govt job, take the database with him, and found
a nice business that would make money while he slept. Basic security by
individuals at home can't prevent such "inside jobs."

I once knew someone who was offered a million dollars for a copy of a very
small corner of a records pile that was part of their normal work
responsibilities over a few drinks and said no; I'm less surprised that others
might have said yes.

What I call the "mass effect", that is, the fact that 60 million records could
fit onto anyone's phone (if it had an external chip smaller than the one in my
otherwise cheap phone) makes such inside jobs very hard to defend against; on
the "who will watch the watchers" principle. "Mass effect" so called because
information is losing mass extremely quickly, a whole lot of it now weighs
almost nothing. You can just saunter out the door with it.

------
jonboiza
I'm sure who ever started this got the main db set from DHA (probably
illegally). And that would quite likely be all citizens both alive and dead.

