
Two Australian University Students make a better Census in 54 hours - awoldes
http://eftm.com.au/2016/08/how-two-uni-students-built-a-better-census-site-in-just-54-hours-for-500-30752
======
iamshs
WooHoo!! They hosted their website on AWS. And totally ignored privacy
sensitive nature of census. Also deployed Google Analytics. And as with
anything these days, added in a healthy dose of Trump. Neat work, guys.

BTW, this looks like neat program name, "Bernd is studying Creative Industries
and Information Technology".

~~~
mistermann
AWS isn't capable of supporting confidential information?

I'd think I'd have read about that somewhere because it seems like a fairly
serious limitation.

~~~
shearnie
We've managed to get confidential data stored in Azure for a large financial
organisation, all compliances they required passed.

A server is a server right?

Problem could be cultural and generational. Some old school types like the
idea of a server without redundancies and fire prevention and fault tolerance
and high security clearance in a data center. They want it to be blinking
lights at them in a cupboard on premises. They feel it is safe due to physical
proximity, even though their firewalls are like cottage cheese.

We're noticing that issue with health data for medical practises. Older people
are suspicious of cloud. The younger generation are more savvy.

~~~
manicdee
The younger generation are more gullible, lacking awareness of data security,
oblivious to data sovereignty, and easily distracted by squirrels.

An Australian census is never going to take place in hardware controlled by a
foreign national on foreign soil, especially not in a country where cloud
servers have been seized in the past for simply having the possibility of
containing data associated with a person associated with a crime.

One thing you will learn as you gain real world experience is that there is no
such thing as "too paranoid" when it comes to IT projects handling sensitive
data.

------
foxylad
Nice demonstration of smart system design, but the first item on the
Australian government's specification would be the use of a private Australia-
based system. The bureaucrats who wrote the spec will have limited
understanding of data security, and will believe building their own local
system costing an order of magnitude more than a cloud-based system must be an
order of magnitude more secure.

On a much smaller scale, we provide a school interview booking system used by
thousands of Australian schools, built on Appengine. Sadly Queensland's state
government has woken up to the fact that the data (names, email addresses and
booking times - not hugely sensitive information) is stored overseas, and
we're having to jump through all sort of hoops to avoid a blanket ban.

Basically if your customers are not in the US, you'll be well aware of this
issue.

------
bootload
A bit of context for foreign readers. Every 5 years a mandatory national
census is undertaken by Australia's premier data gathering agency, the
Australian Bureau of Statistics (ABS). For over 100 years the census has been
completed on paper forms delivered and collected by government employees.

Political interference by prior governments who cut funding and left the
department without a boss for 12 months. Change of leadership, no re-funding
and the newly appointed ABS leader over promises savings while implementing a
new online census without a backup plan. At the same time (this is the real
story) decides to collect names and addresses of every person completing the
census to link these previously anonymous data gathering.

New code, untested systems and an expected audience of 15 million people
logging on, most likely after dinner around 1930 EST. The system experiences
load, the site admins panic, make mistakes then pull the server off-line. At
the same time politicians and other stake holders deride ordinary Austrians
who question the _name-gathering_ issue. As the service goes off-line and
people try to enter the census, the site is down, the PM reports on twitter
everything is good just before 1930. The ABS continues to auto-tweet the
service is working. Australian gold medal swimmer at the Olympics derides
convicted Chinese swimmer of drug use. Australian politicians blame Chinese
for hacking. [0]

72 hours later the service was still down. The census requires around 95%
completion to be useful. 10M AUD is spent and we have around a 50% completion.
The Australian security services investigated the alleged hack and found the
administration and contractor (IBM) to be at most fault.

An all round cluster-f*ck.

I saw this claim yesterday
([https://twitter.com/peterrenshaw/status/764991292828815360](https://twitter.com/peterrenshaw/status/764991292828815360))
and had a quick chat to Austin (SWE) and Adam (Organiser/AI & Neural networks)
asking for them to write a summary of what they did addressing both non-
technical for the benefit of very non-technical press and political wonks as
well as a technical article that explains the assumptions made and addressing
how the ABS may have technical reasons for not using third-party computing
services. As far as I can tell this article is the general non-technical
article by Adam.

[0] check my twitter feed for examples of articles explaining the buildup:
[https://twitter.com/peterrenshaw](https://twitter.com/peterrenshaw)

~~~
6nf
Good summary. The only thing I wanted to add is that the site was load tested
at 1 million submissions per hour, which if obviously not enough for the
evening peak time on census night.

~~~
bootload
_" the site was load tested at 1 million submissions per hour, which if
obviously not enough for the evening peak time on census night."_

Yep. There were some other technical/spec insights. Like the router failure
[0] and the most important bit the use of really poor techniques [1],[2] to ID
sensitive data and the potential sale of it. [3]

[0]
[https://twitter.com/riskybusiness/status/763583981749100545](https://twitter.com/riskybusiness/status/763583981749100545)

[1]
[https://twitter.com/peterrenshaw/status/763266008383578112](https://twitter.com/peterrenshaw/status/763266008383578112)

[2] "using publicly available data @TurnbullMalcolm yr #SLK581 is
URBCO241019542" ~
[https://twitter.com/peterrenshaw/status/763244867564601344](https://twitter.com/peterrenshaw/status/763244867564601344)

[3] [http://www.smh.com.au/federal-politics/political-
opinion/cen...](http://www.smh.com.au/federal-politics/political-
opinion/census-meltdown-just-the-latest-bureau-of-statistics-
bungle-20160809-gqow3e)

------
rbobby
> How would it cope with a Denial of Service attack though? “Fine” – “it would
> have racked up a bill, but it would have survived”

Uhm... so an unbound AWS bill... that doesn't sound "fine" to me.

------
hoodoof
Not much more was needed than an editable Adobe PDF form and a web server that
accepts POST data.

