Hacker News new | comments | show | ask | jobs | submit login
Teenager facing prison for downloading unsecured files from government website (cbc.ca)
793 points by eigenvector 8 days ago | hide | past | web | favorite | 483 comments

This reads like the beginning of The Hacker Crackdown..

As a Canadian, reading this article made me angry. If the information is not supposed to be public, it should not be reachable without authorization or authentication.

Never mind a curious 19-year-old, there are tons of crawlers and indexers out there that attempt to enumerate URLs where they think there might be other content.

Shame on them for building a poorly secured site, but even more for trying to railroad a curious kid who made them look stupid.

Information accessible here was SUPPOSED to be all public, with sensitive information located elsewhere, however a handful of improperly redacted documents were also published to the site. Here's a more technical article: https://evandentremont.com/some-information-on-the-freedom-o...

So this is a classic case of a subcontractor doing substandard work, leaving a security hole big enough to drive a truck through, and then trying to sue someone to save face... Sigh.

If some of those documents weren't appropriate to be viewed by the general populace then the company was criminally negligent in their handling of the data, the "hacker" saw an open door with a sign reading "free information" on it and didn't know any better before running a crawler over the documents to grab them.

And the cries by old geezers in charge, yet clueless of what it is they’re “in-charge of”, that he stole it are eye rolling

It’s ageism but at this point I’m pretty convinced old people should be term limited from office

The problems we seem to be facing are almost entirely due to their inability to move on

Youth shouldn’t spend their lives kowtowing to geezers that quit thinking and are simply peddling what’s become etched in their neurons and “the rules”

You should've stopped yourself at "it's ageism." Surely you wouldn't have had the audacity to be like "It's racism but at this point I think black people should be kept out of office."

The problems here are the usual problems within governments: venality, vanity, laziness, posturing and a certain bureaucratic indifference to human life. Which are traits well established among young and old alike.

Age isn't what's keeping anybody from understanding the technical issues here, either. A computer isn't some kind of magical oracle that only reveals its secrets to the young. Peel back the layers and I'll bet you $100 CAD it boils down to a failure of the institutional hierarchy and the communication therein. With uninformed-electorate-flavored sprinkles on top.

Young tech nerds rarely get voted into office because very few of them have done enough living to be able to assemble a coherent vision of reality and a way forward for an entire, let's say province. Don't get me wrong, a lot of successful older politicians do a shitty job of this too. But anyway, anyone who can do all that, or fake it well enough, quickly finds that messing around with computers is below his pay grade and not worth squandering his attention on. But they will still presumably have such people in their organization somewhere, and should be able to get from them a summary of WTF's going on and why.

EDIT: I guess what I'm even further saying is that this is yet another example of what's become a mantra of mine lately: The missing ingredient is almost never tech and almost always leadership.

> The problems here are the usual problems within governments:

The problems is with humans, you have those problems in courts, companies, charities, churches, clubs and small gatherings of people. The only protection we have against it is to recognize that this will always happen and strive to correct it before and when it has happend. I feel this is a closely related to Meetoo, you need to able to ask for forgiveness, but often the people accused never really get what they have done wrong and belittle their actions.

It's not ageism. It's incompetence. If the content he was allowed to download sans-credentials was publicly available, the host/admin fucked up. If I do a wget for {random_url} and it returns a 200 range response code - that's not my fault - that's the admin. If I'm not supposed to see it, it should return a 403.

Not Canadian, but I expect this teen to be let off completely free.

I write this as I'm trying to recover my wife's website by scraping archive.org and respecting various 400 & 500 level errors.

> It’s ageism but at this point I’m pretty convinced old people should be term limited from office

Resource depletion from overpopulation aside, the lack of social progress we will see once we manage to extend life to the point of immortality is one of the more depressing outlooks I can imagine. Old ideas will never die.

Restoring neuroplasticity is a necessity for true immortality, so I'm not sure it will be much of an issue to be honest.

I'm not sure about age-limiting old people, but I'm starting to think a basic technological literacy test might not be out of place.

If there's two things people in tech seem to vastly underestimate it's how many old people are technologically literate and how many young people are technologically illiterate. I wouldn't be surprised if there are correlations but you're right, age itself isn't the issue.

I'm just surprised they didn't "redact" the documents by drawing black boxes over top of text in the Adobe Acrobat PDF editing tool... If their web presence is that clueless it seems like exactly the sort of thing that would happen.


Something similar happened in Australia last year.

Federal Government ministers all have their phone bill summaries released as part of public record with private information removed.


The contractor redacted the phone numbers in this instance by changing the font colour to white (same as the background) in the PDFs uploaded to the government disclosure website. So if you exported to text, or highlighted text, the phone numbers were clearly visible.

So most federal members of parliament, the former prime minister (Australia's equivalent of president) and opposition leader all had their numbers leaked.

. The large percentage of people are still clueless these days.

>Federal Government ministers all have their phone bill summaries released as part of public record with private information removed


Because they want a phone line that they can talk to their spouse and children without having to field thousands of calls from well-meaning people who want to wish them a good day?

Sorry, I should have written

>Federal Government ministers all have their phone bill summaries released as part of public record


So that the public knows they aren't making secret phone calls to lobbyists or what have you. The purpose of transparency is always to reduce corruption.

But they redact the phone numbers so how can you do that?

> who want to wish them a good day

I see what you did there.

Lmao. "Wishing to mean them a good day"? Really?

If anyone deserves to go to jail in this case, it's the contractor that took government money to implement a system that relies on security through obscurity when they knew that sensitive documents may be stored in it, and then screwed that up further by allowing documents to be accessed through nothing more than incremented IDs. It should be a basic premise of the law at this point in all civilized countries that if you can access a document by submitting a normal HTTP request with no authorization headers or cookies, then the server's owners intended that document to be public.

Or the low paid government employee who didn't bother to read security documentation and uploaded private documents to a system intended only for public documents.

"It’s very clear that the software is intended to serve as a public repository of documents. It’s also very clear that there at least 250 documents improperly stored there by the province. Documents that the province had a responsibility to protect, and failed."


I already play with urls that have possible id's in them out of habit. The only difference here is the poor kid was on a gov website and they did not want the information he found to be public. ( Or like in the USA he might have to pay for each document accessed. But I feel that should not result in a raid to his house.) It is a shame the kid was their target and not a google bot like you said.

Googlebot has the resources to fight back against government attacks.

> I already play with urls that have possible id's in them out of habit

Do you jiggle door handles out of habit to see if they're unlocked? It's antisocial behavior. If you were supposed to have access to that document, it would be accessible from a link or search box on the main site.

Poor analogy. What he was doing (incrementing an ID to access a series of files) is more like leafing his way through a filing cabinet. A filing cabinet that was ostensibly filled with public-access files, and he was told he was allowed to be in the filing cabinet.

So while going to the filing cabinet to get the file he'd been directed to, he leafed through other files too. Why not? They're all public information, since they're sitting here in the unlocked filing cabinet with all of the other public information files.

Turns out some of them were mislabeled, and were private information in the public information filing cabinet.

Not so weird, not so antisocial, not his fault, shouldn't be his problem.

If someone told you that you could get a specific file from a filing cabinet, it would be antisocial of you to start flipping through the other files to see what was in there.

Here's another change-up in the analogy: it's like a public library, and you ask the librarian on where a certain book is located. They direct you to the book. You notice the other books on that bookshelf also may contain information relevant to your interests, so you check them all out.

Sure, it'd actually be like checking out every book in the library, but this is the age of the internet and it's an insanely useful skill to learn how to grep large amounts of text.

Also, the library was called the "Free Information" library.

IMO this looks like a company that did a poor job trying to cover their mistake by blaming "those hacker folks". I don't think it's inappropriate to confirm the kid was acting without malicious intent, but the subcontractor who setup the security for this site needs to be investigated thoroughly.

A link is not someone giving you permission, it's merely telling you where something else is. I can't think of how you even came to this conclusion. It's like you have this incredibly restricted view of the internet, limited to people clicking on a browser, and think that's enough for protecting files. It's not.

You don't seem to realize how bad of an idea this is. You're talking about making criminals of people. You know, I think I remember once reading about how chrome would try the address you type while you typed it (i.e. before pressing enter, it'd make a request for every character you typed). Users of chrome could become criminals because their software would do this.

I mean you are making a request using a uniform resource locator, and the web server is responding to that request.

Best analogy I can think up is an automated free vending machine, with a row covered up by a piece of cardboard. If you don't want someone drinking the cokes on the hidden bottom row, why did you put them in the machine in the first place?

I have a view of the Internet where “protecting files” has nothing to do with whether access to files is authorized or not. I shouldn’t have to lock my door, and I shouldn’t have to lock down my web server. (It may be prudent to do those things, but a trespasser shouldn’t escape penalty just because I didn’t do those things.)

There's understood conventions for when doors can/should be opened. There's also understood conventions for when it's OK to access a resource served over HTTP.

If the response code is 200, it's OK. The response code (not to mention the transmission of the file) is literally permission from the system to have the resource.

If you don't want someone to come in your door, don't put up a sign that says "come in."

If you don't want someone to see a resource at a URL, don't send them a 200 response code or serve the resource. That's the convention for the web.

And I shouldn't have to pay an attorney to write legal contracts, while we're on the subject of fictional, idealized, romanticized, and imaginary realities.

This conversation has devolved into arguing against the analogy. This is the internet: everything on it is public unless care is taken to make it not so.

You may choose to argue whether or not that should be, but that's the way it is.

Your view of the internet only applies to things that aren't the internet. There exists no real governance or real ownership on the internet. These things do exist in some capacity, for the most part, in the physical world within national boundaries. Even still, if this were the physical world and some house existed, with an open door and outside the generally agreed upon distinction of what private property is, then you'd bet your ass I'd walk in and snoop around. If the owner came by and said "Hey! This is private propertay. I'll have you arrested!" then he'd certainly have the right to do so. I could then argue that there was no reason to think that this was private property because the door was open and it looked like public facility.

Trespassing is generally not a crime unless there is a clear indication that the person committing it should not go there, such as a fence or a sign or some form of explicit communication from the property owner. The fence or sign doesn't have to make it impossible for the person to access the property, but it does have to be there so that it is clear that they shouldn't enter the property.

The Internet should be treated the same. Anything put on the Internet should be presumed to be public unless there is some indication to the contrary.

In this case, most of the information he accessed was clearly intended to be public, so there was no reasonable way for him to know that there was some private information improperly co-mingled with the public information, so he can't be faulted for not realizing that he shouldn't have accessed some of the information.

Yeah, but if you staple a note to a telephone pole you shouldn't get angry if people copy its contents. To portray a simple web crawler "trespasser" is a poor analogy. Do you have any proof that the owner of www.zombo.com has given anyone permission to view its contents? If not, should people be persecuted if they visit the site?

Elsewhere in this thread, people have pointed out that Google has crawled (and cached) at least some of the pages that were supposedly criminally accessed.

Think of it like so: you have a robot that anybody can ask anything and that will answer any and all questions truthfully. Whose fault is it if you deliberately tell the robot non-public information?

Still not a perfect analogy. If the robot spews copyrighted content, the robot, its owner and the content receiver are now in trouble.

Why the content receiver? That would be like a musician suing viewers for listening to their copyrighted music playing in the background of Youtube videos. It's the responsibility of people disseminating content to ensure they have the right to do so. That's why file sharing cases focus on the sharing, not the downloading per se.

I was under the idea that possession is still punishable, even if it is not given the same severity as infringeme

Edit: I mean not the act of listening, but the act of storing unlicensed material

I agree that you shouldn't have to lock your door, but a web server is nothing like a house. You should be expected to put a fence around your playground equipment in a grassy field if you don't intend for people to use it as a public park. Even more to the point, you should lock the utilities shed at the public park you run lest someone mistake it for the public loo.

Th website had "Freedom of Information" sign at the top....

It's not just a single file that you have been invited to access. This 'someone' has told the world that all of their public documents are in this filing cabinet. And here is how to find a few specific files. It's not a stretch to think the every other file in the (unlocked) cabinet is also public.

If the filing cabinet was labeled "public information" you bet I would.

I'm not saying that I agree with you, but I'm curious: does antisocial behavior mean it's illegal?

I completely agree with rayiner, and am a little concerned he’s being downvoted so heavily

But no, it shouldn’t be illegal. Yet what he said still completely applies to stuff like fiddling with ids on a site where you suspect it might lead to content you shouldn’t be able to access

Unless you’re whitehatting and plan to inform them of the security issue (probably anonymously because the world is fucked up and whitehatting can lead to jail time -_-)

Just because what you're doing is legal, doesn't mean you're not an asshole

If they're my door handles, then yes. If they're public doors, then yes. Why, just yesterday, I tried the door of my favorite coffee establishment where the Open light was on but, unbeknownst to me, they had closed early.

The kid got documents on a public facing server. He did nothing wrong.

A bit more complicated than that but I am sympathetic to this persons plight. What complicates this is if the website had a Terms of Use policy, if not then outside of existing statutes I can't see how he is guilty. Even if their are terms of use, I think these are useless if I have not agreed before entering the site. All very confusing.

The blame truly lies on the government for allowing such porous security. They should be glad a seemingly benign teenager discovered their flaw and not some more nefarious actor.

Violating Terms of Use is not a crime [1].

1: https://www.eff.org/deeplinks/2010/07/court-violating-terms-...

...for now. It's pretty clear which direction the legal profession wants to go.

This example is not entirely equivalent. My understanding is that the opinion of the court was that authorisation (for the definition of "without permission") cannot be decided based on method of access. I.e. if you have granted a user access to data, you can't later say they accessed it without permission because they used a proxy or bot to access it (in violation of your TOS).

A terms of service can not define law, but it can make explicit what data a provider is authorising a user to access.

I agree, however it is worth noting that this link probably only falls within the jurisdiction of the U.S.

I don't see any terms of use anywhere else, and it looks like the side is down now, but the official links to this site describes it:

"The Access to Information website allows you to submit, pay and receive FOIPOP requests online. The Nova Scotia Government also posts responses to formal FOIPOP requests online on the Disclosure log. This is a free public repository of FOIPOP responses that have been approved for publication and have met a specific set of criteria (PDF file 800 KB)."

I also have a huge problem with the stance that violating the "Terms of Use" policy of a website can result in criminal charges when accessing publicly available information.

Its more like a bunch of books on a shelf at the library with one 'special' book, unmarked, in the middle where they charge you with theft if you grab it.

It... it is accessible from a link

That's what a url with an id in it is

a link

A "link" is a DOM element in a web page which references a URL, but a URL is not itself a link. To point a finer point on it: the fact that a URL is referenced in a link means that a user is supposed to see and access it.

Obscurity does not mean security. Just because the link wasn't referenced does not imply it is sensitive information.

Should you not have the freedom to type what you want into the address bar of a browser?

"Security" is irrelevant--there is no obligation to "secure" private property. Obscurity, on the other hand, implies that the property owner did not intend for people to access certain property. That is what matters.

Sequential numbers are not obscure. They're commonly used for pagination and collections, to the point where popular browsers have extensions to simplify navigating them.



> Obscurity, on the other hand, implies that the property owner did not intend for people to access certain property. That is what matters.

It implies the exact opposite. The owner may have intended it to be private, but making it publicly available, without security checks on a publicly accessible server, implies the property owner intended for people to access that property.

There is an obligation to secure sensitive information of people. Obscurity is not security, and so Nova Scotia was improperly storing the data.

The fact that Nova Scotia might have violated a separate obligation to secure sensitive information doesn't make accessing that information not trespass.

When accessing a document on the web, you ask the server if you can have it. The server then says "yes" or "no" based on a set of rules. In this case, he asked and the server said "yes".

This is like going to a library, asking the librarian if you can check out a book, being told yes, and then later being arrested because they meant to say "no".

Couldn’t the same analogy be used if I left my front door unlocked? The door would happily say: “yes, you may enter” to anyone trying the handle.

I think the real question here is: did the website provide enough information for the user to have been assumed to understand that what they were accessing wasn’t meant to be public (e.g. did the door look like a door to a private property)? And did the user cease to access the data once they understood it (e.g. did they close the door and leave)?

That analogy would be more accurate if you also operated a cafe out of your living room, with a big "open" sign on the front door, and someone accidentally used your personal bathroom because you failed to stick a "private" sign on it. In that case, it would be unreasonable to sue someone for trespass.

Web servers are not houses. They are implicitly public, whereas houses are implicitly private.

Totally agree with this. If the web resource is available to anyone, then how could you be mad that some one saw it.

This isn't even the case though. The site specifically says that the documents, all of them, are public. They just happened to have noticed they screwed up and published some they shouldn't have, and are now going after someone who downloaded the entire set. This was not prohibited in any way that was documented on the site,and the language that was on the site made it sound like it was allowed. Since the theme is analogies how about this one: someone puts a box labeled "free stuff" in front of their house. You dig through it and find something of value. You take it and go home. A few days later your house is raided because the owner of the house put in something valuable by mistake and is now claiming they never intended for anyone to take anything that wasn't visible from the street. Since you dug through the box, they are charging you with theft.

I would agree if it took an actual exploit to obtain the information.

You're correct, but I strongly suspect this is a case of the government trying to deflect blame from their horrendous security to "those young hackers" and that this whole situation could probably be resolved peaceably by ensuring the data in question is deleted from the dude's computer.

Since when is "intention" legally enforceable?

A link is nothing of the sort, it's a colloquial term that connects one point to another. It is not unique to HTML or programming in the slightest. In those terms a URL is always a link because the only practical reason for inventing the concept of a universal resource locator is to link one thing to another without them coming into conflict. There is no reason for a URL to exist except for it to be a link and it has nothing to do with the DOM.

Sounds like you might be thinking of an anchor element. People ask for the "link" when they mean the URI all the time.

Wikipedia disagrees:

> In computing, a hyperlink, or simply a link, is a reference to data that the reader can directly follow either by clicking, tapping, or hovering. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with hyperlinks. The text that is linked from is called anchor text. A software system that is used for viewing and creating hypertext is a hypertext system, and to create a hyperlink is to hyperlink (or simply to link). A user following hyperlinks is said to navigate or browse the hypertext.

The URL in the abstract is not a "link." A link is an element in hypertext.

You're just being stubborn. Link is used to mean a URL by the layperson. For example, I don't think I've ever heard my family say "URL" in their life.

a "link" is something that connects two entities together in some arbitrary way. a chain link for instance connects the constituent entities of a chain (other links) together. a link in a linked list connects different nodes together. and a hypertext link connects two hypertext resources together. what you're doing is picking one specific definition and applying it liberally to all potential interpretations and contexts of said word. this is why i tend to espouse the virtues of generality over specificity; when you get too specific, you start eliminating the actual utility and flexibility of language. a link in terms of computing is most certainly NOT just a hypertext link, that is only one very specific interpretation of the concept at hand that you're falling back on to try and further your argument. liberally ignoring all the other possible interpretations is quite intellectually dishonest imo. still, you're entitled to perceive things how you want and argue it whatever way you desire. just know that the majority of reasonable and educated individuals will disagree with you. you're up shit creek and you keep denying every paddle that's offered to ya m8

If it helps, the Oxford English dictionary has updated the definition of “literally” to literally include “used for emphasis while not being literally true.”

Whether we like it or not, language evolves. If “literally” can, then literally any other word can too.

"wikipedia disagrees" like it's a source?

You're right - a URL is an address, and a link is a sign saying 'go here'. The address still exists with or without the sign.

Let's say you are reading a free manual online and every page is stored as a different webpage with a common URL and only the last number changing (E.G. page 1 is book/1 page 2 is book/2 and so on). The only way to go to a certain page is with a "next page" link from the previous one. You read the index at page 1 and see that you cant find the information you need but page 500 is missing, a printing error in the index perhaps. Instead of clicking 499 times on the page links to see if the information you need is on page 500 you simply change book/1 to book/500. OH NO, the link on page 499 brings you to page 501 and it's not in the index because page 500 contains the truth about aliens so now you read it and are going to prison.

That's in a nutshell what happened

Imagine a government information bureau.

The employees who work there have access to all information. Why wouldn't they? They work at the information bureau.

They also answer all questions. Why wouldn't they? They work at the information bureau.

You walk up and ask questions. You get answers. Three days later, you're arrested for knowing state secrets.

Do you see a possible problem with this arrangement?

Door handles should be actively jiggled on the regs. If anything it's social behavior. Maybe you'll walk in on an orgy and get to join in!

That's what I'm going to tell the next inhabitant I meet while jiggling knobs!

That you were just hoping to find an orgy? I fully support this endeavor.

Instead of effectively designing and administrating their systems, "the powers that be" are taking the "rubber hose, lead pipe" approach. Creating in "law", what they are too distracted and incompetent and egotistical and tight-fisted (tech is a "cost center") to create in fact and systems.

They are, in essence, trying to redesign the nature of the Internet by fiat.

Historically, accessible resources are accessible. Public by their very nature. If you don't want them publicly accessible, implement (effective) authentication.

But this is far too much "trouble" for self-important, "do it my way" lawyers and executives.

So, they "define", or redefine -- in a pure fiction of language -- what is "public" and "private".

Aside from all else, these... "errors in the system" should be routed around. Denied use of the system.

Unfortunately, in this regard, tech has ended up in the position of working for them, rather than vise versa.

Personally, I won't work for them, anymore. Every thing I do for them, is against my own interests and, I've come to believe, the common good.

Don't forget to mention: the absolutely do not have a democratic mandate from the Canadian public to behave in this manner. Our government in many ways is getting to the point where it can legitimately be considered a rogue, adversarial actor in our country, and unlike in America, we don't have a second amendment as a last resort to protect us.

^^^^^^^ [edit] clarifying to point out that I fully agree and have nothing more to add

Is there a Canadian version of the EFF, or does the EFF have an Canadian branch?

A couple years ago, I asked Michael Geist[1] a similar question.

Me: I am a Canadian citizen and am rather concerned by what appears to be increasing online censorship and erosion of privacy rights in Australia and the UK. If it can happen there, it can happen here. I'd like to do my part to ensure that we can effectively oppose bad public policy when it's proposed in Canada. I deeply respect both the EFF and the ACLU for their work in the United States, but I'm not familiar with equivalent organizations for Canada. Do you have any suggestions as to where should I be sending my holiday donations?

Michael Geist: Thanks for your note. There are several groups in Canada that do great work on these issues:

1. CIPPIC - the Canadian Internet Policy and Public Interest Clinic (cippic.ca). I founded this tech law clinic at the University of Ottawa, the only one of its kind in Canada.

2. Open Media - based in Vancouver

3. CCLA - the Canadian Civil Liberties Association

4. CJFE - Canadian Journalists for Freedom of Expression

5. BCCLA - BC Civil Liberties Association

All do great work with limited resources.

[1]: http://www.michaelgeist.ca/

For ccla, cclet is their education arm eligible for a charitable contribution tax credit.

Not exactly, but the University of Toronto Citizen Lab is pretty close: https://citizenlab.ca/

It's not all domestic oriented, they do a lot of research on internet censorship internationally, and other things that fall into the category of "government interference with the internet".


"The Citizen Lab is an interdisciplinary laboratory based at the Munk School of Global Affairs, University of Toronto, focusing on research, development, and high-level strategic policy and legal engagement at the intersection of information and communication technologies, human rights, and global security.

We use a “mixed methods” approach to research combining practices from political science, law, computer science, and area studies. Our research includes: investigating digital espionage against civil society, documenting Internet filtering and other technologies and practices that impact freedom of expression online, analyzing privacy, security, and information controls of popular applications, and examining transparency and accountability mechanisms relevant to the relationship between corporations and state agencies regarding personal data and other surveillance activities."

Possibly https://cippic.ca/

They helped Geocoder.ca with their 4 year lawsuit: https://geocoder.ca/?sued=1

It really needs one. Cory Doctorow went to work for the EFF despite being Canadian probably because the Canadian analogues are pretty weak and disorganized by comparison.

Open Media is pretty good at social media and pressure campaigns.

The BCCLA is top notch if you actually need to win a court case. When they take on a case they don't fuck around, and they have high powered lawyers working pro-bono for them.

The great thing about the EFF is they're not as narrowly focused. Open Media is pretty good, but they pick their fights more carefully and don't tackle social or ethical issues the same was the EFF does.

i can see how the older generation is thinking though, they see it like leaving a window unlocked doesn’t mean you can

the laws are interpreted and applied by powerful people in a way that suits the way they think - that much i think could have been predicted (but not by a teenager)

did the weev ruling surprise anybody other than hackers?

It feels like the reduction of a nuanced problem into a simple one with a single victim and a single perpetrator, and only one acceptable narrative.

"It's not my fault I left my window open and you took advantage of it. I shouldn't have to keep my windows locked."

"If you see an unlocked window it's not an opportunity for you to take advantage of."

That's admittedly fairly obtuse, but you can see elements of this play out even in this thread, where it becomes a debate about the accuracy of the metaphor and not a discussion of the actual problem. It is so much easier to attack the language than it is to dig out the real concerns and talk about those, so you get a pro and anti situation or semantic nitpicking.

If you agree that the older generation can only think in the metaphorical sense and are practically stuck in the 50s with how they describe it, you also have to accept that it is so far out of whack with reality that it causes continual debate about what actually does reflect the situation and completely detracts from the problem at hand. It is beyond idiotic and it's symbolic of an unhealthy resistance to change.

As an aside, this happens in places like HN and Reddit all the time. A metaphor is introduced into a discussion and it totally derails it, and you're not talking about the source material any more, you're talking about the metaphor and how it can be more accurate. It's like the metaphor is more important than the problem itself sometimes and it's intensely sociopathic, because the linguistics take priority over the humanity.

It's not necessarily a problem that the older generation can only think about technology in a metaphorical sense, the problem is that the metaphors they are using are idiotic.

I think, by and large, people are constrained to thinking about things they can describe. To that extent, being able to accurately describe something is meaningful, and is therefore a linguistic issue.

Semantics are very important when you are dealing with minutiae, and the law hinges on comparisons and extremely complex semantic arguments.

To that extent, it makes sense that we argue about the metaphors.

No OP has a point, we shouldn't talk in metaphors so much... It's not necessary.

The web is in many ways a huge collection of resources that reference each other. Some of these references are explicit in links, others in text, and some are available for programmatic access.

In fact many resources can be discovered by programmatic access, and there is no inherent reason to think this is wrong. Just because an API isn't documented doesn't make using it illegal.

For example, many URLs are actually permalinks, you can bookmark them, or send them to a friend. While most websites don't document this API, it's very common.

Lots of people configure search keywords in Firefox by injecting queries in bookmarked URLs. Few of these URL patterns are formally documented, but that doesn't make their usage illegal.

> In fact many resources can be discovered by programmatic access, and there is no inherent reason to think this is wrong.

Agreed. There is no inherent or intrinsic reason to expect that any given document or any given URL ought to be restricted. However, a look at the documents could have provided some extrinsic reason to stop looking. For example, if I find a filing cabinet full of classified documents, I will not continue leafing through them after I see the first one. I will stop immediately and notify someone appropriate (after contacting my lawyer). I do not intend to access classified documents.

The question is one of intent. Did the individual intend to access documents that they knew or should have known that they ought not access. If the kid pulled down one classified document, took a look, realized what he was looking at, and deleted it and notified the authorities, then I'm with the crowd. Likewise if they pulled down the entire archive without looking at any of them. I'll be on the front lines with my pitchfork.

On the other hand, if they saw the first classified document, then pulled down the rest of the trove hunting for more, some amount of punishment is probably warranted. Even then, I would say fifteen years is too much. Maybe a few months of time and probation, depending on exactly how much willfulness was on display.

I imagine these are the kinds of questions that will be resolved during the trial.

> if they saw the first classified document, then pulled down the rest of the trove hunting for more..

Per the articles: nothing was "classified", it was an archive of public documents that the government published periodically. The issue is that a small subset weren't redacted properly - but there's no apparent reason the teen would have known that.

It appears that someone simply archived a bunch of documents they reasonably believed to be public information.

In that case, hopefully, and probably, justice will be served.

ah yes the age old problem of actus reus sans mens rea

> In fact many resources can be discovered by programmatic access, and there is no inherent reason to think this is wrong. Just because an API isn't documented doesn't make using it illegal.

The license to access private property is based on the intent of the property owner. Where the intent is made express (through a sign), that governs. Where the intent is not made express, we try to figure out what a reasonable person would infer about the property owner's intent.

The method of access, therefore, is relevant insofar as it tells you about what the owner of the web server intended people to have access to. The fact that content on a website meant for the public to access is only accessible by "programmatic" means that ordinary users would not know, is strong evidence that the owners of the web server did not intend for people to access those documents.

> The fact that content on a website meant for the public to access is only accessible by "programmatic" means that ordinary users would not know, is strong evidence that the owners of the web server did not intend for people to access those documents

Sorry, that is complete BS. Have you scanned the entire internet and sure sure there are no links to these files on other public pages?

Files publicly hosted by a web server (software explicitly designed and installed to make those files public) is in no shape or form private property.

Furthermore, in this specific case, there is an explicit statement saying the files are public and saying nothing about them not being accessible:

"The Access to Information website allows you to submit, pay and receive FOIPOP requests online. The Nova Scotia Government also posts responses to formal FOIPOP requests online on the Disclosure log. This is a free public repository of FOIPOP responses that have been approved for publication and have met a specific set of criteria (PDF file 800 KB)."


I can see where you're coming from but you're also describing the purpose of an API, documented or not. Ultimately, if you want to secure the boundaries of your property (whether that's your app or your domain or your honest to god physical land) it's up to you. If you find yourself in the position where other people are revealing information you or your company should have protected then you are accountable. You have to be accountable. The guy who found the problem or abused it is not accountable.

The guy who found the problem or abused it is not accountable.

This is clearly wrong.

If I forget to lock my door when leaving my house one morning it's still trespassing if you enter the house without my permission.

> If I forget to lock my door when leaving my house one morning it's still trespassing if you enter the house without my permission.

This doesn't seem like an apt analogy for the actual case described in the article, though. That case seems more like: you left a bunch of stuff at the curb with a sign that says "free for the taking", but didn't realize that you left some stuff there that you actually didn't want taken.

in this context i don't think your metaphor applies. your house isn't a freely accessible entity that has been declared as such which has certain areas of it that are off limits but never clearly defined as just that.

So what you are saying is that what matters is the intent of the property owner and the general understanding of the population about what is typically public vs private property?

i would say that those things are certainly relevant, but it would be perhaps very myopic to say they are the only relevant concepts. one should also keep in mind the intent of the individual attempting to access the property, and the perception of what they're doing when they do within their mind as an expression of that intent, for instance. if i have been to led to believe that your property is public through a means that is debatably authoritative, such as a sign indicating it as such, then for all intents and purposes, anything i interact with inside of it in my mind is perfectly acceptable because it falls within the context of pre-established permissions. and if there are parts of your property that for some reason you would prefer private but they are not demarcated as such, am i really in the wrong to interact with them or go inside of them if i have no external information telling me that i should not?

in the context of the article and the problem at hand, the teenager downloaded a bunch of things that were supposed to be public access, but also accidentally downloaded some things that were confidential though not clearly marked as such (im assuming based on available information) and so the only real way he would have known they were confidential was if he actually perused the contents of them. that would be like having a room that is private property and off-limits, and it is marked as such, but the marking is inside the room and can only be seen by entering it and as such violating the private nature of said room. but really these are all just my thoughts on it and i certainly don't think i'm right or anything. it's just fun to talk ya know

I hear what you are saying about the intent of the actor, but that's tricky. What if the actor doesn't believe in private property? What if the actor accidentally misreads "private" as "public."? These things aren't really mitigating.

This is why the law tends to fall back on what a hypothetical "reasonable person" would think.

I'm not as much of a hardliner as rayiner on this particular case as I think there are some facts in favor of thinking of these documents as public:

- it was a government website

- it was specifically set up for the purpose of sharing foia requests

- the data in the documents was not easily identifiable as private

But when it comes to the general principle where some HNers seem to think "If the webserver responds with a 200 then it's perfectly fine." I have to disagree.

Imagine a different scenario in which we were talking about tax returns instead of foia requests. You're looking at yours at http://www.canadataxes.com/return?id=1234 and you realize that if you inc the ID you get the tax return of some other random Canadian citizen. In that case it would be immediately obvious that someone had made a mistake and you were accessing information you shouldn't. A "reasonable person" would understand that a mistake had been made. It would then be clearly illegal to write a script to scrape down the docs for every ID.

But what if someone had just moved to Canada from Norway and didn't realize that tax return information was supposed to be private? I think it's fair to put most of the responsibility on those who have implemented a system that serves private data in response to public http requests. Presumably this is the party that has both been entrusted with the private data and has the technological and legal resources to understand what can and cannot be served publicly and appropriately configure their systems to do so.

Those that move from one country to another are responsible for learning the laws and customs of the place they are moving to.

No doubt. I still would consider the implementors of said contrived-example-website to be far more liable than our hypothetical confused batch downloading Norwegian friend. Just like I don't yell at my toddler when she knocks over the glass of water I left on the floor.

I agree that the Canadian government has been negligent in the situation under discussion. It's not an either or situation though. Both can be guilty at once.

Let's say he picked his first random URL and got someone's tax documents. A reasonable person would probably report it and quit prying. But it sounds like the bad documents where mixed in with good documents that he was downloading programmatically, so a reasonable person probably would not have the time to read every single document to determine if it contained personal information.

Yes, that's what I was getting at when I said:

"the data in the documents was not easily identifiable as private"

as a mitigating factor.

I think that's a terrible analogy, a better (albeit not perfect) one in my opinion is:

If I request access to your house (send a HTTP request) and you grant me access (give me whatever I was requesting), I don't think I should be arrested for trespassing.

If I install a one of those new fancy IoT locks on my door but it has a bug and unlocks for you when it shouldn't, it's still trespassing if you enter my house.

Yet if you forget to lock your car doors or leave valuable items in plain sight on the back seat, the insurance company will not honor your claim.

You're right, it is. If you keep forgetting to lock your door then the situation changes.

Its not the lock that makes it illegal. Its the act. Somebody reaches into your pocket to take your wallet, its not 'ok' because you didn't lock your pocket, is it?

We're not yet to the point where its the victim's fault when victimized by a criminal. It may seem that way when there are so many active criminals. But some places its still possible to trust your neighbors. I live in one.

That's not true at all.

I grew up in a relatively small town. Literal years went by where my parents didn't lock the back door. It would have been illegal for someone to enter the house without their permission for that entire time period.

Isn't the assumption always that accessing a web page programmatically is allowed unless specifically asked not to in robots.txt?

> Where the intent is not made express, we try to figure out what a reasonable person would infer about the property owner's intent.

We do. But if we fail to do so correctly, as people occasionally will, have we committed a serious crime for which we should be facing prison? Particularly in a case where societal custom is not well formed, and analogies to more familiar situations are all strained?

This young man, by his account, likes to archive stuff he finds on the Web. From the sound of it, he's done URL incrementation many times, and this is the first time he's gotten in trouble for it. Let's suppose for the sake of argument that that's true, and also that there really were no indications on this site that the information was unintentionally left accessible. Do you really want to send him to prison for that?

You might reply that as cruel as that seems, its deterrent value would make it worth doing. But I don't even want to live in a world where people without criminal intent are so succesfully deterred from experimenting with the Internet. In such a world, site owners would take even less responsibility than they do now for securing their information, and therefore actual criminals would have even more unfettered access to it.

What is significantly more programmatic about manually incrementing a document ID by 1, than viewing a website in the first place?

Maybe we need to shift the metaphor.

A situation like this (security wise) isn't like leaving a window unlocked and having someone rob your house it's like

1. Leaving a pie on the window sill overhanging the side walk with paper plates and plastic utensils beside it.

2. A man knocking on your door, asking you for your bank account number without impersonating anyone of authority, you offering it up freely, then suing the man because you forgot to ask who he was first.

The problem is the metaphor itself. The metaphor avoids the problem by providing a more trivial or palatable debate, it takes the attention away from the teen in trouble to the definition of the problem and at that point you've stopped caring about the person, you're caring about the problem.

This guy facing prison doesn't give a shit about it feeling like a man stealing a pie from your window. It's nothing close to that because you can steal pies from windows and be held accountable in a much more reasonable way, and trying to reframe the situation only helps to an extent.

Reframing is a favorite tool of authoritarians. They just turn a knob and the new frame gets parroted by the media for weeks.

On reflection I think the metaphor itself is a red herring; we're always using them because they serve a great purpose. Tell an aspiring developer they can use goroutines to build a decently concurrent app, without much fuss, and they'll be delighted because the language provides a metaphor for it (way beyond a simple abstraction); tell them they have to understand mutexes, locks, processes, forks, and you've lost them.

What I see is a tendency towards the metaphor because that abstraction itself poses a challenge on top of the original one and the original problem itself is less interesting than the linguistic magic layered on top of it. You can talk at length about how bad the dog > animal OOP example is but you won't have much to say about OOP without that.

It's basically bike-shedding.

i don’t agree with any of that. powerful people choose to interpret law in the way most favorable to their power

wolf doesn’t care about the reasoning of sheep so long as they submit

And you're being the sheep who submits. What would you say otherwise?

It's more like leaving a bunch of manhole covers open on a public street and suing someone for falling in.

FFS if I go to https://www.booking.com/city/ie/cork.html it loads fine. Apparently I'm breaking the law if I use my criminal-mastermind hacking skills to ALSO go to https://www.booking.com/city/ie/dublin.html

It's just ridiculous.

It is not quite like that though.

More like, All items on this table are free.

So far, so good.

Then someone included a couple that aren't free.

Writing a line of code to fetch a batch of info is ordinary to a literate user.

Putting some burden on him to understand that has happened is a very hard sell to me.

It's more like having a 'Please take these unwanted items' bin and sending people to prison when they take something that you mistakenly put in the bin.

A HTTP server may as well equate to a public building with a sign outside saying "Come in and take whatever you want from any room at any time, unless the room's door is locked!" (locked = authentication)

The Nova Scotia govt forgot to lock one of those doors and are now furiously trying to shift blame. I haven't seen them apologize or address their own internal incompetence in any of the news articles I am seeing about this.

"These dang kids these days just don't have a sense of responsibility!"

When somebody uses the "leave the window unlocked" comparison, I try to rephrase it as "imagine you have your satellite TV with thousands of channels. You browse through channels and realize that, on channel #12985 if you press the A button on your remote, the channel will broadcast you videos that were supposed to be private". This is probably a closer analogy as you are not physically trespassing, you are "committing the crime" from your couch, you are not "stealing private property" (as, the docs are still on the owner's servers) and it's mostly on the broadcaster side to transmit sensitive information on a channel that is for everyone to see.

Yeah except it's a public space – So more like the library doors were unlocked.

Since it helps the older generation to think about digital content in metaphors, I'd argue that the kid was entering through an open window of a public building. Although a bit strange, no one would give this hypothetical person a third look.

I think the most precise metaphor is: a kid walked through the front door of a public library, borrowed a couple freely available books, then the government realized those books mistakenly included sensitive information.

In order to address that error, 15 police officers raided the kid's house.

That would be an accurate analogy if these documents were linked to from a publicly-accessible portion of the site. They were not. This is more like someone walking into an unlocked back room and grabbing books that hadn't been shelved.

These analogies are not helping. Here's what actually happened: the accused allegedly sent requests to a web server asking "may I please look at the document with id X?" for various values of X. Each time the web server had the option to say "no, you may not", or even "no, that document doesn't exist." Instead, it responded each time by sending the requested document.

That's all that happened: someone used HTTP in the way it's intended to be used, and inferred quite reasonably that the people who set up that web server knew what they were doing and meant to set it up that way. It turns out those people didn't know what they were doing, and they got embarrassed about it.

The computer is not a person and what it does only matters insofar as you may infer that the owner of the property programmed it to do what the owner intended.

As you admit, the property owners did not intend those documents to be accessible. So the only relevant question is: would a reasonable person infer that documents which could only be accessed by editing a URL (by "tricking the HTTP server," if you insist on anthropomorphizing a dumb machine) was intended or not intended to be accessed?

I think most people would assume that documents that can only be accessed by editing an ID were not meant to be accessed. And that really is the end of the analysis.

> "tricking the HTTP server,"

I don't think you understand the web. I'm not anthropomorphizing anything. He literally sent a request for each document he wanted to look at and the server sent a response.

You keep referring to this hypothetical "reasonable person" who doesn't understand the very basic facts about technology, but the opinion you attribute to the "reasonable person" is just one you invented that happens to match your own.

> I think most people would assume that documents that can only be accessed by editing an ID were not meant to be accessed.

How would anyone know if the documents could only be accessed by editing the URL? Others in this thread have pointed out that some of those documents were indexed by Google, so actually, editing the URL is not the only way to get to them.

I disagree, because the analysis is faulty.

Computers always do what you _tell_ them to do, not what you want them to do.

The onus for keeping computerized material private is on the owner, and the owner screwed up royally by wrongly allowing sensitive material to be placed unprotected on a _public_ web site. Whether or not it was indexed is irrelevant - it was on a publicly accessible site, permissions set to publicly accessible, and the entire site was meant to be publicly accessible. One can close the analysis until the cows come home, it will not change this fact.

Accessing that material is as illegal as finding a diamond ring (or personal files) while dumpster diving. Dumpster diving may be seen as tasteless or low-class, but as far as I know, it’s not illegal.

Do we prosecute reporters for ferreting out publicly available, yet embarrassing, information?

Dumpster diving is legal (in most places but not all) because the owners has, by putting something in the trash, expressed their intent to not own the item in question anymore.

A website isn't a trash can though.

If I accidentally leave a diamond ring (or personal files) in public somewhere and you take them that is absolutely theft.

A web server is a thing people use to make files publicly accessible - it has no other purpose. It has stronger expectations against privacy than a trash can.

As such, your analogies to situations (locked houses, unattended jewelry) with the opposite expectation just disprove your point. Assuming a file is private even though it's publicly accessible on a web server is as nonsensical as assuming an object is free for the taking even though it's an unattended diamond ring.

I mean, it's in the name, web server. It serves things to people when people asks:

   - Hey, can I GET this drink?

   - 200 OK, here it is pal. 
   - 204 Uh, the bottle appears to be empty
   - 206 I have only half the ingredients for the mix
   - 300 Stirred or shaken?
   - 301 That drink is now called this, but here it is!
   - 400 I can't understand what you say buddy, are you drunk?
   - 403 I'm sorry, but I must refuse to serve you that drink
   - 404 Oops, I can't seem to find the bottle
   - 411 How much do you want?
   - 413 That's too much drink!
   - 418 I'm actually a teapot 
   - 503 Too busy right now!

A web server is a thing people use to make files publicly accessible


There are lots of things on webservers that aren't public. Try to access:


You can't, because github hasn't made a mistake and accidentally made all private repos public.

If github screwed up one day and all private repos were temporarily made public it would be illegal for you to run a script that tried to scrape them all down to your personal hard drive.

But...you would have left the diamond ring by accident.

Files don't "accidentally" become publicly accessible via HTTP. i.e. you don't return to your computer one day to find everything is public.

Someone specifically took the steps to make this data public. The fact they didn't realize what they were doing isn't the fault of people that then view the data.

>Files don't "accidentally" become publicly accessible via HTTP

Hmm? It's certainly possible to configure a web server incorrectly by accident.

That's true.

But as the person knows they are configuring a web server, I would say this is more carelessness / incompetence rather than an "accident" in the same way as losing a Diamond Ring would be.

Sure, but your carelessness doesn't absolve me of my crime. If you lose your diamond ring out of carelessness, it's still not ok for me to steal it.

These analogies involving valuable physical items are way off base. If I'm walking along the street and see a diamond ring lying there, I can only assume that it belongs to someone else and they've misplaced it (because it's very valuable and, crucially, there is no way for the owner to make use of its value if they've lost possession of it). I may not have any way to locate the owner, but I still recognize that it belongs to someone else and that for me to take it and keep it would deprive them of their property (probably, I should take it to the police).

If you insist on analogies involving lost rings, this situation is more like taking a picture of a ring someone lost in the street than it is like taking the ring.

No, that's a terrible analogy, because a picture of a diamond ring is worth much less than the ring. A copy of valuable information is generally worth just as much as the original. And while no-one would care if someone else had a picture of their diamond ring, they would care if someone else had a copy of their private information.

> Sure, but your carelessness doesn't absolve me of my crime.

If his carelessness meant communicating that you could take the ring without stealing it (say placing it in the donation basket instead of his wallet), that would absolve you of your crime.

> The onus for keeping computerized material private is on the owner

I don’t think that’s a sensible rule and at the end of the day, it’s not the one that’s going to prevail. The Internet will be sanitized and made safe for all the people who forget their passwords and write them in their monitors. The Internet is for ordinary people now, not curious teenager hackers. And ordinary people will make the rules to suit themselves.

> The Internet will be sanitized and made safe for all the people who forget their passwords and write them in their monitors.

And how, exactly, is this "sanitization" going to occur? Are you saying that having 15 police officers raid a home and confiscate multiple computers (all but one of which had nothing to do with the incident in question), arresting a completely uninvolved person on his way to school, and taking no action at all against the stupid contractor who set up the website, is an acceptable form of "sanitization"?

> The Internet is for ordinary people now, not curious teenager hackers.

That's not what the police action described in the article is saying. It's saying the Internet is for government and corporations, and God help the ordinary people who get in their way. (Btw, I include "curious teenager hackers" in "ordinary people". Perhaps the fact that you don't is part of the problem.)

This can't work, for the simple reason that the Internet has global reach. Unlike other kinds of personal property, a Web server is accessible to the entire world. There are lots of people out there who have no reason to respect US or Canadian law, and there always will be. Prosecuting this young man, or Weev, may make some "ordinary" people feel better, but doesn't begin to deter any actual criminals.

The only solution is site owners taking responsibility for securing their sites, in accordance with the sensitivity of the information on them. The sooner "ordinary" people realize that, the better.

Ordinary young people already laugh at this sort of ignorance. Ordinary old people will die soon.

Ordinary young people today are probably even less computer literate than ordinary people my age (mid 30s). They grew up being spoon fed the Internet through the FB and Snapchat apps on iPhones.

Your entire argument is based on this idiosyncratic theory of widespread ignorance. This theory is simply wrong, as a matter of fact. Even if it were true, no historical case ever turned on the unprovable notion that most people are too dumb to understand the truth.

Since you didn't respond when I raised it elsewhere in-thread, I would highlight again the fundamental imbalance between the rules you would impose on Facebook etc. and those you would impose on users. Firms that spend billions of dollars developing their systems only have to be as smart as the most ignorant person we can imagine. Their users, in contrast, must be geniuses to keep up with their many changes to TOS, interfaces, and functionality, while simultaneously those genius users aren't allowed to notice that numbers follow each other in sequence. This is nonsense on its face, but then again authoritarian maneuvers are their own justification, aren't they?

> The Internet is for ordinary people

If we're at the topic of wishful thinking, I wished ordinary people would understand basic things about the internet. The purpose of humanity as a whole shouldn't be to dumb things down for "ordinary people". It should be to better teach and educate new generations, so we won't be able to assume ordinary people are dumb.

Web scraping is a normal occurence on the internet not just limited to curious teenage hackers. When I grew up we didn't lock our doors because we knew and trusted our neighborhood. It's like you're asking 4chan to use polite vocabulary.

> I think most people would assume that documents that can only be accessed by editing an ID were not meant to be accessed. And that really is the end of the analysis.

You do realize HN provides an API that allows you to request any item by using an ID? [1]

    Stories, comments, jobs, Ask HNs and even polls are just items. 
    They're identified by their ids, which are unique integers, and 
    live under /v0/item/<id>.
If you really know better than everyone else who has replied to you on this story, why don't you point out the exact law that states accessing resources over HTTP is forbidden if not initiated from another resource originating from the target server? Otherwise, I'll assume your "analysis" is simply a subjective view on how you would like the web to work. A pretty limited and unrealistic view that wouldn't work in the real world.

For example: here is the link to the first story posted on HN: https://news.ycombinator.com/item?id=1

1. I don't think you can access that story by starting from the front page, because scrolling for more stories only gets you to page 25. Does that mean the intention is the story is private?

2. You can now access it by using the DOM element generated for my comment. Does that mean it's public?

[1] https://github.com/HackerNews/API

Not to mention there's more to things than HTTP. could be plenty of other sources. Maybe I like using netcat just for kicks. Maybe I like hand-typing HTTP.

While odd, `printf "GET / HTTP/1.0\r\n\r\n" | nc 80` gets you the HN home page as good as anything.

Unfortunately, the main counter argument in this thread, from what I gathered, is "ordinary people wouldn't do that". Including someone claiming that if your mom wouldn't do it (in this case), it's not legal: https://news.ycombinator.com/item?id=16854087

Sure, and the fact that HN has a note to that effect on its website is evidence that all of these items are intended to be publicly accessible. It's also obvious in any case that stories, comments, jobs, ask HNs and polls are intended to be public. In the case we're talking about here, it was far less obvious that the relevant documents were intended to be publicly available.

> It's also obvious in any case that stories, comments, jobs, ask HNs and polls are intended to be public.

> it was far less obvious that the relevant documents were intended to be publicly available

My browser and the respective HTTP servers consider them equally obvious publicly available.

But it's not your browser or the HTTP servers that are being prosecuted. Browsers and HTTP servers don't 'consider' anything.

> HTTP servers don't 'consider' anything.

Of course they do. They consider whether or not to give me access. If they respond with 200, they are effectively telling me that the information is public and the request is approved. There's no law moral or legal that stops me from asking for information.

I could ask a law agent for classified information, but he's not going to prosecute me for asking questions. He could be suspicious and ask "how do you know a document with that number exists?". And I can reply "oh, I'm just asking for random numbers".

You can describe what a webserver does in anthropomorphic terms if you like, but it's not the webserver's "intentions" that are relevant. It's the intentions of the people who control the website and the intentions of the person who accesses it.

>There's no law moral or legal that stops me from asking for information.

I wouldn't be so confident of that if you haven't read up on the relevant laws. Many countries have prohibitions against unauthorized access that apply in circumstances where the access is not "unauthorized" in a technical sense relating to the details of the HTTP protocol. The law doesn't necessarily say what you would want it to say or what you would expect it to say. See e.g. the following example from the US. (I'm aware that the incident we're discussing occurred in Canada.)


> You can describe what a webserver does in anthropomorphic terms if you like, but it's not the webserver's "intentions" that are relevant. It's the intentions of the people who control the website and the intentions of the person who accesses it.

And how do you prove intent? This is a technical problem with technical protocols involved. Intent should be provided via the protocol. If the protocol says resources are public, unless otherwise stated, you can't rely on a human to answer, post factum, what resource is private.

>And how do you prove intent?

I believe that’s something they teach you in law school. Lawyers have been working on that problem for a while! IANAL, but I don't think you are going to be able to find a concise answer to that question that goes beyond the immediately obvious.

>Intent should be provided via the protocol.

Sure, if you say so. That’s not how the law works, though.

Editing a url is not "tricking the web server", the web server is designed to respond to urls with the information they point to. Tricking the server would be doing something like sending malformed packets designed to cause the server to leak memory and display the contents of "hidden data" in an exposed field, ie causing it to behave in a way for which it was not intended.

> And that really is the end of the analysis.

You're being hugely disingenuous. The owner of these files set up their website, which includes deciding which files are and are not publicly accessible, and it is reasonable to expect that the files they made publicly accessible are the files they intended to be publicly accessible.

One can certainly make the counterargument that a lack of public links suggests the owner wanted them to be private, but you are pretending that there's no evidence whatsoever that the files were meant to be public, and that's plainly not true.

> I think most people would assume that documents that can only be accessed by editing an ID were not meant to be accessed.

I think most people don't have an intuitive understanding of this at all, which means you can get them to give any answer you want by crafting your description of the problem appropriately. That doesn't make such a procedure reasonable.

> would a reasonable person infer that documents which could only be accessed by editing a URL

Except there's no way to know whether that's the only way to access those documents. That's what access control is for. They could be linked from elsewhere for all you know, and it's perfectly reasonable to assume that if you can access the document by punching in a URL, then it is so accessible.

Just curious, not trying to trip you up: In your perspective, would it be trespassing to make a new website which has links to the original site with the edited URLs, without actually accessing those edited URLs? If such a website with links containing edited URLs already existed, would it be trespassing to follow those links?

Just curious about which one, or both, of those are trespassing in your perspective.

That's not true at all, for instance a number of the records that are part of the download were indexed by Google.[1]

So it's more like going in to your library, using the card stack, learning about a book, going to the shelf it is on, and then looking at all the books on the same shelf.

Somebody noticed that you were looking at all the books and called the cops on you. The cops break in and arrest you for looking at books. They tell you that the bookshelf is off-limits and has personal information.

Sure, the library creates it's own card stack and google is an external service; however if you design websites for a living you expect google to perform that functionality.

I mean, I designed a service where we wanted to make it easy to share private information, so we didn't use authorization. However I realized that if I wanted the data to be private I should use a suitably long non-consecutive random ID for the resource. If anyone is guilty of criminal misconduct, it's the person who designed this asinine system or the executive who allowed it to be used on the internet.

Hell, I'd go so far as to say that the fact that the exact same system is still being used across the US is a sign that the company who runs the system is criminally negligent.

[1] https://evandentremont.com/some-information-on-the-freedom-o...

I think that's a bit harsh. The documents at that URL were understood to be freely available to the public.

As I physical analogy, I'd think about it more as one of those restaurant straw dispensers. He got tired of pressing the button each time for a new straw, and instead opened the lid and grabbed a bunch out.

I have been arrested for taking too many straws.

Going to a restaurant and taking every single straw out of the top of the straw dispenser is clearly anti-social and is probably theft.

But there’s a near infinite supply of straws and it didn’t damage the dispenser.

It did, however, damage the privacy of various Canadian citizens.

> It did, however, damage the privacy of various Canadian citizens.

Did it? I understand that the stupid contractor who put this data on the website did (potentially--but note that nobody is saying that anyone has actually suffered harm because of that data being accessible). But did the teenager who got this bomb dropped on him damage anyone's privacy? As I understand it, he downloaded the data, put it on his hard drive, and left it there; it never went anywhere else.

Can you please send me a copy of your last 3 tax returns? My email address is in my HN profile.

I don't know you have don't particularly care about your financial situation, so I'm not gonna read them or share them with anyone else. I'll just keep them on my hard drive.

> Can you please send me a copy of your last 3 tax returns?


A) Sure, here you go. Oh wait! I didn't mean to send you those. You tricked me and stole my information. I'm going to send 15 police officers round to arrest you and then you're going to prison for years.

B) No, that's confidential.

^^ Which option do you think is more reasonable?

A) is not comparable to the current situation because you are the one initiating the action. I can't stop you from sending me an email so it can't be a crime on my part if you do so.

No, you are initiating the action by requesting the file from me. You did request the file didn't you? Even though you should have known it wasn't public information?

That's not a correct counter argument, the information he got was understood to be public and there was no reason to expect or think there was any private information on there. If the site had said "This site provides tax returns" then there would be reason to expect the files would contain private information. The site in question gave no indication there would be private information in those files. Also, technical nitpick, there are some countries where tax information is public so probably not the best thing to go with.

So you think this teen was sending the data he downloaded to all his Facebook friends, or what? Do you have any evidence at all to support this?

That may be so, but did he intend to damage their privacy? Probably not.

He can't be faulted for accidentally downloading some private information that was improperly mixed in with a bunch of public information that he was trying to download. He had no indication that the information he was retrieving was not supposed to be public.

...which is not a felony, because that would be insane.

It's more like the restricted section of the public library, and they left the door open. He just walked in, and started reading the books on the shelves, and photocopied all of the books.

If there were books in this section that shouldn't have been in there, that's not his fault. That's the librarian's fault.

No, it's much more like calling the government on the phone and asking a series of questions. Or like sending them a series of letters. Or entering a government office during business hours and conducting business there. We don't need a metaphor, there are already laws in place governing how we speak to our public servants and how they respond.

Computers are dumb machines. They have no free will, and cannot serve as agents whose actions bind humans.

If your car runs me over, I'm going to sue you.

If my car runs you over, and you sue me, what does the court do? It tries to figure out my intent. Did I intentionally run you over? If yes, I'm guilty of vehicular assault (not my car). Or did the brakes fail and I had no intent to hurt you? If yes, I'm not guilty of anything.

Likewise, what the computer does is irrelevant, except insofar is it tells you about the owner's intent. So the question is not "did the computer let you access the file." But "what does how the computer let you access the file tell you about what the computer owner intended?"

The car is still a dumb object that you own that hurt me, and as a result someone is still paying my hospital bills. You might blame your mechanic, if he did a poor job of brake maintenance. When he attempts to defend himself, how is this "intent" mishmash going to fly? "Your honor, please ignore that there are several RFCs defining how one installs calipers, none of which I followed! It was clearly my intent for the brakes to work!"

Also, I'm not sure your analogy works at all. In the first paragraph, you seem to analogize the car to the accused "hacker", while in the second you're talking about the supposedly "hacked" host. To be clear, the point of the car example is that a machine's intelligence has no bearing on how its actions affect the duties of its operators.

The web server sends a response code with each response.

The best, and most accurate, way of determining if the resource you requested is meant to be accessible, is to check to see if you got a 200 OK response or a 403 Forbidden response.

Given the numerous articles about documents inadvertently being exposed through URL ID incrementing, clearly response codes do not accurately convey what people meant.

I didn't say it was perfectly accurate, just that it was the best.

So your argument is that a better way to check this is to crawl the entire web looking for links to a resource to determine if it was meant to be publicly accessible?

> Given the numerous articles about documents inadvertently being exposed through URL ID incrementing, clearly response codes do not accurately convey what people meant.

Your intent argument is really shallow. People do bad things with good intentions all the time. Doesn't mean their actions are good or legal.

> If my car runs you over, and you sue me, what does the court do? It tries to figure out my intent. Did I intentionally run you over? If yes, I'm guilty of vehicular assault (not my car). Or did the brakes fail and I had no intent to hurt you? If yes, I'm not guilty of anything.

Or you failed to follow the rules, were careless, and hit him by mistake. Was your intention to kill him? No. Was it your fault? Yes.

The open window metaphor implies that the rest of the building's facade was a wall. I don't think that is an accurate description of a website.

If we have to resort to metaphors, then let's describe this section of the site as a ring binder, and each FOIPOP publication as a single page in the ring binder. What the kid did, then, is to take out the entire stack of pages and feed it to an automatic copier, put back the originals in the binder and left with the copied stack.

There is no indication that the "perpetrator" even looked at any page in that stack. And since the binder was clearly labeled as "free public repository of FOIPOP responses that have been approved for publication", the act of copying the entire stack is no reason to assume foul play.

More like finding a classified FBI interrogation manual in the Library of Congress...

Ok, not actually a metaphor -- https://www.motherjones.com/politics/2013/12/fbi-copyrighted...

How about looking into the display window of a shop? If they wanted to keep something secret, it should have been out of sight.

>I'd argue that the kid was entering through an open window of a public building.

Given that he tried to sequentially download all possible documents by sequentially incrementing document ID in URL it's more like trying to open every window in the public building and see what happens.

>the kid

Is a 19 year old a kid?


The "older generation" does not use metaphors because they are limited in their thinking. They do it to illustrate the principles underlying law. A basic principle of private property is that you don't have to secure it. The burden is on the would-be trespasser to figure out what rights she has with respect to the property and act accordingly. Snooping around private property out of "curiosity" is illegal, whether or not there are locks preventing you from doing so. (This is an idea teenagers have had trouble with long before the Internet, but we manage to beat it out of them every generation).

Those principles apply equally well to the Internet. Ordinary law-abiding people don't go fiddling with URLs, just like law-abiding people don't jiggle door handles or peak into windows to satisfy their curiosity.

This is absurd. This is not private property. Ordinary law-abiding people do walk into government offices and ask questions, and when they get answers, do continue asking questions and getting answers. Ordinary law abiding people to browse all the products on display at a store. Ordinary law abiding people do flip through all the pages of a catalog that is sent to their home.

This Orwellian attitude that looking at anything is criminal if the government retroactively decides they didn't want you to see it, is terrifying.

But as usual when it comes to authoritarian overreach by government, you're not de-facto wrong about the government sees things, but you are eloquently defending a morally horrific attitude.

Government documents and web servers are private property. (The information within might be public, but it's illegal to access private property in an unauthorized way to get public information). Ordinary law-abiding people do the things that property owners expect them to do (or what they reasonably infer the property owners intend them to do). If there are signs (literal or figurative) that the property owner doesn't want you doing something, then you don't do it--and you're charged by the law with heeding those signs.

That's not Orwellian or authoritarian--it's a basic part of "social" behavior in a society with private property.

Files an a server may be considered "private".

However, when you make those files available through a web server, you make the "public".

You then have the ability to limit the access to those files through any one of a large number of techniques to make them private again. Now if there were evidence that they tried (and failed) to use one of these techniques or that the teenager in question deliberately circumvented these techniques, then you would have a point.

One (not particularly good) way of limiting access to files without verifying identity would be to create a hash (say using the requesters email address and the request ID) and use this in the url to access the document (similar to how google docs implement sharable document links).

If they had done this, then perhaps you could legitimately claim that there was evidence of intent to restrict access.

An incremented ID is the opposite. It is a sign that you wanted people to be able to easily predict the correct url to download the next file from. Using an incremented ID is in fact evidence that this information was intended to be public.

Except this teenager was explicitly authorized to access all those files.

He literally asked the web server "can I have these files" and it responded with "yes, you are authorized, here you go".

If he wasn't authorized, the server should have responded with a 403 Forbidden!

Web servers are built around authentication and access rights! It is not the teen's fault that the government doesn't know how to configure them properly.

Failure to properly secure one's private property does not make it legal for someone else to access it.

Ummm... You put your furniture on the curb or your trash, then accuse someone for stealing it? If it's private it shouldn't be on the curb. Anything on the curb is considered public. Clicking a picture of anything on your curb is also considered public. If you wanted to keep it a secret, you should have bought a box and kept it in the locker.

Edit: after few days you realize that the trash on your curb shouldn't have been there. Then you raid the trash company because your brother is a cop.

Sorry- I edited. What I’m saying is that the server represents its owner, so when the server grants someone access in normal operation it’s not intrusion.

Servers don't grant access. People grant access. People can make mistakes and set up servers to mistakenly allow access to things.

Making a mistake doesn't revoke someone's property rights.

> Making a mistake doesn't revoke someone's property rights.

This is a non-sequitur, nobody is saying anything about anyone's property rights being revoked.

The teen asked for access, and the content owners, via the permissions they had configured, granted it. Sure they can later decide that this was a mistake, but that doesn't make it theft for the teen to have asked for access.

Making a mistake doesn't revoke someone's property rights.

They made a mistake when configuring their web server. It's obvious that this was a mistake because some of the documents contained private information from Canadian citizens.

> They made a mistake when configuring their web server. It's obvious that this was a mistake because some of the documents contained private information from Canadian citizens.

Per the tech article, it was an open archive of public documents that the government published periodically. The reasonable assumption is that the files were all public, and there's no reason to suspect the teenager in this case thought otherwise. The fact that ~3% of the files weren't properly redacted (whatever that means) is hardly "obvious".

> Making a mistake doesn't revoke someone's property rights.

Let's keep things constructive please.

Except they’ve been given access. The server should represent its owners in its sharing of data.

Failure to properly secure one's private property does not make it legal for someone else to access it.

It doesn't matter if we're talking about physical property in the real world or virtual property on a server.

It does if that other person has a reasonable expectation that you intended for it to be public.

Leaving your property on the curb is a good example. If someone takes it, you would be hard pressed to get it back from a legal standpoint.

This is very similar. The government left all those documents on the curb.

Putting it out on the curb with a "FREE INFORMATION" sign, however, does. And this kid is being thrown under the bus for taking it all because it was on a site labeled FREE INFORMATION.

> If there are signs (literal or figurative) that the property owner doesn't want you doing something

But there weren't in this case. The express purpose of the site was to make that information publicly accessible. If you leave stuff out at your curb with a sign that says "Free to all takers", and someone takes something you didn't mean to put there, how are they supposed to know you didn't want them to take it?

What literal or figurative signs were there that he should not have accessed the private information? By all accounts, most of the information he retrieved was clearly intended to be public so there would have been no way for him to know that he wasn't supposed to access the small portion that wasn't intended to be public.

> This Orwellian attitude that looking at anything is criminal if the government retroactively decides they didn't want you to see it, is terrifying.

This is the big problem here. There's no way the way the Freedom of Information Act in Canada is written the way it is because of the democratic wishes of Canadians. Every day our government moves further away from governing according to the will of Canadians and more toward the will of.....I don't even know. Saying it is the will of politicians doesn't explain some of the strange behavior we've been seeing in this country for quite some time.

If our fate is to ultimately live under a quasi-dictatorship masquerading as a democracy, then so be it, but I wish we could just be honest about it. This objectively false "Canada is a democratic nation" claim is infuriating to me.

Man, if it bugs you, have some sympathy for us directly south! People get straight-up fierce about the civic platitudes around here.

> A basic principle of private property is that you don't have to secure it.

That’s not even true. Trespassing requires that you be told not to be on the property, that’s why people post signs. You can’t be charged with trespassing because you went hiking and wandered onto unfenced land with no signs, it doesn’t matter if the dumb owners thought nobody would ever hike over there.

> Ordinary law-abiding people don't go fiddling with URLs

I’m an ordinary law-abiding person, and I fiddle with URLs. I know lots of people in that group.

You're not. Ordinary people don't even know that you can do that.

You are saying that ordinary people don't know what a URL actually is. This is ridiculous. It takes a certain level of computer illiteracy to never notice that you can access stuff by typing into the address bar directly, or by modifying what's already there.

It also doesn't take a computer whiz to use DownThemAll to enumerate URLs and download them all. They even have a dedicated function for this!

Yes, one does have to have some computer literacy to be able to do that. No, they don't have to be out of the ordinary.

Well maybe you shouldn't be using antisocial hacker tools like DownThemAll! An ordinary user would never use such tools. /s

Incrementing URLs by hand is one of the ways I learned about how the internet works, as a young kid. Kids are curious. This is normal behavior!

Well, some people believe being curious and trying things out is not what "normal people" do so it's "antisocial" and you shouldn't do it.

I remember a teacher yelling at me for trying some slightly advanced features in a hardware design language. I was really proud I could implement something I didn't thought possible, but her reaction was along the lines "Do you want attention? Why can't you just stay quiet and do what the rest of the class does without showing off?".

Stifling creativity and curiosity, especially in children, encouraging them to be mediocre "like ordinary people" is disgusting and counter productive.

Reminds me the first time I got to touch a word processor. We were supposed to insert a picture with the "clip art" feature. I made a mistake and inserted a "graph" instead. An empty graph, with grey hashed borders, that I didn't know how to delete or undo at the time. So I asked for help.

> Aaahh, he crashed my computer!

Went the teacher. Which then swiftly closed my unsaved document. 15 minutes of work, gone. As well as any remaining trust I had for her. I had done something unexpected, and she was afraid.

I don't think I was quite able to articulate it at the time, but she would have made a fine witch hunter. I do recall a sense of unpredictability though, and reminded myself not to step on that tiger's tail ever again.

Imagine a caveman...

Whether it is common knowledge or not has no bearing on whether it’s criminal, so I wasn’t using ordinary to mean “of normal computer literacy”. If you want to use some specific definitions that turn your claim into something trivially obvious, please define them ahead of time.

But lots of things are legal or not based on what typical people would do or think. "Battery" is contact a normal person would find offensive. So touching someone's shoulder to get their attention is not battery, but poking someone in the rib might be. More relevantly, implied licenses to access property are defined by reference to what a normal person would consider implied. So an invitation to enter a store implies a license to access the parts a normal person would assume they can access, and not parts a normal person would assume they are not supposed to access.

It's not a technical shell game. If you asked your mom, "hey, do you think they meant to have people be able to access those documents, where you can only get to them by editing numbers in the URL," she would say "no." That's what defines what is legal or not in this context.

Google, that hotbed of criminal activity, is displaying links to and even accepting paid advertising for these illegal devices that can access telecommunications systems reachable only by editing numbers!


If you asked your mom, "Am I free to access all the public-facing information on the Government Freedom of Information server", what would she say? The technical details of how to make the connection are irrelevant. My mother doesn't know how to connect to a BBS, does that mean that anyone accessing a BBS is breaking the law?

> If you asked your mom, "hey, do you think they meant to have people be able to access those documents, where you can only get to them by editing numbers in the URL," she would say "no."

No, she would say "I don't know what you're talking about, can you put that in plain English?" And then you could get her to give any answer you wanted by phrasing the plain English appropriately.

> But lots of things are legal or not based on what typical people would do or think.

Good thing computers use unambiguous protocols to communicate explicit intent.

> If you asked your mom, [...] That's what defines what is legal or not in this context.

I'm really terrified of a world where the law is made by asking laypeople what they think. Just like we don't define borders by asking random strangers on the street where countries are, I don't see how it's a good idea to define laws for technical services and protocols by asking people who barely understand computers what they think.

Ordinary people don't access Freedom of Information requests. The whole site exists solely for the use of people who are not ordinary.

Now you're just trolling.

> Ordinary law-abiding people don't go fiddling with URLs

That's complete nonsense. I've often changed a URL because it didn't work and had a typo. It's right there at the top of the web browser asking everyone to fiddle with it. If you were right, the URL bar would not be editable in web browsers, so you should be complaining to Google, Apple, MS, Mozilla for leaving this criminal-use-only feature so prominently on their products.

A consideration of real property illustrates nothing about URLs. Speech is a much more apt comparison, in that one host says one thing, and then another, if it freely chooses to do so, responds by saying something else. (In Canada, though, the speech comparison might not help much?) This comparison is much more cogent than anything involving a "trespasser". Of course, authoritarians prefer the stupid comparison.

Computers are dumb pieces of property. They are not capable of speech (though people might use them for speech), nor are they capable of "choos[ing]" to do anything. Analogies that anthropomorphize computers are nonsensical.

The only people here are the teenager and the property owner. And the intent that matters is the intent of the property owner. Did the property owner intend those documents to be publicly accessible? Would a reasonable person have assumed that those documents were not intended to be publicly accessible, because they could only be accessed by editing a URL?

Phones, bullhorns, billboards, postcards, bumper stickers, guitars, TV transmitters, etc. are dumb pieces of property that people use to communicate. (Why don't I have to get permission to look at your billboard?) Computing devices are different from most of these in a single respect: they can act autonomously, as their operators intend. In fact it is customary that they do so, just as it is customary that billboards are viewed by the public. Jeff Bezos doesn't have to stay at the console approving everything whizzing in and out of AWS. There's nothing anthropomorphic about recognizing that intent may be coded in such a way that a computer obeys that intent. In future please consider reading more charitably.

You post often enough on this topic that we all know your position, before you post. Consider, if you will, whether your preferred position is one that will lead to improvements. I posit that it will not. Your position, if adopted, would lead more faceless totalizing organizations to amass, against our will, more of our personal data, and to be less careful stewards of the same. We have far more to fear from those organizations than from 19yos.

> There's nothing anthropomorphic about recognizing that intent may be coded in such a way that a computer obeys that intent.

What a computer does may be evidence of intent, just as a lock (or lack thereof) may be evidence of intent. But just like an unlocked door is not evidence of intent to make something accessible, neither is an unlocked computer.

> Consider, if you will, whether your preferred position is one that will lead to improvements.

The Internet belongs to ordinary people, not folks who have read the HTTP spec. (It's their world, we just live in it.) "Improvements" will be had when the rules comport with what ordinary people want and expect. Ordinary people don't think about computer security; they expect that, like in the real world, people won't go into places that don't look like they're meant for the public just because there's no locks to prevent them from doing so. The law should reflect those expectations.

Laws exist to create social norms. HN users are preoccupied with data security, but ordinary people hate security measures and are bad at it. So it seems completely backward to me to codify in the law the idea that accessing data should be presumed to be permissible just because the owner of the data didn’t secure it.

Sending a packet in response to some other packet is something "a computer does". Back when "ordinary people" thought that witch-dunking was an "ordinary" security measure, "ordinary people" were wrong. Improvements came not when the courts surrendered to ignorance, but when they corrected it.

Very few "ordinary people" would describe websites as "places", anyway. They don't say they're "at" Facebook, they say they're "on" it, much like they could be "on the phone" or "on TV". Maybe this hasn't always been the case, but the courts aren't tied to 1990s-era metaphors. No one on a jury remembers those silly "Welcome to the BatCave, Come on in if you Dare" geocities pages.

Incidentally, Facebook and its ilk hold ordinary people to much more complicated standards of behavior than those to which you and they would hold sites, all the time. Oh, you didn't read all 50 pages of TOS and then update the (hidden) configuration, every week? Silly user, that's why we gave all your data to the English!

Meanwhile, you don't think Facebook should have to understand how HTTP works, just because one person working at the company might not. Interesting, that the benefits go one direction and the duties go the other.

I think your arguments are well-reasoned, but I also think that you, along with others here making analogies to libraries/filing cabinets/etc. are too eager to equate physical access to internet access.

In the physical world, one can accidentally walk into a room they shouldn't have, perhaps mistaking it for the bathroom, and then leave without having committed any transgression. Entering a room you shouldn't be in doesn't mean you've automatically taken the contents of the room. On the internet, however, visiting a URL means just that. There's no "oh, it looks like I shouldn't be here" opportunity.

URLs are not doors. They aren't rooms. The same reasoning can't be applied to them, as they behave in fundamentally different ways.

There was no sign that the supposedly private documents were intended to be private. I’ve often fiddled with URLs on public sites because I’ve had the intuition that something hasn’t been indexed that perhaps should have been indexed, or because of a dead link.

If those sensitive documents were on a _public_ website intended to be browsed by the _public_, who presumably did not require authentication, and the documents did not cause an “Authorization required” response when accessed, it feels rather totalitarian to treat that as a crime.

Most of the metaphors I’ve seen about this are not fitting. As excessive as the barrage of metaphors may be, allow me to add my own:

As part of a free treasure hunt, a person gives you the address of their house and says, “Whatever is not locked up is fair game for you to look over, take photos, or copy.”

You go there and have a great time. Then the homeowner has a fit because you discovered a hidden cellar full of pornography, which was apparently off limits but the door was inadvertently left unlocked. Now the homeowner is charging you with breaking and entering, saying you should have known better and it was common sense.

Of course a computer doesn't choose to do anything, but its actions do represent the choices of its operator. If I put a sign on my door the onus is on me to check if it says "All members of the public are welcome here" or "No trespassing".

At what point does the owner or their agents find themselves posessing the onus to clearly communicate their will?

The answer, even in the realm of physical property, is clearly not 'never', so where is it, and what leads you to believe its threshold was not crossed here?

> At what point does the owner or their agents find themselves posessing the onus to clearly communicate their will?

The law is that the onus falls on the owner or their agents at the point where a reasonable person would not be able to infer the scope of the implied license from the circumstances.

I posit that a reasonable person (not an HN reader) would infer from a document being only accessible by editing a URL that it was not intended to be publicly accessible.

>I posit that a reasonable person (not an HN reader) would infer from a document being only accessible by editing a URL that it was not intended to be publicly accessible.

Do you view 'a HN reader' as a reasonable representation of someone skilled in the art [of creating and serving websites]?

Unless I'm missing something, the only conclusion that I can see following this line of reasoning is that skill in the art is inversely proportional to a person's 'reasonableness' in this matter.

If a quorum of experts are coherently proposing that certain actions are reasonable, even if you find them distasteful, at what point is 'reasonable' no longer reasonable?

For what it's worth, despite sounding like a rhetorical question, I am truly interested to know your thoughts on that last matter.

Except in certain limited areas (e.g. “the reasonable doctor” standard for medical malpractice), the standard is the “reasonable person,” not the “reasonable person skilled in the art.”

People on HN are not representative, because they know about computer security and HTTP access codes. We don’t live in a world where those people get to make the rules. We live in a world where the rules are set by reference to ordinary people. My mom gets to set the rules for what’s “reasonable” (what are the social norms everyone has to follow). Not you or me.

My point is that a reasonable layman would assume that if a document was not linked or indexed from a public portion of the site, it was not meant to be accessed. That makes sense, because if the document was meant to be accessed, it would be made accessible in a way a reasonable lay person would know how to access it.

> My point is that a reasonable layman would assume that if a document was not linked or indexed from a public portion of the site, it was not meant to be accessed.

And others point out that editing a URL to increment an ID which is obviously sequential is absolutely a reasonable way of browsing the web. That doesn't mean a lay person has to know how to do it, but that they wouldn't think anything criminal was happening if they watched someone else do it.

How about a "reasonable mechanic", "a reasonable grocery clerk", or a "reasonable carpenter"? Dare we imagine a "reasonable web developer"?

And I posit you are wrong.

An ordinary person would infer from accessing a url and receiving information, that the information was intended to be public

This is more to the likes of your friend inviting you over but telling you to not open a certain door in their house. It's not illegal to then open that door, but it would be to break or pick the door's lock and then open it.

It's worse than that. It's the friend saying "open any unlocked doors you want", then after you do it, he remembers he left some secret things in his unlocked bedroom and wants you arrested because he didn't know the lock was broken.

You are wrong about the law here.

It might be hard to prosecute, but just because I invite you over to my house I have absolutely not granted you permission to enter any room you want.

If you, for example, went into my office and started rifling through my file cabinet that would be a huge invasion of my privacy despite the fact that I (like many people) do not have a physical lock on my office or filing cabinet.

I found the most disturbing aspect of it that the defendant is barred from using the internet. This was perhaps a valid sanction to use 20 years ago but today this is tantamount to removing a citizen from society. I understand that you might want to bar vandals from computer access. This seems to be the wrong case to employ such a heavy handed measure.

For those who haven't read it, The Hacker Crackdown was released by Sterling as freeware:


there are also PDF and epub versions if you google for it.

This seems more like a failure to develop proper social norms with regard to Internet information.

> If the information is not supposed to be public, it should not be reachable without authorization or authentication.

I don't lock my car, and often not my house either. I don't think that means you should be able to snoop around and see what interests you. Websites are private property. It is obvious what parts you're supposed to see and what parts you're not supposed to see. You should be able to prosecute snoops as an alternative to locking things down, as you would with any other private property.

But the difference is that a website is built to be publicly accessible on the public internet.

Your car, presumably, is not offered as a public resource.

Except this data was obviously not intended to be publicly accessible, or else it would have been reachable from some public-facing portion of the site.

You’ve never heard of an API then?

This is silly. Your door analogies have no place at all.

A great many websites work this way.

Unless you’re injecting metacharacters into URLs, or requesting AAAAAAAA * 65535 followed by shellcode, changing paramter values is using HTTP exactly as designed, and a well-formed request has many possible error codes for the exact purpose of letting you know what you are allowed to access.

It’s perfectly normal for people to alter URLs. The fact that people who are unfamiliar with URLs don’t do that is irrelevant, and you could say the same for any subject. Just yesterday I changed lat= and lon= to get a NOAA forecast. Is that snooping or hacking? How about when we change the integer at the end of an XKCD comic to view another one without previously confirming there is a hyperlink somewhere?

At least in the Weev case, people could take his IRC logs out of context as well as argue the fact that it was plainly obvious that the server was misconfigured and he was seeing content that he should not. But getting a response when incrementing an integer, generally speaking, does not mean you are viewing something unintended for you. When you are downloading public documents, it would be entirely unreasonable to assume that the material was non-public.

Also, take a step back to really think about what you are advocating. Is society better off by ruining this kid’s life? It blows my mind that someone even remotely technical can think this particular case is a good use of the justice system, or can even compare it to someone snooping around their neighborhood and trying their doors.

I agree, even if we assume what he did is illegal, its sort of like impounding someone's car and arresting them for dangerous driving because they went 1 mph over the speed limit. We don't do that even though it is technically illegal because that would not be a good use of the legal resources we have, which are finite. Same thing if we arrested and imprisoned everyone who fiddled with url parameter, it would be a waste of finite resources. Probably a phone call saying 'Hey can you please stop doing that would suffice in most cases.

Supporting your examples, in previous GIS work I have iterated through a series of public URLs with file ID integers in them (Corresponding with individual document IDs) to batch-mirror whole public GIS datasets from government websites. This was on a public server, no http authentication or username/password, just something that was presented with a terrible javascript user interface only designed to retrieve one file at a time, slowly. Exactly the sort of thing that got this guy arrested.

If you are just iterating through an archive, it would be impossible to know whether foo.php?id=24530 was linked to anywhere or not. It seems crazy to criminalize this sort of thing.

> obviously not intended to be publicly accessible, or else it would have been reachable from some public-facing portion of the site.

Doesn't seem obvious to me. So now I have to check for a specific anchor to a URL to see if a URL is considered publicly accessible?

But it was, precisely, reachable from a public-facing portion of the site!

>I don't lock my car, and often not my house either. I don't think that means you should be able to snoop around and see what interests you. Websites are private property

Here's a better analogy; you put up a "yard sale" sign in your front yard, fill the driveway with property, and then call the police on the first person who shows up claiming they are trespassing.

When you put up a "yard sale" sign, you're conveying what is called an "implied license" to access the property. The scope of a trespasser's right to access a property is limited to what a reasonable person would consider to be granted by the license. A reasonable person would assume that a "yard sale" sign grants a license to access the yard on which the sign is posted, but not to go around to the back yard and peek into the basement windows.

A public web page is no different. A reasonable person would not assume that content you can only get to by editing a URL manually is supposed to be accessible to the public. A typical person would not even know that you can do that. Those typical people are the ones that get to set the rules, not hackers.

If a url is not supposed to be edited by people, address bars wouldn't be text fields. It wasn't long ago that one of the primary ways of going to a website or even a particular webpage was entering it manually in the address bar as opposed to going to google first.

Websites are not only meant to be accessed by humans either. Are you telling me that bots should employ human reason to guess what should be viewed or not?

I'm not even sure what you're proposing. The web has always been public space.

If a website puts a link to a resource that is not protected by authentication means, and hides it behind some other html element, or styles it with javascript to a white font on a white background, or does something else that is similarly difficult for a script to guess is not viewable by humans on some subset of browsers, is it now breaking the law because obviously the url was not viewable on any webpage?

How about if someone takes the url, and doesn't view it, but posts it on some other website with high traffic. Now all people that click the link have broken the law?

You are old enough to remember a world before Google, yes? I typed http://DeltaAirlines.com into my browser's address bar, looking for info about Delta Airlines. I didn't click a link from another website or copy it from a brochure. Why is that trespassing?

> A reasonable person would not assume that content you can only get to by editing a URL manually is supposed to be accessible to the public.

A reasonable person could totally use crawlers like DownThemAll, and fail to notice that some URLs they request are not, in fact, accessible by clicking through a web page. That's different from accessing something you know isn't accessible by mainstream means.

I did that several time to download some porn. The process is simple: search for whatever I'm interested in in a search engine, click on whatever image looks interesting, see if I the URL has numbers I can modify to access nearby images (they will have hopefully the same theme, or even depict the same scene).

The first URL was clearly publicly available. I got it legitimately through a search engine, or by clicking around. How am I supposed to guess that some of the others are off limits?

Is it more likely that editing a url is a reasonable thing to do or that literally every other person in this thread is an unreasonable person?

I assume you also don't park your car in an area where millions of people can access it in the same second with no cost to them.

It's really not a reasonable comparison.

> with no cost to them.

or to the owner of the car!

I never locked my car, back when I was driving a shitty old Cavalier. That seems to say something about how government views its citizens' data.

Seems like a move by the provincial government to shift blame from its poor security to an imaginary bad actor; this article also from the CBC goes into more detail and asserts that fraudulent intent is necessary for a conviction, so hopefully this goes nowhere.


Yes, it's really not clear that any crime was committed. The relevant section of the Canadian Criminal Code[1] requires either fraudulent intent or some actual manipulation/destruction of the server - not simply downloading data. It seems like overreach by the police to distract from the fact that the government failed to secure private data.

[1] http://laws-lois.justice.gc.ca/eng/acts/C-46/section-342.1.h...

Honestly, I'd like to see the kid's lawyers push back and claim damages against the province and its contractors.

How else shall we force problems like this to be fixed?

Anyone familiar with the Streisand effect would have predicted that this would in fact result in this failure getting as much attention as the circumstances can imaginably furnish.

> The relevant section of the Canadian Criminal Code[1] requires either fraudulent intent or some actual manipulation/destruction of the server.

Not quite. The test is:

> Everyone is guilty... who, fraudulently and without colour of right, obtains, directly or indirectly, any computer service (including... the storage or retrieval of computer data)

The Crown can argue that the documents were retrieved/obtained using manipulation of the server (since the public URLs were manipulated to find non-public URLs.)

>"Unfortunately, what had happened is someone went in through the URL and just sequentially went through every document available on the portal," she said.

It's not clear if the links to the documents with sensitive information were public or not. Yes, it's absolutely stupid to have security by obscurity, but from a legal point of view abusing someone's stupidity may look like a crime. The story, while sad, doesn't look black and white to me as CBC tries to paint it.

I'm not familiar with the Canadian legal system, but most countries have entrapment laws which forbid prosecution if there was no way for the suspect to know it was illegal and no criminal intent was involved. The "public, free, information, disclosure" words would make this a slam dunk if Canada has some of those laws.

I agree with many of the comments here, along the lines of "intent?" and "bad law", etc... How can I provide material assistance to either this kid and/or to the problem at large?

I'm looking for something other than "donate to the EFF [or equiv.]" ideally though; I'd prefer to donate directly to his legal fund, or even do some legwork myself that will help, etc.

And ideally in a way that not only helps him, but that helps prevent these situations from occurring in the future -- i.e. working towards law change, influencing prosecutorial discretion (meh), etc...

Propose changes (additions, subtractions, et al) to current law. Talk to your (I don't know Canadian politics) political figureheads about your proposals. Shop them around.

Talk to your law enforcement agencies about computer crimes.

Write and talk publicly about the issues.

I'm not an expert in Canadian law, but are there any elected officials in the chain of the decision to conduct a raid? If so this ought to be severely career limiting. If it was an elected judge who approved the warrant, for example.

The provincial government or federal government appoints judges. [0]

But if it is anything like here in Australia (we are both Westminster systems) then this does not mean the government is held accountable. Judge appointments are assumed to be fairly neutral and it hardly ever comes up at election time.

[0]: https://en.wikipedia.org/wiki/Judicial_appointments_in_Canad...

FYI - Judges aren't elected in Canada. Good point though. The real question is who continues this process now that it has clearly been exposed.

The teenager was downloading publicly available records on a Freedom-Of-Information portal. Why law enforcement is involved, or why this is even remotely a criminal act, completely baffles me.

It’s easier to prosecute than to fix. More than that, the kid pointed out the service is broken. Pretty humiliating. It’s not at all unusual to prosecute to save face, especially rather than admit that your org gave the info to a teenager when that’s illegal.

I think it also included records of the submitter (private data).

Sometimes you get Freedom of Information Act information because you got an individual's consent to release information that would otherwise not be released (e.g. something about your spouse/family member that consented to the information not being redacted).

Perhaps the same portal handled Privacy Act requests where people requested their own information that the government holds.

Really worried about what the authorities might find in his 30TB of 4chan backups. Hoping for the best outcome for them.

It doesn't actually matter, unless he's KNOWS he's downloaded something illegal and KNOWS how to find it on his computer, or has opened it already and not deleted it.

Basically all possession laws (really all laws in general) require some sort of knowledge and intent. Just like you can't be convicted for possession of drugs if they were stuck to the bottom of your shoe when you came out of a club (don't actually try this), you can't be convicted with possession of illegal digital material if you weren't aware you were in possession of it.

I did digital forensic work at an old job and one of the cases was someone who liked to indiscriminately download huge amounts of porn off dodgy p2p applications. We found gigabytes of indecent images of children, but no searches in the p2p application for commonly used IIoC terms, and no evidence that he'd either opened the folders containing the IIoC or actually viewed any of it. Police gave his hard drives back (after wiping them ofc) and let him go.

I wouldn't be concerned about his 4chan backups, any lawyer will be able to get him off the hook for anything illegal in there provided he hasn't actually opened it.

> Basically all possession laws (really all laws in general) require some sort of knowledge and intent. Just like you can't be convicted for possession of drugs if they were stuck to the bottom of your shoe when you came out of a club...

Clearly not all laws in general: https://www.npr.org/templates/story/story.php?storyId=188420...

Fun fact: in Germany possession of child porn is illegal even if you're unaware of it. If you find child porn in your e-mail inbox technically you have to report yourself to the police for possession of child pornography.

Yes, that seems extremely exploitable. No, I don't know if anyone has ever weaponised child porn this way. I wouldn't be surprised if it happened (and worked) though.

Ugh, this comment has reminded me of the disgusting dystopian fascist police state we find ourselves in; I immediately thought, "I'm sorry, but Mr. Deitrich's dead. I thought they'd arrest him, but when they found a Koran in his house, they had him executed."

I thought you were going to quote "Show me the man, and I'll show you the crime."

Exactly. Why did this kid admit that? How astronomcally low are the odds there's nothing illegal to possess in there?

"How astronomcally low are the odds there's nothing illegal to possess in there?"

That is a haunting reflection of the state of affairs with regards to information and laws regarding information. Criminalizing the mere possession of information should be a tool only of the despotic and/or idiotic.

Of course, these clowns left "private" information accessible by public urls without identification, so we know that they're at least idiots.

As the amount of 4chan material you've archived increases the probability of not archiving something illegal quickly diminishes to zero.

Depending on the boards archived there's a pretty good chance he doesn't have CP. Only /b/ (and I hear /sp/ as well but I never go there) really ever have child porn, and then very very rarely and quickly deleted to the point that an archiver might not pick it up. Due to the sheer size and uselessness of a possible /b/ archive I kind of doubt there's anything bad.

If his archive was automated it's basically guaranteed he's saved a fair bit of loli content. Not sure of the legality in Canada, but this is seems like a case where they'd railroad you for something they normally don't enforce, e.g., Chris Handley in the states.

CP would not be unlikely even if he just archived "safe" boards like /a/, /tv/, and /g/.

If he's scraping the softcore or hardcore boards (because nobody's archiving 4chan for its intellectual discussions), he's absolutely going to have more than a few under-18 subjects in there doing sexually explicit things. If the state can verify the age and identity of any of them, he's toast.

>nobody's scraping 4chan for its intellectual discussions You'd be surprised. There's quite a few archives out there just for keeping up board history and discussions.

>because nobody's archiving 4chan for its intellectual discussions

There's plenty of good discussion as long as you stay away from /b/, /v/, and the porn boards.

/diy/ always has interesting projects. /g/ seems to be the one tech board on the internet that takes consumer privacy seriously (Even HackerNews seems to give Microsoft a free pass in this regard)

Fun fact: in Germany even "lolicon" (i.e. sexual drawings of prepubescent characters) count as child pornography and as far as I know they're tolerated on /b/.

I'm hoping Canadian laws treat this differently but if he were German and they found anything like that in his archive, he'd be facing child pornography possession charges.

The more systematic your collection the better your defense will be.

Intent and knowledge of the illegal material of which you are in possession could easily be a requirement.

I'm no laywer, but archiving the internet is hardly illegal.

> I'm no lawyer, but archiving the internet is hardly illegal.

Tell that to the part of the law enforcement that decided to arrest this kid for accessing a public website.

If one would give me six lines written by the hand of the most honest man, I would find something in them to have him hanged.

He's a 19 year old kid, not a criminal mastermind

There's quite a bit of child pornography that has graced the pages of 4chan over the years.

My stomach dropped when I read that line in the article. I imagine the cops will drop these charges, but I'll be amazed if they don't try to pin something else on him.

On a differnt note, is this level of backup a common thing?

30TB is a stupid amount of anything for any one person -- I've always used wayback if I wanted to see an old page

Here's an entire forum of them on reddit, it's called /r/datahorder.

I love that sub.

A very good friend of mine used to obsessively back up and catalog music and books. He had all the local bands and artists well organized and preserved for posterity. Once he had a computer he started to do similar things digitally. Some people are just into it.

I wish I'd done that with a lot of the indie music and otherwise unpopular things I was interested in years ago. There's a particular album I wish I could find and listen to, but I've only ever been able to find one article talking about it, all links dead, and that itself was long enough ago that I've forgotten the name of the album and band.

I had a lot of things saved through Instapaper right from its inception, but I stopped using it regularly and one day when I went back I didn't appear to have anything saved anymore.

Since then, I've been much quicker to download things I might want in the future. Storage is cheap, archaeological effort is not.

Just like Aaron Swartz. Is it really necessary to send 15 officers and / or sentence people to 20+ years in prison for downloading freely available information?

Aaron Swartz faced a longer prison sentence than most murderers within the US, and the sentence for murder in almost any other country in the world.

Common sense has completely gone out the window in both policing, and the criminal justice system.

Was anyone injured? Did anyone suffer financial loss? Fear to self? Any form of significant damage at all? The answer to all the previous questions is a definitive and resounding NO!

God dammit. An almost identical thing happened to me after submitting a public records request to Seattle's IT department for email metadata for January 2017. Instead of sending me the email metadata I requested, they ended up accidentally sending me millions of actual emails. FBI investigations, cheating husbands' texts, SSNs, credit cards, zabbix alerts (so many 100% disk space alerts).

When I contacted Seattle them to tell them what happened (on my own will), the conversation quickly turned to a point where we had to get lawyers involved. Basically, they told me that if I agreed to have Kroll [1] scan my hard drives to prove that I deleted the records, then they would give me "legal indemnification". They eventually agreed to accept an affidavit that I deleted everything, and had to wipe TRIM and that I wrote a script to confirm deletion to the effect of, "grep -r $FILES_HEADER_FIELDS /".

One part that led to such a strong action by them was that they didn't see in their logs how I downloaded everything and thought that I found a backdoor to download all of their emails. They had some annoying rate limiting that prevented too many files from being downloaded at once, so I copied the files from the page's source, then ran a wget against everything. Since the files were being downloaded from S3, their webserver logs didn't include most of the downloads, which led to some suspicion.

Funny enough, Seattle told me it would cost $32m and 320 years of employee salary, but I ended up sending them $40.

It just blows my mind.


[1] https://www.kroll.com/

The way things played out for you is exactly how we handle corporate exfiltration. Employees are unilaterally terminated for the violation, but we'll agree to not press charges if they disclose any dissemination and attest to its deletion.

Good for you for not having to deal with Kroll.

I am absolutely incensed reading this.

The government made no reasonable effort to conceal the information and put it on a _publicly accessible_ web server. They made the information available to the public whether or not that was their intention. How can any reasonable person conclude that typing in an HTTP url qualifies as an illegal breach?

They've made the information publicly accessible via HTTP, yet react like this when someone then views in. Scary stuff.

I just can't comprehend this at all. To even describe it as a "breach" is inaccurate -- the real headline is "government publishes data they hadn't intended to".

Incensed. Learned a new word today. Danke.

I've contacted the reporter to see if we can setup a legal fund for this guy. It sounds like he's being bullied. This could also get a very bad precedent in Canada as this is totally absurd.

Please keep us informed on the outcome. I have written a letter to my MP and MLA as well as the MP for Halifax.

He got back to me and said there is no legal fund at this time. These people need to hire a lawyer ASAP. Their sons life is over because he wrote a for loop with wget.

  for i in `seq 1 7000`; do
    wget http://foi.example-gov-domain.gc.ca/foi-download.cgi?id=$i;
Boom, you are not going to federal prison for 10 years! What a complete joke!

As an act of solidarity, we should all run it until they drop charges and fix the damn thing. They can't charge 5000 people.

Awesome, thanks!

I would love to help out if possible. I'm a Canadian citizen but not in Halifax, so I guess I can write my MP but I don't think they have much authority to look into a provincial matter like this.

Call Andy Fillmore's [1] office anyway.

[1] https://www.ourcommons.ca/Parliamentarians/en/members/Andy-F...

Keep us informed. I'll chip in.

Add "help avoid sending teenagers to prison" to the list of reasons why you should prefer UUIDs over integers in your Internet-facing REST API.

This API was supposed to be private and yet supported trivial enumeration?

Many UUIDs aren't secure either and can be trivially enumerated. A better approach might be a long number generated using a secure random number generator and converted to a BASE-64 string.

If things should be secure, they should be behind authentication and authorization for that content. To argue about what is the best ID to be used is just trying to fix what is not broken, IMHO.

Security is multi layered.

Obscurity is not a security layer.

Obscurity is a perfectly fine security layer as you said yourself (i.e. it is one of many layers).

Relying ONLY on obscurity is a failure.

So a password or private encryption key isn't security?

What is a UUID if not a fairly predictable object-based shared authorization token?

EDIT: I'm mostly kidding.

Just call it an object-specific API key, that forms part of the URI instead of being a header value. Since we are using https it's all good.

I think a cryptographic hash is fine.

If you know the hash, you likely already know the file.

A cryptographic hash of what?

Cryptographic hashes aren't random by nature.

The document, from what I could glean they were PDFs.

Edit: even if they aren't PDFs, you can feed the content to the hash function.

I suppose that might work but seems needlessly complex compared to just a long securely generated random number.

Yeah, randomly generated IDs are fine for most use cases IMO, and perhaps more importantly they’re easy to implement well enough—grabbing 256 bits from /dev/urandom isn’t bad.

With a hash it’s more expensive to compute the ID, but you get advantages such as content-addressability, data integrity without trust, and easily mergeable databases. It’s a good amount of bang for not much more buck.

At the cost of leaking more metadata in the ID, by including a checksum/namespace, you can recognise a valid ID or determine the type of object it refers to without fetching anything from storage, mitigating some DOS attacks.

IDs are a subtle thing, and in my experience, often overlooked as a design issue. A lot of times it ends up as something like “id integer primary key autoincrement” without any thought.

A better approach would be to not put private data in the public domain.

Generally you generate a random UUID which is quite difficult to enumerate.

>Many UUIDs aren't secure either and can be trivially enumerated.


uuidv4 is random assuming your rng source is random

And assuming you're legitimately getting a UUIDv4 instead of a UUIDv1 or UUIDv2 now and forever...

UUID are easily shared, and there is no way of knowing who shared it. Good for Xmas photos, not so good for sensitive documents.

You could just keep a record of to whom you gave each UUID?

The thing is it wasn't even supposed to be private! https://twitter.com/SwiftOnSecurity/status/98536562414151270...

If the information is supposed to be private wouldn't you want to use token based auth or something similar? I wouldn't feel comfortable exposing information publicly even if it was behind a UUID.

Sure, but remember the lowest-bid contractor here used an integer. Baby steps.

Since this is publicly accessible, what would be the chance that search engines indexed the files? In this case, would Google bot be charged? Or if this were, say, Equifax or Facebook. I mean, in those situations, the companies were blamed for "the leak". It seems rather convenient to cherry pick the law to apply on this poor teenager.

A quick Google search return some (cached) results: https://webcache.googleusercontent.com/search?q=cache:4N3wSV...

I think I read that Google had, in fact, indexed all of these pages.

Google did indeed index these files according to the following article:


All? Or just the ones that others posted links to?

Yeah, what about that Cloudflare memory leak they had a while back? Are all the caches that retained the info complicit?

>Officers took her 13-year-old daughter to question her in a police car.

Why are the cops allowed to do this? Why do you have to be "rescued" by your lawyer in order to not be questioned by the police without legal representation? Not sure how it works elsewhere but the cops badgered the fuck out of me until my lawyer finally got to the station and chased them out. So, if I was a 13 year old on the way to school and thrown into the police car, they could just do that until I crack?

Not the kind of response I would expect as a Canadian. Can't they send someone undercover ahead of time to find out it's just some kid at his family house and then take the appropriate response? There are better ways to handle this, and there are certainly better ways to secure government files!

If it's publicly accessible, it's public information. Obfuscation doesn't count!

There really should be accountability of whoever chose to store the data in an insecure manner.

If you can see the data by iterating a single number in a URL, and there's zero authentication or verification of credentials, there's no possible way to call it malicious. The fact this family's home was raided was already a colossal mistake. The fact charges are even _suggested_ is such a joke I don't even have words to describe it.

Please write to CIPPIC [0] and the Members of Parliament [1] and Members of the Provincial Leglisation [2] for both your local jurisdiction if appropriate and Halifax, Nova Scotia to help protect this kid. The federal Minister of Justice [3] and Technology [4] may be good additions. Remember what happened last time we let a government go wild on a kid incrementing a number in a public URL.

The fact is, it is the organization who published "personally identifiable information" on the public internet who should be punished - and, in any case, criminal law is not the tool to do it. The kid who incremented a number in a URL to download that information is not the bad guy. What if the kid was not Canadian? Are you going to try to extradite a Russian national over accessing information on a public web server?

When a server announces to the world that it can answer HTTP requests, making a reasonable number of HTTP requests is, to me and most technologists I know, authorization (and thus, should be seen as with colour of right or non-fraudulent). The fact those HTTP requests released data he was apparently not entitled to is a security issue, a bug, a problem to be paid for by the actor who manages the HTTP server, not a problem of law. Unfortunately, this section of law has not been used often enough to clarify to me the interpretation of those words.

Here are some follow on questions:

- Why was there "personal information" in FOI releases? Surely a FOI release was intended for the public, as that is the intent of the act. Who's fault is it that there was undesired information in the releases?

- How do we get this law changed? As the law is written, it hangs on the words "fraudulently and without colour of right" - the rest of the clause is incoherent babble of a 1985 technophobe.

[0] https://cippic.ca/

[1] https://www.ourcommons.ca/Parliamentarians/en/members/Andy-F...

[2] https://nslegislature.ca/members

[3] http://www.justice.gc.ca/eng/contact/index.html

[4] http://www.ic.gc.ca/eic/site/icgc.nsf/eng/h_00279.html

If you're frustrated by this story and live in Georgia, USA, you should immediately contact the Governor's office and express your concern and share that he veto a similar bill sitting on his desk, SB 315.


And this is why I'm going to route all my kids traffic through an offshore VPN by default and whitelist low latency stuff.

I'm starting a proxy project that can be used in countries where VPN is illegal. I'm looking for collaborators. https://github.com/UncleGrape/UncleGrape

I'm interested in contributing, although I don't have much experience with networking code. Any idea what language you're going to be using?

Typos in README: "versitle", "explictly"

I've been thinking I wanted a router with two wi-fi networks.

One that goes through the ISP and one that goes out over a proxy.

I haven't found a solution just yet. I guess a raspberry pi with iptables and routing based on device ID could do the trick too.

I use the following configuration:

I run two tinyproxy instances on my home server and I point all browser traffic to the first instance. The first instance run with the default routing table on port 8888 and has entries like upstream localhost:8889 ".somesite.com" the second instance, which run on port 8889 is run with the vpn as default route (I use setfib under FreeBSD).

With this setup, traffic goes by default directly on the net, but the tinyproxy config file can be used to redirect some traffic through the VPN.

Of course, you can do it the other way around to have traffic by default on the VPN and direct some traffic.

Won't this leak DNS and won't lots of system process go avoid the HTTP_PROXY settings?

I have my own DNS server on my local network. And yes, this is only for my browser. But this is intended.

If you want to route all traffic to a vpn for a specific machine, you can use pf rules to forward an ip through another routing table.

I have just set this up a week ago at my apartment. I use hostapd and dnsmasq on a raspberry pi to make it a wireless AP. It is connected to the primary router via ethernet and uses iptables to route wifi traffic via ethernet. Then I just installed OpenVPN to route traffic via a VPN. I haven't tested for DNS leaks yet as this is an ongoing project.

There are a couple of other things on my todo list still. Such as easier switching of VPN node (current method is to ssh in and restart OpenVPN with a new config...) and ad blocking.

Hope this can help you although it's still a bit immature and quite hacky IMO


I considers using a raspberry pi, but I'm not sure the range is ideal.

Why not just set up two wifi routers? This is surely also the best solution for ensuring data compartmentalization.

This is appalling. The operators of the site should be charged for criminal negligence. You don't get to call it stealing if all it took was 3 keystrokes in the address bar of a browser. Backspace. Number. Return. Hacked!!

Good thing he’s in Canada and only got raided. If he were in the USA, they would have tossed flashbangs and tear gas into his house, vaulted in through the windows, shot the family dog, and held the whole family at gun point, boots on their necks.

It’s a shame that police departments think these “shock and awe” tactics are even remotely appropriate for dealing with non-violent suspects.

I don't like or agree with it either, but it is unfortunately necessary for the preservation of digital evidence.

Many nonviolent actors involved in cybercrimes have prepared killswitches or some other manner of instantly burning everything to the ground if you give them enough time to react when you show up with a warrant.

If we can't enforce the law without making society terrible, let's get rid of the law.

I wonder how they noticed.

Perhaps the lowest-bid contract company that made the site decided to use something like amazon glacier for storage of boring documents nobody will ever need. Then along comes someone that causes them all to be extracted at great cost, some middle manager receives a bill for $millions and wants to blame the kid rather than his own failings.

(added: the link to evandentremont.com elsewhere in the comments discusses how this was supposedly discovered, and other details of interest.)

that would make its own interesting information request. you probably couldn't directly ask "how'd you find him out?" at this point, but you could ask for maybe IT costs per month over the last X months broken out by organization the money was paid out to.

also possible (probable, even, in my mind) he just crawled too hard, the machine was slow, and the folks in the office working on it complained. (god only knows how much processing the service does behind the scenes when a PDF is requested. for all we know it is being reassembled from tiffs of individual pages every time.)

The truth is even sadder than you might expect (the rest of this post is a quote from this article[0]):

Conrad said the breach was detected by a provincial employee, but it was a fluke.

“The employee was involved in doing some research on the site and inadvertently made an entry to a line on the site — made a typing error and identified that they were seeing documents they should not have seen,” Conrad told a technical briefing.

[0]: http://toronto.citynews.ca/2018/04/11/halifax-police-probing...

There's a bit of a leap from that to knowing this dude had done the same thing? That describes the employee finding a vulnerability. It probably took some study of the logs to find "the breach". How many similar breaches by actors overseas and less-vulnerable Canadians did they ignore?

I don't think you can get from glacier in "real time", you need to prefetch it first

This might be a controversial opinion here, but intent does matter. If I see a bunch of stuff sitting the sidewalk and I take some because I think it's free, that's a reasonable thing to do. But going into someone's house and taking their tv is not. "It's their own fault for not locking the door" isn't a valid legal defense, and I would prefer not to live in a country where victim-blaming becomes a get-out-of-jail-free card.

Based on what little I've read thus far, the teenager does indeed seem to have good intent. If that's the case, I'm cautiously optimistic that the court system will set him free without any consequences. But if the prosection can prove that he was aware of the data's confidentiality and was acting with malicious intent, then he deserves a conviction. Let's let the legal system run its course, before gathering our pitchforks.

A publicly accessible HTTP web server is not analogous to a locked home.

The kid sent a request to the server for a document and Nova Scotia's web server graciously provided the content with a "200 OK" response code. The Nova Scotia government doesn't know how the internet works.

That analogy assumes a lot about how hard/hidden obvious id numbers in URLs are. I'd counter that this situation is more like "putting your stuff on the curb and being mad when people take it". Rather than scapegoat the kid, the government should be investigating themselves for criminal negligence.

> "That analogy assumes a lot about how hard/hidden obvious id numbers in URLs are"

Well, I did give 2 different analogies, and without knowing more specifics, I'm not taking a stand on which analogy better fits this case. Depending on the specific design the government used, and the steps the teenager took to access the content, either analogy could be applicable.

> "Rather than scapegoat the kid, the government should be investigating themselves for criminal negligence."

That's a false dichotomy. Investigating government officials for negligence shouldn't preclude prosecuting a (hypothetical) malicious hacker.

You can find some documents from https://foipop.novascotia.ca in Google cache, so should Google be sued too?

I get what you're saying and I don't disagree with your premise, I just don't think it is applicable in a situation where the purpose of the website he was visiting was to access information. Not _that_ particular information, but if I accidentally put my wife's jewelry out on the curb with the old Nordic Trac it'd be pretty crazy to charge the people who took it off of the curb as jewel thieves.

That's because we know what a "house" is - a bunch of private property with a wall around it (even if the wall is not locked). Usually comes in sets with other bunches of private property with walls around it. And we know the default - you aren't welcome in those walled forts unless you are welcomed in by the owner.

Not at all clear that files on a public webserver look very much like a private house.

I think someone's house is "private by default"... even if the door is unlocked, you know you shouldnt go in there.

The internet is public by default.

I agree with you but I don't think the law does. The CFAA says that if access isn't authorized, it's no good. Now we can say that if the system was programmed to give it up (200) instead of telling you you aren't authorized (403/401) then you are authorized, but I think the law is more about whether a human intended to authorize you. Accidentally programming the authorization is (however stupid it may be) not what it's about.

How is someone supposed to determine that one unauthorized thing is hidden among many authorized, similarly named things?

I guess because the unauthorized thing isn't linked. Giving you the link is like giving you a password... They're both just strings although one is considered to be more secret than the other. Guessing at links is like guessing at passwords: it's overcoming the fact that you weren't provided with the string that gets the server to respond with the stuff.

I don't like this but I think it's how it legally could play out.

The way I see it, these files were sitting on the sidewalk. Public facing websites are public spaces.

He didn't walk in and take something, he sent a request (via HTTP) and they responded with the content he requested.

If I sent them a dead tree letter requesting a document and they replied with a copy of that document, would you consider this the equivalent of going into their house and taking their TV?

There is one difference which makes analogy irrelevant. You can easily distinguish houses where you are allowed to come from houses where you are not. There are simple rules and we are all know them.

URLs have no way to classify them to legal and illegal ones. You can propose a plan to w3c and to government to mark URLs with string 'illegal' in them, if they are illegal to visit without special permissions. It will make them distinguishable from legal URLs, and then it would be normal to charge for visiting illegal URLs. But this rule should be a widely known social norm, not a local rule of some site hidden on some obscure page that easy to miss.

Yes, the house analogy is not accurate.

It would be logical to assume that as the files have specifically been made public via HTTP then no laws are being broken by viewing them unless a warning message appears saying otherwise.

The 'locking the door' metaphor is just flat-out incorrect. A public-facing webserver is simply not a place to store your shit..

A closer analogy would be two tables out the front of your house covered in fruit, with a sign saying "Free Fruit" on one table and then expecting people not take fruit from the other table.

I like this analogy a lot. But I think even better would be someone setting up a store in an area advertised like "everything is included with the price of admission" (admission being your ISP fee) and taking from the store without realizing it's not truly all inclusive. The store needs to be a vending machine, not a shelf.

Wow. The trauma inflicted upon the children in the family when the Canadian government bursts into their house can never be undone or taken back. Nor can the financial and mental, emotional stress of losing their computers and ability to do productive work and go to school. (Edit: will there be any reparations for this abhorrent behaviour? Have they apologized? Will they, at least? Not that it matters, the damage to this entire family is done.)

Some questions. Is the website still online? What happens if every Canadian downloads the files?

What a dystopia. Do we only have one part of the story, can the situation really be as bad as depicted on the article? This is atrocious.

Edit: And where is the case against the people in the office who put the sensitive information of others into public view, (assuming and against the law), the actual perpetrators of an actual crime?

The Gov of Nova-Scotia shut down the site, but you can still find some documents on Google (they're cached).

The subcontractor of the site fucked up and they're blaming this kid.

Is it trauma, or an education in tyranny? Admittedly, it's not the kind of education I'd prefer for my children, but in the long run, nakedly tyrannical behavior is better than concealed.

It's worth noting this sort of thing is highly unusual which is why it's getting so much media attention.

Cases like this make me want to attend law school. I am well versed in technology, have acted as CISO and other capacities. I bet I could decimate many prosecuting attorneys trying to make their weak cases.

Interdisciplinary skills are really undervalued in law practice. Usually (US) lawyers try to fill a room with 1 subject experts, which is uneconomical, and a government typically is not going to do it. So, in a run-of-the-mill matter like this there is no interdisciplinary skills available to stop the slow motion train wreck.

Do it! Then when you get enough experience become a Judge. So when a prosecutor brings your stupid case you can just throw out!

This story resonates with me. I faced a similar charge in college. I got indicted by a grand jury, and it was years before the DA dismissed the charges. Absolutely nerve-wracking, and I was innocent!

This is crap. Late last year I found a public S3 bucket with 23,000 JSON files in it, which I used to make a visualisation: https://vimeo.com/249970399

The reason I felt confident to do this was because there was no access control on the files and I'd reported it to PUBG Corp, with the bucket remaining public weeks later.

Before people are punished for downloading unprotected information, the person who left it like that should be hauled up in front of the courts.

I've actually done a similar thing myself. When my wife was doing her Nursing degree she was downloading some documents she wanted to reference from an NHS web sites. The report for one year wasn't linked, so I checked the URL scheme, figured out what the URL for the report should be (only the date was different in the file names of reports for different years) and downloaded it directly.

It never occurred to me I might be committing a crime.

He's archived portions of 4chan?

It's likely then that he's gotten some "bad" stuff without him really knowing it. The police will search through his files, find the bad stuff, and charge him with some sort of possession/accessing/downloading charge.

Life = ruined.

Am I crazy? Aren't FOIA requests, by definition, public information?

This happened in Canada. They might have a FOIA equivalent, but I wouldn't assume it has the same rules as in the US.

In this case, most of the documents were public, but there were a small number which had confidential information and were inappropriately stored on the public portal.

In the UK and most of Europe, you cannot request the personal information of anyone else via a freedom of information request, since it is likely to contain sensitive information to that person.

For the UK, you can find out more information about the specifics of this from the data protection act, which includes clauses about FOI requests.

If a URL responds to any unauthorized HTTP request with data, how is the requester supposed to know that the data they received is supposed to be private or sensitive?

A better (more accurate) analogy than finding an open window/door is that of asking a government employee for data.

Kid: "Hi, what is the personal info in that file?"


    What they should say: "You are not authorised to see the contents of that file."

    What they actually said: "Sure, here's all the information in that file."

In France, we had a similar case, a computer guy with the pseudonym Bluetouff[0].

He downloaded loads of national agencies confidential documents, because they were available on Google.

However, he was sentenced (3,000€ fine), because when he explored the website, he arrived on a connection page, thus realizing he should not have accessed these files, but continued anyway.

I just hope for the teenager that he did not encounter any login page in his search (which seems unlikely because he used a script).

[0] (in french): http://www.maitre-eolas.fr/post/2014/02/07/NON%2C-on-ne-peut...

Damn kids. They're all alike.

You sir, won the internet today with that quote

> His bedroom is upstairs. That's where police found him sleeping when 15 officers raided the family home last Wednesday morning.

Calculate the cost of the 15 officers raid plus prosecution plus the damages to the teenager and repeatedly bash it over the head of the responsible officer in the next election. This is how to deal with this shit in democracy. Even if people are insensitive to someone else's freedom they are sensitive about their money.

The most interesting part of this to me is that for the charge to stick, they have to prove he did what he did with malicious intent.Keeping in mind the article states that other employees of this business also viewed these classified documents and are facing no repercussions because the company states they did it on accident. While in every scenario this kid should get off completely, that very well may not be the case. The US is extremely stringent when it comes to cyber crime, more often than not they like to make an example out of people rather than show mercy. The technical writeup for this was spot on, it seems like the company is embarrassed and instead of admitting they severely screwed up, they are doubling down on trying portray this teen as some super high tech malicious hacker who was trying to steal government secrets. It doesn't matter how lax your security is, if you can convince the population that this teen was nothing but an unethical, scumbag hacker, no one will show him sympathy.

Isn't this Canada though?

This reminds me of a purported "hack" back in the Governor Schwarzenegger days. An employee from a rival campaign found a public-accessible FTP directory full of audio files, which they then leaked to the press. IIRC, the California Highway Patrol opened up an investigation but ended up not pursuing charges.


edit: the other parallel, IIRC, was that part of the web site was kept private. But the user found the audio by navigating to a parent directory which was apparently open to the public:


> Essentially, aides opened the Web address, or URL, from one of Schwarzenegger’s speeches and lopped a few characters from the end of the address. That yielded a directory of audio recordings.

Each officer and official involved in this should face prison, as would any common burgler who holds a family at gunpoint in their own home.

Couldn't it be argued that clicking links on a web page are no different from changing an ID in the URL?

Web pages contain loads of URLs. You can't tell if you have the right to access the content behind it. The URL itself is simply an address to something - or nothing (404).

Having an ID in the URL is a compact way of signaling a huge list of URLs.

Thus, the kid simply followed links published on the website.

perhaps we need an RFC that defines this type of approach (pages "secured" behind easily guessable urls) as public information.

That actually seems like a good idea to me. I wonder if there isn't already one? It'd be a good symbolic source of authority on issues like this.

There could also be other RFCs covering our usage of the internet, and our expectations of what our rights are as internet users. Or perhaps stick that all in one "definitive" RFC.

Instead of actual security why not have a spec for /humans.txt which can say things like "Please don't read anything in the /secret directory."

and what would that achieve? all it's going to do is force companies to add legal boilerplate (eg. those "this message is intended for the recipient only..." that you see in email signatures) to every imaginable place to cover their ass, meanwhile doing nothing to improve security.

Talk about no sense of humour.

As if the brilliant minds behind this website would even know what a RFC is.

The RFC would be more for future "legal" defense around this type of issue to use as evidence for support of enumerable urls = public api.

What is it about headlines involving charges that love to focus on “facing prison”. At the very least this should indicate the RANGE of punishments, and there is a hell of a range.

From the article, he’s been charged with “unauthorized use of a computer”. IANAL but there would seem to be at least two possible interpretations of this charge [1], and the “Summary Election” variant has a MAXIMUM punishment of $5000 fine or 6 months. The other interpretation “Indictable Election” is a maximum of 10 years.

As with any case, details matter. Judges aren’t just sending every hacker to prison for 10 years. He may be judged not guilty (evil intent must be proven; then there’s his age, etc.), or given a way, way, way smaller punishment than this “prison” he “faces”.

[1] http://criminalnotebook.ca/index.php/Unauthorized_Use_of_Com...

Remember Aaron.

Aaron Swartz' situation is substantially different. Swartz knowingly violated the terms of service of JSTOR, and deliberately circumvented it's rate limiting. And he knew what he was doing was against the law, he even published a manifesto outlining his intentions to do this as a form of civil disobedience.

The kid in this story just incremented sequential IDs on what was supposed to be public information.

>He estimates he has around 30 terabytes of online data on hard drives in his home, the equivalent of "millions" of web pages.


This is pretty disgusting. The provincial government should be absolutely ashamed of themselves.

No one will face prison for making these documents public in the first place, I'm sure.

what part of _freedom of information portal_ do they not understand?

> Around the same time, his Grade 3 class adopted an animal at a shelter, receiving an electronic adoption certificate.

> "The website had a number at the end, and I was able to change the last digit of the number to a different number and was able to see a certificate for someone else's animal that they adopted," he said. "I thought that was interesting."

He's like, 20 years ahead of his classmates.

This is really funny to see people comparing downloading a file to breaking into an unlocked window.

You guys don't really have a clue on what Internet is.

It’s more like taking a photo of a public bulletin board with a hundred posts on it, where three of the posts contain private information and so shouldn’t have been posted.

Anyone could have viewed the posts on the board one by one; he just copied them all at once for later viewing.

I work in a US city government doing government transparency work. I have a title that can get people on the phone. What can I do to help?

The levels of security:

1) Formal methods and cryptography

2) Obfuscation

3) Litigation

Looks like they chose the latter.

After figuring out that they screwed up, government agents should have politely visit the teen, interview him, go through his computer together and delete compromised files.

Then quietly fix the vulnerability.

Instead they produced PR disaster by disrupting lives of a law-abiding family.

These are signs of arrogance and incompetence of decision makers at that government department.

It isn't a breach if the administrators of the repository failed to secure the information. Regardless of the likelihood of conviction it is reprehensible to terrify some kid with the threat of losing his freedom as a means of saving face, which is what this appears to be. I certainly hope he gets out of this.

I assume it was the large number of requests over a short period from a single IP that drew their eye to it?

I wonder how many other people found the same thing and slurped this data down in a more circumspect way before this kid was kind enough to expose this privacy breach for us?

This is very sad. Any information that is not secured, and thus CAN be accessed, should be considered publicly available. If that rule or precedent were in place (a law would never happen) it might force system owners to be more cautious.

>At the family's request, CBC News is granting him anonymity because of his hope the charge will be dropped and his reputation preserved.

The only thing about this article that didn't irritate me.

>At the family's request, CBC News is granting him anonymity because of his hope the charge will be dropped and his reputation preserved.

The only part of this article that didn't irritate me.

This is shockingly heavy handed. Would they try and extradite Sundar Pichai if Google's crawler happened to index the pages?

Teenager facing prison for looking at poster the government (mistakenly) put up...

Mandatory xkcd: https://xkcd.com/932/

I can give a very anecdotal example, where I live all the doors on the flat look the same, the just have small numbers over the door. Because the exit door is the same just without a number and it is next to the door of my flatmate. Because I was in a hurry I accidentally entered my flatmate room when I intended to get out of the flat. This is more or less the same.

I think people would be less upset about the teenager going to jail if the chain of highly paid Peter principle executives responsible for the files being accessible would also have to face any investigation at all.

From the article: "When he was around eight, he remembered playing around with the HTML of the Google search page, making the coloured letters spell out his name."

Isn't the Google logo an image? Smells a bit fishy to me.

I make that about 11 years ago - the page was rather different then.

Sounds an awful lot like the computer fraud and abuse act

What an age we live in, utterly depressing.

American culture strikes again.

I find the his story of "archiving the Internet" extremely amusing. Good luck with that defense. He was 19 at the time and knew exactly what he was doing - or should I say "archiving".

He estimates he has around 30 terabytes of online data on hard drives in his home, the equivalent of "millions" of web pages. He usually copies online forums such as 4chan and Reddit, where posts are either quickly erased or can become difficult to locate.

"I preserve things, I archive the internet. I have history on my computer, and all of that should be saved and preserved," he said.

You are not providing any argument against why he should not do that.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact