Hacker News new | comments | ask | show | jobs | submit login
Germany's laws on github, machine-readable and ready to be forked (github.com)
325 points by jaseg on Aug 8, 2012 | hide | past | web | favorite | 73 comments

There was a great discussion on HN previously about this topic which also explains why a straight git implementation isn't viable for US law; I'm not sure if doing the same with Germany's laws would be similarly difficult:


As discussed in the linked HN discussion, VCS/diffs are not compatible with the established workflow for discussing and changing laws. However, as far as I understand, this repository is not primarily intended as a tool for supporting lawmakers but serves two purposes:

a) It allows the public to track all changes made to a law.

b) it allows NGOs and other parties to suggest changes to a law by forking the repo and sending a pull request. [1]

In summary: no revolutionary shift but a nice tool.

[1] https://github.com/bundestag/gesetze/pull/2/files

This is about as revolutionary as officially letting the plebs read the bible.

Pretty damn revolutionary.

Anyone can already read the laws, though. I'm less sure that reading them specifically via git is a major revolution. it's possible something compelling will be built on top of it, I'll admit.

For the U.S. code, something like Cornell's LII interface, which for a long time has displayed both the current version of the law and, for any section, the history of amendments to that section, seems more user-friendly than a git repo: http://www.law.cornell.edu/uscode/text

>I'm less sure that reading them specifically via git is a major revolution.

You're very right. The hard part is knowing where to look in the documents, not where to go to find the documents.

In a former life I spent a bit of time with my nose in the CFR. The hard part was finding the which regulations pertain to me, googling for "47 cfr 73.3526" was a cake walk.

Thanks for the link. I read the following from that earlier discussion and immediately thought, isn't this what the Darcs version control system was designed to do?

"What you have to do is to just record the conflict and create two parallel universes, one in which the conflict has been resolved using branch A and another one in which the conflict is resolved using branch B. You then keep these two universes alive and apply all the later changes twice. You have to do this until a judge or a legal body declares one of the "branches" the correct one; this may take years and the decision reverted (even partially) many times."

Isn't US law not public-domain anyway? Or at least on a state by state basis? I remember a copyright issue concerning someone reprinting some of Oregon's laws.

The law is non-copyrightable. However a lot of state laws say "You will build your building according to Document X by Corporation Y" and that is copyrighted and not given out without a lot of money. The reason why these exist is that lawmakers aren't experts in, say, concrete; and those who are tend to be employed privately, and make industrial standards. As an example consider specifying that a program will be written in C (a bad idea here, but saying "structural members for buildings shall be steel as defined by this standard" isn't a terrible idea). C is defined by the relevant ANSI or ISO standard; now if you want to know what the law actually says, you have to pony up the money for an official standard.

I don't know what settled law is about this, but it is at the least morally questionable activity.

The law is fairly settled, however the Supremes have not said anything on the matter yet. The Veeck decision is the best answer as of now and Veeck seems to be pretty clear:

"The issue in this en banc case is the extent to which a private organization may assert copyright protection for its model codes, after the models have been adopted by a legislative body and become "the law". Specifically, may a code-writing organization prevent a website operator from posting the text of a model code where the code is identified simply as the building code of a city that enacted the model code as law? Our short answer is that as law, the model codes enter the public domain and are not subject to the copyright holder's exclusive prerogatives." (Veeck v. SOUTHERN BLDG. CODE CONGRESS INTERN., 293 F. 3d 791 - Court of Appeals, 5th Circuit 2002)[1]

You do not need to pony up the money if Carl Malamud already has a copy of the code. If you have purchased one of these codes and are done with the hardcopy get in touch with Malamud and see if he wants it.

Cory Doctorow explains Carl Malamud's efforts after the Veeck decision: http://boingboing.net/2012/03/19/liberating-americas-secret....

May god continue to bless Mr. Malamud and all of the great work he has done...

[1] http://scholar.google.com/scholar_case?case=6755260615473645...

The scary consequence could be that eventually there will be laws prescribing certain programming techniques for software projects.

In Germany I think it is almost already the case, because if your software project goes awry, the judge will want to know that used current best practices of software development. I only picked that up in passing, though, reading about a case with another focus. But it scares me, because I don't necessarily agree with all current "best practices". Imagine being sued because you didn't include 99% test coverage. Or worse, because you didn't use Java.

Requiring a certain degree of test coverage by law sounds like a good idea for life-critical systems.

So people will write some crappy tests to satisfy the law. I don't think much would be gained. But consultants would be enabled to earn more money, that is true.

That doesn't make any sense. How can laws not be in the public domain? Can any lawyers around comment?

IANAL but you are right from the basic premise:

“If a Law Isn't Public, It Isn't a Law”—Justice Stephen Breyer

But I believe the distinction here is that just because the law is public, that doesn't mean that you can call up anyone who has a copy of the law (including the organization who wrote the law) and say "Hey, give me a copy of the law."

So orgs like Malamud's get copies of the law and provide them freely to all:


Merges definitely seem like they'd be difficult if there are two concurrent changes to a statute, but that's always going to be the case (and is the case with or without source control).

I think a lot of the initial focus/discussion here is about machine-readable interpretation and management of laws themselves, and hence the challenges of turning gray into black and white - but I think that having a historical, digital record of changes in laws -- especially if clear attributions to individuals or groups can be made -- could be just as significant, if not more so.

Imagine having a full historical record of legal changes across and within a nation, and the data measurements to back up the effects of those changes. The results could be linked back to the individuals/groups involved in drafting the laws - data-based legal review.

Since Germany has a civil law instead of a common law system, this might be easier. (But I don't know, really.)

Not really. Common law systems still have statutes that are compiled and amended. The "common law" potion of the system just means that court rulings are (potentially) binding upon other courts.

The difference between civil and common law is an elusive thing. On the surface it is enormous, but the deeper you dig, the less you find.

Sometimes it's claimed it's about precedent, and indeed some civil law jurisdictions claim that they do not believe in binding precedent. But of course for a legal system to be at all useful, decisions need to be consistent, and the idea that you can achieve consistency by writing every detail in a code so that every decision logically follows from the code is bullshit; if that were the case, all civil lawsuits would be 100% predictable and therefore rational actors would settle them and the judges could all go on vacation.

The reasoning I once read in some Dutch first-year law course notes was along the lines of "we don't do _stare decisis_, but we support equal treatment in equal cases, and it would be unequal treatment to treat you in way X when we treated the other guy in way Y, so we're going to follow precedent, but not because we must follow precedent, but only to avoid unequal treatment." I suppose that it is true that digging up absolutely ancient judgments is a little bit less convincing in a civil law setting, especially if there are periodic recodifications so that you can simply toss away a 17th-century precedent by saying it was an interpretation of the old code, not the new one.

Some say the difference is codification, but as you point out, not all status in common law jurisdictions are just piles of unrelated acts: a lot of the time, they are organized as systematic codes that are amended just like civil law codes. And besides, civil law countries have uncodified case law, too. The section on torts in the French civil code, for instance, is incredibly terse, saying basically that if you unfairly harm somebody you must compensate them. But of course France has tort law just like England does. Interestingly, since French court decisions do not normally provide much reasoning aside from quoting sections of the codes, the details of that tort law get elaborated mostly by law professors in books and articles; but in other civil law jurisdictions, like Germany, judges write long, reasoned decisions just like in the US (except more stilted and formulaic in style). And Scotland, considered traditionally a civil law country, has lots of English-style uncodified legislation.

So maybe then it's the Roman basis? Nah. English law had lots of Roman influence, too, and continental law had lots of influence from local customary law, canon law, and the law merchant. (Just read Berman to find out the details.) Maybe the continentals were bigger on pretending that it was all Codex Iuris Civilis all the time, but nobody ever really believed that.

So then what? Sometimes you hear particular doctrines called out as being significantly different, like consideration in in common contract law as opposed to the intention to be bound in the civil law of obligations. But the consideration rule has so many exceptions that if you can reasonably be thought to have intended to be bound, you'd better know the law very precisely if you still want to get out from under things based on lack of consideration. Besides, consideration may not be required in the civil law, but a payment can serve as evidence of a nonwritten contract.

That's not to say that there are no differences, but it's hard to pin down anything that really applies in all civil law jurisdictions and no common law jurisdictions or vice versa. Notaries, I think, are a pretty consistent difference, although they don't exist in some Asian civil-law jurisdictions.

Sorry semi-OT humour but this is one of the very few times were you can actualy fork with the law and come out ontop :).

Sadly though alot of laws due to changes and word-smith pervertions can be hard to understand and in that it would be nice if there was some universal way to express law's that you could get any law in any country and express. That would be immpressive though hard to do. Only comparision would be picture based traffic signs, that is somewhat as close to universal with regards to laws as can get.

Be nice when all the countries have there laws up in such a way. Will make grepping alot more fun and probrbaly be the birth of lgrep (law-grep).

The closest thing to a universal expression of law is probably English. In older times, it would have been... Latin?

The problem with translating law is that a lot of the time words are used as "terms of art" that have a special meaning based on tradition or, worse, precedent: some court at some point was forced to decide on the meaning of some very fuzzy word, they came down one way, and now the very fuzzy word has a very precise meaning and lawyers like to use it precisely because it has a precise meaning!

That sort of stuff easily gets lost in translation, which is why legal translation is such a pain to do. And probably a good part of why it took the English courts so long to switch from Law French (an old dialect of Norman French long used for English legal writing) to English.

They should definitely do this for bills also, so you can easily see who has incorporated what into each bill, and how the bill is evolving as it happens

In Poland we have project "Sejmometr" /Parliment-o-meter/ [1]. It's website with almost current information about parliment works - bills, voting, speeches. It even have json api [2]

[1] http://sejmometr.pl

[2] http://sejmometr.pl/api/dokumenty

Agreed. Every proposed change and amendment should be attributed to whichever legislator requested it so we can see who's inserting all these crazy clauses and loopholes.

Awesome stuff. I couldn't find anywhere though: is this an 'official' project, or is it just someone who processed the XML forms into markdown?

Also, this: "All German citizens can easily find an up-to-date version of their laws online."

And it's only 130 megs of markdown when zipped (246 unzipped)! A mere 4,737,628 lines[1]! Surely you have time to read it, right? And therefore be a well-informed, law-abiding citizen?

I wonder how big America's would be :|

  [1] `wc -l $(find . -name '*.md')` admittedly very rough

This is not an official project, "just" someone who registered the organization "Bundesregierung" ("Federal Government") at github and processed the official XMLs into markdown.

Yeah, sorry - 'just' is a relative term :) Didn't mean to belittle the act, it's still a big and interesting project.

Nitpick: no "Bundesregierung" (Government) but "Bundestag" (House of representatives).

My guess is that about 129 megs of those are tax laws.

I think this is where "German citizens can easily find an up-to-date version of their laws online" :


It's not. You still have to have the complete text of the law and manually add the changes that are described in the Bundesanzeiger. It states for example something like "Change in BGB: Paragraph 123, section 45, change the word 'and' to 'and/or'. Only completely new texts would be printed in full.

The Bundesanzeiger is merely the last step in the law making process. First, the two law making institutions (Bundestag, Bundesrat) have to vote in favor of the law and the president has put his X under it. Only after the law is published in the Bundesanzeiger it takes effect.

Does the government publish a version where the deltas are already applied?

The concept is great, especially seeing the specific contributions legislators make (of course in this example all commits are coming from one guy, so not so useful here). I'd love to see this advance, as well as seeing more developments in semantic markup of laws (think: 'Siri, how do I get out of this ticket?'). Not to mention just better avenues for laypeople to educate themselves on the law. I find it a little ridiculous that the legal system, which pretty much runs our lives, is so complex that it requires an industry of some of the most highly paid people in our society to interact with it. The whole thing is ripe for hacking IMO.

I do not know about Germany, but in my country the issue is that "An update to the law X" may introduce changes not just to parts of X, but to govern the parts of laws Y, Z too. Or it may introduce completly new regulations not being a part of either law text.

This is why it is hard to make current versions of X, Y or Z in terms of a version control.

It is also common to have laws X and Y both applying in the same context, and sometimes it is not clear which one is newer or how to apply "An update to X".

It is little easier to work on a more fine grain, in terms of sections and articles and not the law text as a whole, but this makes it a lot less official.

Isn't that precisely taken into account by patches, hunks, and branches?

The readme suggests that he only gets a copy of the final laws. That means he can diff two versions of a law, but more work would need to be done to say that a change to two different laws belongs in the same patch. Apparently that data is not machine readable.

Gesetze sind Prosa, sie enthalten keine maschinenlesbare Semantik. (Laws are prose, they contain no machine readable semantics)

And there was me hoping they had fixed that! Ahwell, one step at a time :)

Rewriting law to be machine-readable as in "instructions a machine can understand" would be quite a task I suppose. Interesting project though. Would that even be possible? How well does the law map on the black and white logic of a computer?

There's a large area of legal philosophy around those kinds of questions. Pragmatism vs. formalism in judging is one split that's sometimes identified, with "formalism" being closer to a view that the law is a precise set of procedures that must be followed mechanically, and "pragmatism" closer to the view that the law is a set of principles that must be applied using common sense to reach equitable outcomes. Lots of other positions as well, around that "what is law, anyway" question.

There are some attempts to formalize something like the pragmatic view, too (oddly enough), in artificial intelligence "legal argumentation" systems, which try to model the back-and-forth of adversarial legal systems, determining when to bring up an argument, how to counter an argument, etc.

Sometimes, I fantasize about turning that around. What if parts of the law (say, fiscal law sounds like a candidate) are chosen to be as black and white as the logic of a computer?

Policymakers could get automated compile errors when trying to craft conflicting laws. They could instantly compute what the effect of a law change is on this and that demographic or persona.

There are serious problems with trying to make laws black and white. A certain amount of flexibility is frequently a good thing. Let the judges knowledge/judgement decide. An example of problems with the alternative is 3 strikes laws.

> automated compile errors when trying to craft conflicting laws

That just made my day :)

It's actually a good idea, I think. I'm just not exactly sure whether that would work because laws govern the real world and in the real world, logic isn't binary.

However, such laws would at least be understandable for mere programmers ;)

    data Offense = MensRea | ActusReus
    CriminalLaw :: [Offense] -> Punishment
    ExtenuatingCircumstances :: CriminalLaw -> [Circumstance] -> ExtenuatingEffect -> Punishment

Logic in computers doesn't have to be binary either. There are plenty of implementations of fuzzy logic, including some programming languages.

Someone should come out with an artificial intelligence app, which reads through the laws and past history of cases and helps lawyers build cases. :-)

If only they'd accept pull requests.

"You are encouraged to open pull request. Of course only valid legistation voted on by the Bundestag will be merged."

> Of course only valid legistation voted on by the Bundestag will be merged.

And I was excited for a moment...

Germany is killing it. Their economy is roaring, they've set very strong renewable energy goals and are acting on them, and to top it all off, they protect their civil liberties and are a relatively benevolent nation.

I find it fascinating that for a repository of German law, written in German, the README and commit history are all in English. I wonder if that will have the effect of scaring off any would-be contributors.

the readme has a german section right above the english one.

Ah, right you are! Didn't notice it thanks to the anchor link.

Doesn't change the commit history's melange of English and German.

All Germans I have ever met have had excellent English. Certainly good enough to read short English commits with only the occasional dictionary reference required.

The Eurobarometer report from 2006 says 56% of Germans speak English.

But I suspect that rather underestimates the case here. There is a huge difference between someone checking a box on a form that says they can speak English and being able to parse short messages in English.

And given only fairly educated, tech-savvy Germans are likely to participate in this, I think the negative effect from English commits is straight up zero, or at worst incredibly low.

Source: http://en.wikipedia.org/wiki/List_of_countries_by_English-sp...

It's been a few years since I visited Germany, but it varied quite a bit with region, level of education, and age (pretty universal among younger Gymnasiasten, very rare among older people in former East Germany, particularly more rural parts).

I have gotten reactions from "Why did you bother to learn German? Everyone here speaks English" to "Gott sei dank! Du kannst Deutsch" ("Thank god! You can speak German")

I second that.

I'd like my government to do the same! Don't really care about the versioning system, as long as it's open-source and alive.

Actually, it is not "the government" doing this. It is somebody scraping an official web site containing every law, processing it and pushing the result to this repository.

Start with the IP-related laws. They seem to have some of the most aggressive ones in the world, which could explain why the Pirate Party there is also the fastest growing branch.

I doubt it. Germans started getting sued for illegal downloads right when DSL grew popular (a decade ago), without any political response. We also have no software patents; we can still (CMIIW) freely share copyrighted works with friends, just not with the public; we can crack what wouldn't otherwise run.

The only part that is so terrible and draconic that everybody knows about it is the GEMA, which is concerned with music and loyalties.

This is not completely true. You can share copyrighted materials but you are not allowed to bypass or crack any security measures to do so. Also, the information on cracking a protection system, like ripping a DVD or an LP, may not be discussed publicly. Tutorials on these topics are illegal. The same is true for software products that would allow network intrusions. Wireshark is illegal, as well.

That is all true. I probably live too deep within my bubble of DRM-free media :)

Yeah, and now the GEMA want Kindergardens to pay fees for singing child songs that are in the portfolio of the GEMA. ridiculous!

Wouldn't it make sense to give each sentence it's own line, and use double linefeeds to demarcate paragraphs? It would make diffs much easier to read.

The python scripts to generate the repository from the xmls freely available from the german government is also on github: https://github.com/bundestag/gesetze-tools So, go ahead, hack! ;)

Applying versioning to laws is a fantastic idea, if only to make an easily accessible account of legislation's evolution.

Amazing idea, which should be implemented by all sorts of governments around the world!

Great idea! can see how this can apply to many other fields than law.

Where is the compiler for this?

this is the best gravatar I've ever seen

we are living in the future

What sucks is this would never happen in the US. The reason being that most of the earmarks and so forth are added to bills after they're passed. And not only that, they're added as images in tiny fonts so that they can't even be scanned in.

God bless America.

Unfortunately I do not remember the link to the GPO page on xml or the XML page on thomas.loc.gov. However there are an awful lot of government documents in xml format. What specifically are you missing? For an interesting site that is using some of the new federal technology openess initiatives (for lack of a better term) take a look at: http://www.govtrack.us/ . Carl Malamud also has an extensive collection of machine readable government documents but sadly http://bulk.resource.org/ and http://public.resource.org/ seem to be down.


Gov XML initiative: http://xml.gov/

House XML Initiative: http://xml.house.gov/

Code of Federal Regulations (XML format): http://www.gpo.gov/fdsys/bulkdata/CFR e.g: CFR Chapter 12 (regulation of banks. 12cfr30 is customer information security at banks if you are interested) http://www.gpo.gov/fdsys/pkg/CFR-2012-title12-vol1/xml/CFR-2...

GovTracks About / Data Sources Page: http://www.govtrack.us/about

EDITED: To include links to the Code of Federal Regulations

You mean except for this? https://github.com/divegeek/uscode

Honest question, did you even search for reasons that this would be feasible or did you just rush in to make an anti-US comment?

No part of your comment is true.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact