
The entire US Code is now online in XML - liscovich
http://uscodebeta.house.gov/download/download.shtml
======
vog
I find it amusing that here in Germany, we have that for years:

[http://www.gesetze-im-internet.de/](http://www.gesetze-im-internet.de/)

All laws are available in XML, HTML, PDF, etc. The site also provides an RSS
feed.

In addition, some enthusiasts regularily download stuff from there and apply
those to a Git repository:

[https://github.com/bundestag/gesetze](https://github.com/bundestag/gesetze)

That way, this repository contains not only the current laws, but also the
history of how the laws developed!

For the Git repository, the XML version is not used directly, but converted to
markdown. This produces very readable diffs:

[https://github.com/bundestag/gesetze/commit/f90e8fc8eb20f081...](https://github.com/bundestag/gesetze/commit/f90e8fc8eb20f08173e608f493e15f986d7e43ba)

Wouldn't it be cool if we could finally manage our laws of filing pull
requests?

~~~
gsnedders
Likewise in the UK:

[http://www.legislation.gov.uk/developer/formats/](http://www.legislation.gov.uk/developer/formats/)

XML, HTML, RDF/XML for everything, as well as browsable online (often with
original source PDFs of the printed laws), with both online and RDF/XML
representations showing all alterations to the law (with date, cross-reference
to the Act that made the amendment, etc.).

The website is one of the big success stories of RDF, IMO, as it is all based
around a model of the laws in RDF, with everything else just being varying
serializations thereof. It also allows the website to show what has been
amended — no need for dumping stuff to GitHub and then diffing it! (e.g., see
the annotation on 28(1)c in
[http://www.legislation.gov.uk/ukpga/1998/29/part/IV](http://www.legislation.gov.uk/ukpga/1998/29/part/IV))

That all said, there tends to be a delay between the PDF being uploaded and
everything being marked up and entered into the RDF database (see the "new
legislation" on the home page).

------
pnathan
I'm really tempted to collect the XML files and put them on github, with
periodic checkpoints to update it with the latest.

Watching the evolution of law over time is a fascinating thing and using SW
engineering tools to help would be really fun.

~~~
sc68cal
This is currently done via scraping:

[https://github.com/divegeek/uscode](https://github.com/divegeek/uscode)

The diffs are huge.

~~~
DannyBee
Remember that diff is an algorithm to generate the smallest set of operations
to produce version B from version A, _not_ an accurate reconstruction of what
happened. Diff algorithms are also often tuned not try as hard to find the
smallest set of changes for larger documents, due to speed concerns.

~~~
bliker
I did some work & research about diffs when I tried to visualise progression
of slovak law. My best attempt was a diff method that would understand the
inner structure of the law. I ended up with simple draft but I am sure
somebody more competent could look into that.

~~~
DannyBee
At least in the US, a lot of the laws that get passed are in the form of
diffs.

That is, the law that they enact says "This law is to do blah blah blah.

Subsection 1373(a) of the US code is replaced with the following text 'blah
blah blah'"

The wording used is pretty standard. So you can actually parse it in most
cases to see what the actual changes are.

------
OldSchool
Caveat for for many of us overly rational thinkers: the powers that be
deliberately are allowed to 'interpret' this code nondeterministically by many
different means including its 'spirit,' admissibility of relevant information,
manipulation of venue and participants, apparently even extrajudicial
proceedings lately.

In short, that allows a lawyer to answer almost any question with "it
depends," and start billing.

~~~
lobotryas
You make this sound like a bad thing. Would you really prefer "black and
white" laws that leave zero room for flexibility or interpretation in light of
a given situation?

~~~
jamieb
Yes, we should have black and white laws, and unit tests for them. So someone
could write a unit test for "Will this allow the NSA to create a secret court
that is outside the rule of law?" or "Does accessing a university computer
that has an open access policy allow for a sentence of 130 years?"

I believe we'd see a lot simpler laws.

~~~
rmc
I think that's impossible with the current English language, and the fact that
human beings disagree about lots of things.

~~~
minor_nitwit
you wouldn't write in English, you'd write it in Lisp.

~~~
liscovich
Do you know of any examples of actual legal documents written in Lisp, or some
other formal language?

~~~
minor_nitwit
haha, no. It was meant as a quasi-joke. I'd like to see one though.

Certain laws are very cut-and-dry (speeding for instance) and perhaps laws
could be proven on a functional basis.

You could even make it axiomatic from the constitution and declaration of
independence.

Of course, you'd need to define the axiomatic meaning of things like 'the
right of the people to keep and bear arms shall not be infringed' \- which is
hard to do even with such simple language.

And then after loading the axioms, you'd spend a lifetime going through errors
in the existing laws.

~~~
rmc
_Certain laws are very cut-and-dry (speeding for instance)_

Ha! No.

* do you want to give an exemption for speeding for, say, ambulances? So how do you define "ambulance"? Do you have to be registered to drive an ambulance? What defines the "duty of an ambulance" (or are they allowed to speed no matter what). What about a van carrying an organ to be donated?

* What if I have my sick child in the car and I'm rushing them to the hospital because it's faster than waiting for an ambulance? Is it right that I can be arrested and convicted for that? I think that would be a perversion of the spirit of any just law.

* What if I'm being chased by a maniac relative who wants to kill me? Can I speed then to drive away from them? Do I have to believe my life is in danger? How do you define that?

* Let's pretend I'm driving along and some other vehicle is about to move into my lane and crash into me because they are stupid and don't see me, and let's pretend that if I speed up a little bit (over the speed limit) and am able to get in front of them and avoid an accident. Should I be convictable for speeding even though I sped to prevent an accident?

Sure, these are all edge cases and there are loads of cases where it _is_
clear cut, but you have to write a law that can handle the edge cases. Without
accepting the vagueness of human life, you'll wind up with an unfair
conviction that is horrible.

~~~
randallsquared
Or, alternatively, you can write the law simply and let people decide for
themselves if it's worth breaking.

So, no exemptions for anyone, but then part of the cost of running an
ambulance service is paying speeding fines regularly. Given the cost of
ambulance services without that, the additional expense is lost in the noise,
making it obviously a good choice to speed when useful (for ambulances).

Trying to figure out after the fact whether someone had a good reason to break
the law (and therefore shouldn't be penalized) is one of the things that
complicates legal systems enormously. Instead, we should write the law clearly
and specify the penalties for breaking it directly, and let those who have the
best information about the situation, the potential lawbreaker(s), choose
whether it is worth breaking the law in a given instance.

~~~
joesb
And if speeding repeatedly banned the driver from driving, then you'll run out
of ambulance driver after a day.

Or let's put a man in jail because he break the law three times (even if all
those cases would be exempted in current system).

~~~
randallsquared
The whole point of the proposed system is to capture all the downside of a
given action in a single penalty. Therefore, escalations based on repeated
actions wouldn't make any sense.

------
ChuckMcM
This is pretty awesome, and if it were in git/hg would have the ability to
write a 'blame' tool to figure out who voted on the part of the law that is
pissing you off :-)

~~~
DannyBee
Actually, you wouldn't. This is the US code, not the legislative info :)

I actually tried to create this once (with a team behind me, in fact) with
what's available or possible with the legislative info.

THOMAS theoretically published in XML, but it's missing a _lot_ of info.

Not only is this info not published, it's not even stored. They still are
literally passing bills around to each other in some cases. You'd have to sit
in on committee markup hearings, etc.

Even the small amount that is published from markups or whatever doesn't tell
you who did it, only that it was done.

Some committees were willing to give more info. None were willing to
essentially publish the in-flight staff attorney or other versions that would
tell you for real who changed something.

Remember also that hg/git diff does not display what really happened.

It is displaying "how can i produce version B from version A using some
minimal or near-minimal set of changes", converted to a line (or sub-line)
based set of changes.

This does not tell you history, only one possible set of changes.

THOMAS does publish some in-flight versions of bills, but again, it's not
really enough to do what you really want to do. I can tell you a bill changed
between introduction and markup, and was different again when it got back to
the floor. I can even, in some cases, tell you what was amended/deleted. I
can't tell you who did it.

(Well, i can tell you with some percentage accuracy, because we built a
machine learning model, but ..)

~~~
jacques_chester
The US system of legislative development is quite flawed. There are a lot of
places where changes can be made without attribution, emerging from committees
without saying who added what.

In most Parliamentary democracies you can only propose amendments from one of
the two chambers, so any amendment can always be traced to the Parliamentarian
who moved it.

~~~
saraid216
I'd support having a nice and loud discussion on doing this. Being able to
trace the authorship of line items doesn't seem to have stopped pork in other
countries, but it might mitigate it. And we'd be able to yell at people
better.

~~~
jacques_chester
There are other amendments that cut back on some of the worst pork.

The Australian Constitution in section 55 forbids tax and spending legislation
to deal with any other subject. If an Act includes other material, it's void,
it has no effect.

This prevents the American situation of omnibus bills, rider amendments and
the like.

We still get pork barrelling here; but between S 55, fused executive and
legislative and strict party discipline, the political incentives are
differently structured for individual politicians. It creates a stronger check
on profligacy.

------
antitrust
This actually makes law accessible to the technologically-savvy out there, and
is going to launch a thousand apps giving specialized legal advice.

This could in turn mean a reduction in the cost of litigation, which would
hopefully be passed on to the rest of us.

Hopefully I won't get sued for that statement.

~~~
liscovich
Hopefully, as a result, the disturbing statements like the ones below will
become obsolete:

James Duane, Regent Law School professor, defense attorney:

"Estimates of the current size of the body of federal criminal law vary. It
has been reported that the Congressional Research Service cannot even count
the current number of federal crimes. These laws are scattered in over 50
titles of the United States Code, encompassing roughly 27,000 pages. Worse
yet, the statutory code sections often incorporate, by reference, the
provisions and sanctions of administrative regulations promulgated by various
regulatory agencies under congressional authorization. Estimates of how many
such regulations exist are even less well settled, but the ABA thinks there
are ”nearly 10,000.”
[http://youtu.be/6wXkI4t7nuc?t=5m18s](http://youtu.be/6wXkI4t7nuc?t=5m18s)

Supreme Court Justice Breyer:

"the complexity of modern federal criminal law, codified in several thousand
sections of the United States Code and the virtually infinite variety of
factual circumstances that might trigger an investigation into a possible
violation of the law, make it difficult for anyone to know, in advance, just
when a particular set of statements might later appear (to a prosecutor) to be
relevant to some such investigation."
[http://www.law.cornell.edu/supct/html/98-93.ZD.html](http://www.law.cornell.edu/supct/html/98-93.ZD.html)

------
liscovich
If you were to start a new country, what would the legislative process look
like there? For example, how should new "startup nations" like BlueSeed
([http://blueseed.co](http://blueseed.co)) inspired by Seasteading Institute
go about passing and storing laws? Should they have some sort of open github
repo to which anyone can make pull requests? How do you see the congress of
the future?

~~~
rmc
Usually countries inherit the laws from the country they are decended from.

~~~
liscovich
True, but what would it look like in the future? Would you just fork a repo?

~~~
jacques_chester
You just say that the laws in force on a particular date apply locally, minus
anything that physically can't apply (eg laws about particular locations).

The law of every country in the English-speaking world started that way.
Australia's current system of law, for example, commenced as a branch of
British law on 26 January 1788. All the laws in force in Britain at that
moment were presumed to apply; laws governing institutions and issues peculiar
to Britain-as-a-place just weren't of any consequence.

Periodically you go into the collection and clear it out. In Australia in the
80s and 90s there was a law reform movement and as a positive side-effect
enormous research was done to discover the true coverage of "Imperial" laws
still technically in force. Our various Parliaments passed various Acts to
repeal and replace old laws.

For example, where I come from, in the Northern Territory, the _Law of
Property Act_ repealed laws going all the way back to shortly after the Norman
conquest.

~~~
rmc
_You just say that the laws in force on a particular date apply locally, minus
anything that physically can 't apply (eg laws about particular locations)._

Yep. Or you give yourself a new constitution and say "All laws previously in
force automatically come into force, unless they contradict the new
constitution". Ireland did that.

Likewise Ireland cleared out a lot of old laws in the 200X's. It was done by
saying "Anything pre-1922 is repealed unless it's on this list".

I'm unsure _why_ they kept some laws, like the 1204 law on "Erection of castle
and fortifications at Dublin; establishment of fairs at Donnybrook,Waterford
and Limerick", but at least now we can refer to it as the Fairs Act of
1204....

------
fnordfnordfnord
Doesn't appear to include codes and standards which are included by reference
such as NFPA, IBC, IRC, SAE, etc. (see [1] for a non-gov't project to publish
those)

Nevertheless it is a very good thing to see the the gov't publish (most of)
the law in an easy to use format.

[1] [https://public.resource.org/](https://public.resource.org/)

------
calpaterson
For those of us who don't know anything about it, what are the uses of machine
readable law?

~~~
bjr-
The law can be interpreted as a set of rules matched against an action in a
particular context to determine whether said action is legal.

Right now lots of people do this manually.

Maybe computers can do it better.

If computers can do it better, maybe people could focus on writing new laws
and refactoring old ones instead of repeatedly interpreting old laws for each
case.

~~~
BenoitEssiambre
Yup, legal code isn't that different from computer code where a judge is the
interpreter and evidence is fed as input.

If we could invent a computer interpreter that acts as a judge, laws and
contracts being its code might make it obvious to lawyers why it doesn't make
sense to allow code to be patented.

Every time lawyers would write laws or contracts they would have to avoid
using legal ideas that have already been patented subjecting them to the same
legal difficulties software developers face every time they write code.

I guess I should patent that invention.

~~~
rhizome
You could also add in EDGAR data that touches the laws to make estimates on
the way the judiciary would interpet a certain action. It could also be used
to find loopholes, conviction-free zones near boundaries in the law that could
be closed.

------
techtalsky
I'm probably a little late to the party, but I think it's worth mentioning
that some of the "XML" looks like this:

<tr style=" -uslm-lc:II22; "><td style=" text-align:left; vertical-align:top;
border-right:1px solid black; padding-right:2pt;"><p style=" text-align:left;
text-indent: -1em; padding-left:1em;">

Wow. I wonder what -uslm-lc does.

~~~
nxn
Similar question and answer on another site:
[http://www.reddit.com/r/technology/comments/1jfufh/the_entir...](http://www.reddit.com/r/technology/comments/1jfufh/the_entire_us_code_is_now_online_in_xml/cbeiub8)

------
tmoertel
It's tragic that the United States (Federal) Legislative Model (USLM) is
defined in terms of W3C XML Schema Definition language (XSD) instead of the
comparably sane RELAX NG and its easily interpreted compact syntax. You would
think that something this important ought to be made clear and understandable.

EDITED TO CLARIFY: The tragic part isn't that the schema is _given_ in XSD but
that it's _defined_ in XSD, which lacks RELAX NG's simple semantics and
composibility rules. For a good summary of what I'm referring to, see James
Clark's message to the IETF on the subject:

[http://www.imc.org/ietf-xml-use/mail-
archive/msg00217.html](http://www.imc.org/ietf-xml-use/mail-
archive/msg00217.html)

~~~
mindcrime
In practice, it probably doesn't make any difference. Assuming, for the sake
of argument, that XSD is expressive enough to allow the authors to say what
they're trying to say, in an unambiguous way, it's more or less a moot point.
RELAX/NG is awesome, sure... but to some extent the "war" between it and XSD
is a religious war.

Pretty much every popular & widely used language / platform has XSD support...
hell, it's so commonplace there are probably schema aware xml parsers in
Brainfuck, INTERCAL and Befunge.

RELAX NG may well be a better choice, but saying this is "tragic" strikes me
as a bit of excessive hyperbole.

~~~
tmoertel
The tragic part isn't that the schema is _given_ in XSD but that it's
_defined_ in XSD. What I'm lamenting (yes more hyperbole) is that a schema
this important wasn't _defined_ in a schema language that had clear semantics
and composability rules, fostering reuse and adoption for related domains, of
which we would expect there to be many. (Later, of course, the definition
could always be _extruded_ into XSD and other popular yet semantically stunted
formats as a practical publication step.)

~~~
mindcrime
Fair enough. I haven't really dug into this schema in detail yet, but I'm
guessing that XSD is expressive enough, and that - worst case - somebody could
define an equivalent schema in $WHATEVER, and mirror the data after
tranforming it. It's not a perfect setup, but at least it might make the
content more usable for some purposes.

I'm kinda curious to see what can be done with it in terms of transforming to
RDF triples myself, but time will tell...

------
liscovich
An alternative XML version of the US Code from Cornell Law School:
[http://www.law.cornell.edu/wiki/lexcraft/united_states_code_...](http://www.law.cornell.edu/wiki/lexcraft/united_states_code_in_xml)

------
ilaksh
This reminds me of a recent discussion where someone mentioned tools over
process. ([http://rc3.org/2013/07/29/seven-signs-of-dysfunctional-
engin...](http://rc3.org/2013/07/29/seven-signs-of-dysfunctional-engineering-
teams/))

I would say that the 'law' is just subjective manual process, and we
desperately need more tools for every-day judgement and decision making.

For example, if there were a computer system that logged all corporate
financial transactions including income, then we could automatically tax large
corporations, rather than waiting for them to report income through loopholes.

------
lisper
This information has been available for a long time on plainsite:

[http://www.plainsite.org/laws/index.html?corpusid=3](http://www.plainsite.org/laws/index.html?corpusid=3)

------
thinkcomp
I'm attempting to centralize many different sets of laws on PlainSite:

[http://www.plainsite.org/laws](http://www.plainsite.org/laws)

Feel free to contribute.

------
tbatterii
now if the same could be provided for bills(ideally before they are voted on),
and that should go in github or something.

~~~
BWStearns
Maybe try to organize some tech-minded staffers on the hill? Get them to put
bills onto a github repo when they are made public, maybe even XMLizing it to
make it more exposed to machine analysis. It'd be a huge step in congressional
transparency, especially now that bills are going into the high hundreds and
thousands of pages (not a dig at the healthcare bill in particular, just a
trend).

~~~
tbatterii
Well it looks like the sunlight foundation has done some nice work in this
area, but as far as I can tell these are all bills that have made it to vote.
[http://www.opencongress.org/bill/all](http://www.opencongress.org/bill/all)

Whereas, I would like to see the drafts and how they progress behind closed
doors with all the lobbying influence.

~~~
BWStearns
I think its quite understandable that they have the deliberations in private.
Imagine if your customers saw every minute of your office's day. What irks me
is that although bills are available, they are available in such a way that it
is a fulltime job to even look for things of interest. With better machine-
searchable formats, you could set up alerts for potentially interesting sub-
components of legislation that is up for debate.

Edit: I don't endorse some skeevy lobbying activities, but that isn't the only
influence, and there are some legitimately difficult choices that our reps
make (when they're working right... so I guess rarely) that they believe are
in the best interest of the country and their constituents. Many of these
decisions would be made even more difficult and actually discourage much of
what thoughtful and earnest decision making does exist.

~~~
jacques_chester
Amendments should be moved in the open by individual members.

Australian Parliaments do it, so clearly it's not a great impediment to
lawmaking.

------
thehme
Since this is hacker news, I was curious to see which code the subject was
talking about and surprisingly not the code I was thinking of. I wonder how
much more of this gov code has actually been read by the people in all those
countries where it's been available longer.

------
mathattack
Putting something online is very different than actually getting meaning out
of it. I'm afraid that this will push us towards more laws rather than less.
But... Perhaps there will be good machine learning apps that can make sense
out of all the contradictions.

------
mpyne
This is awesome. They even have a stylesheet apparently.

However though the file claims to be UTF-8, vim seems to disagree, at least
for title 10. I can't tell what encoding it really _is_ though, doesn't seem
to be latin1 or windows-1252 either.

~~~
drv
The encoding looks like valid UTF-8, at least for the first few pages that I
glanced at.

I did notice the section references look a little strange in vim, e.g. "act
Aug. 10, 1956, ch. 1041, § 1" near the top; it consists of c2 a7 (section
sign), which looks fine, followed by e2 80 af (narrow no-break space), which
shows up as a box in vim.

~~~
mpyne
OK, maybe it's just a font thing here on Windows (wouldn't surprise me one
bit). I'll try again when I get home tonight.

~~~
mpyne
For posterity's sake, it does work fine here on a system with actual Unicode
fonts. So I was wrong to blame the file itself.

------
sinzone
The developer friendly RESTful version:
[https://www.mashape.com/community/united-states-
code#!docume...](https://www.mashape.com/community/united-states-
code#!documentation)

------
liscovich
Does anyone know what software is used by the hill staffers when drafting new
bills? How do they make sure that the laws do not contradict each other?

------
pseingatl
I can't wait to look for the law whereby Congress established a church in
violation of the Establishment Clause. But no one complained.

------
pdw
I'm annoyed that Title 38 - Veteran's Benefits is out of alphabetical order.
Was it originally called Pensions or something?

~~~
dailyrorschach
Yes, old name is: Title 38 - Pensions, Bonuses, and Veterans' Relief

[http://www.gpo.gov/fdsys/pkg/CFR-2004-title38-vol1/content-d...](http://www.gpo.gov/fdsys/pkg/CFR-2004-title38-vol1/content-
detail.html)

------
jingo
The USC has been available in HTML or ASCII for many years. From house.gov,
gpo.gov and cornell.edu, to name a few sources.

------
tlrobinson
In case anyone was wondering, it's about 80MB zipped, 500MB unzipped.

------
methehack
Does anyone know how people search this now and how much it costs?

~~~
ww520
Congress has a site for it.

[http://uscode.house.gov/search/criteria.shtml](http://uscode.house.gov/search/criteria.shtml)

------
pkinnaird
repo with the contents of the house.gov site:
[https://github.com/peterkinnaird/US-
Code](https://github.com/peterkinnaird/US-Code)

------
prmobiledev
Portugal should do this to their civil code!!!

------
bandushrew
Except for the secret laws, of course...

------
krob
does that mean these are all the current federal laws?..

~~~
liscovich
This is only the codified portion of the federal law. So called "Statues at
Large" that are passed by Congress but not codified into USC are not included.
Plus this does not include the federal common law based on the rulings of
federal courts.

~~~
mpyne
Also there is the Code of Federal Regulations to contend with, but those are
available online already.

------
rogerchucker
Genuinely curious - how can this data be used from a software perspective?

------
tianhe
XML? this should be in json!

~~~
liscovich
As the schema guide explains, in 1999 they picked XML based on the study from
1996:

"Following a 1999 feasibility study on XML/SGML, the Committee on House
Administration adopted XML as a data standard for the exchange of legislative
documents" [http://uscodebeta.house.gov/download/resources/USLM-User-
Gui...](http://uscodebeta.house.gov/download/resources/USLM-User-Guide.pdf)

It took the US Govt 17 years to release 200,000 pages of the US code in XML.

~~~
tianhe
I think the most time consuming part of the conversion is from text to
digital. That's what took so long.

Interchanging formats should be relatively easy. Back in 1999, json wasn't
even around.

IMO in today's API centric, and javascript ruled world. json would be a lot
more useful.

~~~
chc
Given that it can be mechanically converted to JSON, I have a hard time seeing
how JSON would be a lot more useful. "Very slightly more convenient" seems
like the most we could reasonably say.

