
Why isn't there a Google for the law? - vqc
http://www.openlawlib.org/blog/why-isnt-there-a-google-for-the-law/
======
ChuckMcM
Anecdote, I was at Google in 2006 and there was (is?) a database of ideas that
people shared that Google might pursue. As their motto was organize the
world's information and make it easily accessible.

I had gotten a ticket for obstructing the intersection (the "anti-congestion"
law in California that sought to limit gridlock by making it illegal to enter
an intersection that you couldn't exit before the other light changed). I was
fighting it and wanted to find other cases that had been decided on this
law[1]. The only way to do that was to go to the public library and look
through their published volumes of decisions and cases. So I thought here is a
really useful thing Google could do, it isn't even a _hard_ problem, collect
the decisions that the courts publish anyway, and just connect the ones that
are about the same part of the code. Match a number, match a date. And very
useful to people who are fighting cases. It saved me some time and money [2].
But the idea never made it past the discussion stage because, as I was
counseled, doing that would take on "powerful interests" who would really
fight back hard and Google didn't want to draw that level of scrutiny. It
wasn't until Carl Malamude started attacking this problem in earnest[3] that
it became clear to me what it means to take away a revenue stream from
lawyers.

[1] And learned this is called 'Sheparding' based on finding citations --
[https://en.wikipedia.org/wiki/Shepard's_Citations](https://en.wikipedia.org/wiki/Shepard's_Citations)

[2] turns out there had been no case law on this particular law and lots of
dismissals so I just entered a plea of not guilty at the clerk, and the court
informed me a week before my trial date that the prosecution had declined to
prosecute.

[3]
[https://www.techdirt.com/articles/20150726/23080731763/even-...](https://www.techdirt.com/articles/20150726/23080731763/even-
if-state-georgia-can-copyright-legal-annotations-should-it.shtml)

~~~
randyrand
The anti gridlock law is actually a great law that DOES stop gridlock. The
problem is you need to get people to actually follow it and in california for
some reason no one does. In chicago we all know about this law and I had never
seen gridlock the whole time I lived there (23 years). crazy right? not
really. But to the people in LA it seems crazy to them. within a year of
moving to LA, I saw gridlock. I had no idea this was still a problem in the
united states. I thought it was only a 3rd world problem. But nope. In LA they
have it. And it's becuase people block intersections here ALL THE FUCKING
TIME. It's crazy.

Why? I don't get it. Do LA drives really not know it causes gridlock? Or do
they just not care. I remember talking to an uber driver who didn't even know
what gridlock was, or why blocking intersectinos was bad. As someone from
chicago, I was amazed. And even on HN, OP doesn't know that blocking
intersections causes gridlock, it amazes me. I thought this was common
knowledge. For some reason people in these gridlocked towns are skeptical that
the law doesn't work.

for example, in downtown LA they blocked one street for a festival and the
entire downtown became gridlocked. Ive never seen traffic this bad IN MY
ENTIRE LIFE. All because of people blocking intersections. I was able to walk
10 blocks faster than cars did. I saw cars sitting in place for an hour and
more. It was crazy. I've never seen anything like it. And when the lights
turned green people still blocked the intersection. THEY WERE SITTING IN
GRIDLOCK AND STILL DIDN"T STOP BLOCKING INTERSECTINOS. Like wow. I was
dumbfounded. How are LA drivers so bad at driving?? I was so glad to be on
foot.

It's a completely fair law; it's easy to follow, and it benefits everyone.
blocking intersections DOES CAUSE GRIDLOCK. IF YOU STOP BLOCKING
INTERSECTIONS, GRIDLOCK WILL GO AWAY. Get this through your head!

the problem is people dont know about the law in california, and don't have
the common sense to figure out themselves. it needs to be enforced more. LA
has some of the worst gridlock problems ive ever seen.

~~~
Practicality
For what it's worth, I was taught to sit in the intersection when making a
left turn in driver's ed.

Maybe it's an east west thing, but here is a video of a guy complaining about
people who DON'T enter the intersection:
[https://www.youtube.com/watch?v=q_bcjCOzob4](https://www.youtube.com/watch?v=q_bcjCOzob4)

Again, he is saying that the proper thing to do is to enter the intersection
and sit, regardless.

So, it seems like the issue may be that in some areas this behavior is
promoted (even in drivers ed classes) and in others it's illegal.

Of course, I am not from a large city that deals with gridlock, so that may be
part of the difference.

~~~
rconti
You're supposed to enter the intersection to make a restricted turn across
oncoming traffic. Though I believe only one car should do it at a time.

It's different from gridlock because you know that when the oncoming traffic
stops, there is somewhere for your car to go.

~~~
ensignavenger
Except that you block the view for oncoming vehicles turning the opposite
direction (their left). A properly labeled intersection has white lines (in
the US, other places may use something else) that indicate where to stop, and
if engineered properly, stopping before these white lines leaves a clear view.

~~~
dsfyu404ed
which doesn't matter because after the oncoming traffic gets the red light you
take your left and are no longer blocking anyone's view.

~~~
ensignavenger
Huh? You are still blocking the view and preventing the other side from
turning left safely until the traffic clears and you can turn.

------
sfRattan
Appropriate that this subject makes an appearance on HN so close to the
anniversary of Aaron Swartz's suicide under duress and attack from the Federal
Bureau of Investigation.

Making court documents more publicly accessible was one of Aaron's projects
(circa 2008). He and project collaborators downloaded more than a million
documents from the government's PACER electronic access system using public
library terminals and attracted the attention of the FBI.[1] Part of the goal
at the time was to uncover privacy violations in filed court documents that
were legally a matter of public record but behind a lucrative, government
administrated pay wall.[2]

There is something important to be said for the social and moral importance of
keeping the public record _publicly accessible_. The justification for these
intermediaries to exist and extract rent from the cataloging of public
information grows slimmer and slimmer, but cataloging and indexing everything
in a common law (precedent based) system is tremendously expensive. I suspect
that developing of an algorithm to usefully search the dense and interweaving
web of judicial opinions, case history, written legislation, and jurisdictions
in which all those elements apply/overlap/supersede each other is also a
massive capital investment.

It all does have to be paid for somehow, and I don't think how to fund is a
settled question. Pay walls clearly have pernicious externalities (privacy
violations go unnoticed; access to law is practically limited to professionals
for whom the costs are a business expense). But I don't trust the state to
properly fund or develop such a service through general tax either.

Consider supporting the individuals in this thread who are working to make
that sort of open information access in law a reality, and consider also who
will seek rent from the finished service who will not.

[1]:
[https://en.wikipedia.org/wiki/Aaron_Swartz#PACER](https://en.wikipedia.org/wiki/Aaron_Swartz#PACER)

[2]: [https://public.resource.org/crime/](https://public.resource.org/crime/)

~~~
chris_wot
The law is formed by the government and the judiciary and applies to everyone
living in or visiting the State. It's entirely reasonable to require the State
to pay for unfettered access to the law.

~~~
Dangeranger
While access to case law is in fact required unless the case has been sealed
by a judge, convenient and free access is not required. The states can charge
for paper copies and make those copies only accessible via a USPS mailed form.
That's the underlying problem. Barriers to access can be an effective tool for
creating "expert" silos, where only those who have the means and the correct
keys may travel.

~~~
TheSageMage
Any idea of what the cost to, say, buy up all the case laws would be? And what
is the law on republishing said copies?

~~~
kalu
Most docs are available through Pacer. Pacer pricing is $0.10 per page capped
at $3 per document. Buying everything would cost a lot. In addition, new case
law is continuously being created. So you couldn't buy everything and be done
with it.

You can republish... court docs are in the public domain. RPXCorp tried doing
this with patent law. For a time, they made everything free. That practice
didn't last and they now pass on their costs to customers.

------
eelliott
As a lawyer I've thought a lot about this (and if anyone is working on this
and wants to talk get in touch). There are two reasons:

1\. Reasoning in law relies on complex language semantics, both in statute and
case law. Take for example a court decision that says "in the circumstances of
this case I do not agree that John v Doe applies". That can be expressed a
million ways and I'm not sure our natural language processing can replace
humans yet in this area.

2\. There is a lot of copyright problems that need to be overcome. Companies
like Lexis and Westlaw own the rights to a lot of decisions and even statutes
and can paywall the . This is slowly changing however, for example in the UK
recently the courts took back the rights to publish decisions.

~~~
nawtacawp
It does not seem ethical to hold a person accountable for not following a law,
if they do not have free access to read that law and the various ways the
court has ruled to how that law should be applied.

~~~
eelliott
I agree entirely however law is like all professions where access to
information is only half the equation, its application and interpretation is
derived from extensive training and experience. So I'd argue that until we
nail 'Google for the law', access to free lawyers at least for the poor etc is
more important than access to the legal databases

~~~
kevinpet
If you start from the assumption that the law is whatever lawyers and judges
tend to think it is, then access to lawyers is more important. If you take the
egalitarian perspective that the law is the law and a lawyer is just someone
particularly skilled in applying it, then a person of average eduction should
be able to handle a routine legal dispute without paying a specialist. This is
what people have in mind when they want to make the law more available online.

If there are laws out there that are currently applied or interpreted
differently than their plain meaning as written down, that's a failure of
government. Either legislators should have fixed a stupid law, or judges
should have thrown it out for vagueness.

~~~
arcbyte
The problem is that both of your assumptions are true. The fact is that the
law is constantly being discovered. To the extent that an area of law is well
explored, a layperson should be empowered to handle it alone, but to the
extent that it is not, it requires abilities that have not been instilled in
the average citizen.

------
igurari
The premise of this blog post is a little off base. (Though I think Open Law
Library is doing good work.) The difficulty in building a high quality legal
search engine is not in parsing the links between the documents. High quality
links matter, but they only get you about 25% of the way there. The more
important thing is to have a highly accurate and structured understanding of
the law. (Think of Google's Knowledge Graph, or the maps they use for their
driverless cars.)

Disclaimer: I worked on Google Scholar and am the CEO of Judicata.

A recent evaluation of various legal search engines [1] found: "The oldest
database providers, Westlaw and Lexis, had the highest percentages of relevant
results, at 67% and 57%, respectively. The newer legal database providers,
Fastcase, Google Scholar, Casetext, and Ravel, were also clustered together at
a lower relevance rate, returning approximately 40% relevant results."

Westlaw, Lexis and Google Scholar all have high quality citation parsing
(i.e., links). And Scholar relies very heavily on PageRank (as [1]
demonstrates). But it is Westlaw and Lexis that are the better search engines.
That's because they have invested more into going beyond just links; they've
invested a lot into understanding what it is happening with the law.

At Judicata our own findings are that the average legal search query is
significantly more complex than the average Google query -- having more terms
and more concepts. Moreover, whereas only 15% of Google queries are unique,
the inverse is true in legal research: more than 85% of queries are unique.
What that means is that in order to return a good result, you need to
understand a lot more about the query and the documents you've indexed. You
can't rely on links between documents and past searches and clicks to power a
quality search engine (the way that Google.com can).

As has been mentioned in other comments here, the real challenge for legal
research is extracting structure out of the law (Shepardization, Procedural
Postures, Causes of Actions, Dispositions, Legal Principles, Arguments, Facts,
etc.). That is what will get legal search engines closer to where Google
really shines -- results that are powered by the Google Knowledge Graph.

[1]
[https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2859720](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2859720)

~~~
tomcam
Thank you for your perspective. It's quite helpful. However, I wouldn't even a
plain text searchable database be better than nothing? And I don't understand
how this can be monopolized by Westlaw when law should be public domain…

~~~
igurari
Getting free or low cost access to a plain text searchable database is no
longer a problem for lawyers. It was 8-10 years ago, but since the entry of
Google Scholar (and Casetext, Ravel, and a half dozen or so other providers)
getting access to the law is no longer difficult. To echo the original post,
the law today is in a place like the "deep, dark, early days of the Web, using
search engines like Lycos and Alta Vista". We do need a "Google for the law,"
but Google isn't good enough to be that. It's a very hard problem to create a
good search engine for the law, but legal search engines will eventually get
there.

------
BenderV
Doctrine, a French startup is actually doing this.
[https://www.doctrine.fr](https://www.doctrine.fr) [Disclaimer: I'm a Data
Scientist @ Doctrine]

Part of the reason why we are the only one providing something clear is that,
indeed, law data is a mess, and we working hard to have a clean & consistent
database.

As simple point as legal references is really complex. Every country has it's
own identifier system , some editor have their own identifier system and
people are referencing in really different manners...

The second point is that, we really heavily on NLP/DL to extract insights and
informations about the data. This is something that couldn't have been done
/easily/ in the past.

Shameless plug: We are hiring!
[https://doctrine.typeform.com/to/uyjXoE](https://doctrine.typeform.com/to/uyjXoE)
[French only]

~~~
touristtam
> Shameless plug: We are hiring!
> [https://doctrine.typeform.com/to/uyjXoE](https://doctrine.typeform.com/to/uyjXoE)
> [French only]

Great, but for what position? This link is a multistep application form, and I
am pretty sure others will find equally annoying to fill it up just to see the
description of the job.

~~~
a455bcd9
Hey! You're right, we have 4 open positions on Angel List:
[https://angel.co/doctrine-/jobs](https://angel.co/doctrine-/jobs)

[Disclaimer: Cofounder here ;)]

------
dvdhnt
> The reasons this problem exists are complex, but they boil down to the fact
> that laws and the links between them are not being published in ways
> computers can easily process, making it difficult to extract the valuable
> information they contain.

I'd argue that laws and the links between them are not being published in ways
that the average person can easily process or understand.

Furthermore, I believe this ambiguity directly impacts the governed, causing
them to be, in general, distrustful of most laws that do not affect them in an
observable way.

Making laws more digestible is only part of the solution; the "what" should be
annotated with the "why". Otherwise, with so many decentralized cities in an
already decentralized nation, fundamentally sound, universally-applicable
legislation may be ignored due to stereotypes and generalizations. Whereas,
documented and annotated legislation can be analyzed, duplicated, and modified
to fit different environments around the country, or reasonably ignored on
verifiable grounds.

~~~
ksikka
> I'd argue that laws and the links between them are not being published in
> ways that the average person can easily process or understand.

So true. A premise of this blog post is that laws need to be easier for
computers to understand, but that's skipping a step: making laws easy for
humans to understand.

I really like what TLDRLegal did: made software licenses digestible by humans.
I would love to see similar sites pop up for other verticals of the law, but
it's a lot of work and there's not much incentive to get it done at the
quality-level of TLDRLegal.

~~~
dvdhnt
I had to note I also enjoy TLDRLegal... I wish they'd open an API.

------
rayiner
> The reasons this problem exists are complex, but they boil down to the fact
> that laws and the links between them are not being published in ways
> computers can easily process, making it difficult to extract the valuable
> information they contain. If the format were computer-friendly, it is easy
> to imagine leveraging the links between laws to improve search results.

This is totally untrue. Legal documents are linked together with citations
written according to very precise rules, which lawyers spend a lot of time
getting correct. Almost all laws and cases are published in a quasi-append-
only record: sequential publications in reporters organized by volume and page
number. So unlike URLs on the Internet, 47 F.3d 167 will always refer to the
same page of the same case. Forever. Most agency decisions, etc, have similar
sequential records. Statues and regulations are precisely identified by
structured citations as well. WestLaw and Lexis have no problem parsing these,
and will happily find you all the cases that cite to say a specific Supreme
Court case from 1880.

The reasons lawyers use terms and connectors searches instead of a "Google-
like" engine is because the underlying concept of Page Rank absolutely sucks
for legal research. Page Rank equates in-degree in the link graph with
relevance. This will get you highly cited cases that you knew anyway that are
only tangentially related to the cases you actually need.

In a legal brief, a couple of trial court decisions that are factually similar
but uncited are infinitely more valuable than a highly cited Supreme Court
case that happens to pertain to the same general area of law.

~~~
rpedela
Why are they more valuable?

~~~
rayiner
In addition to VCQ's point below, there is the fact that such cases are often
simply not helpful despite technically being relevant. A seminal Supreme Court
case might state the broad principle of law: e.g. you have a right to due
process before losing government entitlements. But stuff like that is never
actually disputed. In practice the dispute is over e.g. "how much process is
enough?" or "when does a government benefit rise to the level of an
entitlement?" The judge _knows_ the big overarching Supreme Court case. While
it's technically relevant, it's not helpful. What you need as the lawyer is to
show the judge case law that supports your specific argument applying that
general principle.

------
chris_wot
When I was an active editor on Wikipedia, I rewrote the USA PATRIOT Act from
scratch.

It wasn't easy, and I'm not talking about the law itself here (though at 10
title long, with title III literally an anti-money laundering bill they bunged
into the Act and had passed, it is still n extremely complex bit of
legislation). No, I'm talking about the ability to find information on certain
laws - I'm an Australian, so it was a major challenge to find good quality
sources. I was lucky in a way, as the Patriot Act is so controversial I did
eventually manage to track down info. But it wasn't easy, and when I tried to
find sources for some truly ancient and tangential legislation a few times I
hit a brick wall entirely.

It makes me think: ignorance of the law is not an excuse for breaking it...
but with the current system you are often going to be ignorant of the law no
matter what you do! Unless, of course, you have the money to pay for expensive
legal searches.

How anyone could consider resyricted access to information about the law and
_the law itself_ to be anything but a violation of human rights is beyond me.

~~~
nradov
What sort of information and sources were you looking for? The Patriot Act
text is available online, as are the (unclassified) notes from Congressional
debates and voting records.

~~~
chris_wot
All primary sources, which are allowed. However, there are plenty of other
primary sources that I can't get easy access to, including case law. In fact,
there are old Acts I found I didn't have any access to at all.

------
vqc
Co-founder of Open Law Library here. Happy to answer any questions about what
we're building, the law, and anything else people are interested to discuss.

~~~
pryelluw
How can people (like me) contribute? The word Open makes me think you accept
non-monetary contributions.

~~~
vqc
With respect to contributing code, the short answer is that we're not sure
yet. There is certainly no shortage of code to write, but we want to make sure
that we work with volunteers in the right way. For anyone that's interested in
please reach out here or through the contact form on our website. I will
personally respond to everyone.

Other than writing code, we need advocates in and out of government who
understand and believe in the value of free (as in freedom) and accessible
laws. Contact your local, state, federal representative and let them know that
free and accessible laws are important to you. Let them know that a system
exists that can not only make this a possibility, but that it will also make
their lives a lot easier.

We would also love to hear about what you would want to build on top of
computer-readable, always up-to-date laws that could programmatically alert
you when something changed and let you diff against old versions of the law.
E.g. a) internal annotations for civil servants that wouldn't immediately be
obsolete once the legal code changed; and b) legal alert system for the part
of the law you care about.

We'll put together a form that makes it easier to collect this information!

~~~
chris_wot
Aside from code, how are you gathering the laws themselves? Public libraries?

~~~
vqc
We get them directly from the source.

Previous attempts at accomplishing our mission saw organizations scraping
government websites and re-hosting the laws on prettier websites. The problem
was that a) the laws were only as up-to-date as the law the governments made
available (which, unfortunately, are not up-to-date at all) and b) the
projects were not sustainable because no one pays to access the law and
websites needed updating every time the law changed.

Sites like these are potentially very harmful. They haven't been updated in
years and people who stumble upon them and miss the fine print end up relying
on laws that have long since changed.

Because timeliness matters, the only way to guarantee that we get it is by
working directly with the governments. So we build software into the law
drafting, codifying, and publishing process that governments can really
benefit from and enjoy using. The software changes the economics of
codification and publication and permits publishing the laws freely and
openly.

------
prohor
I was always wondering rather on Stack Overflow for law, where one could get
advice from other knowledgeable. But I talked with few lawyers
(technologically open-minded) and they weren't interested. It seems the final
root cause is that in software development the level that you get help is not
the level where the end products are and compete. For lawyers it would be
different - an advice is the root of their service, so it is the end product.
So if they were helping with advice, they would directly help competition.
Secondly, their customers could end up in that level instead (while software
customers cannot benefit from stack overflow).

~~~
ralfd
There is [http://law.stackexchange.com/](http://law.stackexchange.com/)

But I think it is not as active as others stackexchanges.

------
SmellTheGlove
Casetext is a startup working on that, at least for case law, which is the
more difficult body to gain efficient access to in the US:

[https://casetext.com/](https://casetext.com/)

I'm not associated with them at all, other than once emailing the founder best
wishes.

~~~
vqc
Casetext and Judicata are great companies working on searching through
judicial opinions. There's a lot to be said about PACER and publishing state
judicial opinions. Suffice it to say, those companies could probably save a
lot of time on parsing and focus on other problems instead if judicial
opinions were released with some structured information, e.g. party names and
procedural posture. Open Law Library makes it possible for legislatures to,
among other things, publish their laws in structured formats.

One thing to consider is that on a day-to-day basis, an individual might be
impacted more by city/county/state law than by federal law.

------
Ericson2314
Uh, the real question is "why isn't there git for the law".

~~~
Animats
There is at the Federal level.[1] It's XML-based. Here's an example of a bill
in raw XML.[1] It displays in the form that a bill is printed.[2] The GPO even
puts in the XML, "Pursuant to Title 17 Section 105 of the United States Code,
this file is not subject to copyright protection and is in the public domain."

There's a change control system behind all this. Here's a history of a bill,
again, in XML.[4] There are change transactions, which are also in XML, but
they're not in this database.

[1] [https://www.gpo.gov/fdsys/bulkdata](https://www.gpo.gov/fdsys/bulkdata)
[2] view-
source:[https://www.gpo.gov/fdsys/bulkdata/BILLS/114/2/hconres/BILLS...](https://www.gpo.gov/fdsys/bulkdata/BILLS/114/2/hconres/BILLS-114hconres106ih.xml)
[3]
[https://www.gpo.gov/fdsys/bulkdata/BILLS/114/2/hconres/BILLS...](https://www.gpo.gov/fdsys/bulkdata/BILLS/114/2/hconres/BILLS-114hconres106ih.xml)
[4]
[https://www.gpo.gov/fdsys/bulkdata/BILLSTATUS/114/sres/BILLS...](https://www.gpo.gov/fdsys/bulkdata/BILLSTATUS/114/sres/BILLSTATUS-114sres99.xml)

~~~
Ericson2314
Thank you! I was unaware of any of this.

It is understanding that bills are patches, and thus the law works like
darcs—patch-oriented rather than revision-oriented.

Does this sound correct to you? I arrived at this conclusion asking people who
know nothing about VCS, so something mighthave been lost in translation.

~~~
kuschku
Yes, and no. Each bill is subject to revisions, but once it is passed, it is a
single patch – you could imagine that each bill is a branch in git, developed
in commits one after another, and then the latest status is squashed and
merged.

~~~
Ericson2314
Yeah, that is what I meant. The revision of the bill would be some meta-
history that doesn't quite fit the darcs metaphor.

I also hear the applying of all these patches is a slow manual process only
done periodically? :|

~~~
Animats
Amendment updating is done nightly, and the results are on "congress.gov",
which is the user-friendly access interface. The XML dump is the raw data,
made available to the public.

The user guide for the XML data is on Github.[1]

[1] [https://github.com/usgpo/bill-
status/blob/master/BILLSTATUS-...](https://github.com/usgpo/bill-
status/blob/master/BILLSTATUS-XML_User_User-Guide.md)

~~~
Ericson2314
"ammendment updating" would be the meta-history of the bills themselves,
right? I meant the applying passed and signed bills to the overall body of
all. When is that done?

------
MistahKoala
And not just why isn't there something that links legislative citations
together, but why isn't there something that can tell laypeople what is
current legislation and what isn't? In the UK, we have numerous acts of the
same name but different years. If I find some information about a slightly
obscure issue of concern - say, a website that's ten years old - it might cite
an act of a particular year - and I then look up the details of that
legislation, I can see it, but I've no idea if or how it currently applies, it
it's been superseded by newer legislation etc; it might even just be a list of
'edits' to previous legislation that aren't easily researched by someone
without a formal understanding of law.

The common response to this has usually been "that's why you need s [jurist]",
but I take issue with the idea that the legislation that applies equally to us
should only be understood by those equipped with the means to make sense of
it.

------
jacobheller
Our startup, Casetext (YCS13) was mentioned a few times here, so I thought I'd
stop in.

The crux of the article is that most legal research solutions have ignored the
immense power contained in the links between laws:

> Laws frequently reference other laws in order to reuse definitions,
> introduce exceptions, or make it clear that two concepts are meant to work
> together. Consequential laws tend to get referenced in other laws as their
> influence spreads throughout the legal system. Experienced lawyers build up
> detailed mental maps of these links, allowing them to jump immediately to
> core issues of complex legal problems.

> However, most laws can only be searched using the dark-age, Lycos
> strategy—guess at keywords and hope—and it’s often necessary to pay for even
> that limited functionality.

We at Casetext are taking a very different approach than the "dark-age, Lycos
strategy" that you have to pay for:

1\. On Casetext, the law is free, as is basic search. Honestly, it's insane
that Westlaw and LexisNexis charge as much as they do for basic keyword search
over a database that should have been free to begin with.

2\. We make money by charging for advanced, data-driven ways that lawyers can
research more efficiently. CARA, our premium product, enables a lawyer to
drag-and-drop upload a document they're working on, and will recommend the
research that the lawyer missed but is very relevant to what they're working
on ([https://casetext.com/cara](https://casetext.com/cara)). A key ingredient
behind this awesome tech is the network of citations that the article
mentions.

Whether it's us or other startups, I agree with the article that in the next
few years you'll see a trend towards more "Google for Law" \-- companies will
make legal research free, and their comparative advantage will be on their
technology, often driven by ML/AI. As a lawyer/coder, it's a pretty exciting
time to be in the space.

Oh yeah, and we're hiring!
[https://casetext.com/jobs](https://casetext.com/jobs)

------
usloth_wandows
There is, it just isn't free. Lawfirms pay mountains of money for up to date
'google for laws'. At least in the U.S.

------
mrleiter
Speaking from a global perspective: every nation has its own way of writing
and linking their laws. Although civil law countries, and case law countries
respectively, have a lot of similarities (historically induced), they are
still unique.

That is not the case when it comes to searching the world wide web. Here, by
design, everything is linked and national borders are mostly irrelevant. So if
you want to implement a Google of law, you have to do it locally. The only
exception would be international law, which itself can be seen as local.

~~~
a3camero
You can also create a normalized system that aggregates all of the local
systems together. And this also applies to international law (assuming you're
talking about treaties) because there's a treaty and then there's the national
implementing legislation in all of the treaty countries.

------
WhiteSource1
Lexis Nexus and Findlaw?As usloth_wandows said, there are law databases but
that's a huge part of a lawyer's value and knowledge, so these databases are
extremely expensive.

------
a3camero
I'm the CTO of Global-Regulation.com which is the search service with the most
number of countries (78) and machine translated laws. We are often described
by clients as the "Google of laws" but there are huge differences. Getting to
Google-level search, where the engine understands your intent, is very
challenging. People often search for industry terms like "SAR reporting" and
what they want is Suspicious Activity Reports (SAR). A Google-like engine
would need to understand what the query means from an industry point of view
(since the terms often don't actually appear in legislation) and then
translate that to the specific term used in each country. This is far from
trivial and requires looking at secondary sources, not just the laws
themselves.

Other problems include official vs. unofficial laws, slow consolidations,
updates to the law (of various kinds) and attempting to normalize the world's
laws to a US standard (like excluding municipal laws and avoiding guidance-
type documents from civil law countries). These are problems Google doesn't
have to deal with and customers expect a very high standard for legal search
results.

------
richardboegli
There is already one being worked on by Casetext.

Just saw this posted to HN RSS feed by lever...

Maybe lever watches what's trending on HN and then puts job adverts? If so,
NICE ;)

Become a Data Scientist/Machine Learning Engineer at Casetext
[https://news.ycombinator.com/item?id=13307644](https://news.ycombinator.com/item?id=13307644)
[https://jobs.lever.co/casetext/c7f0129e-af9b-461e-b791-a9323...](https://jobs.lever.co/casetext/c7f0129e-af9b-461e-b791-a93235cea2af)

Machine learning is at the core of Casetext's mission to make the law free and
understandable and we're looking for an ML engineer/data scientist to help us
build the next generation of legal research products. The data team at
Casetext is working on groundbreaking legal technology for document
recommendation and search. If you have industry experience developing
production software for machine learning, especially in areas like NLP, graph
models, topic modeling, and/or recommendation engines, we'd love to talk to
you.

------
kaa2102
The closest thing right now is Cornell's Legal Information
Institute--[https://www.law.cornell.edu](https://www.law.cornell.edu). There
is also the same problem with academic journals--Lexis Nexis has asearchable
database with cases for a pretty penny. Also, pacer.gov enables users to
access cases and dockets but the structure and "per page" cost make it
difficult to be a useful search engine.

I've argued a couple cases in district court (and one case is on the docket of
the Supreme Court) and I've used a mix of law school textbooks,
Scotusblog.com, Cornell's Legal Information Institute and lawyer's blogs to
start background research.

[https://www.supremecourt.gov/search.aspx?filename=/docketfil...](https://www.supremecourt.gov/search.aspx?filename=/docketfiles/16-6814.htm)

------
iam4xzor
there's a french startup killing it
[https://www.doctrine.fr/](https://www.doctrine.fr/) (works for france only)

------
showerst
Another group that's taking this on is
[https://github.com/statedecoded/statedecoded](https://github.com/statedecoded/statedecoded)
\-- although they're targeting converting existing data rather than being a
workflow for the states.

------
maxboisvert
We have free access to law with canlii.org in Canada.

~~~
vqc
I'm not familiar at all with Canadian law. Does canlii.org provide statutes,
regs, case law from every jurisdiction from the local level (cities/towns) up
through the federal level? Is this a distinction that matters?

~~~
citeright
You can see CanLII's coverage here:
[https://www.canlii.org/en/databases.html](https://www.canlii.org/en/databases.html).

~~~
vqc
That is magnificent. It seems to "end" at the province/territory level (*this
is not a critique at all). How does lawmaking work at the city level?

~~~
a3camero
Cities pass "bylaws" that affect their municipality. In Canada municipalities
are more limited in their lawmaking powers than cities in some other
countries.

Bylaws are hard to lookup and are city-specific.

------
palunon
In France we have Legifrance, which is government run, and give you access to
every France law, code, etc, and let you see modifications over times and
where the article is cited (the "links").

Eg.
[https://www.legifrance.gouv.fr/affichCodeArticle.do;jsession...](https://www.legifrance.gouv.fr/affichCodeArticle.do;jsessionid=3178D9B64C307BBB4A2D9CA27EE167A2.tpdila22v_1?cidTexte=LEGITEXT000006071154&idArticle=LEGIARTI000032655793&dateTexte=20170103&categorieLien=cid#LEGIARTI000032655793)

It's probably made simpler by the fact that we are not a federation with law
making bodies everywhere...

------
cakeface
Another reason why it is difficult to create the "google" for the law is that
some laws are copyrighted and are not in the public domain. Stop. Think about
what I just said. There are laws that you must pay to read.

A good example of this is when a state or municipality enacts a building code.
A common building code is the electrical code published by NFPA. Most states
use this. NFPA owns the copyright for this. You cannot publish a PDF of the
electrical code on your website, yet you are required by law to follow it.

There may be other cases of this, I don't know. But I think it is crazy!

------
Mathnerd314
I'll just mention PlainSite:
[http://www.plainsite.org/](http://www.plainsite.org/)

It basically is "google for the law" (provided that your definition of law
only extends to federal courts and state appeals courts). But they typically
have full opinions available for free.

General summary: [http://thelegalpioneer.blogspot.com/2014/02/plainsite-
puttin...](http://thelegalpioneer.blogspot.com/2014/02/plainsite-putting-law-
in-plain-sight.html)

------
jarjoura
In university I had full access to LexusNexus, of course it's not free, but
you could literally find any obscure legal text written in the country's
history, in the click of a button.

~~~
wfunction
It's "LexisNexis"... it's not referring to the nexus of a car company. :)

------
oarfish
Well, for the german law, there is [https://lawly.org](https://lawly.org), the
result of a recent bachelor's project at my university.

~~~
AlbertoGP
This looks nice. Are you involved in it?

I took a look at the Umsatzsteuer [Value Added Tax] part as I had to read it
some years ago when I started freelancing:
[https://lawly.org/gesetz/UStG%201980/4.1#12-steuersaetze](https://lawly.org/gesetz/UStG%201980/4.1#12-steuersaetze)

The Inhaltsübersicht [content overview] list at the right side is yellow on
white which makes it hard to read.

In comparison with the place where I've read the German law before, there seem
to be surplus list elements in the HTML: [https://www.gesetze-im-
internet.de/ustg_1980/BJNR119530979.h...](https://www.gesetze-im-
internet.de/ustg_1980/BJNR119530979.html#BJNR119530979BJNG001204301)

It's nice though that by registering I could download the content for off-line
use. AFAIK gesetze-im-internet.de does not provide that.

Are there any other relevant differences between those two services?

------
seshagiric
One things that deserves a mention is the Lexis add-in for Microsoft Word.
[http://www.lexisnexis.com/en-us/products/lexis-for-
microsoft...](http://www.lexisnexis.com/en-us/products/lexis-for-microsoft-
office.page)

It's a pretty cool utility to integrate sheparding, research and info from the
online Lexis law database into the context of a document a lawyer/ para legal
may already be working on.

------
katpas
This reminds me of the concepts written about here* basically trying to
develop robust definitions for legal terms and objects so a question like
'what are my rights in X situation' could be answered accurately by a
computer.

*[http://blog.stephenwolfram.com/2016/10/computational-law-sym...](http://blog.stephenwolfram.com/2016/10/computational-law-symbolic-discourse-and-the-ai-constitution/)

~~~
cookiecaper
The issue with our system of justice is that it's _too_ unfeeling and robotic.
Everyone has an inherent sense of what's just, yet it takes at least 7 years
of schooling and many difficult technical achievements to be allowed to
participate in its implementation (beyond simply plucking suspects from the
street).

Making it computerized does not seem like the correct course of action. Human
judges and juries are needed to fully evaluate the context and pass judgment.

My personal preference would be to go toward a less rigid system of law, not
one so rigid that computers could reliably enforce it.

~~~
nradov
When the law becomes less rigid that means that whoever happens to be in power
at the time gets to decide on a whim what's legal or illegal. Historically
that hasn't worked out very well.

~~~
cookiecaper
"Whoever happens to be in power" already decides what's legal or illegal.
That's what being in power means. No one respects a piece of paper; power
ultimately comes down to the ability to exert force to see the decrees of the
powerful imposed. If this isn't underneath the covers somewhere, the power
moves to someone who does have this ability.

There's already a huge amount of finagling by powerful individuals and groups
in our government, they just have a lot of pomp and circumstance to try to
cover it up. Removing some of the formalities makes flexibility more
accessible.

Sure, you can spend the millions of dollars it takes to successfully lobby
Congress if you're a big multinational corporation. If you're a niche concern,
you're stuck.

Everyone hates mandatory minimum sentences these days. They were put in for a
lot of drug crimes in the late 80s-early 90s, and they result in a lot of
unneeded incarcerations, not only costing the taxpayer a lot of money, but
costing society, family, and community the participation and productivity of
someone who would be much more beneficial outside than in. Because of our
rigid legal traditions, mandatory minimums must be enforced regardless of
circumstances.

When you get down to the bottom of it, no matter what system of governance you
have, you need its administrators to be benevolent and wise to get desirable
outcomes. I believe that more local authorities are more able to make wise
decisions because they not only know the area more intimately, but are more
impacted by the outcomes. A far-off judge doesn't care if he sends 40% of the
community to jail. A local judge does.

This is kind of like being entitled to being judged by a jury of _your peers_.
Peers know the cultural norms and the local expectations. High-powered
attorneys sitting on a bench in Washington, D.C. may not.

------
anotherhacker
>Why isn't there a Google for the law?

There's no (or little) money to be made doing it.

Maybe in the future when collecting and modeling such knowledge is cheap. For
now, it's not cheap.

------
aurizon
Quite simply, this is a racket. Courts are public, and anyone can attend and
write down what goes on and publish it. There is an official court reporter
who does this. There is no authorship, but the reporter is granted a
copyright. They then charge fees for access. The problem lies there. The
courts should publish all cases in the open, with no copyright. That however
blocks lawyer and reporter revenue streams and they will protect their racket.

------
angled
Is WorldLII not already a good start?

[http://www.worldlii.org/databases.html](http://www.worldlii.org/databases.html)

~~~
emmelaich
I was about to mention AustLII
[http://www.austlii.edu.au/](http://www.austlii.edu.au/) which has been around
for about 20 years. I hadn't heard of WorldLII and CanLII; seems they're part
of a network.

And it looks like AustLII was where it started; the WorldLII contacts are all
AustLII people.

~~~
a3camero
There are many others: [http://www.saflii.org/](http://www.saflii.org/)
(Southern African countries) [http://www.paclii.org/](http://www.paclii.org/)
(Pacific Island countries)

------
nickjamespdx
Has no one mentioned Ravel yet?
[https://www.ravellaw.com/search](https://www.ravellaw.com/search)

------
Apocryphon
"I want my lawyer program."

[http://www.technovelgy.com/ct/content.asp?Bnum=864](http://www.technovelgy.com/ct/content.asp?Bnum=864)

------
drdeadringer
I remember when Google Patent was around. My uncle and cousin found it very
useful as patent lawyers on both ends of the experience spectrum. When it got
shut down they were disappointed.

~~~
arthurcolle
Google Patent shut down? Whaaat

~~~
loqwe
[https://patents.google.com/](https://patents.google.com/)

~~~
lgas
It has risen.

------
abrbhat
Indian Judiciary System has its own online repository of court orders.
[http://www.judis.nic.in/](http://www.judis.nic.in/)

------
erikb
Why is google not enough google for the law? All my law requirements have been
met by Google. And my lawyer friends also use it.

------
known
[https://indiankanoon.org/](https://indiankanoon.org/) in India

------
josto
Judicata is a start. You can search California statutes and case law.

But lexis and westlaw are the tools needed for serious research.

~~~
meddlepal
My brother who is a lawyer said Westlaw and LexusNexus have the unfortunate
problem of being 99.9% accurate but the original source material still
sometimes needs to be pulled because of errors during import such as a dropped
comma which can change entire meanings of a law.

------
mtdewcmu
It would be a good idea to make a wiki, since the laws themselves aren't very
readable.

------
waspleg
because money

source: my dad was a lawyer for 20+ years.

------
gioele
There are plenty of sources for acts (many national and supra-national
services [1]) and for US case law (Google Scholar).

The _text_ of law alone is worthless. In very few cases you should search for
keywords.

The unsolved problem is that what is needed (and there are private systems
that can do this) is the ability to make queries like

    
    
       In 2014
       my only child was 17,
       my family lived in Italy
       but I worked most of my time in the UK;
       which version of the Italian child-care law applied to me at the time?
    

In order to answer this query you need to:

1\. Know all the text of all the acts out there at the Italian level, European
level and supra-national level.

2\. Find the main acts that deal with child-care law.

3\. Find all the acts that modify those main acts (they could extend it
duration, modify its content).

3b. Find all the acts that modify the acts that modify those main acts (maybe
the extension has been repelled).

3c. Find all the acts that modify the acts that modify the acts that modify
those main acts (I think the point is clear now)

4\. Consolidate (merge) all these acts using the rules that were in place at
the time of the enactment. This produces a tree of versions for each point in
time, not just a single version.

5\. Find all the judgments that reference any of these acts.

6\. Highlight the points that have to do with the user query.

Truth be told, having the raw text (point 1) is the easiest part. The rest is
what is extremely complicated. Regardless of this, there are private systems
in places that can perform this kind of queries (although in a very limited
fashion: their idea of "the whole corpus of law" is extremely narrow).

To make the life of implementers easier, markup formats like AkomaNtoso [2] or
Oasis LegalDocumentML/LegalRuleML [3] are being used, sadly not enough.

Making the corpus of the law accessible is an important first step. But the
corpus alone is is not going to be much helpful. It may even be dangerous if
the single texts are not cross-referenced with other relevant texts.

 _Appeal to authority: I worked on versioning legal documents (bills, acts,
judgments, etc) during my PhD. I also worked in the research group that shaped
the early versions of AkomaNtoso._

[1] IT: [http://normattiva.it](http://normattiva.it) (ex NormeInRete) DE:
[https://www.gesetze-im-internet.de](https://www.gesetze-im-internet.de) EU:
[http://eur-lex.europa.eu](http://eur-lex.europa.eu) US-CA:
[http://legisweb.com](http://legisweb.com)

[2] [http://akomantoso.org](http://akomantoso.org)

[3] [http://www.legalxml.org/](http://www.legalxml.org/)

~~~
matt4077
I worked on a platform for EU law for about half a year and got quite deep
into Akoma Ntoso. It's certainly the right direction but it was unfortunately
extremely difficult to even get an overview of the status. The various
websites felt dead and the documentation wasn't inspiring my confidence either
– IIRC it was actually provided as a word document :).

Plus, I just couldn't find anybody publicly using it. The EU parliament
supposedly does, but they could never give me an answer why they weren't
sharing it online (only .doc and .pdf).

~~~
ocky7
Apparently, the UN and some African countries also make use of Akoma Ntoso,
but like you I've never seen it in practice. Open data is not provided well.
For example, it is possible to get XML documents from the EU parliament, but
you have to request access to their FTP server and it is a laborious process.

I've met Monica Palmirani of Akoma Ntoso recently and she told me they have
just launched a case law standard, and so they're still working on it. I'm
actually amazed she hasn't burned out yet. Trying to get governments to play
nice data-wise is i n c r e d i b l y hard.

~~~
oever
The case-law standard is called ECLI. It's used in a few countries and being
rolled out in more. At least one Dutch news paper uses ECLI when referring to
court cases.

[https://en.wikipedia.org/wiki/ECLI](https://en.wikipedia.org/wiki/ECLI)
[http://bo-ecli.eu/](http://bo-ecli.eu/)

Dutch law and government publications are available as XML and ODF.

The constitution:
[http://wetten.overheid.nl/BWBR0001840/2008-07-15](http://wetten.overheid.nl/BWBR0001840/2008-07-15)

A publication about standards:
[https://zoek.officielebekendmakingen.nl/stcrt-2015-39782.htm...](https://zoek.officielebekendmakingen.nl/stcrt-2015-39782.html)

Each article can be linked and documents referring to each article can be
found as well.

For example all known documents that link to article 5 (equality) of the
constitution:

[http://linkeddata.overheid.nl/embedded/portal/spiegel-
lijstw...](http://linkeddata.overheid.nl/embedded/portal/spiegel-
lijstweergave?juriconnect=jci1.3%3ac%3aBWBR0001840%26hoofdstuk%3d1%26artikel%3d1%26z%3d2008-07-15%26g%3d2008-07-15)

The links are available as RDF.

Currently work is underway to publish law as XML and RDF with ODF/PDF/HTML as
secondary formats. This will allow embedding of data such as property lines,
lists of medicines, reusable financial reports.

------
LoSboccacc
liabilities

------
X86BSD
Nexus/Lexus? Westlaw?

