A Modern Compiler for the French Tax Code

weinzierl · on Nov 25, 2020

> In France, income tax is computed from taxpayers' individual returns, using an algorithm that is authored, designed and maintained by the French Public Finances Directorate (DGFiP). This algorithm relies on a legacy custom language and compiler originally designed in 1990, which unlike French wine, did not age well with time.

This interesting. When it comes to legacy code we think of COBOL and FORTRAN most of the time, but probably there is a huge amount of even more exotic code out there that does its duty day in day out.

semi-extrinsic · on Nov 25, 2020

In my country, with an admittedly quite simple tax code, the authorities used to publish each year official COBOL code that computed income tax.

They switched to Java in 2018-ish - and to Github instead of a random FTP server.

Doing your taxes has also been a no-op for many people the past decade or so. You don't do anything except read the tax report, unless you spot some mistakes or items missing in the report that was autofilled by the government, then you can go into the web form and submit the amendments necessary.

specialist · on Nov 25, 2020

That's genius.

A buddy of mine retired rich after creating property tax administration systems. They only had a few client counties.

They'd just about rollout new system updates before the next wave of tax code revisions. Never ending work.

Sounded like living hell. I couldn't do that kind of work and stay sane. But he seemed to enjoy it.

e12e · on Nov 25, 2020

Recently helped my dad clear out his office - and we came across a binder with fan-folded, faded dot matrix printed copy of COBOL code he used in order to calculate taxes in the 80s - translating the code to BASIC. The original COBOL code came from the Norwegian tax office - I'm not sure if they still publish something similar.

Ed: ah, same country https://news.ycombinator.com/item?id=25210009

A little unfortunate that we threw it out.

sdfin · on Nov 25, 2020

That sounds amazing. Which country is it?

ukd1 · on Nov 25, 2020

Cool - which country? Could you link to the Github repo?

semi-extrinsic · on Nov 25, 2020

Yeah, it's Norway.

https://github.com/Skatteetaten/trekktabell

danuker · on Nov 25, 2020

How can you tell? All "Languages" I get is "Java 100.0%".

semi-extrinsic · on Nov 25, 2020

The COBOL part is from memory. I'll have a look to see if I can find a copy floating around somewhere.

Edit: the transition appears to be more like 2016, and one of the commits then refers to a file generated from COBOL: https://github.com/Skatteetaten/trekktabell/commit/5181d86c5...

e12e · on Nov 25, 2020

I did find an awkward source that lists the 2017(?) COBOL code for the tables - the core program seems to be last updated 1993 (Norwegian only) : https://docplayer.me/38761458-Beregning-av-forskuddstrekk-in...

> IDENTIFICATION DIVISION.  PROGRAM-ID. FT7P200T. AUTHOR. PER J. RISTUN. DATE-WRITTEN. NOVEMBER 1993 ----------------------------------------------------------------  * BESKRIVELSE : PROGRAMMET BEREGNER FORSKUDDSTREKK FOR ÅR 2017 * (...)

semi-extrinsic · on Nov 25, 2020

This is the one!

If memory serves, they updated the program for each year (I guess only numerical constants and small rule changes), but they probably didn't update the source in the documentation.

1f60c · on Nov 25, 2020

I think it's Norway.

(I just googled GP's email address (it's in their profile).)

pratik661 · on Nov 25, 2020

I’m guessing it’s Estonia. It’s really ahead with digital governance.

skeletal88 · on Nov 25, 2020

Estonian here - no it is not Estonia.

Here our income tax declaration is very simple, the tax office already knows how much salary we have earned, and there aren't many deductions, so most people either don't have to file taxes at all, or just click next a few times and get their tax returns after a week or so. For 90% of people it takes less than 5 minutes.

toolslive · on Nov 25, 2020

In Belgium, calculation of the pensions was done in a mix of BS2000 assembly language and COBOL until at least 2007, when they entered a race to switch to Java before the last of the original developers retired.

specialist · on Nov 25, 2020

I worked for an org that tried to modernize, again, their mainframe legacy code, and failed.

I always thought it'd be easier to use an emulator and keep using the old stuff. Then over time whittle the code base down to just the business logic.

toolslive · on Nov 25, 2020

So you think they still had a 1970s mainframe ? No: Siemens themselves replaced it with an x84 based emulator in the 90s (which ran on a desktop machine).

Icathian · on Nov 25, 2020

My entire industry (subset of healthcare) mostly uses a single application for 95% of our core business functions. That application is written in Visual FoxPro and is, predictably, pretty terrible. Obviously it isn't going to get better, either, but it'll be a decade or more before there's a serious competitor.

ghaff · on Nov 25, 2020

I'm not near the space any longer (in a former life, the healthcare industry was a big customer of our computer systems). But a significant portion of healthcare in the US used to use a proprietary language called MUMPS. https://en.wikipedia.org/wiki/MUMPS

alextheparrot · on Nov 25, 2020

Epic has been trying to move on for years, or so a recruiter told me half a decade ago. Their job posting now describes “Using leading-edge technologies and languages like JS, TS, and C#”, though I wouldn’t put a bait and switch past them.

akx · on Nov 25, 2020

Yeah, well, they have a thing called TS2M, referred to e.g. here... https://www.reddit.com/r/epicsystems/comments/9pmsjj/ts2m_in...

ghaff · on Nov 25, 2020

MEDITECH, maybe others, had a proprietary operating system too. (We had a special deal to sell them hardware without an OS at a time when they were normally bundled.) Eventually they moved on to NT.

p_l · on Nov 25, 2020

MUMPS is quite popular in finance, mostly through Cache product.

Including for new developments.

legulere · on Nov 25, 2020

Hah I’m also working in healthcare and we also use Visual FoxPro for storing data. Luckily most of the Visual FoxPro UI is gone.

knaq · on Nov 26, 2020

We have it right here. Hacker News is written in the Arc programming language, which was developed by Paul Graham and Robert Morris.

https://en.wikipedia.org/wiki/Arc_(programming_language)

tester34 · on Nov 25, 2020

I've been asked to write parser of _my_country_law_ and something that's capable of doing diffs and putting it together (diff+original=>newest version) docs without knowledge in that domain

after seeing sample doc I've estimated it on something like 1 week of work (XD)

month later I've been crying and having like 20-30% done

this shit has been so sensitive (insanely error prone) and debugging was time costly. I think I didn't spent enough time on thinking about its architecture, but on the other hand my experience was pretty small with this stuff

easy & fast to test and reliably

have solid abstraction over original documents

have solid abstraction over operation (e.g Article 5's meaning is changed)

_____________

project died because it was needed "fast" (in that time there was very specific peroid of changes in law) and we weren't getting to the viable version fast enough

mywittyname · on Nov 25, 2020

I feel like legislation is a field that would benefit a great deal from the progress that software engineering has made in a number of fields.

It's already evolved somewhat into a DSL. With some nudging and technical leadership, I suspect that we could move it over entirely into a format that can be readily parsed, tested, and version control. The tax code is especially well-suited to this, because it's a lot of rules and math formulas.

In fact, I bet most tax software probably does something very similar to this.

javajosh · on Nov 25, 2020

Laws are supposed to be systematic invariants, and to specify them requires a comprehensive vocabulary containing the union of all the things people do in a life. This language is necessarily bound to your time and place, and it may not translate well in the future. (I feel like laws that live long enough to outlast the language in which they were defined hasn't actually been solved! Certainly we should have to reword the Constitution at least once every 200 years!)

javajosh · on Nov 26, 2020

BTW how do you think we would create a new Constitution today? What would it be made of, how would it be made, and how many copies would be made?

logicx24 · on Nov 25, 2020

I heard about the startup Legalese [1] a while ago, which does exactly this.

[1]: https://legalese.com/

an_opabinia · on Nov 25, 2020

It would be more valuable to write many fewer, simpler policies, but transparently document the why when it comes to decisions by stakeholders. For example, the only difference between getting something from the government (say, a California Public Records request) and a court is (1) cost and (2) the court almost always has to write a defensible opinion for why something happened some way, but a government agency does not. At the end of the day you wind up with the same possibilities of results.

jcranmer · on Nov 25, 2020

> the court almost always has to write a defensible opinion for why something happened some way, but a government agency does not.

That's not true. Agencies that do rulemaking have to provide justification for why they're enacting or repealing particular regulations, otherwise those regulations can be struck down as capricious. This is a major problem the Trump administration had: a lot of their attempts to rollback Obama's policies or enact new ones foundered on the ability to provide this justification.

an_opabinia · on Nov 26, 2020

> a lot of their attempts to rollback Obama's policies or enact new ones foundered on the ability to provide this justification

...in the court of law. Someone had to go and sue the administration. When they issued the policies, they simply provided no meaningful justification. You're proving my point.

jcranmer · on Nov 26, 2020

It's not the court that does the justification, it's the administration that has to. It has to be litigated in the court because, well, no other branch of the government has the power to decide that someone is violating the law.

walshemj · on Nov 25, 2020

This sounds easy until you actually get to grips with how parliamentary procedure and law making actually works.

For example last year I was on a SOC (Standing orders committee) for a 3.5 day conference with 600+ delegates.

We had 120 motions submitted working out Consequentials if motion 15 passes motions 16 99 and 120 fall is non trivial.

We also had to do compositeing of 20/21 motions on one topic which where all worded slightly differently and had slightly different effects took 4 of us about 2/3 of a day just for that.

Another example is say the various legal documents for pensions a choice of a different two letter word can lead to years of legal arguments - the difference between CPI and RPI

sjy · on Nov 25, 2020

The current buzzword for this is “rules as code” and many governments are exploring it. You’re right that it is particularly relevant to complicated financial laws like the tax code, and necessarily has been done in a limited way by tax software for many decades now. I am skeptical that this technology can meaningfully assist us in resolving contentious legal questions – it’s more about generating wizards that can help you navigate a 1,000 page tax statute by only showing you what’s relevant to your problem.

gamblor956 · on Nov 25, 2020

This is mostly true, however laws are are already generally written as (legal) diffs, and many agencies issue regulations as diffs as well.

Cornell Law has providing version control releases of the federal laws for some time for free, and Lexis and their competitors have been doing the same for decades for commercially.

eecc · on Nov 25, 2020

IMHO we shouldn't define the problem in terms of reducing legislation into some computer language or schema; indeed the effort should be in describing and linking the terms into a graph with consistency validation rules and shape identities.

paulvorobyev · on Nov 25, 2020

Do you plan on open-sourcing it, or do you have any recommendations for similar projects/research? Very excited about the computational law space.

sjy · on Nov 25, 2020

Check out http://austlii.community/wiki/DataLex. In Australia we are fortunate enough to have AustLII publishing virtually all Australian laws and court decisions (at least from this century) for free in a somewhat consistent HTML format. DataLex is the name for the “computational law” research AustLII staff have been doing since the 1980s. It’s interesting, but the real value of AustLII’s work is in getting the courts and legislatures to allow them to collect all the raw data and publish it in a free database with full text search. Just getting to that point is a huge improvement over what’s freely available in the US and the UK.

tester34 · on Nov 26, 2020

I don't think I could legally do that

specialist · on Nov 25, 2020

Who was the client?

I ask because my state's legislature has a staff that reviews proposed legislation and then maintains the revised official laws. If I was tasked with doing diffs, I'd first interview them, try to make their jobs easier.

tester34 · on Nov 26, 2020

Private Persona

vmception · on Nov 26, 2020

yeah complex tax laws often have interrelated and cocontingent things to compute the final tax on.

Is this deduction before or after your AGI, whats the maximum % of AGI that it can be, does taking the deduction lower your AGI and thus the max percentage of AGI that it can be?

tipain · on Nov 26, 2020

Ex Dee

vishnugupta · on Nov 25, 2020

During one of those rabbit hole journeys I discovered that Dutch Tax authorities use MPS, a DSL creation language/tool by JetBrains [1]

It makes me wonder, why haven't DSLs caught up? Their claim that it allows developers spend more time implementing the business logic makes sense. But somehow that promise hasn't been realised. I'm curious to know from those who tried DSLs.

[1] https://www.jetbrains.com/mps/documentation/

ethbr0 · on Nov 25, 2020

The nearest I've been able to articulate this is lack of overlapping skillsets & departments' tendencies to hire people who look like them.

The kind of person who (a) lives in business problem land & (b) is proficient in programming language design (even guided DSL generation)... doesn't exist. At least not in hireable numbers.

And those that do are buried deep in the guts of consultancies, who can afford to pay them way more than customers can.

It's a shame, because it results in suboptimal software. And suboptimal tooling available to the devs that do work in that space.

The best solution I've seen are products that lower the knowledge barrier of entry (at least for creating a proof of concept) + designing for non-programmers as your primary users.

denismerigoux · on Nov 25, 2020

Hi, OP here :) I've come to the same conclusion about programming language creation. However, for a very large organization, it makes sense to have a team in charge of language tooling (see Dropbox/Python, Facebook/Hack/Flow, Apple/Swift). Which is why I'm trying to convince the French state that they should have such a team of permanent people.

ethbr0 · on Nov 25, 2020

I worked in a similar kind of role recently, in a large public retail company.

The only advice I'd give would be to lean hard on finding, maturing, and then advertising end-user champions.

Cross-department / -traditional boundary products are frustratingly difficult to push top-down, as the leader of the space that "owns" the product (i.e. IT) doesn't directly see the value, because they're not the end user (business).

What mostly work for me was being as loud as possible with open-attendence educational events, continually taking meetings from interested areas, and then mentoring developing teams.

The goal is to help them create a killer product using your product, such that (highly-placed leader on their side) talks to (your leader) in glowing terms about your product. And that usually happens because your product helped them get a win that moved an important metric to them.

Hint: Ask them about things they've always wanted to do, but couldn't because it was technically impractical. There's probably at least one diamond in there that would be "easy" with your product.

Hint2: Think more broadly about the kind of thing you're trying to do, and get your team in that area. I've worked under CFOs as often as I've worked under CTOs, because "saving money" is near and dear to the former.

(Adapt as necessary to how French government works. Good luck!)

vishnugupta · on Nov 26, 2020

Quite insightful! Being a part of product teams, I've noticed "platform" teams struggle and the reasons have mostly been not doing what you've pointed out above. As in, instead of working with their customers (i.e., other product teams) to identify their problems and fix them, they would push down their generic platforms down the throat. It invariably didn't end well.

I tend to think that platform/framework teams within a large orgs should be run as a B2B SAAS, at least with that mindset.

Also, if a platform team isn't run well, it ends up being the first one on the chopping block during layoffs. Uber laid off an entire developer-platform team earlier this year. One casualty was the Screenflow team, a promising product that didn't gain wider adoption due to terrible marketing/evangelism.

ethbr0 · on Nov 26, 2020

One can turn it into a flywheel too.

What features should you work on next? The things your users are asking for at your touchbases.

There's a time and place for top-down, but it works best when there are few edge cases. Platform work tends to be a normal distribution with the usual number of "Oh. We never thought anyone would want to do that" tails.

denismerigoux · on Nov 25, 2020

Very insightful update, I'll remember it. Thanks!

ethbr0 · on Nov 25, 2020

Thank you for applying your skills to make the world a better place! If everyone did that, we'd all be better off.

908B64B197 · on Nov 25, 2020

The issue you'll bump into is attracting the same caliber of talent.

Compare the offers from Dropbox, Facebook and Apple.

breck · on Nov 25, 2020

I make them all the time (https://jtree.treenotation.org/designer/), and track many thousands of DSLs.

IMO, the problem is our languages are unnecessarily complicated. they are all linearly parsed BNFs, and you don't need any of that. I think things will start changing big time.

That being said, my favorite ecosystem in the traditional DSL world is ANTLR, and I'd highly recommend Terence Parr's books on the field if you are interested.

specialist · on Nov 25, 2020

ANTLR 3.x was a game changer for me. I was able to refine my grammars so that the resulting parse tree and abstract syntax trees were the same thing. No goofy inlined tree construction pragmas, term rewriting, post parse tree walk processing.

I'm just a grammar mechanic, so I don't really grok the underlying theory or use the right word for this stuff.

breck · on Nov 25, 2020

I was late to the party and didn't start until ANTLR4 https://pragprog.com/titles/tpantlr2/the-definitive-antlr-4-...

divtiwari · on Nov 25, 2020

Does his book actually teach you how to create a DSL using ANTLR and not just and overview? Btw the 2 books of his are almost a decade old, so are they any good now?

breck · on Nov 25, 2020

Yes, it walks your through it step by step. I have https://pragprog.com/titles/tpantlr2/the-definitive-antlr-4-... and at least 1 more of his, forget where it is. I would say it is ageless (at least, until something better than ANTLR comes along—ohm is promising but not sure what the latest is with that).

"The Definitive ANTLR 4 Reference" is absolutely the most understated title I've ever seen in a book. It's really more like "The Book That Will Change How you Look at Programming Languages Forever"

IMO anyway. I guess it depends on how much you've been exposed to parsers and grammars and compiler compilers already.

divtiwari · on Dec 2, 2020

Thanks, I'll start reading his book, already have some experience with interpreters and compilers :)

eecc · on Nov 25, 2020

They do? That's bloody awesome... there's something to this country I love, despite the bad weather and worse food :)

Can you share the resources you found?

vishnugupta · on Nov 26, 2020

JetBrains has a DSL creation language called MPS (Meta Programming System)[1]. I stumbled upon it while exploring their various offerings. I haven't played around with MPS but came across Dutch Tax system's use of MPS in their case studies, it's near the end of the page [1].

You will find a decent info to get started here[2]

[1] https://www.jetbrains.com/mps/ [2] https://www.jetbrains.com/mps/concepts/domain-specific-langu...

throwaway_pdp09 · on Nov 25, 2020

Could you clear something up for me, do you mean DSLs generally, or DSLs for tax codes specifically? TIA

vishnugupta · on Nov 25, 2020

DSLs in general, say in banking system, e commerce apps etc.

throwaway_pdp09 · on Nov 25, 2020

Well, this is (one of my) areas so here goes. DSLs are a concept, not an implementation. As implemented they can vary from chained procedure calls to actual sub languages with lexers and parsers (and I tend to consider the latter to be 'proper' DSLs, but that's just my view).

To have a 'proper' DSL I reckon you need two things, and understanding that a thing can and should be broken out into its own sublanguage, and the ability to do so. The first takes a certain kind of nouse, or common sense. The latter requires knowing how to construct a parser properly and some knowledge of language design.

Knowing how to write a parser is not particularly complex but as the industry is driven by requirements more of knowing 'big data' frameworks rather than stuff that is often more useful, well, that's what you get, and that includes people who try to parse XML with regular expressions (check out this classic answer <https://stackoverflow.com/questions/1732348/regex-match-open...> Edit: if you haven't seen this check it out cos it's brilliant).

I think this reflects the fundamental problem in software development of the market's not knowing what's actually needed to solve real business problems.

++++

Edit, some reading material

https://www.amazon.co.uk/Language-Implementation-Patterns-Do...

https://www.amazon.co.uk/Definitive-ANTLR-Reference-Domain-S...

https://www.amazon.co.uk/yacc-Nutshell-Handbook-Doug-Brown/d...

They're all worth investing the time in.

vishnugupta · on Nov 26, 2020

Thanks a lot! This is very useful information and follow up books.

hirundo · on Nov 25, 2020

Modern computer languages are DSLs for writing DSLs, which you do by defining public domain specific classes, methods, modules, etc. A library for computing PI is a PI DSL.

At least, that's the right way to name things in code.

throwaway_pdp09 · on Nov 25, 2020

That's really stretching the concept of a DSL, but at an extreme it can be seen that way. What you're really describing is hierarchical structure.

sandcha · on Nov 25, 2020

The French government and parliament also use this implementation of the law (taxes, benefits, ...) : https://github.com/openfisca/openfisca-france It's open source and contributive so, a common structure is used by multiple countries. They are listed here : https://openfisca.org/en/countries/

Eikon · on Nov 25, 2020

Here's the source code of the implementation of the tax code: https://github.com/etalab/calculette-impots-m-source-code

protz · on Nov 25, 2020

Hi, one of the paper authors here. This is unfortunately only part of the story. This "calculette" covers only a fraction of the tax computation; furthermore, without knowing the crazy semantics and computational rules of the M language, it's very hard to reproduce the tax computation.

As a side-note, the source code appears to have moved here: https://gitlab.adullact.net/dgfip/ir-calcul

rozab · on Nov 25, 2020

The languages breakdown is interesting. It seems GitHub refuses to admit defeat and categorises the .m files into various other languages which use the extension (M, MATLAB, Objective-C, Mathematica and Mercury). I wonder if they use some sort of fuzzy ML solution for categorising them rather than conventional parsing.

remram · on Nov 25, 2020

Wonder no more! https://github.com/github/linguist/blob/7c2adbdb15d4efd25d92...

As most AI, it's regex and ifs all the way down.

nolok · on Nov 25, 2020

> As most AI, it's regex and ifs all the way down.

This made me laugh way more than expected, mostly because of how true it is.

rozab · on Nov 25, 2020

These incredibly vague regexes are hilarious. But I guess if it works, it works (until it doesn't)

Eikon · on Nov 25, 2020

Here you go!

https://github.com/github/linguist#how-linguist-works

riffraff · on Nov 25, 2020

I can't help but wonder what awesome things happen in "iliad" and "ocean" mode after looking at some of those files:

    application : pro, batch , iliad,oceans ;

Eikon · on Nov 25, 2020

Iliad means "Informatisation de L'inspection d'Assiette et de Documentation" which roughly translates to "computerisation of tax base and documentation inspection".

denismerigoux · on Nov 25, 2020

Hi, author here :) You seem to be well-informed of the DGFiP jargon, do you know if news of my work has been spreading among the IT department there?

Eikon · on Nov 25, 2020

Hi! Unfortunately I have no idea, I don't work there nor have any affiliation with them.

nitsky · on Nov 25, 2020

If tax codes are so complex that authorities struggle to maintain the code that implements them, how are humans supposed to understand them well enough to follow the incentives they are designed to create?

ip26 · on Nov 25, 2020

Like every other piece of software they are probably spending 90% of their time on edge & corner cases while citizens are spending 90% of their time solidly in the simple core functionality.

consp · on Nov 25, 2020

Most of the time you only need a small subset which is manageable. I once tried to increase my deductions as a normal citizen but this went nowhere quickly as you need to create giant loopholes to get it done as a person. Businesses on the other hand are the ones using the complex stuff and they can hire an expert (in this case expensive doctors in the case of the body metaphor of the other poster) at quite some cost.

dragontamer · on Nov 25, 2020

Most people don't know about the muscles or bone structures in their hands, but most people seem to know how to use their hands anyway, despite the gross complexity involved.

valuearb · on Nov 25, 2020

Most people aren’t at a gross disadvantage to those who know how to game the system or bribe authorities to make their hands work far better than others.

speps · on Nov 25, 2020

From some of the same authors: https://github.com/CatalaLang/catala HN comments: https://news.ycombinator.com/item?id=24948342

secondcoming · on Nov 25, 2020

Translating legal texts to mathematical form is very interesting. It could decimate most legal jobs if a lawsuit can be converted to mathematical form and then 'executed' against the laws that are also in mathematical form. You get your judgement and the explanation as to how that conclusion was reached, all automatically.

It could even cause headaches if contradictions in legal judgements are detected.

It all relies on the conversions to mathematical form being done correctly though which, given that some laws can be intentionally vague, may be impossible.

smarx007 · on Nov 25, 2020

Australian CSIRO is working exactly on that

https://theconversation.com/csiro-wants-our-laws-turned-into...

https://research.csiro.au/bpli/our-research/reasoning/

https://people.csiro.au/G/G/Guido-Governatori

They are using https://en.wikipedia.org/wiki/Deontic_logic and https://en.wikipedia.org/wiki/Defeasible_logic describe laws in terms closest to how it's done in the legal community.

Can't find a link but they codified some parts of the Australian import duties laws that we went through in a workshop of theirs that I had a chance to attend.

contingencies · on Nov 25, 2020

https://theconversation.com/csiro-wants-our-laws-turned-into... claims law-as-code is a bad idea because its "dynamic"/"always changing" and "discretionary"/"requires or open to interpretation".

The first argument is nonsensical (computers are great at changing data: in fact way faster and more accurate than humans, having the capacity for things like single source of truth, change logs, peer consensus, and dynamic versioning while preserving historic versions automatically... and also better at publishing it, analyzing it, and documenting it).

The second argument is an outright straw man fallacy. Who cares if some laws require interpretation. Just write MAY instead of WILL (like RFC language) to make it clear the judge can decide, then provide statistical information regarding past case law. Nobody is saying "fire all the judges", they're just saying make the law clear. Right now it's not clear, and it's a big problem.

Refugee? Read up to date law in your language, automatically. Business? Determine what is required by law in order to execute a transaction. Human? Determine what you are and are not allowed to do in some field (like architecture, driving, sailing, pet walking or rock collecting) without being fined.

riffraff · on Nov 25, 2020

> The second argument is an outright straw man fallacy. Who cares if some laws require interpretation. Just write MAY instead of WILL (like RFC language) to make it clear the judge can decide, then provide statistical information regarding past case law. Nobody is saying "fire all the judges", they're just saying make the law clear. Right now it's not clear, and it's a big problem.

But that is not how law works. A judge is generally expected to interpret the law because we cannot expect someone who wrote the law to have predicted all possible things, especially those that did not even exist when the law was written.

It's (often) not close to a machine-interpretable spec, but a to visual mockup, to stay in the software area.

For example, you may have a law to forbids euthanasia. Does it also extend to assisted suicide? What if the dying person can't physically trigger their own death? What if assisted dying is illegal here, but someone takes the patient to the neighboring country?

Also, I hardly believe "read up to date law in your language" is possible, there are entire legal concepts that do not exist in different jurisdictions, or literally the same expression may mean different things ("voir dire" for example).

It's good to attempt formalizing things, but I don't think this is a strawman.

contingencies · on Nov 25, 2020

The straw man was "interpreted law is uncodifiable" - essentially throwing the baby out with the bathwater. We can interpret where necessary without claiming that because interpretation is required in specific cases that the whole concept fails.

smarx007 · on Nov 25, 2020

> law-as-code is a bad idea because its "dynamic"/"always changing"

I don't think it was meant in classical terms of an SQL UPDATE. It was meant in a way that a new rule may affect the application of an existing rule. Academically speaking, it makes reasoning non-monotonic. This is exactly why https://en.wikipedia.org/wiki/Defeasible_logic is proposed.

> The second argument is an outright straw man fallacy.

I think they are trying to codify the laws in logic _without_ changes to the law.

[Edit] The journalist also failed to read up on the basics of deontic logic and defeasible reasoning: "The law says cars must drive on the left in Australia. But what if they have to cross the road to avoid hitting a child?"

contingencies · on Nov 25, 2020

Yes, in the current system it's shifting goalposts. That is a bad system.

You cannot completely successfully codify something that is based on the vagaries of wishy-washy language, nor specific legal concepts like the intent of regulations or the context of prior judgements. Therefore, improve the language: don't give up!

Imagine if latitude and longitude weren't invented because "sorta over there a few days sail beyond the cape" was too hard to quantify. This is the same ridiculous argument. It just so happens that there are also a vast number of ingratiated rent seeking and powerful people and corporations interested in the status quo: literally all of them.

I believe that as engineers and as optimists within the greater human endeavour, over time in all fields we should seek to create means of trust and means of precision: in our measurements, in our communications, in our analyses, in our references and in our collaborations.

We don't need to fire all the judges. But maybe 90% of the solicitors and standard procedural lawyers, a large part of whose job is explaining to the average citizen what exactly is the done thing in some particular area or how exactly they can expect to be treated the hands of a system that cannot otherwise explain itself.

Also, in terms of community governance if it becomes crystal clear that a law is being abused through increased fidelity in the logging of police actions brought about by such a system, then the law can more rapidly be identified and repealed.

sjy · on Nov 25, 2020

The last century has seen a huge increase in the quantity of law that is written out explicitly in statutes [1], rather than being worked out on a case-by-case basis according to the common law method. This is an attempt to “improve the language” as you have suggested, but it has not made the law easier to comprehend. Detailed statutes make it harder for lawyers to build up a coherent picture of the entire legal system, because there’s a much greater risk that a solid argument based on general principles collapses due to a specific statute that the lawyer has never heard of.

[1] https://www.gov.uk/government/publications/when-laws-become-...

contingencies · on Nov 26, 2020

Again, this points to poor scoping, an imprecise language and a broken system. According to your assertion it's currently impossible for even professionals to know what should be considered in scope. This should be the trivial basis of a legal case, not the difficult and dubious extended research result of paid professionals.

sjy · on Nov 26, 2020

Yes, that is my assertion – it’s impossible to verify that you have considered and correctly interpreted all relevant law, although it should be rare for professionals to make a mistake, especially after extended research. I’m curious as to what makes you think this problem can be solved trivially? Formal verification is hard enough for algorithms over the integers.

contingencies · on Nov 27, 2020

Law needs to be rewritten to be efficient and transparent. Logically speaking, the first countries to do so should score major investments and trade bonuses from multinationals. If you have a tinpot dictatorship with US protection or an out of the way novelty country with few useful industries, you could do worse than throw down on this project.

dane-pgp · on Nov 25, 2020

> It could even cause headaches if contradictions in legal judgements are detected.

I remember hearing an anecdote about researchers who tried to formally specify the benefits system of some European country, and they found that the system was slightly non-deterministic, in the sense that the outcome for the citizen would depend on which order various government departments processed the various forms that the citizen used to enrol into various programmes they were eligible for.

It makes me wonder if these mathematical formalisations should support some kind of fuzzing, to check that entities doing things in slightly different orders (or earning slightly more/less) don't produce dramatic changes in outcome.

MaxBarraclough · on Nov 25, 2020

It's not possible to reduce all lawsuits to compact formulae. Part of the legal profession is in interpreting old laws in the modern context, for instance, which would take something close to a general AI. More broadly, reasoning about legal edge-cases needs a sophisticated understanding of the law and of the world. Also, jurors can't be replaced by software, as they must be 'peers' by definition.

I imagine it would be much easier to build a system to roughly estimate the odds of a lawsuit being successful, inputting salient features of the case and running the numbers, without deep reasoning about the particulars. I don't know if work is already being done on this but I wouldn't be surprised. A lot of money must ride on knowing which corporate law battles can be won.

sjy · on Nov 25, 2020

Work is being done on this – here’s some in the human rights context: https://link.springer.com/article/10.1007/s10506-019-09255-y

But the outcome of high-stakes commercial litigation is much more complex than “win or lose.” Most cases settle, so the ones that result in published legal judgments which can be fed into a machine learning tool are an unrepresentative minority. Participating in this kind of litigation also costs millions and attracts public attention, leading to innumerable second-order effects (eg. public relations damage, law reform) that are hard to predict and may be more significant than whatever specific legal decision a judge makes. So being able to put a numerical probability on the legal opinion “we think the company has a good chance of winning this case” may not be that helpful.

Zhyl · on Nov 25, 2020

It sounds like what you are describing are Ricardian Contracts. [0] Also worth viewing is this talk by Clay Shirkey on technology tools for Government [1].

Ultimately, I don't see law being automated. As others have pointed out, most cases that go to court are all about interpretation and establishment of the circumstances (did events X, Y and Z happen), interpretation of the law (does law F mean that event X was illegal, and is event Y a mitigator or irrelevant?) and, more elusively, impact (how serious was X?).

What this doesn't mean, though, is that there aren't tools that can't be used in the creation and management of law. If we have a simple dependency tree of documents and sources, we can trigger a review of downstream documents if a document is changed. We can statically check some aspects of the document, like a Legal IDE that would present itself more like an advanced spellcheck. Meta rules can ensure consistency, add in support for in-line reading (include a definition next to a paragraph even though it is defined elsewhere).

Probably the closes thing you'll get to an 'automated' system would be for the 'process' that is currently done by administrators to be partly or fully automated. When charges are brought, a court date is automatically booked. When a judgement is made, a court report is generated, prison alerted etc etc.

Basically, everything that people in tech think should be automated, shouldn't be, and all the things technology people think shouldn't exist should be automated.

[0] https://en.wikipedia.org/wiki/Ricardian_contract

[1] https://www.youtube.com/watch?v=CEN4XNth61o

dalbasal · on Nov 25, 2020

Well technically (joke)... a decimation is 10%.

I agree. Very interesting. Realistically, paperwork & bureaucracy is an ideal target for automation. That said, every input that you feed an algorithm is legally debateable... Structuring your inputs to be both legal and optimal is more storytelling than mathematics. There's enough play in these "subjective" parts that results could be anything.

Still, having a major algorithmic in the mix changes the game.

The fact is, that while PCs have been (a) on every desk and (b) paperwork efficiency machines for the last 30 years... the total number of legal/accounting/hr/admin jobs has increased in this period.

The effects of technology/efficiency on "clerical" work is not predictable in the way it is on factories.

petercooper · on Nov 25, 2020

In countries or legal frameworks based around civil law, I think that could possibly occur. It would be impossible to do this in a formulaic way in legal areas covered by common law, although having digitized legal "assistants" that could bring up and analyze relevant precedents seems almost guaranteed (if they don't exist already?)

Bayart · on Nov 25, 2020

Civil law probably has a big edge in computer-driven legal implementation, if that's where things are headed.

I have no idea what would be in it for Common Law countries. Perhaps they'd be more driven towards using data mapping techniques to lay out networks of heterogeneous legal stuff (jurisprudence etc.).

naringas · on Nov 25, 2020

IMO this is such an 'obvious' idea once you think about it that it will eventually happen.

The law will go from written natural language to executable computer programs (formal languages).

However a change of this mangnitude cannot happen quickly, (and might even be impossible to do peacefully).

Abusing the "software is eating the world" analogy, we ain't seen nothing yet. Software is just barely opening its mouth.

AndrewOMartin · on Nov 25, 2020

There is no floating point representation for 'gross', 'egregious', or 'blatant'. At least not one that will be suitable in all cases.

MaxBarraclough · on Nov 25, 2020

Right on. Consider whether forcing someone to unlock their smartphone should constitute compelled self-incrimination. This is a problem that US courts have flip-flopped on. [0] How would an SMT solver derive answer? It can't. You'd need a general AI.

Even ignoring the subtleties of the real world that the ̶l̶a̶w̶ edit legal system must cope with, nuts-and-bolts legal concepts like mens rea can't be mapped to algebraic expressions.

[0] https://arstechnica.com/tech-policy/2020/06/indiana-supreme-...

naringas · on Nov 25, 2020

but the law is not the judge

noisy_boy · on Nov 25, 2020

Yes but you can do a what-if analysis based on the parameter $degree being equal to 'gross', 'egregious' or 'blatant'.

ape4 · on Nov 25, 2020

They opitimized it by removing dead code etc. So I wonder if that could be fed back into the original laws to simplilfy them.

anticristi · on Nov 25, 2020

I was rather thinking about fuzzying it to find gaps in tax laws.

mjul · on Nov 25, 2020

In a previous job we built an advisor for banking and pensions using a mathematical optimiser solving over a model of the Danish tax rules. It would normally find enough tax savings to retire about a year earlier by finding the best savings strategies (e.g exploiting dynamic and temporal effects between taxation on spouses retiring at different times, using the house mortgage as a savings buffer to shift pension payouts to lower tax years etc). It was quite an eye opener.

stevesimmons · on Nov 25, 2020

You should write a blog post or a HN story about that!

mjul · on Nov 27, 2020

I can give a few highlights: it had an exact model of a personal economy (savings, pensions, debt, houses, cars, boats and other assets) and the tax system and a projection engine capable of predicting the future cash flows for a person.

The projections were the basis for advising the client how to best manage their finances.

The advisor model could request an optimisation of how to best allocate the assets and liabilities over time from a mixed integer programming model built with GAMS. The latter optimisation model was not exact but it could generate close-to-optimal strategies for e.g. pension savings, buying houses and how to best spend the savings after retirement. The best strategies were then fed back into the exact model, evaluated and presented to the user.

It requires a pretty complex tax system to really generate a lot of value for the clients so its value was lower in the neighbouring countries that had separated their social systems more from the tax system.

diordiderot · on Nov 25, 2020

Where could one read more about this?

mjul · on Nov 27, 2020

It was developed by a Danish startup named Financys, which was acquired by Schantz (another Danish company) and this was later acquired by Keylane from the Netherlands. Unfortunately I don’t think there are any good, detailed information about the technology available any more.

pradn · on Nov 25, 2020

Someone did do this - finding discontinuities in the tax law - using formal methods, for the French tax law:

https://blog.merigoux.fr/en/2019/12/20/taxes-formal-proofs.h...

dolmen · on Nov 25, 2020

He's also the author of the research paper we are discussing.

pradn · on Dec 4, 2020

Oh, oops!

creuset · on Nov 25, 2020

Most of the French socio-fiscal system can be simulated in Python through https://github.com/openfisca/openfisca-france/.

valuearb · on Nov 25, 2020

I remember when the US tax code was going to be simplified so that an entire tax return form could be printed on a post card.

sjburt · on Nov 25, 2020

The majority of tax payers actually don’t have very complex taxes, if you only have w-2 income and don’t itemize deductions, you just need the single page 1040. Even small businesses aren’t complicated if they have proper bookkeeping, or if you’re just deducting mortgage interest. The piles of tax regulations really are for wealthy people and corporations with complicated accounting.

Taniwha · on Nov 25, 2020

Here in NZ we got rid of almost ALL deductions the tax system is so simple that almost all people with one job pay exactly the right tax through employer PAYE (as someone who runs a small company I do my PAYE with 1 line of spreadsheet formulae, it's easy).

This means most people don't need to file, if you want to it's a 1-2 page web form, and, starting last year, the IRD will do it for you anyway and directly credit your refund (with interest) to your bank account without you asking

johntb86 · on Nov 26, 2020

What do you do about interest on savings accounts or capital gains on stocks?

Taniwha · on Nov 26, 2020

The bank takes it out at a standard rate - when you open an account you tell them your marginal rate, if you get it wrong it's obvious to the IRD.

We have no tax on capital gains (we should, it badly distorts our markets, leaves too much money in real-estate).

kccqzy · on Nov 25, 2020

Was that the 9-9-9 plan? To be honest such simple tax rules was eye-opening and refreshing but I don't think it was particularly fair.

valuearb · on Nov 25, 2020

It actually goes back to the flat tax proposals to set the top rate to 28% with very few deductions.

The idea was that you don’t have some rich paying the top rate of 39%, and some paying near zero (cough, cough, Mr. Trump) due to a large discrepancies in deductions.

rsynnott · on Nov 25, 2020

You'll notice they never specified how big the writing was going to be.

simonebrunozzi · on Nov 26, 2020

> This algorithm relies on a legacy custom language and compiler originally designed in 1990, which unlike French wine, did not age well with time.

Gotta love the sense of humour.

tlb · on Nov 25, 2020

I wonder if you could backpropagate gradients all the way through the tax calculation, so it could tell me what to change about my finances to minimize taxes.