Hacker News new | comments | ask | show | jobs | submit login

One of my more painful anecdotes goes like this: Years ago, we were coding a piece of government regulation that decides eligibility of some kind. The rules went something like this "if it's a man and over 60 or over 55 and part of these classes of occupation but not in .. or ... then .... If otoh, it's a woman and ... ". The law was written down in the exact same wording. We coded it in the same way 12 pages of conditionals split out into different functions to make it more readable. Then I came, armed with my knowledge of algebra, Cayley tables and optimization of boolean functions. I created a truth table for the boolean function at hand, optimized that function into a small equivalent expression and replaced the 12 pages of code by a one pager. A job well done.

Two months later the law changed adding a new special case somewhere deep inside. I could start over. Lesson learned: don't outsmart the logic of the business.




When coding from a well maintained specification, I always try to find an architecture where the structure of the code mimics the structure of the specfication. This way, maintenance of the code becomes a lot easier. Sometimes, there are some small points in specification that makes it difficult to have the same structure in code. It is a "spec smell". Often, after discussion with spec writer, it was an incorrect point that had to be fixed in spec.


The most extreme version of this was a colleague who wrote a parser for the "human readable" version of the H265 spec. This could then be turned into executable code - and also a set of test streams.

This resulted in him filing a large number of minor bugs between the spec and the reference implementation: https://hevc.hhi.fraunhofer.de/trac/hevc/query?status=accept...

And a product: http://www.argondesign.com/products/argon-streams-hevc/ (page includes explanatory video)


Very cool :)

On a much smaller scale, VPRI made part of their TCP stack by parsing ASCII diagrams of packet contents (e.g. see the STEPS reports at http://vpri.org/writings.php )

Edit: Ah, I see user nradov has already mentioned this in a sibling comment :)


Wow, that is some truly impressive piece of technology! Kudos to those who successfully completed the daunting task of actually implementing this pretty crazy idea.


It was largely the work of one very smart person; not so much a 10x programmer as a mathematician^10.

Edit: the tool critical to the project was "Ometa": https://en.wikipedia.org/wiki/OMeta


If I actually match a tech spec or a law, I usually put references to the related text in comments, next to the code handling it. E.G: I'll put the spec version + page, or law paragraph number.

This way you know when you come back to it, what must stay, what must go, and what must change.


Forget the cleverness of the code... If I see comments like that (providing direct references to the spec/law) I will know I am reading the work of a true artisan!

Hats off to you, you wonderful person!


VPRI wrote an experimental TCP/IP stack which parsed the structure of the relevant IETF RFCs to generate code.

http://www.vpri.org/pdf/tr2007008_steps.pdf (Appendix E).


Another similar route is to code it as an interpreter over a data structure that represents the rules - When stored in something like a database, it can then be hot-reloaded.


Perfect example of reality VS classroom.

There is a reason people tell you to do "KISS". Which is just coding what you need right now, and not trying to be too smart about it.

Life is full of special cases : try to code an international calendar application, and you'll see what I mean. So many cultural things, so many political things, so many legal things...

So yeah, if you can find an elegant way to generalize your problem, do so. But unless you work on a purely theorical topic, you will have special cases. And they will change.


This is something I learned: When coding for this kind of legal tests just make your code read as close as possible to how the actual law reads.

It's easier to check that the code matches the law (after all there are no test vectors on the law apart from case law so you have to go by the drafting) and it makes updates when an amendment is made much simpler.

That, and there is usually zero need to optimise.


Or instead of optimizing it by hand, you have a machine do it automatically.


Indeed. Besides, it makes audits and compliance demos so much easier.


>Lesson learned: don't outsmart the logic of the business.

Very often when people try to solve problems, like the one you faced, with software, they fail to recognize that you might have to rethink the business logic. It's my belief that the majority of software design for specific niche business and governments fail because there was a lack of willingness to change and simplify the rules.

Of cause a project create a new system for taxation will fail, if you have 6000 pages of rules, not including the tax law itself.

Often we think we have a software bug/issue, in reality what we have is a flawed business logic. Until such time the people in charge fixes the logic, we're not able to do anything clever of efficient in software.


>* they fail to recognize that you might have to rethink the business logic*

Oh, we often recognize this.

You try telling a paying client that you think their business process is sub-optimal. I'll fetch the popcorn...

There are systems out there essentially replicating the workflow of older systems which replicate the workflow of an even older one, which was designed simply to automate a paper-based process all those years ago. If you are talking to a large business then the chances are that the people you are talking to don't have authority to change the business processes even a little bit.


Or, as is more often the case in my experience, the logic is dealing with real world cases that are outside the software's control and the business logic must follow the real world cases to be worthwhile. Which I guess is what the parent's spec example is really an example of.


I worked on ERP software in a past life and this was the main challenge. The software had to deal with "normal" cases well enough to improve efficiency, but had to be flexible enough to allow employees to handle unavoidable exceptions, without becoming a giant kludge. There was nothing worse for a user than having to deal with a special case and finding that they can't because they are blocked by the software.


You're right. But when a government is involved, you cannot even start explaining to `the business` they have flawed logic. (Believe me, I've tried... and failed)


When a business with multiple layers of management is involved it also quickly becomes impossible.

“It was decided!”

“By whom? Why?”

“???, It was decided!”

“...”


It was decided over dozens of meetings with dozens of people, and there are not good notes. Those people are smart and considered more details than you did (but not always the same details). There is rightly fear that if we reconsider this again we will forget one of those other details and some up with a decision that is better for your case but worse overall because it breaks some other case.


Isn't it more likely they just keep on adding little exceptions in order to be able to show they pushed something through for their constituents? That's the impression I got when reading these laws in the first place (a bone for the liberals, a bone for the socialists, ....)


Depends. Sometimes things work that way, sometimes they do not. It depends on how much attention the extras would bring.


Isn't this is a feature, not a bug? When things are decided by a democratically elected body you don't necessarily want a contractor coming in and tweaking things "because this is a better way."


Alas the real world is often not amenable to the fiery rage of logical-minded programmers. I’m sure there are plenty of examples of poor logic in real-world policies that could be simplified but there are just as many or more exceptions that serve a purpose, whether the justification is encoded in your requirements docs or not.


That's also why large software projects for government usually fail.

Create a system that handles all the rules of a single municipality or so -- doable. But too expensive for a single municipality.

So we create one system that will work for all municipalities! That way they can afford it, right?

Yes, but none of them is going to change their process. All the different ways of doing more or less the same thing have to be supported, in one system.

End result: nothing working, way over budget. Every single time.


Interestingly, the APL community (or at least the kdb/K community) and the Forth community (from what I hear) often push in the exact opposite direction. They seem to love to take advantage of any accidental regularity in the problem as given to simplify the code as much as possible.

The logic is that often the specs don't change, and if you make the code concise enough, you can just rewrite it if you need to make it more general.


Oh, some things never change. Others change all the time... And there are a few that absolutely never changed, but as soon as you automate them, they will change.

If it was easy, developers wouldn't be well paid.


It's an uphill struggle, but I think the EU proposal to abolish DST changes may have been driven by this kind of desire for simplification.


Most projects like this fail because they don't practice TDD. They have pages of code and no single proof that it works as expected apart from feedback from people testing it which is prone to errors. 6000 pages of rules is not a problem. You can implement anything but if you don't care to prove you implemented it correctly then that will almost never work.


Proof and test are different things, you seem to be missing them. A proof is a math concept to verify that all cases are handled, a test is a implementation concept to check one particular case is handled the way you think it should be. Both are important because both uncover a different type of bug. (Proofs uncover cases where you didn't think of something, tests uncover cases where you thought of something but didn't get some detail right)


That’s a genuinely great anecdote, and very instructive. Please tell me you used a Karnaugh map.


yup.


Reminds me or working in the UK for a telecoms company on billing.

Every year on budget day I used to listen to the live speech from the house of commons in case they changed something that effected billing - one year they changed VAT (sales tax) in the middle of the month.


I realise these are very uncool, but I've always had great success with using rules engines for this kind of thing (insurance / government rules especially).

If you can get over their lack of coolness (some implementations use spreadsheets!?!?!) they are a nice way to separate the crazy complicated stuff that changes all the time from the rest of the more static logic.


My question with rules engines is, “how do you test and deploy changes to the rules?” If you go through the same process as the rest of your software, then why not just use code? If you have a different process, how do you make sure your rules don’t conflict or overlap in unexpected ways?


The rules are code and should be handled as such - but they're written in a domain specific way that makes them easier for humans to compare against the spec.


That is up to the business, though with a rules engine I imagine you can empirically test that all cases are covered by just running all possible inputs through it.


Whatever happened to the Mechanism vs Policy mantra? It was all the rage in the nineties but perhaps X11 soured people on the idea.

Anyway, business logic is a perfect example of stuff that should go in Policy: functional, side-effect-free code (but not necessarily tidy or easy to read) which computes a value from a set of inputs. Like the parent says, effort spent optimising here may end up being a poor investment if the requirements change every year.

Mechanism code, OTOH, generally ends up containing all the side effects. It's harder to test so it tends to be dumb and, therefore, less susceptible to changing requirements. Optimisations here can pay off because the code lives longer.

Really though it all comes down to tests, tests and more tests. Automation is an asset and test cases outlive everything so make sure they work against the interface, not the implementation.


Tests are great and all, but wouldn't have helped in this particular case. It also seems policy was separated cleanly - the problem was it was optimized instead of maintainable.


Or you could have written a code generation tool. But I guess for this kind of job it might have been overkill and introduce its own special bitrot.


GP was essentially being a protein-based transpiler. The reference source code was the appropriate legislation; the code GP wrote was transpilation output. Since biological transpilers are slow and error prone, it makes sense to keep the output as close as possible to the input, to allow for "incremental transpiling" ;).


It can still makes sense to translate the rules into a parseable format instead of hard-coding them, to allow for code generation (or dynamic evaluation) and programmatic analysis of the rules.


I've often thought tax laws should be expressed in some kind of domain specific language that you could directly interpret.


A very sensible idea, and not just for tax laws, but I can imagine a lot people having problems with that kind of approach because they survive by interpreting/creating the ambiguities that go into most legislation.


"Two months later the law changed adding a new special case somewhere deep inside."

Time for a DSL and a compiler from the DSL to a truth table! :)




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: