Hacker News new | past | comments | ask | show | jobs | submit login
New Explainable AI Algorithms (wagtaillabs.com)
111 points by pete_b_condon on Nov 17, 2020 | hide | past | favorite | 18 comments



Why doesn't this end up with an infeasible number of rules and equally huge reassembled trees once you go beyond toy examples?


That's a very good question.

You're right that the way we typically train Tree Ensembles creates a massive number of rules, the walk through Random Forest has more than 100,000 leaves per Decision Tree. Once we start grafting it the number of rules starts to vastly outnumber the amount of training data.

I have some follow up articles planned that will cover this in more detail, but the short answer is that I feel that we often jump to overly complex models up front without fully considering whether the accuracy/complexity tradeoff it worth it. Using Amalgamate I showed how I could have the number of rules without significantly increasing validation error (+5%). I believe that if we're careful using model sophisticated techniques (i.e. Boosting and dense/fully connected/tabular neural networks) then we should be able to create reasonably accurate models that are reasonably straight forward to explain.


I've been writing new Explainable AI algorithms in my spare time. Always interested to hear what people think.


This is really nice work.

If you had a mechanism to subscribe for future updates, I'd do it.

Either way, this is cool stuff.


Thanks :)

I've been trying to work out where to put updates, so far I've been using GitHub & Twitter (both @wagtaillabs). I'll keep posting to HN as well (I just had two orders of magnitude more traffic than any other day).

I'd be more than happy for any suggestions on places where people could follow (I've thought about an email list, but I'm not sure how many people actually read emails any more).


I took Show HN out of the title because this seems more like reading material (https://news.ycombinator.com/showhn.html). It's very good though, so I also put it in the second-chance pool (https://news.ycombinator.com/item?id=11662380).

(While I have you, would you mind adding an email address to your profile that we can contact you at? We do that sometimes when we want to invite a repost.)


No worries, I thought I was ok because the post links to the GitHub repo but I'll make the link more explicit in future.

Yep, I've added an email address now :)


Great—thanks!


Fantastic read! As a long term fan of RF, a lot of things immediately clicked and made sense. It's also a great new direction compared to SHAP-style explanations which most of the industry is using at the moment.

I'm excited to play with the code.


Thanks :)

Yep, code's on GitHub. I'm more than happy to collaborate, there's heaps of things that need to be done.


I only see a single Jupyter notebook in the GRANT repo. Is all the code in there? https://github.com/wagtaillabs/GRANT/blob/main/GRANT_walkthr...


Yep, all the code is in there. I do have another piece of code that has all the algorithms in a single class (which makes it much easier to use), I'll double check that it's up to date and post that tonight.


An alternative is to re-lable data with the ensemble's outputs and then learn a decision tree over that.


That works to a point, but it doesn't necessarily find all the rules of the model. In the post I walked through a model with three training records (yellow, blue, red) which created six prediction boundaries. Half of the rules weren't covered by the training data, which makes them hard to find without an efficient algorithm to search out all possible rules. The risk of undiscovered rules is they may cause unexpected behaviour that leads to bad predictions - and if you haven't described the whole model then it will be impossible to know how many of these potentially bad predictions exist.


Do you have any references/explainers for that approach? Would be interested to read!


The best I have is this one I wrote a long time ago:

https://www.aaai.org/Papers/Workshops/1999/WS-99-06/WS99-06-...

But, I apologize ! It's a bit pimped up compared to the one liner above, I think step 7 in section 4.3 is what I was thinking of :) I did laugh when I dug it out, as I have been working on the first bullet in the conclusion this week!


Check out this work from Rich Caruana & collaborators on model compression: http://www.cs.cornell.edu/~caruana/compression.kdd06.pdf

which was a precursor to the model distallation work from Geoff Hinton: https://arxiv.org/abs/1503.02531


I've made some experiments that you can check out here:

http://www.clungu.com/Distilling_a_Random_Forest_with_a_sing...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: