Mildly unrelated, but I wanted to point out the awesomeness of the K-map[1] for handling simple minimization problems like these.
Unfortunately, once your dimensions get large it becomes difficult for humans to visualize the optimizations, which is where Espresso[2][3] comes in handy.
Ooh, espresso. There was also a tool called eqntott for converting boolean logic expressions in a human-readable form into the truth tables that Espresso takes as input. The source code was written in an archaic dialect of C, but here's a resurrected version that should compile on modern compilers:
I haven't touched this code since 2008, so beware of grues, but I remember it being a really slick system for minimizing boolean logic when you combine eqntott with espresso. Just the thing for an electrical engineering student who's sick of doing K-maps, which described me nicely at the time.
While I'm not sure thousands of if/then/else rules are actually the best knowledge representation for his domain (it sounds like he really wants to learn some model from data, in which case throwing some off-the-shelf machine learning may be a better fit), the area of "expert systems" may be relevant if they really are rules extracted directly from some source that need to be managed, such as a human domain expert.
Expert systems were particularly big in AI in the '80s, and considerably less hot now, but still widely used in industry. There are a lot of issues that come up, most of which aren't really purely technical, such as: 1) how you extract knowledge from people who might know it; and 2) how you validate that the knowledge you've extracted is actually what they know, for example by validating that it produces the same decisions that they would make; and 3) how you allow updates/revisions to the knowledge base over time. Once you have the rules and some reason to believe that they're any good, actually managing and applying them can be done through one of several rule engines, such as Drools or Jess.
The keyword "knowledge engineering" may also turn up relevant info.
Is it just me? It's not a good question, it shows lack of core coding experience.
For example a video game consists of millions of possibilities/results in movement based on environment - it's not approached as a collection of thousands of if/then/else statements.
In fact the movement of cows seems very much like video game coding, it needs a "cow engine".
It's a great question. It expresses a problem from the worldview of a coder with a particular, very common skillset. Which is something that other coders with the same problem and the same skillset will find in their searches. As long as the answers are good, it's a perfect use of the system.
I think it was a really good question. It showed that he had the intuitive instinct that there had to be a better way. He just didn't know how to ask for a "cow engine" by name.
I am not so certain. Note their question is not if there is another way - they are asking how to manage all the statements they are planning to write.
Imagine being given a problem such as "add any two given integers but do not use math operators".
A person without anything but if/then/else knowledge would proceed to try to write
if (a==1 && b==2) c==3
and repeat it for every variation until they gave up.
Someone with more core experience and coding instinct would immediately recognize there must be away to set patterns for 0-9 and then analyze each decimal place, apply a matrix, or a number of other approaches.
I write AI perception systems for "realtime simulations", and you are exactly correct.
while on the surface it may seem like "if/else" will give you the solution you need, the sheer number of permutations will make it infeasable.
That said, there are "known solutions" and some novel solutions to this problem too. Perhaps if/else is really what this person needs, but there are other potentially better solutions documented.
Often through physics engines. You express a bunch of constraints then let a physics engine solve them on each time step (and sometimes across multiple time steps).
Custom (non-physics) solutions involve declarative sets of constraints being solved by some sort of engine; such solutions often start to resemble physics engines even if the physical rules even if the constraints are not exactly physical.
Learning how to use a physics engine, first, then how those physics engines work, is a good start to understanding these kinds of problems.
Edit: I'm really surprised the stackexchange answers don't mention this.
Using a physics engine to express an AI problem? That's an interesting solution, but that's certainly the first time I've heard of one being used like this?
But you know, the problem as stated was how a bunch of influences affect...cows. So think of them as a bunch of mechanical cows instead, and the influences are modeled through springs and collisions, things like that.
AI is often cast as constraint solving, and a physics engine is basically a constraint solver. Of course, for real the AI engine runs next to the physics engine. Ah! I should have mentioned Rodney Brooks' work:
where you model intelligence as a bunch of simple prioritized reflex-like behaviors. Now, on each step, you have each "entity" execute their reflexes according to the current environment, where some reflexes act to disable others. The reflexes are incredibly simple, you probably couldn't even express A* path finding, but you could have the act as ants following a diffusely scented trail. Also see his paper "Elephant's Don't Play Chess."
If I was given the task to integrate cow herds in a computer game, my first instinct would be BOIDS/flocking, adding in weights for wind velocity and sun. It's fast and lends itself to predictable behaviour which is beneficial to the player.
It may not necessarily be realistic but then most players won't be familiar with cow movements.
I think it's a good question if only for the quality of answer it gained for the rest of us. But it's certainly not what I would expect from a very experienced developer.
Such a great question. Want to predict cow movement while taking into consideration sudden events, food and sun, using thousands of if then rules? Sounds like the perfect problem for one of the following tree learners that not only manage themselves but build the model for you, in order of suitability to the problem* :
Genetic Programming
ADTree
Random Forest
Decision Tree
Most of the work would be in figuring out the best way to gather all the data.
All machine learning techniques build models for you.
ADTrees, Random Forests, and Decision trees are all classifiers, so they serve this guy's needs. Genetic Programming is not a machine learning technique: it is an optimization method for a very specific task and is quite unsuited for his purposes.
No, I am certain you are confusing Genetic Programming [1] for Genetic Algorithms [2]. I am a fan (do a lot with them) of the prior but have never really been impressed by the latter.
Wikipedia says Genetic Programming is a specialization of genetic algorithms where the objects are programs. I think this is the only thing I have ever seen where the specialization turns out to be more general than the parent.
I'd also like to point out that I although all ML learns models, all those I gave can learn stuff as what amounts to a bunch of IF-THEN statements. Well GP need not be so limited. And Random Forests lose the trees in the forest (harder to interpret) but yeah, that's why I formed that particular list.
Then you will know that Genetic Programming can be used for classification, structured learning, symbolic regression and even meta learning [1]. For classification you can build a program that searches for a program that best uses the input to predicts classes. I have done this (there are better heuristics than the old kind that lead to really big dumb programs). Or you can combine boosting with Genetic Programming or you can use them in Learning Classifier Systems. All these lead to classification based on genetic programming.
As an aside, My definition of Machine learning is more inclusive than yours. I mean if you are going to separate out optimization then I guess you don't count stochastic gradient descent or Matrix factorization as part of machine learning? Machine learning is basically the combination of statistics and optimization where you can work with a lot of data and the output of your computation is more important than the model.
"Wikipedia says Genetic Programming is a specialization of genetic algorithms where the objects are programs. I think this is the only thing I have ever seen where the specialization turns out to be more general than the parent."
Obviously various optimization techniques can be used essentially as poor-man's machine learners. That doesn't mean they're particularly good at it.
In this case, Genetic Programming uses trees as its representation, and so it could be used as a tree-based classifier, but the OP has already indicated that he expects enormous numbers of rules, necessitating huge trees. For Genetic Programming, this translates into a gigantic search space.
As it was my thesis work, I'm always ready to promote Genetic Programming where appropriate. This is not the place. There are better tools for this task.
I do not think that he knew thousands of if-then rules would be needed, likely his estimate.
But that aside I believe that such a search space could be handled by GP. That very problem is what I've been tackling for sometime now. Its position on my list reflects that particular bias. But there are techniques. I use EDA's like MOSES and was heavily inspired by http://www-ia.hiof.no/~rolando/ but have my own original contributions. The method is also interactive for tough problems, these really help in reducing the search space. Although I have not succeeded in a use yet, I have been thinking hard about how category theory could be used to constrain space or perform particular automatic transofrmations (https://www.cs.ox.ac.uk/jeremy.gibbons/publications/origami...., https://www.cs.ox.ac.uk/people/jeremy.gibbons/publications/a...).
If you had samples of cow behavior such as videos, you could then simulate behavior at each step and optimize for a genetic algorithm that produces similar behaviors. In this case it's probably not the right solution, but it's not that hard to swap out different learning algorithms when things don't seem to jell.
I have already explained in the post above why optimization is very a much a part of machine learning. Now, you say that that classification and optimization are two different things and that is true. But really, it can be more fruitful to look at supervised learning as a special case of unsupervised learning [1]. It is best to seek the most general framework from which to understand things as it leads to a deeper understanding, broader applicability of concepts and easier cross fertilization across fields.
For example, understanding the spectral theorem makes SVD (hence PCA) and the DFT class of algorithms much clearer. Understand the notion of Lp-Norms, convexity, adjoints, loss functions and regularization and a whole bunch of seemingly different algorithms collapse into facets of the same thing. Hook it up to automatic differentiation then some optimization algorithms and you can write anything from Neural networks, SVMs, regularized logistic regression to Non negative tensor factorization in a few lines. You stop making arbitrary divisions between classification or optimization. Much the same kind of collapse can be done for the dual [2] notion of probabilistic algorithms by thinking in terms of graphs, simplices, parametrizations, families and conjugacy.
The best thing from all this is you stop thinking of which algorithm should I use and start thinking of what do I want to do? What is the best mathematical model for this? What would really be great would be a machine learning language. Where one could work with things akin to folds and maps on various structures and manifolds and disappear the incidental complexity. Stuff like [3] is really encouraging for that direction.
[1] The problem of learning a distribution usually is called unsupervised learning, but in this case, supervised learning formally is a special case of unsupervised learning; if we admit that all the functional relations or associations that we are trying to learn have any element of noise or stochasticity, then this connection between supervised and unsupervised problems is quite general.
I agree with everything you said. I never said optimization isn't a part of ML, it was very much a part of my ML masters, in fact. I was just saying that, in this case, the OP doesn't have an explicit function to optimize, hence why he didn't need optimization...
Oh cool =) I don't really think in terms of classification and optimization anymore and don't really see much distinction between minimizing a loss or objective function. But I did rant on because I am really passionate about how much unnecessary complexity (hehe) is in much of machine learning. And am always eager to talk about promising unifying approaches. If you haven't read up on semirings or computational algebraic geometry you should. For an MLer learning more math is like learning math for a programmer, it is a lot of upfront work but it makes a lot of seemingly arbitrary list of rules much more cohesive and simpler when you've achieved what is basically a conceptual compression.
Yeah. It must be pretty energy intensive for the brain consider all that rewiring that occurs. I take the interlaced GIF approach to learning hard things.
Solution is great and there are very nice algorithm suggestions given here as well. Now the question is how easily one can apply growing number of rules to production without taking down the application.
I suggested using Prolog in the comments. His problem description was not quite good enough for me to assess whether or not Prolog will be a good choice, but at least it's a candidate.
A Prolog program is essentially a list of rules that are written as an implication. Something like:
Good answer. Nice to see the original questioner accepted it. To take it a bit further, so-called "answer set solvers" like clasp (http://potassco.sourceforge.net/) provide a fast way to run these kinds of logic programs.
You really don't want to think of this in terms of IF ... THEN ... ELSE but as individual binary or multi-path choices. These are most efficiently handled with hash tables that choose a route through a logic diagram in a way that is more intuitive than If ... THEN ... ELSE as well as more efficient.
You start by writing a logic flowchart that you understand, then you reduce the logic flowchart to code, usually automatically. This is how a logic flowchart looks:
Maybe I'm off the mark, but whenever I see a long runon procedure full of if-then-else, I think "This is a job for a state machine!" With that approach, you can exhaustively provide actions for every combination of states and events. You can also immediately identify the code responsible for handling any state+event. You can also track which state+event transitions have been tested automatically.
> Maybe I'm off the mark, but whenever I see a long runon procedure full of if-then-else, I think "This is a job for a state machine!"
I'm a big fan of state machines, but systems like this are often multitasking and (for all practical purposes) stateless. OTOH it might be more accurate to say their state is formally unpredictable.
A system like the one being described would be more like a multi-threaded critical-path flowchart, where multiple processes are active at a given time. And at any time a high-level linear programming result might mandate a change of strategy based on available resources, even though individual processes continued to follow the tactical flowchart.
So even though a state machine approach looks attractive (as it always does), it's a matter of deciding how many states and how many levels. Because of the nature of the original problem and the number of connections with everyday reality, the fact that it was a state machine might escape the attention of even a careful observer.
You're not, imho. The problem is : how do we manage the data associated with the structure of this gigantic state-machine. It is likely to change overtime, so it cannot be hard coded (although if it were, one could perhaps use a tool which compiles the state machine to hard code).
Basically, many of the solutions proposed are more or less specialized state machines (decision trees, markov chains, rete algorithm).
Even with the clue that he wants to "predict how cows move around in any landscape", his description seems like it wouldn't be enough any definite design advice.
What do the states of the cows and the landscape look like? What is changed by the if-then-else results. Does he already have his rules? Is this for a biology project or a video game, etc.
You could use a finite automaton system, an array of cow-states, a cow DSL or something else depending on the answers to these and other questions.
Yes. The piece missing for me in most answers I've seen is eventing. Does one pump a series of events/state changes into the CowModel and see what actions pop-out? Does the model involve interactions with other cows where the actions of one cow cause actions of others? If so, one must have an event queue to handle chain reactions. How discrete are the events? Is there a "temp is 89F" event where, for every N minutes the cow is in that temperature in direct exposure, it becomes more likely to seek shade? If so, the "it is 89F" event must be triggered multiple times. And, such a rule requires that the past states of the cow be retained. I think the if/then/else logic is only part of the solution to a simulation problem of this sort.
A finite state machine is actually not a great solution for this particular problem (in fact, FSMs are often over-used), IMHO. An expert system is a far better solution (as one commenter mentioned). The problem with a FSM is that you still must explicitly specify ALL of the nodes (without some complex algorithm to automate such a task). An expert system, like prolog, only requires you to state all of the rules, and then it will figure out any query for you (via propositional logic, backwards chaining, etc). The number of nodes typically grows exponentially with the number of rules (if the rules are inter-related), so for a problem of any significant size an expert system would probably be the method of choice.
It's basically thousands of if-this-then-that statements, but dressed up into something you can reason about and most importantly train and build automagically.
This really sounds like a machine learning problem--why hand engineer all these rules yourself (which can be error prone in any case) when a program could do it for you? If you can figure out how to get a large data set together and come up with a good general representation of the data, you're half way there.
I'd consider taking a look at recurrent neural networks if you're looking at your data as a time series or if its more of a static problem that doesn't take into account the last N things that came before, you might consider tree based methods or even DBNs if you can get a lot of data together (say 50000-100000 samples)
Potential pitfall: if you're using a NN based approach, you will judge results based on test performance, but you won't get as much insight into the "rules" your network has learned.
I have an axiom "any sufficiently complex if-then-else ruleset can be modeled as a search problem with boosts".
I have modeled a logistics lookup system (assign "best" courier for an ecommerce setup) using a search solution. The rules were modeled as documents in a search index and looked up using weights. Fundamentally, what I did was flatten an if-then-else hierarchy and assign weights based on (roughly) level of nesting.
The con of this model is obvious - it is at best a heuristic. However, with clever overrides built in (as a separate search index?), you can get pretty close to the ideal solution.
The pro of this model is scalability - when your rules are in millions, this system scales beautifully.
The question is an interesting one but it's futile in its context. You can't build a prediction engine for cattle movement. Adding factors like weather, food etc. seems well and dandy but there's hundreds if not thousands of other factors that are definitely going to be missing and that will render the prediction events pointless.
Never mind the fact that you can throw in black swan type of events into the mix or just unknown unknowns you could never think of (maybe a cow has a frog phobia and panics when it sees the frog and starts a stampede or whatever it is that cattle do).
You can't predict how an individual cow will move, but it's perfectly acceptable to use models to predict how cattle generally move. Similar to the way we use models to predict how humans navigate hallways and find paths across grass.[1] When you have a bulk of interacting particles, it tends to remove the "personality" of the individual, leaving you with how the group as a whole will move.
Philip Ball's (former editor of Nature) book Critical Mass[2] is an excellent read about the interesting effects that happen when you look at large number of complex interacting actors. He discusses the effects in traffic, pedestrian models, finance, plant growth etc. I highly recommend it.
I am late to this discussion, but I would like to ad my experiences with compiling rules to a Rete network. I have done a lot of hacking on the old Lisp OPS5 code and found it easy to work with, once you read and understood Charles Forgy's papers. I hacked OPS5 to support multiple data worlds and a few other enhancements for specific projects.
The more modern Rete based software projects are probably also easy to work with.
What, are you insinuating that asking about great big if-then-else trees means you'd never hire that person? Because, what, you've never been a complete novice programmer before?
Considering the question-asker has a specific problem to solve, I would even hazard they probably aren't a programmer by trade.
... a very good sign that they're still too green.
When did HN become so touchy-freely? It's important that people start learning somewhere. If you show any interest in programming, I highly recommend learning more. However, there comes a time when you must have a certain amount of experience. This is a profession where important things are made. Even people who should know better end up building systems that leak our sensitive personal information, or let others impersonate us.
I didn't say that such a person shouldn't be hired, but I certainly implied that the original question is a better indicator of their capability than the always positive keyword soup and "years experience" that many people slather on their résumé.
If "avoiding IF-THEN spaghetti" is all you require from contributors, fine. Let's not pretend that this is better than an early high school effort.
Considering the question-asker has a specific problem to solve, I would even hazard they probably aren't a programmer by trade.
So why would someone hire them to be a programmer? And yet, unlike other skilled professions, rank amateurs often present themselves as competent programmers. It would be valuable to include evidence to the contrary with résumés.
you've never been a complete novice programmer before?
I certainly wasn't hired when I was. What kind of ridiculous statement is this?
The poster is asking the wrong question. They are asking about how to manage incidental complexity, when they should be asking how to avoid inherent complexity. Incidentally complex code is the worst possible code to maintain, and you'll thank yourself for not letting contributors commit it.
So you rather hire somebody who has the same problem but has too much pride to ask somebody for help? Seriously, if all the questions everybody asked ever were on their resume, nobody would ever be able to get a job. At least this individual recognized that there was something wrong and sought out how to fix the problem from people with more experience then them.
Meh, put a date on it. If you asked this even a couple of years ago, fine. If you asked this last month, I have a pretty good feel for where you're at. You should also learn from your questions, which would make a great interview conversation.
I've littered the net with my share of dumb questions, and I have no regrets.
The idea of things on resumes shouldn't be to make hire/no-hire decisions (well, unless your job is to bin all the incoming resumes written in crayon..). Talking about this situation during an interview may provide insight though.
Unfortunately, once your dimensions get large it becomes difficult for humans to visualize the optimizations, which is where Espresso[2][3] comes in handy.
[1] http://en.wikipedia.org/wiki/Karnaugh_map
[2] http://embedded.eecs.berkeley.edu/pubs/downloads/espresso/in...
[3] ftp://ftp.cs.man.ac.uk/pub/amulet/balsa/other-software/espresso-ab-1.0.tar.gz