I was just thinking about that recently. One of the antenna's my company designed had some weird constraints based on the plastics. The engineer took a wire and bent it by hand to fit while looking at the impedance on a signal analyzer. And... it's a wire with some odd kinks in it.
Worked 'fine'
Later they redesigned the product to use a better classic PCB antenna and it 'sucked'.
Impedance is easy to measure. But this is not the only parameter. A 50 Ohm resistor has a perfect impedance, but makes a terrible antenna. I wonder how they optimized the radiation characteristic this way.
I mistyped, he uses a vector network analyzer for tuning RF stuff. Measures the amplitude and phase of a circuit and generates a smith cart in real time. They've come down in prices a lot but they used to be $50k or more.
There are also simulation tools that allow you to simulate antenna's.
This sounds closer to genetic programming -- they evolved a program that describes an antenna.
The distinction arises because the basic idea of evolutionary computing was independently explored from at least four different directions: genetic algorithms (evolving parameters of fitness functions), genetic programs (evolving program descriptions), evolution strategies (evolving vectors, often including parameters of the evolution process) and learning classifier systems (evolving populations of rules).
Edit: on the other hand, the paper linked specifically calls it out as genetic algorithms. Where's my hat, I feel peckish.
> This sounds closer to genetic programming -- they evolved a program that describes an antenna.
Based on my understanding of the paper, the members of the population were individuals which each encoded an antenna design as lengths of wire segments and rotations between segments. They weren't programs but rather what amounts to descriptions of geometric forms. So the output wasn't a program that was run to give an antenna, it was list of wire segments to assemble together in a certain way. But maybe the differences in in our interpretation comes down to mostly semantics.
On further thinking I think you're closer to it, I was blindly pattern matching on what looked like function names. Usually in GP the thing being evolved is basically an AST, though sometimes it's something like assembler. That's what I latched on to.
But seen as a string of symbol-value pairs, GA is the proper fit I think.
Took a GA class during my undergrad years -- actually went to a neighboring university to do it because mine didn't offer it -- and it informed a research project I got a tiny grant for. One of the things my prof noted was that many people looooved GAs as a research topic because, at least at the time, a good chunk of related work was in coming up with ideas for fitness functions and examining them and that's a vein a reasonably creative person could mine for publishing for a long time.
I think that the most fun part of writing this article was coming up with simple examples that adequately demonstrated the components of a GA. I agree that creativity is a huge asset in the topic!
When I was in college I was mystified by genetic algorithms, without knowing much about them. After taking 2 subjects on the matter and reading some books, I came to the conclusion that apart from being inherently inefficient (that's what you apply them when you have no alternative), they are actually outperformed by hill climbing (which can be seen as a particular case of the former if population = 1). Also, the crossover operator seems to make more harm than good, and it's not fully understood it's usefulness in nature, although there are some theories (this last point is taken from Pedro Domingos book).
Interesting. I want to ask this in a way that is respectful because I'm not grandstanding, but coming to that conclusion after two undergraduate courses, based on preceding mystification, did it not occur to you that it might be your understanding, rather than the whole field?
My personal story: I worked for just under a decade on EAs (PhD and professionally, late 90s and early 00s). I don't entirely disagree with you about pure GAs. But even back then, pure GAs were rarely used. But I do disagree about evolutionary algorithms in general.
The usefulness of EAs is highly problem dependent. And I will definitely concede that the space of problems in which they excel and are tractable is smaller than the space of problems for something like a neural network, in my experience.
As for crossover, when it works (and in engineering contexts the operator needs to be designed with the problem in mind) it does so by allowing individuals at different points on the landscape to share the optimisations they find. At the cost of increased genetic load. This, of course, is only useful if the fitness landscape is both heavily multimodal and self-similar. And it requires some effort to avoid the population converging around a single solution (where you do have a stochastic hill climb).
In my experience, EAs come into their own with non-fixed genotypes or complex genotype to phenotype expression. Multivariate optimisations with a fixed number of variables is probably best done with other tools.
> And I will definitely concede that the space of problems in which they excel and are tractable is smaller than the space of problems for something like a neural network, in my experience.
The only thing that EAs and neural networks have in common is the buzzword abuse. Neural networks are actually useful as they are at their core an efficient way to approximate multivariate functions, while EAs are computationally expensive heuristics abused by being passed off as optimization algorithms even in domains where deterministic algorithms (and even brute force approaches) actually perform better.
That depends on the particular problem you are trying to solve. If hill climbing works for your problem, then you are lucky and you should not use GAs.
I've also had the experience that hill climbing (with random restarts) can outperform GA's. In many problems you can fairly easily define a neighborhood concept for moves from the current solution. That tends to be easier than getting a meaningful crossover and mutation working. Then you just can hill climb by making neighborhood moves, with random restarts. Run 1000 random restarts and pick the best one, and you're going to have a really good solution.
Genetic algorithms aren't useless, evolutionary algorithms are, the crossover method is terrible at searching the space in any kind of dynamic way, but things like pep-g and covariance matrix adaptation are also genetic algorithms and they work significantly better because they have essentially dynamic learning rates, they are to genetic algorithms what Adam is to backpropagation.
Out of curiosity, does anyone know of examples where genetic algorithms are the "right choice"?
I thought they were really nifty when I first heard of them, but thinking of them as an optimization procedure, they don't really stand up in my experience to basic gradient descent methods. Sure, you can e.g. train a neural network with genetic algorithms, but why would you?
The game's scoring system was based on Lines of Action but added an alternative goal which was to win by points. So in order to play this game well a bot had to balance progress towards one of the goals with the need to make sure the opponent didn't get too far ahead in the other goal. My program was the output of a genetic algorithm that learned a remarkably good balance between these two sometimes competing goals. Before I used the genetic algorithm I spent a lot of time hand tuning the evaluation function and actually came up with a quite good result that I was unable to improve on. When I ran the genetic algorithm I was thrilled to discover that the genetic algorithm's output totally crushed my hand tuned output.
After my program won the contest with an undefeated 74-0-0 record the author of one of the best Lines of Action computer programs in the world played his program against mine. If the organizer of the programming contest had chosen just a slightly different scoring system that was biased a little more towards the Lines of Action goal, then my program would probably lose against the Lines of Action program. The author played two games of his program against my program. Both games ended with his program thinking it won by Lines of Action rules and my program thinking it won by the tweaked contest rules! This suggests that the organizer of the contest had done a particularly good job of balancing the different goals of the game. (No, this paragraph is about the organizer's game design and has nothing to do with the question of genetic algorithms. I just thought it was an interesting side note.)
I really have no idea of whether gradient descent optimization methods would have been better or not. But this was an area where genetic algorithms worked very well.
All these meta heuristic type things (simulated annealing, GAs,...) are really just fancy dress on MCMC adapted for mode seeking (usually via importance sampling by scaling).
Specifically in this case the evolution bit is a sampler and evaluation function is the log likelihood.
Nothing wrong with that, MCMC is awesome. But you need to put in the features of your particular problem if you want good performance. i.e. if your input to cost surface mapping is continuous and you can efficiently calculate a gradient then you would be silly not to use it. Your sampler will be much more efficient.
American Fuzzy Loop[0] uses GAs to concentrate its fuzzing.
But in general the "No Free Lunch Theorem"[1] shows that no optimisation algorithm is "better" than any other across the universe of all possible problems.
So it's going to be some balance of your familiarity with GAs, having a problem domain that maps nicely to some simple input structure but which has a relatively gnarly surface to search and whether you could do it better in some other way.
These days the clear winner is various kinds of neural nets, largely because they can be represented with matrix multiplication, which made them suitable first for GPUs and now for much more specialised hardware.
I've had them in mind for searching autoscaler parameters, mostly because I feel it's easier to reason about a GA's optimisations than a series of matrix multiplications resulting in a cloud of floating point numbers.
I once saw an antenna design produced by GA that was very interesting. Met all the performance parameters in a very compact and easy to fabricate design. Extremely counter-intuitive design -- it looked like a mangled paper clip -- but it worked in simulation and in real life.
People have also had good results using GA to optimize Yagi-Uda arrays. In that case, it is very easy to map design parameters onto a "gene vector" in a way that is well-understood and such that it yields an understandable result. But that is really just a search problem, so maybe other algorithms would do as well.
Yep, I recently worked on an engineering project, where GAs were used to evolve new designs for large steel structures, with the aim of reducing weight (and, ergo, cost).
There were a lot of constraints, and several applications were used at different points (e.g. specialised 3D CAD) - a single generation took around 1 hour, so we had to let it run for days at a time on a cluster to be useful.
I wasn't in the AI team (I was the architect for the cloud infrastructure and backend), but my understanding was it was pretty much a "textbook" implementation (I dabbled with GAs, basic neural networks and swarm optimisation several years ago).
I actually kind of surprised, because I didn't realise people still used GAs any more, let alone such a standard implementation.
In the specific area I worked on, minutes are borderline tolerable so I didn't think twice before posting. But now that you said it, I totally see how it can go on for days.
Well one area where genetic algorithms are better than basic gradient descent is, naturally, nondifferentiable functions. But that is facile.
I think the realistic strength is that there are some easy to use libraries that don't require the user to really know anything about optimisation or work too hard and can deal with very difficult optimization scenarios (and sometimes even deal with them well).
Plus they're neat-o. That's gotta count for something. You ever try the island variant?
> Well one area where genetic algorithms are better than basic gradient descent is, naturally, nondifferentiable functions. But that is facile.
That's a poor assertion because even if detivative-free algorithms weren't already a thing, there are a myriad of smoothing techniques that enable nondifferentiable objective functions to be approximated by differentiable functions.
> I think the realistic strength is that there are some easy to use libraries that don't require the user to really know anything about optimisation or work too hard and can deal with very difficult optimization scenarios (and sometimes even deal with them well).
That's also not a reasonable assertion because there are also plenty software libraries for continuous and discrete optimization that are easy to use.
GAs in general are explored in academia because they are a fad that's easily publishable. Other than that this class of methods is in general very computationally expensive and very inefficient, and don't provide any advantage over plain old adaptive sampling, or even dumb regular lattice sampling.
> That's a poor assertion because even if detivative-free algorithms weren't already a thing, there are a myriad of smoothing techniques that enable nondifferentiable objective functions to be approximated by differentiable functions.
The function you wish to optimise may be non differentiable because the domain is discrete. E.g. to perform feature selection one can binary encode which features are included and which are excluded, with the fitness function being the accuracy of a model using the included features as inputs and trained for X iterations.
> GAs in general are explored in academia because they are a fad that's easily publishable.
I would say 90% of GA papers are because of this.
> Other than that this class of methods is in general very computationally expensive and very inefficient, and don't provide any advantage over plain old adaptive sampling, or even dumb regular lattice sampling.
Definitely false. The vast majority of OR-type papers in good journals include comparison to weak baselines like those, and wouldn't be published (in those venues) if they didn't win.
> Definitely false. The vast majority of OR-type papers in good journals include comparison to weak baselines like those, and wouldn't be published (in those venues) if they didn't win.
That statement is not correct. Although it's customary to accompany papers with benchmarks, these benchmarks focus practically exclusively on popular evolutionary algorithms. Worse, the "no free lunch" theorem grants authors the freedom to cherry-pick which benchmark problems are used.
Hmmm, maybe we're reading different papers. Ok, I admit that a lot of papers compare only against other metaheuristics, but only in cases where it is already accepted that those other metaheuristics far out-perform weaker methods.
About NFL, I agree that a lot of authors misuse it in that way. But again, in most papers in good journals, what we see is either a comprehensive experiment with a wide enough range of instances, or a real industry problem, not cherry-picking.
> Ok, I admit that a lot of papers compare only against other metaheuristics, but only in cases where it is already accepted that those other metaheuristics far out-perform weaker methods.
Those methods are invariably other evolutionary methods, and benchmarks are cherry-picked to show only encouraging results under the convenient guise of the "no free lunch" theorem. That' pretty much the norm, such as the repetitive recipe for inventing a metaheuristic algorithm of a) coming up with a clever nature-inspired metaphor with a catchy name, b) put together an algorithm that is arguably inspired on the metaphor, c) come up with a benchmark that arguably portrays the algorithm as being any improvement, even if only on a fortuitous corner case.
No, I'm using "weaker methods" in the technical sense, not "performs worse".
I agree that many papers of that recipe type exist -- Sorensen and Weyland have skewered them effectively -- they are just froth, to be ignored in a discussion of the true merits of evolutionary computation.
I wrote a GA for picking an optimal Fantasy Football (soccer, Brit here) team based on player scores from the previous season(1). The team did above average, even though it was hobbled by the fact that I didn't do any transfers after the initial pick.
The knapsack problem(2), of which fantasy team picking is an example, is a classic use case for genetic algorithms.
Wow that's funny I did the exact same thing! I was working at a web/mobile gaming company building a sports draft platform and I was tasked with building a bot player to fill in empty player slots. I used a GA and it worked really well even though I didn't know anything about sports.
Funnily enough I did a similar thing - I was given a list of players with a "point score" and a "price", and I had to choose the best team I could get. One goal-keeper from a set of N, a couple of forwards, a couple of defenders, etc.
The solution my program "evolved" was the best in the company.
I was thinking about this the other day when trying to imagine a way to produce texture shaders through some kind of optimization. The connections between various nodes in a shader might be modeled as mutations, since they can't easily be modeled as continuous values.
You shouldn't view GA as a competitor to gradient descent. Whenever gradient descent applies to a problem, it'll probably be better than GA. But GAs are flexible and applicable to more problems, and can be used in conjunction with other optimisation methods.
A few days ago I used GA to perform feature selection (binary encoding with 0 indicating a feature is excluded, 1 if included, fitness function was the accuracy of a NN taking the included features as inputs and trained with gradient descent for x epochs). This made my final network smaller, train faster, etc.
For a uni assignment we had to do a project that organised a bunch of people into 4-5 person groups based on
* gender (try to get mixed gender groups and atleast have two of each gender in a group e.g. no 1 female + 4 males)
* skill types and skill level (so correct combination of skills and skill levels)
* bunch of other rules and stats
At first I brute forced it, but didn't work very well, then with a few tweaks I used a simulated annealing algorithm which worked perfectly. I believe genetic algorithms can solve the same problems like that though.
If you need a GA, you haven't decomposed your problem well enough. Which is to say, if you need a quick example of an optimizer for a high dimensional space and you just want to tinker with a few cost functions, then GA is the right choice.
This is probably extremely cynical, but does floydhub have a vested interest in promoting genetic algorithms because they can take long times to train, especially with neural networks hyper parameters as the search space. These kind of beefy, complex training processes would be fantastic for its business.
I think this is a reasonable insight into their business model. Especially if they are able to somehow cheaply spin up long-running processes. Not sure how they might be able to do that, though. Conventional tech is all about short term processes.
Hmm, so what is the theory of the mutation function? Is it just determined ad-hoc from looking the problem and throwing some standard examples at it?
It seems like you could implement effectively any function with a complicated enough combination of fitness, selection and mutation functions but without a theory of which to use, progress would be a bit hard.
Very true - there are lots of what the AI people call hyper parameters to tweak. And they do affect the balance between fast convergence and avoidance of local minima traps, as you would expect.
Once you have used them for a while, you start to get a feel - if the optimization is rapidly converging to a poor solution, you need more variation - more mutation, bigger population, etc. if the solution is converging very slowly, you need Less variation.
I like to set up the functions as Gaussian, then the parameter controlling the variation is a well understood sigma.
I think it's usually a function of what data structure your "genotype" is stored in.
In my experiments with GAs, defining a coefficient for mutation that is either far too high or far too low means you search search for too much of the problem search space or far too little, respectively. Testing parameters such as the frequency of mutation are part of the iterative process of tuning your GA.
I've been working on a Genetic Algorithm/Evolutionary Computing framework in Scala, using network-parallelism to solve optimization problems fast. If you're interested, check it out at https://github.com/evvo-labs/evvo
An interesting case is where a human does the selection; this was pretty well covered in Dawkin's 'The Blind Watchmaker' [0] with his program for generating biomorphs. Some nice speculation in there too:
'Dawkins speculated that the unnatural selection role played by the user in this program could be replaced by a more natural agent if, for example, colourful biomorphs could be selected by butterflies or other insects, via a touch-sensitive display set up in a garden.' (from the wikipedia article).
A lot of the time there are better solutions that a GA, but something I've always liked about them is how easy they are to understand, even for AI noobs.
I guess it's because of the similarities to the evolution of life, but also because they're just quite simple.
GAs are a lot of fun, but a lot of the time they are very time-consuming in the case of fairly straightforward optimisation problems. Other iterative searches like simulated annealing or just plain ol' dumb hill climbing is a lot faster most of the time.
If your problem is simple enough for solving with SA (let's face it, most problems are), there is really no gain on using GA.
Genetic algorithms beat simulated annealing when different dimensions of your solution interact strongly and unpredictably, SA and gradient searches completely fail at those cases.
I worked on both GA and SA and I think it really depends. There are some effective evolution strategies that can speed up the evolution, combine that with a parallel implementation GA can be really fast. But yeah it can take time to figure out a good evolution strategy and SA can solve some problems just fine.
Yeah, evolution is amazing as a biological process precisely because no intelligence is needed to get results, albeit after millions or billions of years. But given that we do have intelligence, reliance on a blind method seems pointless other than as a demonstration as in Dawkins' "weasel" demo, that it can work.
> When you're solving a problem, how do you know if the answer you've found is correct? In many domains, there is a single correct answer. A mathematical function may have a global maximum or other well-defined attributes. However, other problems, like how a cell behaves in a petri dish, do not have clear solutions. Enter evolution, which does not design towards a known solution but optimizes around constraints.
What about situations when there are no real optima and minima, but only intransitive relations? So "rock/paper/scissor"-like scenarios[0]. Could you use genetic algorithms in any way in those circumstances?
One issue with “rock paper scissors” is the fixed input domain; there is nothing to mutate if your three starting options are the only possible inputs.
If you’re trying to do something like predict the next move based on historical patterns, I think a classifier would be better.
Not sure if @vanderZwan meant the same, but what about scenarios in which your goal is to pick the best member of an already existing population where each member has a set of variables like agility, strength, speed. Would it be possible to find a pareto-optimal solution with genetic algorithms, maybe even with priority-parameters?
So one step of a genetic algorithm, selection, does what you describe, but the main value of the approach is the iterative generation of new populations.
This is what the GA came with up for the design given the constraints: https://en.wikipedia.org/wiki/Genetic_algorithm#/media/File%...
And this paper describes the process: http://ti.arc.nasa.gov/m/pub-archive/1244h/1244%20(Hornby).p...