I want to call out: But sometimes the opposite happened. Glancing through the results of an algorithmic run showed a pattern I wasn't aware of. Taking that back to paper allowed me to make further theoretical breakthroughs that sped up runtime by additional orders of magnitude.
I worked with some mathematicians on an online load balancing problem, and we were trying to model the behavior of our connections. I did not have the mathematical chops to come up with the ideas for the models. But I was running the experiments and diving into the data and figuring out where and why our models broke, so I could reason about the behavior we were trying to model in the same way that I can walk around my apartment with my eyes closed.
Having this level of intuition of the behavior we were trying to model was invaluable during discussions. I may not have been able to come up with the mathematical models, or even understand them completely from first principles, but once they were explained to me, I could give instant feedback on how well it might work based on my understanding of the data.
I've written about this before, but my experience as a programmer attached to researchers at a university has a lot of this sort of thing. At first I was intimidated being the "dumb guy" in a room of "smart folks". But I've come to realize there is a huge benefit to this role. I get to ask the questions like "I'm not following how X->Y->Z works, explain?". Since everyone else is making assumptions or following the intuition leaps from lots of Ph.D. work - and I don't have that - I frequently find bad assumptions just by trying to follow what people are saying. Similarly, I get to point out where too much yak shaving is going on, by saying "look we don't have to worry about that, I'm just going to use $existing_tool that already does it for us".
But the best part of working in an environment where I'm regularly over my head is that in ignorance I'm free (and safe) to say "hey from down here this looks a lot like this other problem in a different space, can we adapt that solution?" Whether or not it's doable, it helps because I either learn something or directly shape the what the "smart folks" are doing.
I also find that working at a level that's over my head for the last few years has made me fearless. At the startup I'm associated with, when one of the other programmers says "we don't know how to do that!" I can say "So I haven't known what I'm doing for years, doesn't mean we can't go ahead and just do it and fix it later". And we fix a lot, but with a solid straw-man it's much easier because we can say "this is the actual problem".
I think it's not just knowing your data - it's also learning to be comfortable in your own ignorance (not embracing it, just accepting it as a "for now" situation) and OK with uncertainty.
If you read his autobiographical essays, you'll notice Richard Feynman often took the position of 'the "dumb guy" in a room of "smart folks".' and that seemed to help his approach.
I would imagine being willing to "be dumb" separates the "really smart" from the "merely smart". The one who come up with the most amazing things tend to question assumption in the fashion of the "dumb" people who haven't been fully initiated into a field.
I don't think enough people appreciate (or understand) your point 4. I am fairly good at algorithmic optimizations and pretty good at your implementation optimizations, but I do not have the math to do much mathematical optimizations, so spending any time on these kinds of problems gives me a visceral understanding of your point 8.
But I've talked to too many people who don't understand that such problems exist, and therefore don't see the point of any optimization (and then wonder why others consider their products to be hideous monstrosities), or believe that micro-optimizations or algorithms are the best you can do. In reality, all three are closely tied together and, in fact, applicable to almost any problem.
Your points 6 and 7 could be elaborated more. Or at least stamped on the foreheads of some educators of my acquaintance. One of the things I learned from what math education I have had is that you won't really understand something without seriously using it.
I hope that this was actually part of your PhD research, or if not that it didn't delay you too much. And if you're out in the "real world", all you have to do to succeed is to forget you ever heard of your point 10.
It wasn't actually part of my PhD. But it didn't delay me too much in the sense that I was able to stop on time to finish my PhD work. In fact, over the next 4 month period I wrote up my thesis and coded up the prototype as well, which I hadn't done in the 3 years prior. I do conjecture that my performance in the PhD work relates to the time spent on the 17x17 but besides certain specific items, it's just conjecture.
I'm in the real world now for good, and I might be lucky that things in the class that would invoke the reaction this problem did, are few and far between. I'd like to find something that I can enjoy this much while being profitable, but, oh well, a man can dream, right?
I was just about to say that this is what about four out of the six years of my math PhD have felt like. (And it's not quite done yet...May cannot get here soon enough!)
>As I started writing code for this problem, I found out I could work on it for hours on end without distraction, quite unusual for me. I chalk this up to a tight feedback loop. Have an idea, implement it, get back a number. Think of ways to improve the number, start the cycle all over again. This is a cycle I would run dozens of times a day. Obviously more fundamental ideas would take more time to see the fruits of, and that's when I actually lost focus, but when the brain has been rewarded so richly with small cycles, it can afford to go a bit longer without reinforcement. This tells me that A/B testing or a similar numeric optimisation area would be quite motivating to me and I've made a mental note to go into this in the future.
I've had this experience before. I have a severe problem of not being able to focus on anything, and it feels great to be working on a problem like this. I'd love to find a problem like that again.
For more stuff like this, check out Love and Math (http://www.amazon.com/gp/product/0465050743/ref=as_li_ss_tl?... , disclaimer: affiliate link). It's a great book I'm just in the middle of that talks a little about doing maths and the mindset and heart behind it. Some of the OP's observations are echoed in the book and it's nice to have the OP and the author of Love and Math give words to my own obsession to problems I've tackled and failed to solve.
"Of course, I also cheated. I realised that my code spent a huge amount of time counting the set bits on a byte. I could implement that on my side, or get a CPU that implements that as a single instruction."
Why not use a lookup table to memoize the results? There are only 256 possible cases for a single byte. Or is this still considerably slower than a single processor instruction on the i7?
With my limited understanding of processors, it seems like a lookup table would be significantly slower. The command itself seems like it can be executed very quickly in hardware. More importantly, the command can operate directly on the registers. Memoizing, at best, would require loading the relevant value from cache. Not only is this slower then registers, but it also pushes 256 bytes out of the cache. Additionally, using a single command may allow the cpu to do some form of out of order execution to get additional benefits.
> there will be other sized grids that are looking for solutions.
Nope! We now know for all grid sizes whether or not they are rectangle-free four-colorable. 17x17, 17x18, and 18x18 were the last remaining cases (there are no such grids larger than 18x18). So you have nothing to worry about.
Sounds like a challenge :-) Seriously though, I am not a mathematician, did the SAT solution to the 17 x 17 show there were no solutions past 18 x 18? I ask because on the original challenge site was an update [1] which I was trying to parse. It talks about OBS4 but I was trying to parse if that was the class of problem or the particular challenge problem.
Thanks for the link! That is pretty cool. I hadn't really thought about these kinds of questions, although operationally I can imagine applications for this, if you imagine columns and rows as redundant network links, and 'colors' being network services, you can prove that all your services have x4 redundancy this way. (I look at 16 x 16 grids all the time except they are 16 racks each with 16 servers :-)
Haha before I clicked I knew it would be that goddamn 17x17 problem. Hadn't heard that someone had solved it, but it will be interesting to read how they did it.
I worked with some mathematicians on an online load balancing problem, and we were trying to model the behavior of our connections. I did not have the mathematical chops to come up with the ideas for the models. But I was running the experiments and diving into the data and figuring out where and why our models broke, so I could reason about the behavior we were trying to model in the same way that I can walk around my apartment with my eyes closed.
Having this level of intuition of the behavior we were trying to model was invaluable during discussions. I may not have been able to come up with the mathematical models, or even understand them completely from first principles, but once they were explained to me, I could give instant feedback on how well it might work based on my understanding of the data.
Short version: know your data.