I was just working on this last week. I found out after a week of coding and designing, that the fundamental flaw in my implementation design for parallel computation of serial problems, is that it takes as much work for a "forked thread of execution" to undo what it had computed. I'll write up a blog post about it.