We will be closer to cracking neural nets and are closer to the singularity when we can train a net on two completely different tasks and each task can make other predictions subsequently better. IE: train / test it on spam / ~ spam emails, then train the same net with twitter data male / female.
In fact something as simple as naive Bayes will work reasonably well for that.
I'm not sure if you are aware, but in (say) image classification it's pretty common to take a pre-trained net, lock the values except for the last layer, and then retrain that for a new classification task. You can even drop the last layer entirely and use a SVM and get entirely adequate results in many domains.
Transfer learning at the moment works within one domain (say images), because the low level shapes are still similar, but not between different domains of data
I agree - this would be general intelligence. You can adjust weights for one problem very specifically to some degree of success, but having that affect a different slightly related problem (and even being able to distinguish between two differing problems) is going to require some rethinking, or enormous compute power exponentially beyond what we have now.