This technique is used when optimizing atomic structures, where the energy changes rapidly as a function of distance between the atoms. Newton's method (or more typically, one of its approximations, like L-BFGS) without a maximum step size would just blast all the atoms apart, but with a maximum step size, it works extremely well.
This technique does require some domain knowledge. That is, what step size is "big", which is not always easy to know.
I believe gradient descent is the king everywhere interesting right now (e.g. CV, NLP, RL).
Of course the roots need to be isolated/bracketed first, and for this one can use https://en.wikipedia.org/wiki/Sturm's_theorem