The title of the paper (by my company's CTO) "Hough Transform: Underestimated Tool In The Computer Vision Field"  still holds true. It describes the FHT algorithm rather well.
 pdf http://www.scs-europe.net/dlib/2015/ecms2015acceptedpapers/0...
Its competitor is RANSAC.
Compared to linear or orthogonal regression it has the advantage that it has quite a natural way to cope with multiple lines. Each line is represented by a high value in the Hough space. The next line can be found by removing the current maximum.
It is generalizable to more abstract shapes, but the Hough space becomes high dimensional. It is nontrivial to represent hierarchical objects, for example line segments AND the square it forms.
There's also a neat trick of finding vanishing points via double Hough transform.
See also  for an example of HT application to road markup recognition.
 pdf http://iitp.ru/upload/publications/7226/2015_ICMV_D.%20Krokh...
Basically, it's used to find parameters to a model with sample data.
Also, can it be used to detect circular arcs?
Our lab has done some research on fast GHT using general-purpose computation scheme optimisation, but I cannot find any publications in English from that distant period. For 3D Hough transform, there's an efficient solution for finding the argmax (3d line), also by my colleagues 
 pdf https://web.eecs.umich.edu/~silvio/teaching/EECS598/papers/B...
 pdf http://www.scs-europe.net/dlib/2016/ecms2016acceptedpapers/0...
Circular arcs can be detected but how well depends on how long they are and depending on the amount and shapes of whatever other features exist in the image.
Hough can be thought of as a convolution of an image with a kernel that is a delta function in the parameter space. Peaks in the parameter space are then interpreted as being representative of features in the real space and with the strength of the representation being that of the height of the peak.
It is called the randomized Hough transform.
Circular arcs is stretching it. :-) Do you need to find the endpoints of the arc? That would be two more dimensions to search over. If that's the case I would just use a maximum likelihood approach. You have to be careful in that case, if you have a circle the distance of points on the inside of the circle and points on the outside of the circle meshes up any naive fitting method.
R S Stephens, A Probabilistic Approach to the Hough Transform