Write a function which takes as input a list of sets, many of which are not disjoint, but will output a list of sets where all of the non-disjoint ones have been merged back together again. So the output is a list of sets which are all disjoint from each other because any intersecting sets have been merged together.
e.g. given the input:
[(1,2,3), (2,4,8), (10,11,12)]
[(1,2,3,4,8), (10,11,12)]
Whereas given the input:
[(1,2,3), (2,4,8), (10,11,12), (8,10)]
[(1,2,3,4,8,10,11,12)]
I came up with an algorithm which is acceptable for the dataset we currently have - but I've no idea what time complexity it is (for our real dataset it was able to do it in "one pass" - but in principal it could be worse than that). I don't know how I would implement a distributed version if the list of sets was too big to fit in memory, etc. etc.
And I haven't found a good solution on Google (but I'm not even sure what to Google).
