You don't need a metric which captures everything about the case - you just need a statistical model whose risk assessment is at least as accurate as the surgeon's assessment. This is not as difficult as it sounds - in Chapter 21 of Thinking Fast and Slow, Kahneman makes a strong case that simple algorithms are very often better than expert clinical judgement.