TLDR: What is the state of the art for one-shot or few-shot longitudinal (time-s...

TLDR: What is the state of the art for one-shot or few-shot longitudinal (time-series) machine vision tracking of object boundaries with ~ pixel (~ 10 μm) precision?

Specifics: I'm tracking the edges of the knee meniscus from time lapse video (~ 1000 frames) to measure its deformation under load. This is in the context of research to prevent osteoarthritis. Due to material rotation and irregular geometry, background edges that started off occluded come into and out of view over time. This tends to confuse both machine vision algorithms and human labelers. Because the tracking is for strain measurements, the tracked edge must be the same in all frames; therefore, the tracked edge is be a foreground vs. slightly less foreground edge of the same material (low contrast), not foreground vs. background (much easier). Only few-shot approaches are likely to save time because only ~ 20 specimens are needed to accomplish the immediate objective, and follow-up experiments will probably differ enough to require re-training.

The current plan is to Google "few-shot image segmentation" and try things until something works or the manual labeling effort finishes first, but maybe one of you knows a shortcut. Work is also ongoing to bypass the problem by enhancing edge contrast or using 3D imaging, but machine vision would be the most cost-effective solution.