Basically, I have a big corpus of text (novels, as I'm interested in getting the learners to read) and a dictionary. I annotate the words using the dictionary, and then give the text context, the target word and the possible dictionary definitions as input to LLM, and I let it select or score, which definitions could be considered to "apply" given the context. Finally, I tally the counts.
The disambiguated senses are provided by the dictionary. Does that answer your question?
Are you asking the LLM to annotate text and then count number of annotations?
How do you make sure that each disambiguation has a stable label throughout?