Check this: http://huyle.de/2019/02/12/accessing-capacitive-images/ As you see, the sensor elements are huge, 4×4mm each, i.e. there’re only 15×27 sensors for the complete touch screen. On top of that, there’s high amount of noise in the signal of each sensor.
The reasons why it works OK in practice, fingers have very predictable shape, also a lot of software involved on all levels of the stack. Touch screen firmware filters out sensor noise and generates touch points. Higher level GUI frameworks “snap“ touches to virtual buttons, some platforms go as far as making virtual keyboard buttons different sizes, depending on which virtual keys are expected to be clicked next, according to predictive input software i.e. dictionaries.
What you propose probably can be done, by using a finger-like object, but I don’t expect the resolution will be great. At least not in comparison with hardware turning knobs, even cheap ones can be make extremely precise. See this https://en.wikipedia.org/wiki/Rotary_encoder and https://en.wikipedia.org/wiki/Incremental_encoder for more info, both are used a lot in wide variety of applications. Old mice with a ball had 2 of them, the reason why ball mice sucked was not sensor precision, it was dirt accumulation, a minor issue for a knob.