Edit to respond to edit: Built-ins and slots are even faster but the key point is that the ctypes-based class is actually slower than simply using a class with ordinary Python variables.
The reason ctypes is slower is that addition isn't defined on c_ints which requires casting them back to Python ints for each operation. This can be avoided by using ctypes to call a C function to perform addition.
I stand by my assertion that ctypes along with an external C library is a great way to do Python speed-ups. It's very simple to do, see here:
This is the kind of optimizations I usually use ctypes for, or for interfacing with a third-party shared library.
ctypes + C code can be quite efficient, but you have to write the entire fast-path in C, not flip-flop between C and Python. It's best when you have a certain operation that needs to be fast (say, serving your most common path on a webserver, or running a complex matrix operation in Numpy), and then a bunch of additional features that are only called once in a while.