then you have to rewrite your whole app to use asyncio keywords and colored ORM methods. A gevent monkey patch, or eventually nogil concurrency makes a lot more practical sense.
You don't have to rewrite your whole app - you can continue using the regular stuff in synchronous view functions, then have a few small async views for your LLM streaming pieces.
I've never quite gotten comfortable with gevent patches, but that's more because I don't personally understand them or what their edge cases might be than a commentary on their reliability.
> When running in a mode that does not match the view (e.g. an async view under WSGI, or a traditional sync view under ASGI), Django must emulate the other call style to allow your code to run. This context-switch causes a small performance penalty of around a millisecond.