Manual testing runs into the same problems, just like coupled messy code needs to be mocked to high heaven to the point that it's hard to be completely sure you're testing the functionality, messy code is also really hard to manually test.
I worked at a place that had a 100% test coverage requirement and the codebase was highly coupled so there were mocks everywhere and tests were ridiculously brittle. Manual testing was nearly a non-starter because it was a microservice architecture with a lot of moving pieces and non-deterministic docker builds failed with surprising regularity (to the point that people shipped development VMs around much to the chagrin of the project "architect"). Getting the stars to align for docker-compose to result in a working test system, then populating it with whatever specific data was needed for that test case so you could actually walk through the flow was a minor miracle.
I worked at a place that had a 100% test coverage requirement and the codebase was highly coupled so there were mocks everywhere and tests were ridiculously brittle. Manual testing was nearly a non-starter because it was a microservice architecture with a lot of moving pieces and non-deterministic docker builds failed with surprising regularity (to the point that people shipped development VMs around much to the chagrin of the project "architect"). Getting the stars to align for docker-compose to result in a working test system, then populating it with whatever specific data was needed for that test case so you could actually walk through the flow was a minor miracle.