When you experience such a situation it means that your tests are too low and you are testing the implementation, while you'd rather know if the system still works as expected.
One of the solutions could be some, when appropriate, using ports-adapters architecture and level up your unit tests.
Bunch of related blog posts have been published, e.g.
You can do it well, bad, in harmful or pathological ways, in healthy ways, too much, too little, for ideological or misguided reasons, at the wrong time ... but if you stop doing it completely, you will eventually die.
Don't follow any dogmatic advice about it ; instead, learn to recognize by yourself if the way you currently do it is providing value, and why.
And finally, move along ; and learn the big set of other skills needed to keep your code bases under control.
FWIW, I'd say that "writing tests using a tool like RSpec/Cucumber/etc." isn't really "doing BDD" by itself.
Tools like RSpec are a way of organising test code.
I'd recommend "Specification by Example".
IIRC the idea is to come up with example output of a feature, refine this to a specification. Tools like RSpec help align the test code with the specification.
> I think there are real problems with these BDD(-ish) test tools, as they obfuscate what you’re actually doing. ... The more layers you add on top of [assertions], the harder it will be to debug.
Sure, but the point of naming a test case (or adding in statements like `it("should ...")`) is to make it more apparent what a test is trying to do. I doubt that even something like Cucumber makes it that much harder to debug the program code.
Executable Specifications also try to be 'living documentation', which (if it works out) is a benefit worth the added cost of writing the tests that way.
Although there isn't a strict trade-off between code simplicity and testability, I have personally found that focusing on unit testing can take my focus away from thinking about simplicity an correctness-by-design. Sometimes I have to recalibrate: simplicity and correctness-by-design should be considered during the "refactor" phase of red-green-refactor.
The article could do more to emphasise that both TDD and BDD are "test first" methods. They are ways to develop code by first specifying what the code should do by writing tests. Working this way you often end up with different kinds of tests (and maybe even more testable code?) than if you just apply a post-hoc "write unit tests for everything" approach. Roy Osherove often mentions this in his test reviews.
The article questions whether integration tests should be preferred over unit tests, claiming that unit tests don't verify that the program performs it's main task correctly. This raises the question of how to allocate resources to testing, and being clear on what your goals are for testing. This reminds me of the "Test Pyramid".
It seems to me that it's a good idea to have integration tests that verify that the program satisfies its use cases. But if you want assurance that each component of the code does what it claims it does, then you want unit tests. The middle ground is what J.B. Rainsberger has called 'integrated tests' -- tests that simultaneously test aggregates of multiple units. J.B. argues that you're better off with unit tests, since achieving good test coverage with 'integrated tests' entails combinatorial explosion.
 http://artofunittesting.com/ (see Test Reviews in the right sidebar)
 J.B. Rainsberger - Integrated Tests Are A Scam https://vimeo.com/80533536