Believe me, we still need ACID/MVCC and by no means do I want to break the consistency part of CAP theorem if possible.
Also set operations break down under the following conditions:
1. when you have to aggregate a LOT of data together they don't scale. If it scaled well we wouldn't need OLAP solutions and could do everything with live, perfectly normalised data. That is just not the way it is.
2. They really don't work for temporal or complex non-uniformly structured data at all well.
You say you still need ACID, but you advocate manually writing joins on top of an ORM layer which does not give you ACID and introduces massive latency. Using an ORM on top of an a key value store doesn't absolve you of ACID or having joins. You still do joins and transactions, but you need to write them manually and test correctness manually. Both are very expensive to do correctly and running some unit tests is not sufficient to show correctness at this level. You are writing these things without recognizing what they are.
The whole NoSQL movement is fundamentally about trading correctness for performance, and usually read performance at that. There is nothing else new about NoSQL databases and ironically many even use a variant of SQL for queries! SQL is just a language, not an implementation.
As for aggregates, yes they are hard. By doing it yourself you are not making the problem less hard. Rather, you are possibly implementing solutions that RDBMS's can't yet do. I suspect that with a reasonable relational design you could probably do whatever aggregates you need to in a high performance manner while still taking advantage of the work done in an RDBMS.
It sounds like you don't understand the requirements of your application. If you need full ACID, your solution with nosql + ORM + manually written joins and queries is failing you. Unless, of course, if you are also writing your own global lock or using some form of lock-free algorithm.
NHibernate, SQLAlchemy, Hibernate are where you need to look and compare.
The ORM does a few joins where a performance gain is possible but we avoid them though as they are expensive on IO and transfer.
Most of our tables match the UI 1:1 and we populate these from the domain model when it is mutated or when we can afford it to be done (either directly or via message queues). Effectively they are materialized views which are updated when data changes or before it is required. Google CQRS.
I'm not suggesting NoSQL - just caution when using RDBMS features. I'm not an advocate of NoSQL.
In all those cases, the devil remains in the details. For example if you're just looking to add one or two columns to the result set, an INNER or OUTER JOIN usually reduces the amount of data that has to be shoved through the pipe. That's because you avoid transmitting the keys that are needed to join the data in-app. On the other hand, if you're joining a small number of larger values to a table with lots of rows, perhaps the join duplicates those values enough to really bulk up the result set. But it's effectively impossible to know whether that will be the case without measuring the performance for a couple different options.
Here's an alternative consideration: ORMs tend to encourage a style of coding wherein every column in the table is always grabbed all the time, regardless of how much of that information is actually needed for the task at hand. This is done in order to be able to generate full-fledged objects like the ORM wants to, and to reduce the amount of custom mapping that needs to be maintained. Now if you're looking for a needless waste of bandwidth, there's a needless waste of bandwidth.
If you're consistently having performance problems on joins, it's likely that the real culprit is poor configuration of keys and indexes. Or if you're not watching how the ORM chooses to join the data too closely, perhaps you've got some N+1 queries sneaking in there.