Hacker News new | comments | show | ask | jobs | submit login

[sorry. replied earlier incorrectly]. one of the two transactions would abort in this case. snapshot isolation needs to check that any data mutated in the transaction were not also mutated externally. in the example you are replying to, there is no mutation, and so no problem, but in your example there is. see http://en.wikipedia.org/wiki/Snapshot_isolation

note that it's only mutated values that are checked for conflicts, and only against other mutations (this is why it is efficient - the number of checks required is small). so you can get weird behaviour when multiple values are read while different transactions change each - there's a good example in the link above. this is called "write skew".

the whole approach is, in a sense, exploiting poor phrasing of the ansi sql-92 standard, which doesn't actually require serialisation even though that is the most natural way to interpret it (as far as i understand things). so you can think of MVCC as "exploiting a loophole" that leads to a more efficient system, but one that is less intuitive. on the other hand, this is not new - it's already the standard behaviour for postgres, oracle, sql server, etc.




Thanks for the clarification. However, this does not sound like something I could use in practice.

If I have two transactions and one of them is aborted because of the changes done by the other transaction what am I as a developer supposed to do? Retry? I hope not, because a retry is in other words serializing the execution. One after another.

So, if I have a system with a lot of concurrency (i.e. bank accounts and transfers) I'd better not use NuoDB because I'd get a lot of transfers aborted, not good. If I have a system with a very little write conflicts I'd go for serializability because that gives 100% consistency and will be fast anyway due to very little conflicts.

From the wikipedia link you posted, it is fairly clear that the Snapshot isolation is good when you don't need consistency. For that you'd have to either abort every time there is a conflict or introduce write-write conflict (ie. serializing).


(1) serializability will not be "fast anyway". it will be slow. this is the problem with, for example, mongodb's global write lock.

(2) you would not get "lots" of aborts with bank accounts because each transaction is, typically, to a different account. you're only going to get a problem when two processes try to change the same person's account at the same time.

(3) what this provides is standard-compliant ACID sql, the same as postgres, oracle, sql server, etc. if you use any of those and don't have retry in your code for when transactions fail then you're already in a mess.

i am not associated with this project, but what you're saying doesn't really make sense. as far as i can see, you're criticising it for being the same as everyone else in the standards compliant, sql world.


I'm not criticising it for being the same as everyone else. I'm just not sure how the whole thing works. I don't see enough information provided on the site, yet a lot of statements about providing full ACID and scalability.

re 1) and 2). I think they are related. The serialization can be done on account level. There is no need to have a single global write lock. This also implies that if the bank account transfers are typicaly to different accounts the serialization will be as often as the aborts. Hence very rare and the system will perform equally good in both cases. However, the serialization does not require further retries and doesn't force the application programmers to workaround the problems.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: