What they should have done was waited an hour then done an O(n) scan of all transactions globally in history to find the transaction by inspecting for parameters which exactly matched the ones they provided. That is, the Bitcoin developers now say, the correct use of the create transaction API.
Let me use an example programmers may be familiar with. Twilio lets you do SMS messages with three parameters: from_number, to_number, message. You are given back an SMS ID, which you can query to see the results of the SMS message (like, say, was it delivered successfully or did it fail with an error like "that telephone number did not exist").
Here's a discussion with Twilio in the bizarro world where it's like Bitcoin.
Me: "Hey Twilio I created an SMS message but when I try to query it for the results it 404s."
Them: "Are you sure you created the message?"
Me: "Yep pretty sure."
Them: "Are you sure you are looking for the right message ID in /messages/:id?"
Me: "Yep, I'm using the one that I got back when I created it."
Them: "Maybe it changed."
Me: "... What?"
Them: "Message IDs can change."
Me: "They don't usually change."
Them: "Of course, they don't usually change. Why have an ID if they usually changed? They only change some of the time."
Me: "What determines if a message ID changes?"
Them: "Oh, anyone globally can change your message IDs."
Me: "That sounds a bit insecure for a system which is, by its nature, deployed in a hostile environment."
Them: "Don't worry, they can't change after about an hour. Well, probably. It would be pretty expensive for an attacker to change them after an hour. Don't worry though, you'll never need an ID."
Me: "I find IDs useful for querying things. Like, say, messages. Which I have to do. To see whether the message was successful or not."
Them: "Well you're already downloading every message ever. Just scan through for one which matches the same from number, to number, and message contents."
Me: "... You're serious."
Them: "Don't worry though: they can't touch the from number, to number, or the message contents."
Me: "... Does this sound a little problematic to anyone else?"
Them: "It's on our wiki, noob!"
[Edit: Maybe somebody thinks I'm joking. Let me point you to one of the dangerous functions.
Parameters: <bitcoinaddress> <amount> [comment] [comment-to]
Comments: <amount> is a real and is rounded to 8 decimal places. Returns the transaction ID <txid> if successful.
You should naturally, upon reading this documentation, figure "I should immediately discard that transaction ID, because it could be changed instantaneously after this message call. If I instead rely on that transaction ID, I will allow malicious users to break the software I am building."]
> Don't worry, they can't change after about an hour. Well, probably. It would be pretty expensive for an attacker to change them after an hour.
You use the term "pretty expensive" here without qualifying it. Changing a transaction encoded in the blockchain would require outpacing the current hashrate of the bitcoin network. That would require a significant hardware investment, on the order of tens of millions of dollars.
> Well you're already downloading every message ever. Just scan through for one which matches the same from number, to number, and message contents.
You make it sound as if you wouldn't have to do this if you had the transaction hash. You still need to iterate through the transactions regardless. It's just a question of whether you use the transaction hash, or derive your own from the parts of the transaction that are immutable.
Let's make your example a touch more realistic:
Them: "Has the message been delivered?"
Me: "I don't think so. I'm querying it shortly after I create it."
Them: "How are you querying it?"
Me: "With the message hash."
Them: "Ah, that explains it, then. A pending message may be changed before its delivered, altering the hash. This makes the hash unsuitable for identifying pending messages."
Me: "So how do I identify messages?"
Them: "Ideally you wait until they're delivered, but if you really need to check for pending messages, you can search through them looking for a message that matches on to, from and content."
Me: "That kinda sucks."
Them: "We know, but it's a difficult issue to fix. It's documented in our wiki."
Me: "What if I don't read your wiki, or follow your mailing list?"
Them: "Then should you really be running an exchange handling millions of dollars of transactions?"
Me: "... Good point."
You'd think they'd have at least one guy dedicated to nothing but breaking their software. They make my salary every day with transaction fees (well, maybe until recently) so you can't say they're unable to afford it.
Edit: Rereading this, it sounds more accusatory than I intended. I think your clarification was perfect, but at the same time that MtGox is at fault.
However, the hash over the malleable part is still protocol-significant: which exact incarnation of the isomorphic transaction is being passed around or cemented into blocks. So this new stable ID would be in addition to the older one, and might not even be necessarily expressed inside the protocol: it might just be a convention, and could vary across independent implementations.
The MTGox statement was a plea for the community to converge on such an consensus identifier before MtGox commits to a local fix. But that's not strictly technically necessary, so their stance looks like a strategy for blame-shifting and further delay. The Bitcoin core people don't like to rush into things.
So what happens to people who aren't running an exchange handling millions of dollars of transactions? It doesn't matter if they get screwed by this flaw?
I read much of the wiki and never encountered any reference to transactional malleability.
The point of bitcoin is being able to do it yourself and not rely on centralized institutions.
The bitcoin reference client seems to get confused by this. It seems to allow additional spending of the unconfirmed change addresses and forms a chain of double spent transactions. The bitcoin balance as reported by 'getbalance' also becomes unreliable as it computes the balance incorrectly. Eventually the wallet stops working.
It wasn't twilio, but it turned out that when we submitted a SMS message of over 160 characters, the provider split it into 160 chunks and sent out as multiple SMS.
So far, so normal. But what happened when the first chunk sent successfully and the second chunk failed?
We got back a notification to say "MessageID: 4ACB-etc Result: OK" but the customer never got the message, and scanning the report on the provider's site showed the customer number, time and message as having failed.
But then the representative agreed it was a problem and set out to fix it rather than blaming our dependence on the ID!
I find it quite ridiculous that people are trying to lay the blame on not reading an obscure wiki page. I remember reading much of the bitcoin wiki myself and never seeing ANYTHING about not relying on transaction IDs. The API list doesn't even warn you about it.
Why bother returning a transaction ID if it is spoofable? That is simply misleading.
I guess it shows you how how biased all the bitcoin backers are.
You don't have to scan all transactions: only those from a firm reference-point of available-funds state, essentially the same point that was used to compose the outbound transaction.
Robust software already has to examine all incoming confirmed-in-block transactions for whether those transactions have consumed prior funds. If they have, even if the local software had as its design goal exclusive control of those funds, the local software must adapt to the new information. (Given the possibility of backups/virtualization-clones/private-key-exports, software must always be open to the possibility another node elsewhere has spent pending funds first.)
So safety against this particular mischief is possible with the same practice that's necessary for other reasons: it's not involved extra work.
Also, it's not "an hour" that lets a node know when it can rely on transaction-state, but block-confirmations, a precise and observable transition. One block is almost always enough, but each additional block adds more certainty. Still, all Bitcoin software already needs to handle occasional orphaned blocks and short forks, so being sensitive to periods of uncertainty is a essential part of all implementations, not extra work because of this one gotcha.
A better analogy than Twilio would be commercial payment systems: there you need to systems that are checking for weeks or months for chargebacks or reversals.
But an even better analogy than proprietary pay-per-use payment systems is SMTP or BitTorrent. The system is an emergent mess anyone can plug into. There are a lot of sharp edges, and even with great care, you're going to hit some painful and costly bugs. Those building billion-dollar businesses on such systems need to be experts, and will still take some arrows, but each incident that doesn't kill the software/business stacks only leaves them stronger.
Oh, you mean the scanning they have to do already, to verify "all transactions globally in history"? Inspecting all parameters on all transactions since the genesis, like you'd already have to do to verify they are not stealing or creating money from nothing? The inspection you have to do just to locate even the same transaction you submitted to the network yourself, to verify it was accepted? And you have to spend like 10 whole seconds of CPU time doing this, per ~10 minutes that a new block comes out, verifying the transactions from the last 10 minutes? Golly, that is sooooo much more onerous than just running the blockchain securely! /sarcasm
I'll agree that it's embarrassing, misleading, not documented well, and not gracefully handled by the community now that everyone points their fingers at each other. But you are deliberately making it sound worse by re-describing standard parts of the bitcoin protocol, as if they are new requirements in order to get a sane ID. Anyone writing financial software should be more than capable of quickly adding a few function hooks into the existing process to get a deterministic normalized ID, and the amount of extra computing resources is negligible compared to what you already have to do, just to use bitcoin safely.
sendtoaddress didn't always return a transaction id. It was changed to do that to facilitate bookkeeping. Sort of ironic.
There seem to be two prevailing extremes of opinion which appear a lot on Hacker News and many other places, as extremes are wont to do while those in the middle don't feel strongly enough to contribute. Those are 1) BitCoin will replace government control of money and fix freedom, dude! and 2) What fucking morons, can't wait until you crash and burn.
I love this whole thing. It's fascinating, it's an interesting solution to a problem, and watching DogeCoin take off is fun to watch. In my opinion, BitCoin is kind of like when a naïve programmer decides to rewrite an existing library themselves, and comes up against the brutal reality that led the original developers to the compromises and apparently necessary hacks to get the thing working. The analogy here being regulation, insurance, all that jazz. It's educational, and I haven't been this interested in a technology for a while.
I'm picking on your reply here because it is one of many that exemplifies a "haha told you so" rather than really digging into the interesting technical and sociological aspects.
* Message IDs from one API do not equate to IDs from the other, so your example is a bit flawed; there's no way to check with Twilio except the ID.
* Is "ID" even the term used? I don't know for sure, but "Tx Hash" seems to be more widely spread. [Edit: patio11 edited his comment while I was typing mine; I withdraw this point!]
* "It's on our wiki noob" - someone running the 3rd largest exchange should hardly be a "noob"
* "O(n) scan of all transactions globally" - that's not particularly hard, nor is it necessary (why scan all transactions from all time?), nor is it unexpected (the entire thing requires everyone to have the complete ledger, so you have the data anyway)
There are valid points to be made that the BitCoin protocol needs improvements, and these are even acknowledged by the core devs. This whole situation is a bit ludicrous. But I wish we were talking about "what have we learned", not "told you so".
When it comes to BitCoin, the conversation seems to be full of radicals and optimists when success happens, and gloaters when it doesn't. I don't feel either add to the conversation, we could be talking about how to improve this as a currency or (as I believe the long term actual application to be) how this can influence distributed trust, especially important in the current climate.
People often deploy the word FUD to describe arguments about technology which have no basis in technical fact. Can you identify statements which I've made about Bitcoin which have no basis in technical fact?
As for the second, FUD was then incorrect. You points were factual. I believe they ignored certain other facts for the convenience of argument (like, most exchanges seemed to know about it). But FUD was the incorrect term.
I'm a bit sad to see my main argument derailed by semantic failures. I guess I need to learn a lot about debating on the internet.