
Bitcoin Transaction Malleability - eklitzke
https://eklitzke.org/bitcoin-transaction-malleability
======
clarkmoody
> _These txids are immaterial to how the Bitcoin blockchain works: their
> primary use is as a convenience for humans when referring to transactions._

This is incorrect. Each Bitcoin transaction input references a previous
transaction output as the txid+output index. Transactions spending unconfirmed
outputs are orphaned when the parent is malleated and confirmed.

Also, as a data hash with no checksum, txids are not convenient for humans at
all.

> _Transaction malleability is already more or less fixed in Bitcoin_

A couple months ago, there was a significant malleability attack on the
Testnet, in which nearly every transaction was malleated as it was included in
a block.

~~~
nullc
It's also confused with the sources of malleability, listing DSA sign and DER
encoding (which you note it calls asn.1) as the only sources; unfortunately
there are a dozen of them... and as we came up with workaround with some, we'd
find more. This is why are complete fix was needed rather than a series of
hacks.

~~~
abrkn
For the uninformed, nullc knows a thing or two about Bitcoin[1]

[https://github.com/bitcoin/bitcoin/commits/master?author=gma...](https://github.com/bitcoin/bitcoin/commits/master?author=gmaxwell)

------
f9beb4d9
> However, OpenSSL did not do strict validation of the ASN.1 data by default

The more interesting problem was that this was non deterministic, you could
encode fields with 64bit integers and they would bomb out on 32bit systems.
ASN1 is also mind bogglingly complex, you can encode to arbitrary depths
completely nonsensical things like negative numbers and strings, containers of
multiple elements, none of the implementations manage to decode blocks the
same or adhere to the same limits.

~~~
nullc
We've identified ~some~ of OpenSSL's strange behaviors and documented them for
the purpose of making a bug compatible implementation
([https://github.com/bitcoin-
core/secp256k1/tree/master/contri...](https://github.com/bitcoin-
core/secp256k1/tree/master/contrib)), which required that:

\- All numbers are parsed as nonnegative integers, even though X.609-0207
section 8.3.3 specifies that integers are always encoded as two's complement.

\- Integers can have length 0, even though section 8.3.1 says they can't.

\- Integers with overly long padding are accepted, violation section 8.3.2.

\- 127-byte long length descriptors are accepted, even though section
8.1.3.5.c says that they are not.

\- Trailing garbage data inside or after the signature is ignored.

\- The length descriptor of the sequence is ignored.

But some things were just too awful to implement, e.g.

\- Using overly long tag descriptors for the sequence or integers inside,
violating section 8.1.2.2.

\- Encoding primitive integers as constructed values, violating section 8.3.1.

This last is especially fun, in OpenSSL you can create a constructed value
(like a struct) of constructed values of constructed values of strings.. and
it will just concatenate up all the bytes in the last level primitive elements
and treat the result as a number. ... but only if it's not more than 7 (IIRC)
levels deep.

~~~
richardwhiuk
What's the bar here for 'too awful to implement', out of curiosity?

------
kens
There's a list of nine different types of malleability here:
[https://github.com/bitcoin/bips/blob/master/bip-0062.mediawi...](https://github.com/bitcoin/bips/blob/master/bip-0062.mediawiki)

And if you want to see what a malleability attack looks like at the byte
level, I analyzed one three years ago: [http://www.righto.com/2014/02/bitcoin-
transaction-malleabili...](http://www.righto.com/2014/02/bitcoin-transaction-
malleability.html)

------
Uptrenda
This is a very complicated way to explain TX malluability. It'd say that the
problem is that signatures only sign a portion of the transaction and the
resulting TXID that is used in the blockchain is based on hashing the entire
transaction.

So the signature can be mutated as the author suggests, but the signature
doesn't sign the entire section of the transaction anyway (where data is
provided to the redeemScript to satisfy its conditions. This section called
the scriptSig includes the sig which cannot sign itself.)

So with the scriptSig, anyone is free to add whatever new data they like to
this section which gets added to the input stack, and as long as you leave the
stack the same way as you found it you can insert any arbitrary junk as you
like and it will change the resulting TXID as seen in the blockchain without
invalidating the transaction.

This is a bad thing for "smart contracts" on Bitcoin because many contracts
depend on making chains of unconfirmed, future transactions, based on hashing
the entire transaction to compute its TXID (as Clarkmoody suggests.) An
example of this is a cross-chain contract where you might want to send funds
to a partially shared address between a stranger and yourself, and you need a
way to setup a time-locked refund in case the protocol doesn't succeed (no
longer necessary due to OP_CHECKLOCKTIMEVERIFY but its an example.)

To do refunds in this way you would need to be able to sign chains of
unconfirmed transactions without previous transaction IDs being changed from
transaction malleability. Bitcoin does include a fix for this called
"segregated witness" but the fix has been controversial. I don't keep up to
date with the "scaling progress" now but I doubt it has been merged yet.

~~~
uncoder0
Segregated Witness is almost locked in through BIP91.

[https://www.xbt.eu/](https://www.xbt.eu/)

~~~
rothbardrand
I hope you are right. There is reason to believe that some of the signaling
for BIP91 is false (e.g.: bitmain has announced intentions to do a fork
[http://bitcoincash.org](http://bitcoincash.org) /
[http://bitcoinabc.org](http://bitcoinabc.org) )

I expect they are serious and we will have a fork into BCC (Bitcoin cash) and
BTC (bitcoin)

Certainly on the BTC chain segwit will be locked in. Whether it claims the
mantle of "bitcoin" or not is to be seen.

~~~
fpgaminer
Bitmain has stated that they will throw their hashrate at BitcoinABC _only_ if
BIP91 fails. It's their "big red button".

------
dfox
One thing that strikes me as weird is the reference to ASN.1, I always thought
that bitcoin only uses DER encoding for the signatures themselves (because
that is what is usual for ECDSA, even thought it is suboptimal for multiple
reasons) and the rest of the protocol including transaction format is
specified in terms of bytes and varints. Have I missed something?

~~~
saurik
I thought that was the entire point (though it is possible that I
misunderstood myself): that the transaction identifier is formed by taking a
hash of the entire transaction _and the signature_ (which, of course, could
not have been signed); if anything in the data being signed were modified then
this would be a very different issue, so the only options for malleability are
the signature and any structure connecting the signature to the data.

~~~
f9beb4d9
The script language itself is malleable due to being executed. `NOP NOP NOP
PUSHDATA` has the same result as `PUSHDATA`, despite having different bytes
and a different resulting hash. The `PUSHDATA` opcodes are also in themselves
malleable, you can do a `PUSHDATA2` (push the next 2 bytes to the stack) or a
`PUSHDATA4 (push the next 4 bytes to the stack) and get exactly the same
output. These can largely fixed with policy, but for a lot of cases that add
back in this behaviour- Segregated Witness simply doesn't include this data in
the TXID hash (but it is hashed in a commitment for the block to avoid other
attacks).

~~~
saurik
For purposes of my intentionally super-zoomed-out view of this (as I think
that is what is most valuable from a cryptography perspective), the script is
part of the "structure connecting the signature to the data".

