Challenge Transaction
Introduction
Bitcoin and other cryptocurrencies provide the ability to sign an arbitrary message using a public key. This is used for a variety of tasks -- proving control over funds, signing released binaries, etc. But this facility only works for funds that are controlled by a single public/private keypair. It is a complicated task if the funds are controlled by a smart contract, and extracting such a keypair is a form of the "halting problem" if the smart contract is a turing complete program.
But this problem can be solved very simply, given existing machinery to sign and validate transactions -- just produce a transaction that cannot be committed to the blockchain (it is invalid), and have the system solve that. With this technique, any type of security available on the blockchain, such as pub/priv keypairs joined by boolean logic, multisig, or something much simpler or more complex, is available to act as an authorization key.
This blockchain-invalid transaction is called a "challenge" transaction.
Challenge Transaction Formulation
Making an Invalid Transaction
While there are many ways to make an invalid transaction, the challenge transaction should be obviously and minimally invalid. Using an obvious mechanism minimizes the possibility that client code could be tricked (or a bug exploited) into signing a valid transaction, and that the transaction will be invalid forever. Using a minimal mechanism minimizes the likelihood that existing signing code will not be disrupted.
A challenge transaction MUST contain an nVersion field with its most-significant-bit (MSB) set (e.g. 0x80). Every nVersion with the MSB set is an invalid transaction of the same form as an unset MSB. Therefore, checking for an invalid transaction comprises checking that the transaction nVersion is >127.
Note that Nexa only allows the nVersion field to be specifically defined values (currently only 0), so all versions with the MSB bit set are currently invalid.
Making a Unique Challenge
The challenge transaction MUST contain only a single output, and that output MUST be in data-carrier (amount 0 and an OP_RETURN script) format. This will further ensure that the transaction is invalid, since imported nonzero inputs will not balance.
After the OP_RETURN, the output MUST contain a push of unique identifying information (such as the FQDN or IP address as defined by the encompassing protocol) of the challenger. It MUST also contain another push of an arbitrary challenge string of no less than 8 and no more than 64 bytes.
The challenger SHOULD ensure that the length of this OP_RETURN plus the challenge string length (so count the challenge string length twice -- see the response below) is below the "data carrier" maximum to ensure that signing wallets will accept this output.
Requesting a UTXO
The challenge transaction MAY contain 0 or more inputs, with empty scripts. The responder must sign the transaction with these inputs.
If no inputs are provided, the responder may include one or more outpoints, as specified by the encompassing protocol and may provide those outpoint's UTXOs via the same protocol.
The encompassing protocol may require valid UTXOs if the purpose is to prove authority over an existing UTXO. But if the purpose is to simply prove the ability to sign an address, fake outpoints may be provided which must be numbered 0,1,2... and correspond to UTXOs provided as part of the return data.
Or the required address may be known by the challenger, which can generate a fake UTXO.
Challenge Response
The responder MUST verify that the challenge transaction's nVersion field's MSB is set. The responder MAY verify that there is only a single output and drop reject any challenger with more than 1 output.
The responder MAY add to, remove, or modify the transactions' inputs as per the encompassing protocol's specifications.
The responder MUST verify that the first output is an OP_RETURN, and MUST decode the first push after the OP_RETURN and validate that it corresponds to the identity of the challenger (as defined by the carrier protocol. For example, this could be the IP address or FQDN of the web site being accessed, or the name of the app). This prevents attacks where the attacker web site A presents a challenge from another web site B as its own challenge, allowing A to meet the challenge to access B (with your credentials) as you access A.
The responder MUST decode the 2nd push after the OP_RETURN and transform it by placing an arbitrary (random) byte before every byte in that array, producing data of between 16 and 128 bytes. For example, if the challenger's bytes are numbers and the responder's bytes are letters, the transformed OP_RETURN data would look like: a0b1c2d3e4f5... Although the transaction wrapper makes it unlikely that the signed bytes would be usable within another context, breaking up the challenge content makes it even less so.
Response Verification
If the challenger is verifying current UTXO ownership, they MUST verify the validity of the transaction against the current blockchain's UTXO set -- that is, execute a completely normal transaction validation except do not error upon detecting that the nVersion's MSB bit is set. Note that this technique is essential to prove ownership of a token.
Note that it is not sufficient to prove prior ownership of a token by just verifying the validity of transaction scripts (an attacker could make up a fake UTXO that has never existed in the blockchain). The responder must also provide a proof that all supplied TXOs were included in the blockchain. This could be done by providing a merkle proof of the transaction containing that TXO.
Alternately, a challenge transaction can be used to simply prove the ability to solve (sign) some script. In this case, the "UTXO" would not need to actually have existed in the blockchain.
Request and Response Roles
Typically the challenger will form the challenge transaction and send it to the signer. This allows the signer to run standard transaction signing code, with very few if any modifications (just removing validity checks if they exist). In this case is recommended to use the parameter "chaltx" in encompassing parameterized protocols (e.g. HTTP and JSON).
However, in bandwidth limited cases (e.g. QR codes), a challenge may be communicated by providing only the challenge bytes. The responder can formulate the full challenge transaction, with the provided bytes (interspersed with its own bytes) in the transactions' 2nd push after the OP_RETURN.
Similarly, the first push (the challenger's identity) should be defined by the encompassing protocol.
In this case it assumed that the responder is determining the UTXOs that need to be signed, or that these UTXOs are implied by the encompassing protocol. It is recommended to use the parameter "chalby" in encompassing parameterized protocols (e.g. HTTP and JSON) when only the challenge bytes are provided.