On Decoding Raw EVM Calldata

preview

With the release of heimdall-rs 0.7.0, the toolkit gained the ability to decode raw EVM calldata into its constituent types, without the need for a contract ABI or signature resolution. In this technical article, we'll dive deep into the inner workings of EVM calldata, how it's encoded, and how we can decode it arbitrarily.

Note: knowledge of the EVM and Solidity/Vyper is assumed, but not required.

Disclaimer: The method presented in this paper is not perfect and may still have some ambiguity when it comes to decoding types. However, without the ABI or signature, this is the best we can do.

Brief: EVM Calldata

When interacting with an EVM contract, a caller must provide a set of arguments in order to invoke a function call. These arguments are encoded into a byte array known as the calldata, which is passed to the contract as part of the transaction. The contract can then access the calldata, and decode¹ it into its constituent types.

The EVM has three opcodes for accessing calldata: CALLDATASIZE, CALLDATALOAD, and CALLDATACOPY. These opcodes allow a contract to access the calldata, but do not provide any information about the calldata's structure, or the types of the arguments it contains.

As an exercise, let's build the calldata for a simple function call. Consider the following Solidity function:

snippet.sol

1function balanceOf(address who) public view returns (uint256) {
2    return balances[who];
3}

Function Signatures & Selectors

The first four² bytes of the calldata are typically used to identify the function being called. These bytes are known as the function selector, and are generated by taking the first four bytes of the keccak256 hash of the function's signature.

Our balanceOf function has the signature³:

snippet.sol

1balanceOf(address)

The keccak256 hash of this signature is:

snippet.txt

10x70a08231b98ef4ca268c9cc3f6b4590e4bfec28280db06bb5d45e689f2a360be

So the function selector is the first four bytes of this hash: 0x70a08231, and we can begin building the calldata:

snippet.txt

170a08231

Encoding Arguments

Arguments in calldata are encoded according to the ABI specification. The encoding of an argument is dependent on its type, and the encoding of a function's arguments is simply the concatenation of the encodings of each argument (with a few exceptions we'll touch on later).

Encoding Static Types

For elementary types, the encoding is straightforward:

Type	Encoding
`bool`	`00...00` for `false`, `00...01` for `true`
`uint<N>`	Hex-encoded big-endian representation of the integer
`address`	Encoded as a `uint160`
`bytes<N>`	Hex-encoded bytes, left-padded
`int<N>`	Hex-encoded big-endian representation, padded with `ff` bytes if negative and `00` bytes if positive
`enum`	Encoded as a `uint8`

These types are encoded as a single word (32 bytes) in calldata, and are padded to the left with zeroes if necessary.

So, for our balanceOf function, the argument is an address, which is encoded as a uint160. The address we want to query is 0xd8da6bf26964af9d7eed9e03e53415d37aa96045 (vitalik.eth), so the encoding is:

snippet.txt

1000000000000000000000000d8da6bf26964af9d7eed9e03e53415d37aa96045

And we can add this to our calldata:

snippet.txt

170a08231
2000000000000000000000000d8da6bf26964af9d7eed9e03e53415d37aa96045

The calldata is now complete, and we can pass it to the contract using Foundry's cast:

snippet.sh

1cast call 0xc02aaa39b223fe8d0a0e5c4f27ead9083c756cc2 --data 70a08231000000000000000000000000d8da6bf26964af9d7eed9e03e53415d37aa96045

And we get the expected result:

snippet.txt

10x000000000000000000000000000000000000000000000001794bd6ed99652e51

Encoding Dynamic Types

Encoding dynamic types is a bit more complicated, since their encoding is dependent on the length of the data being encoded. The encoding of a dynamic type is as follows:

The first word is the offset of the data in bytes from the start of context. We'll call this offset.
- In most cases, context is the start of the calldata argument block (that is, the first word of the calldata after the function selector). However, if the dynamic type is nested within another dynamic type, context is the start of the outer dynamic type's data block.
- The offset is encoded as a uint256, and is padded to the left with zeroes if necessary.
The word at context[offset] is the length of the data in bytes. We'll call this length. The length is encoded as a uint256, and is padded to the left with zeroes if necessary.
- For bytes and string, length is the number of bytes the encoded data takes up. The encodings themselves are right-padded with zeroes if necessary, and may span multiple words if the length is greater than 32 bytes.
- For dynamic-length arrays (i.e, T[]), length is the number of elements in the array. The encoding is then the concatenation of the encodings of each element, in order.

Note: The above is a simplification of the ABI specification. For the purposes of this article, I believe this is sufficient.

Again, we'll use an example to illustrate. Let's encode the following signature f(uint256,uint32[],bytes10,bytes) (selector is 0x8be65246) with arguments (0x123, [0x456, 0x789], "1234567890", "Hello, world!").

The first parameter is a uint256, which is a simple elementary type. This encoding is straightforward:

snippet.txt

10000000000000000000000000000000000000000000000000000000000000123

The second parameter is a dynamic-length array of uint32.

snippet.txt

10000000000000000000000000000000000000000000000000000000000000002  - length of array (2)
20000000000000000000000000000000000000000000000000000000000000456  - encoding of array element 1
30000000000000000000000000000000000000000000000000000000000000789  - encoding of array element 2

The third parameter is a bytes10, which is a static type with a length of 10 bytes. The encoding is:

snippet.txt

13132333435363738393000000000000000000000000000000000000000000000

The fourth parameter is a bytes, which is a dynamic type. The encoding is:

snippet.txt

1000000000000000000000000000000000000000000000000000000000000000d  - number of bytes in encoded data (13)
248656c6c6f2c20776f726c642100000000000000000000000000000000000000  - right-padded bytes

Now we can concatenate these encodings to get the final calldata:

snippet.txt

10x8be65246
20000000000000000000000000000000000000000000000000000000000000123  - first parameter
3(a)                                                               - placeholder for second parameter offset
43132333435363738393000000000000000000000000000000000000000000000  - third parameter
5(b)                                                               - placeholder for fourth parameter offset
60000000000000000000000000000000000000000000000000000000000000002  - length of array (2)
70000000000000000000000000000000000000000000000000000000000000456  - encoding of array element 1
80000000000000000000000000000000000000000000000000000000000000789  - encoding of array element 2
9000000000000000000000000000000000000000000000000000000000000000d  - number of bytes in encoded data (13)
1048656c6c6f2c20776f726c642100000000000000000000000000000000000000  - right-padded bytes

Now we can fill in the placeholders (a) and (b) with the offsets of the second and fourth parameters, respectively. Since the fourth word (zero-indexed) of the calldata is the start of the second parameter, the offset of the second parameter is $4 * 32 = \boxed{128}$ . Likewise, the offset of the fourth parameter is $7 * 32 = \boxed{224}$ .

snippet.txt

10x8be65246
20000000000000000000000000000000000000000000000000000000000000123  - first parameter
30000000000000000000000000000000000000000000000000000000000000080  - offset of second parameter (128)
43132333435363738393000000000000000000000000000000000000000000000  - third parameter
500000000000000000000000000000000000000000000000000000000000000e0  - offset of fourth parameter (224)
60000000000000000000000000000000000000000000000000000000000000002  - length of array (2)
70000000000000000000000000000000000000000000000000000000000000456  - encoding of array element 1
80000000000000000000000000000000000000000000000000000000000000789  - encoding of array element 2
9000000000000000000000000000000000000000000000000000000000000000d  - number of bytes in encoded data (13)
1048656c6c6f2c20776f726c642100000000000000000000000000000000000000  - right-padded bytes

And we're done! The final calldata is:

snippet.txt

10x8be6524600000000000000000000000000000000000000000000000000000000000001230000000000000000000000000000000000000000000000000000000000000080313233343536373839300000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000e0000000000000000000000000000000000000000000000000000000000000000200000000000000000000000000000000000000000000000000000000000004560000000000000000000000000000000000000000000000000000000000000789000000000000000000000000000000000000000000000000000000000000000d48656c6c6f2c20776f726c642100000000000000000000000000000000000000

You may notice that for any function $f({arg}_1, ... {arg}_n)$ , the first $n$ words of the calldata are either the arguments themselves, or the offsets of the arguments. This will be important later.

Decoding Calldata

Before we begin decoding raw calldata, let's set a few assumptions:

The calldata is well-formed. That is, it is a valid sequence of bytes that can be decoded according to the ABI specification. Additionally, $\text{len(calldata) - 4} \equiv_{32} 0$ .
We do not have access to the contract ABI, nor do we have access to the function signature. All we have is the raw calldata.
Offset pointers to dynamic types are valid. That is, $\text{offset} \gt 0, \text{offset} \equiv_{32} 0$ . This is reasonable, since an offset pointer must point to the start of a word.

Helper Functions

We'll define a few helper functions that we'll use throughout the decoding process. For brevity, we'll use pseudocode to describe these functions, but for those interested, the full implementation can be found here.

First, we'll define a function get_padding(bytes: &[u8]) which determines if a word in calldata is left or right padded.

snippet.rs

1pub enum Padding {
2    Left,
3    Right,
4    None,
5}
6
7/// Given a string of bytes, determine if it is left or right padded.
8fn get_padding(bytes: &[u8]) -> Padding {
9    // 1. find the indices of all null-bytes in the string
10    // 2. if any of the following are true, we cannot determine the padding (return None):
11    //    - there are no null bytes
12    //    - neither the first nor last byte is null
13    // 3. if the first byte of the string is null and the last byte is not, return Left
14    // 4. if the last byte of the string is null and the first byte is not, return Right
15    // 5. find indices of all non-null bytes in the string
16    // 6. count the number of null-bytes on the LHS and RHS of the string
17    //    - if the number of null-bytes on the LHS is greater than the number of null-bytes on the RHS, return Left
18    //    - if the number of null-bytes on the RHS is greater than the number of null-bytes on the LHS, return Right
19    //    - otherwise, return None
20}

Next, we'll define a function get_padding_size(bytes: &[u8]) which returns the number of padding bytes in a string.

snippet.rs

1/// Given a string of bytes, return the number of padding bytes.
2fn get_padding_size(bytes: &[u8]) -> usize {
3    match get_padding(bytes) {
4        Padding::Left => {
5            // count the number of null bytes at the start of the string
6        }
7        Padding::Right => {
8            // count the number of null bytes at the end of the string
9        }
10        Padding::None => 0,
11    }
12}

Next, we'll define a function fn byte_size_to_type(byte_size: usize) -> Vec<String> which returns all potential static types that byte_size could represent. For example, byte_size_to_type(1) returns vec!["bool", "uint8", "bytes1", "int8"]).

snippet.rs

1/// Given a byte size, return all potential static types that it could represent.
2fn byte_size_to_type(byte_size: usize) -> Vec<String> {
3    let mut potential_types = Vec::new();
4
5    match byte_size {
6        1 => potential_types.push("bool"),
7        15..20 => potential_types.push("address"), // We check for 15..20 because addresses may have leading null-bytes. This allows for up to 5 leading null-bytes.
8        _ => {},
9    }
10
11    // push standard types
12    potential_types.push(format!("uint{}", byte_size * 8));
13    potential_types.push(format!("bytes{byte_size}"));
14    potential_types.push(format!("int{}", byte_size * 8));
15
16    potential_types
17}

We also have a simple type-conversion function fn to_type(type_str: &str) -> ParamType which converts a string to a ParamType enum variant.

snippet.rs

1/// A helper function that converts a string type to a ParamType.
2/// For example, "address" will be converted to [`ParamType::Address`].
3pub fn to_type(string: &str) -> ParamType {
4    ...
5}

Finally, we'll define a function get_potential_types_for_word(word: &[u8]) -> (usize, Vec<String>) which returns all potential types that a word in calldata could represent, as well as the maximum size of the word in bytes.

snippet.rs

1// Get minimum size needed to store the given word
2pub fn get_potential_types_for_word(word: &[u8]) -> (usize, Vec<String>) {
3    // get padding of the word, note this is a maximum
4    let padding_size = get_padding_size(word);
5
6    // get number of bytes padded
7    let data_size = word.len() - padding_size;
8    byte_size_to_type(data_size)
9}

The Decoding Process

Now that we have our helper functions, we can begin decoding raw calldata.

The first step of the decoding process is to convert the calldata into a list of words (split into 32-byte chunks, removing the function selector). We'll call this list calldata_words.
Now we create a HashSet<usize> called covered_words. We'll use this to keep track of which words we've already covered, so we don't accidentally decode the same word twice, miss decoding a word, or incorrectly decode a word. We'll also create a mutable Vec<ParamType> called potential_inputs, which we'll use to keep track of the types of the function's inputs.

snippet.rs

1// we're going to build a Vec<ParamType> of all possible types for each
2let mut potential_inputs: Vec<ParamType> = Vec::new();
3let mut covered_words: HashSet<usize> = HashSet::new();

Next, we'll use a while loop to iterate over each word in calldata_words. The loop will terminate when all words in calldata_words have been covered and decoded. Here's what each iteration of the loop looks like:

snippet.rs

1let mut i = 0; // this is the current index in calldata_words
2while covered_words.len() != calldata_words.len() {
3    let word = calldata_words[i];
4
5    // (1) try to decode the current word as an ABI-encoded dynamic type. if this succeeds,
6    // add the type to `potential_inputs` and add the indices of all words covered by this type
7    // to `covered_words`
8    if let Some(abi_encoded) = try_decode_dynamic_parameter(i, &calldata_words)? {
9        // convert the ABI-encoded type to a ParamType and add it to potential_inputs
10        let potential_type = to_type(&abi_encoded.ty);
11        potential_inputs.push(potential_type);
12
13        // extend covered_words with the indices of all words covered by this dynamic type
14        covered_words.extend(abi_encoded.coverages);
15
16        i += 1;
17        continue;
18    }
19
20    // (2) this is a static type, so we can just get the potential types for this word
21    let (_, mut potential_types) = get_potential_types_for_word(word);
22
23    // perform heuristics, since we can't determine the type of a word with 100% certainty
24    //     - if we use right-padding, this is probably bytesN
25    //     - if we use left-padding, this is probably uintN or intN
26    //     - if we use no padding, this is probably bytes32
27    match get_padding(word) {
28        Padding::Left => potential_types
29            .retain(|t| t.starts_with("uint") || t.starts_with("address")),
30        _ => potential_types
31            .retain(|t| t.starts_with("bytes") || t.starts_with("string")),
32    }
33
34    // (4) convert the type with the highest potential to a ParamType and add it to `potential_inputs`
35    let potential_type =
36        to_type(potential_types.first().expect("potential types is empty"));
37    potential_inputs.push(potential_type);
38
39    // this word is now covered, so add it to `covered_words`
40    covered_words.insert(i);
41
42    i += 1;
43}

We'll attempt to decode the current word as an ABI-encoded dynamic type. If this succeeds, we'll add the type to potential_inputs and add the indices of all words covered by this type to covered_words. We'll discuss how this works in detail later.
If this is not a dynamic type, we'll attempt to decode this word as a static type. We'll explain how this works in detail later.

NOTE: If $f({arg}_1, ... {arg}_n)$ is a function with $n$ arguments, then the first $n$ words of the calldata are either the arguments themselves, or the offsets of the arguments. When we have decoded the first $n$ words of calldata, covered_words will contain the indices of all words in the calldata that are part of the function's arguments. We can then use this to determine the function's signature, and the types of its arguments.

Decoding Static Types

Let's start with the simple case: static types. Given a word of calldata that is not an ABI-encoded dynamic type, we can determine the potential types of the word by using the get_potential_types_for_word function we defined earlier.

This function will return a list of potential types, as well as the maximum size of the word in bytes. We'll use this to determine the most likely type of the word by performing a few simple heuristics:

bytesN and strings are always right-padded, so if this word is right-padded, it's probably a bytesN or string.
uintN and intN are always left-padded, so if this word is left-padded, it's probably a uintN or intN.
- We can also check if the padding is 00 or ff, indicating whether the number is positive or negative. If the padding is 00, it's probably a uintN. If the padding is ff, it's probably an intN.
If this word is not padded, it's probably a bytes32.

Decoding Dynamic Types

Now we'll move on to the more complicated case: ABI-encoded dynamic types. Let's take a look at how try_decode_dynamic_parameter works behind the scenes:

snippet.rs

1pub fn try_decode_dynamic_parameter(
2    parameter_index: usize,
3    calldata_words: &[&str],
4) -> Result<Option<AbiEncoded>, Error> {
5    // (1) initialize a [`HashSet<usize>`] called `word_coverages` with `parameter_index`
6    // this works similarly to `covered_words`, but is used to keep track of which
7    // words we've covered while attempting to ABI-decode the current word
8    let mut coverages = HashSet::from([parameter_index]);
9
10    // (2) the first validation step. this checks if the current word could be a valid
11    // pointer to an ABI-encoded dynamic type. if it is not, we return None.
12    let (byte_offset, word_offset) = match process_and_validate_word(parameter_index, calldata_words) {
13        Ok((byte_offset, word_offset)) => (byte_offset, word_offset),
14        Err(_) => return Ok(None),
15    };
16
17    // (3) the second validation step. this checks if the pointed-to word is a valid pointer to a word in
18    // `calldata_words`. if it is not, we return an [`Error::BoundsError`].
19    //
20    // note: `size` is the size of the ABI-encoded item. It varies depending on the type of the
21    // item. For example, the size of a `bytes` is the number of bytes in the encoded data, while
22    // for a dynamic-length array, the size is the number of elements in the array.
23    let size_word = calldata_words.get(word_offset.as_usize()).ok_or(Error::BoundsError)?;
24    let size = U256::from_str_radix(size_word, 16)?.min(U256::from(usize::MAX));
25
26    // (4) add the size word index to `word_coverages`, since this word is part of the ABI-encoded type
27    // and should not be decoded again
28    coverages.insert(word_offset.as_usize());
29
30    // (5) check if there are enough words left in the calldata to contain the ABI-encoded item.
31    // if there aren't, it doesn't necessarily mean that the calldata is invalid, but it does
32    // indicate that this type cannot be an array, since there aren't enough words left to store
33    // the array elements.
34    let data_start_word_offset = word_offset + 1;
35    let data_end_word_offset = data_start_word_offset + size;
36    match data_end_word_offset.cmp(&U256::from(calldata_words.len())) {
37        Ordering::Greater => try_decode_dynamic_parameter_bytes(
38            parameter_index,
39            calldata_words,
40            byte_offset,
41            word_offset,
42            data_start_word_offset,
43            size,
44            coverages,
45        ),
46        _ => try_decode_dynamic_parameter_array(
47            parameter_index,
48            calldata_words,
49            byte_offset,
50            word_offset,
51            data_start_word_offset,
52            data_end_word_offset,
53            size,
54            coverages,
55        ),
56    }
57}

We initialize a HashSet<usize> called word_coverages (not to be confused with covered_words) with parameter_index. We'll use this HashSet to keep track of which words we've covered while attempting to ABI-decode the current word. If we successfully decode a dynamic type, this HashSet will contain all indices of words in calldata_words that are part of this ABI-encoded type, and will be joined with covered_words at the end of the decoding process.
Now, we'll perform the first validation step by checking if the current word could be a valid pointer to an ABI-encoded dynamic type in the calldata. We do this by leveraging assumption (3), which states that all pointers to dynamic types are:
- Greater than zero
- Divisible by 32, since this should point to the start of a word (the size of the dynamic type).
Next, we'll check if this pointer is actually valid and points to a valid word in calldata_words. If it is not, we return None. If it is, we'll parse the word into a U256 and call this size. We'll also add word_offset to word_coverages, since this word (size) is part of the ABI-encoded type and should not be decoded again.
- We can now define data_start_word_offset as word_offset + 1, since the word at word_offset is the size of the ABI-encoded type, and the data starts at the next word. This is the start of the data block for this dynamic type, and we'll use this later. data_end_word_offset is data_start_word_offset + size, since the data block is size words long. In the case of bytes and string, we'll need to recalculate this later.
Since size varies depending on the type of the item, we'll perform a simple check to see if there are enough words left in the calldata to contain the ABI-encoded item. If there aren't, it doesn't necessarily mean that this is not a valid ABI-encoded type, but it does indicate that this type cannot be an array, since there aren't enough words left to store the array elements. If there are enough words left, we'll call try_decode_dynamic_parameter_array, which we'll cover later. If there aren't, we'll call try_decode_dynamic_parameter_bytes.

Decoding `bytes`

Let's take a look at how try_decode_dynamic_parameter_bytes works behind the scenes:

snippet.rs

1fn try_decode_dynamic_parameter_bytes(
2    parameter_index: usize,
3    calldata_words: &[&str],
4    word: U256,
5    word_offset: U256,
6    data_start_word_offset: U256,
7    size: U256,
8    coverages: HashSet<usize>,
9) -> Result<Option<AbiEncoded>, Error> {
10    let mut coverages = coverages;
11
12    // (1) join all words from `data_start_word_offset` to the end of `calldata_words`.
13    // this is where the encoded data may be stored.
14    let data_words = &calldata_words[data_start_word_offset.as_usize()..];
15
16    // (2) perform a quick validation check to see if there are enough remaining bytes
17    // to contain the ABI-encoded item. If there aren't, return an [`Error::BoundsError`].
18    if data_words.join("").len() / 2 < size.as_usize() {
19        return Err(Error::BoundsError);
20    }
21
22    // (3) calculate how many words are needed to store the encoded data with size `size`.
23    let word_count_for_size = U256::from((size.as_u32() as f32 / 32f32).ceil() as u32);
24    let data_end_word_offset = data_start_word_offset + word_count_for_size;
25
26    // (4) get the last word in `data_words`, so we can perform a size check. There should be
27    // `size % 32` bytes in this word, and the rest should be null bytes.
28    let last_word = data_words.get(word_count_for_size.as_usize() - 1).ok_or(Error::BoundsError)?;
29    let last_word_size = size.as_usize() % 32;
30
31    // if the padding size of this last word is greater than `32 - last_word_size`,
32    // there are too many bytes in the last word, and this is not a valid ABI-encoded type.
33    // return an [`Error::BoundsError`].
34    let padding_size = get_padding_size(last_word);
35    if padding_size > 32 - last_word_size {
36        return Err(Error::BoundsError);
37    }
38
39    // (5) we've covered all words from `data_start_word_offset` to `data_end_word_offset`,
40    // so add them to `word_coverages`.
41    for i in word_offset.as_usize()..data_end_word_offset.as_usize() {
42        coverages.insert(i);
43    }
44
45    Ok(Some(AbiEncoded { ty: String::from("bytes"), coverages }))
46}

First, we'll join all words from data_start_word_offset (the index in calldata_words where the encoded data starts) to the end of calldata_words. This is where the encoded data may be stored. We'll call this data_words. This may contain extra words that are not part of the ABI-encoded type, but that's okay, since we'll be checking the size of the encoded data later.
Next, we'll perform a quick validation check to see if there are enough remaining bytes in data_words to contain the ABI-encoded item. If there aren't, we return an Error::BoundsError.
Now, we'll calculate how many words are needed to store the encoded data with size size. We'll call this word_count_for_size. We'll also calculate the end of data_words by adding word_count_for_size to data_start_word_offset. We'll call this data_end_word_offset.
Now, we can perform a check on the last word in data_words. There should be size % 32 bytes in this word, and the rest should be null bytes. If the padding size of this last word is greater than 32 - last_word_size, there are too many bytes in the last word, and this is not a valid ABI-encoded type. We return an Error::BoundsError.
We extend word_coverages with the indices of all words from data_start_word_offset to data_end_word_offset, since we've now covered all words in the ABI-encoded type. We then return an AbiEncoded struct containing the type (bytes) and the coverages to be joined with covered_words at the end of the decoding process.

Decoding `string`

Decoding string is very similar to decoding bytes, with a few minor differences. Let's take a look at how try_decode_dynamic_parameter_string works behind the scenes:

snippet.rs

1fn try_decode_dynamic_parameter_string(
2    data_words: &[&str],
3    parameter_index: usize,
4    calldata_words: &[&str],
5    word: U256,
6    word_offset: U256,
7    data_start_word_offset: U256,
8    size: U256,
9    coverages: HashSet<usize>,
10) -> Result<Option<AbiEncoded>, Error> {
11    let mut coverages = coverages;
12    // (1) check if the data words all have conforming padding
13    // we do this check because strings will typically be of the form:
14    // 0000000000000000000000000000000000000000000000000000000000000003 // length of 3
15    // 6f6e650000000000000000000000000000000000000000000000000000000000 // "one"
16    //
17    // so, if the data words have conforming padding, we can assume that this is not a string
18    // and is instead an array.
19    if data_words
20        .iter()
21        .map(|word| get_padding(word))
22        .all(|padding| padding == get_padding(data_words[0]))
23    {
24        return Ok(None)
25    }
26
27    // (3) calculate how many words are needed to store the encoded data with size `size`.
28    let word_count_for_size = U256::from((size.as_u32() as f32 / 32f32).ceil() as u32);
29    let data_end_word_offset = data_start_word_offset + word_count_for_size;
30
31    // (4) get the last word in `data_words`, so we can perform a size check. There should be
32    // `size % 32` bytes in this word, and the rest should be null bytes.
33    let last_word =
34        data_words.get(word_count_for_size.as_usize() - 1).ok_or(Error::BoundsError)?;
35    let last_word_size = size.as_usize() % 32;
36
37    // if the padding size of this last word is greater than `32 - last_word_size`,
38    // there are too many bytes in the last word, and this is not a valid ABI-encoded type.
39    // return an [`Error::BoundsError`].
40    let padding_size = get_padding_size(last_word);
41    if padding_size > 32 - last_word_size {
42        return Err(Error::BoundsError);
43    }
44
45    // (5) we've covered all words from `data_start_word_offset` to `data_end_word_offset`,
46    // so add them to `word_coverages`.
47    for i in word_offset.as_usize()..data_end_word_offset.as_usize() {
48        coverages.insert(i);
49    }
50
51    return Ok(Some(AbiEncoded { ty: String::from("string"), coverages: coverages.clone() }));
52}

You may notice that this is almost identical to try_decode_dynamic_parameter_bytes, with a few minor differences:

We first perform a padding conformity check on data_words. If all words in data_words have the same padding, we can assume that this is not a string and is instead an array. This is because strings will typically be of the form:

snippet.txt
10000000000000000000000000000000000000000000000000000000000000003 // length of 3 26f6e650000000000000000000000000000000000000000000000000000000000 // "one"

Where the first word is the length of the string, and the rest of the words are the string itself. If the words have conforming padding, we can assume that this is not a string and is instead an array.
The remaining steps for string validation are identical to try_decode_dynamic_parameter_bytes, since strings are essentially just bytes!

Decoding `T[]`

Finally, we'll cover decoding dynamic-length arrays. Let's take a look at how try_decode_dynamic_parameter_array works behind the scenes:

snippet.rs

1fn try_decode_dynamic_parameter_array(
2    parameter_index: usize,
3    calldata_words: &[&str],
4    word: U256,
5    word_offset: U256,
6    data_start_word_offset: U256,
7    data_end_word_offset: U256,
8    size: U256,
9    coverages: HashSet<usize>,
10) -> Result<Option<AbiEncoded>, Error> {
11    let mut coverages = coverages;
12
13    // (1) join all words from `data_start_word_offset` to `data_end_word_offset`. This is where
14    // the encoded data may be stored.
15    let data_words =
16        &calldata_words[data_start_word_offset.as_usize()..data_end_word_offset.as_usize()];
17
18    // (2) first, check if this is a `string` type, since some string encodings may appear to be arrays.
19    if let Ok(Some(abi_encoded)) = try_decode_dynamic_parameter_string(
20        data_words,
21        parameter_index,
22        calldata_words,
23        word,
24        word_offset,
25        data_start_word_offset,
26        size,
27        coverages.clone(),
28    ) {
29        return Ok(Some(abi_encoded));
30    }
31
32    // (3) this is not a `string` type, so we can assume that it is an array. we can extend
33    // `word_coverages` with the indices of all words from `data_start_word_offset` to `data_end_word_offset`,
34    // since we've now covered all words in the ABI-encoded type.
35    for i in word_offset.as_usize()..data_end_word_offset.as_usize() {
36        coverages.insert(i);
37    }
38
39    // (4) get the potential type of the array elements. under the hood, this function:
40    //     - iterates over each word in `data_words`
41    //     - checks if the word is a dynamic type by recursively calling `try_decode_dynamic_parameter`
42    //         - if it is a dynamic type, we know the type of the array elements and can return it
43    //         - if it is a static type, find the potential types that can represent each element
44    //           in the array
45    let potential_type = get_potential_type(
46        data_words,
47        parameter_index,
48        calldata_words,
49        word,
50        data_start_word_offset,
51        &mut coverages,
52    );
53    let type_str = format!("{potential_type}[]");
54    Ok(Some(AbiEncoded { ty: type_str, coverages }))
55}

First, we'll join all words from data_start_word_offset to data_end_word_offset. This is where the encoded data may be stored. We'll call this data_words. This will have a length of size, since size is the number of elements in the array.
Next, we'll check if this is a string type, since some string encodings may appear to be arrays. If it is, we'll stop here and return the AbiEncoded struct containing the type string and the coverages to be joined with covered_words at the end of the decoding process.
If this is not a string type, we can assume that it is an array. We can extend word_coverages with the indices of all words from data_start_word_offset to data_end_word_offset, since we've now covered all words in the ABI-encoded type.
Now, we'll get the potential type of the array elements. Under the hood, this function:
- Iterates over each word in data_words
- Checks if the word is a dynamic type by recursively calling try_decode_dynamic_parameter
  - If it is a dynamic type, we know the type of the array elements and can return it
  - If it is a static type, find the potential types that can represent each element in the array
We'll call this potential type potential_type. We'll then return an AbiEncoded struct containing the type {potential_type}[] and the coverages to be joined with covered_words at the end of the decoding process.

A Quick Example

Now that we've covered the decoding process, let's walk through a quick example. Let's say we have the following calldata (the one we built earlier):

snippet.txt

10x8be6524600000000000000000000000000000000000000000000000000000000000001230000000000000000000000000000000000000000000000000000000000000080313233343536373839300000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000e0000000000000000000000000000000000000000000000000000000000000000200000000000000000000000000000000000000000000000000000000000004560000000000000000000000000000000000000000000000000000000000000789000000000000000000000000000000000000000000000000000000000000000d48656c6c6f2c20776f726c642100000000000000000000000000000000000000

We'll start by converting this into a list of words:

snippet.txt

10000000000000000000000000000000000000000000000000000000000000123
20000000000000000000000000000000000000000000000000000000000000080
33132333435363738393000000000000000000000000000000000000000000000
400000000000000000000000000000000000000000000000000000000000000e0
50000000000000000000000000000000000000000000000000000000000000002
60000000000000000000000000000000000000000000000000000000000000456
70000000000000000000000000000000000000000000000000000000000000789
8000000000000000000000000000000000000000000000000000000000000000d
948656c6c6f2c20776f726c642100000000000000000000000000000000000000

We'll also keep track of covered_words and potential_inputs:

snippet.rs

1let mut potential_inputs: Vec<ParamType> = Vec::new();
2let mut covered_words: HashSet<usize> = HashSet::new();

Now, we'll iterate over each word in calldata_words:

snippet.txt

1i = 0
2word = 0000000000000000000000000000000000000000000000000000000000000123

This word is not an ABI-encoded dynamic type, because U256::from(word) % 32 != 0, so this is not a pointer to a word in calldata_words. We'll attempt to decode this word as a static type:

snippet.rs

1potential_types = ["uint16", "bytes2", "int16"]
2get_padding(word) = Padding::Left

So this word is probably a uint16 or int16. We'll add uint16 to potential_inputs and add i to covered_words:

snippet.rs

1potential_inputs = [ParamType::Uint(16)]
2covered_words = [0]

Now, we'll increment i and continue:

snippet.txt

1i = 1
2word = 0000000000000000000000000000000000000000000000000000000000000080

This word does pass the first validation step, since U256::from(word) % 32 == 0. We'll attempt to decode this word as an ABI-encoded dynamic type:

snippet.rs

1// (1) calculate byte_offset and word_offset
2byte_offset = 128
3word_offset = 4
4
5// (2) check if word_offset is a valid pointer to a word in calldata_words
6size_word = calldata_words[4] = "0000000000000000000000000000000000000000000000000000000000000002"
7size = U256::from(size_word) = 2
8
9// (3) are there enough words for this to be an array?
10data_start_word_offset = 5
11data_end_word_offset = 7
12len(calldata_words) = 9 // yes, there are enough words
13
14// (4) decode as an array!
15data_words = [
16    "0000000000000000000000000000000000000000000000000000000000000456",
17    "0000000000000000000000000000000000000000000000000000000000000789",
18]
19
20// (5) get potential type of array elements
21potential_type = get_potential_type(
22    data_words,
23    parameter_index,
24    calldata_words,
25    word,
26    data_start_word_offset,
27    &mut coverages,
28) = "uint16"
29word_coverages = {1,4,5,6}

So this is probably an array of uint16s. We'll add uint16[] to potential_inputs and add word_coverages to covered_words:

snippet.rs

1potential_inputs = [ParamType::Uint(16), ParamType::Array(Box::new(ParamType::Uint(16)))]
2covered_words = {0,1,4,5,6}

Now, we'll increment i and continue:

snippet.txt

1i = 2
2word = 3132333435363738393000000000000000000000000000000000000000000000

This word is not an ABI-encoded dynamic type, because U256::from(word) % 32 != 0, so this is not a pointer to a word in calldata_words. We'll attempt to decode this word as a static type:

snippet.rs

1potential_types = ["uint80", "bytes10", "int80"]
2get_padding(word) = Padding::Right

So this word is probably a bytes10. We'll add bytes10 to potential_inputs and add i to covered_words:

snippet.rs

1potential_inputs = [ParamType::Uint(16), ParamType::Array(Box::new(ParamType::Uint(16))), ParamType::Bytes(10)]
2covered_words = {0,1,2,4,5,6}

Now, we'll increment i and continue:

snippet.txt

1i = 3
2word = 00000000000000000000000000000000000000000000000000000000000000e0

This word does pass the first validation step, since U256::from(word) % 32 == 0. We'll attempt to decode this word as an ABI-encoded dynamic type:

snippet.rs

1// (1) calculate byte_offset and word_offset
2byte_offset = 224
3word_offset = 7
4
5// (2) check if word_offset is a valid pointer to a word in calldata_words
6size_word = calldata_words[7] = "000000000000000000000000000000000000000000000000000000000000000d"
7size = U256::from(size_word) = 13
8
9// (3) are there enough words for this to be an array?
10data_start_word_offset = 8
11data_end_word_offset = 21
12len(calldata_words) = 9 // no, there are not enough words
13
14// (4) decode as bytes!
15data_words = [
16    "48656c6c6f2c20776f726c642100000000000000000000000000000000000000"
17]
18
19// (5) we pass the padding check, since 32 - 13 = 19, and the padding size of this word is 19
20
21
22// (6) we've covered all words from `data_start_word_offset` to `data_end_word_offset`,
23// so add them to `word_coverages`.
24word_coverages = {3,7,8}

So this a bytes. We'll add bytes to potential_inputs and add word_coverages to covered_words:

snippet.rs

1potential_inputs = [ParamType::Uint(16), ParamType::Array(Box::new(ParamType::Uint(16))), ParamType::Bytes(10), ParamType::Bytes]
2covered_words = {0,1,2,3,4,5,6,7,8}

We've now covered all words in calldata_words, so we're done! Our final potential_inputs is:

snippet.rs

1potential_inputs = [ParamType::Uint(16), ParamType::Array(Box::new(ParamType::Uint(16))), ParamType::Bytes(10), ParamType::Bytes]

We can now use this to determine the function's signature and decode the rest of the calldata.

result

Everything looks good! We've successfully decoded the raw calldata, including it's dynamic types!

Note: The sizes of parameters, namely the first uint16 and the uint16[] are smaller than the original uint256 and uint32[]. This is fine, and there's nothing we can do about it.

Conclusion

In this paper, we've covered how to decode raw calldata, including dynamic types -- all without the contracts ABI or signature resolution. This functionality is automated in heimdall-rs, and I can't wait to see what you create with it!

Note: It's probable that this method is not 100% accurate, and can be iterated on and improved. If you notice any edge cases or bugs that I'm missing, please let me know by opening an issue or PR.

Resources & Citations

The Solidity Authors, "Contract ABI Specification", Solidity Lang, Aug 2023. Available: https://docs.soliditylang.org/en/v0.8.23/abi-spec.html#argument-encoding

The word "decode" is used loosely here. The EVM does not provide any mechanism for decoding calldata, and CALLDATALOAD / CALLDATACOPY only provide access to the raw bytes. For example, if the calldata contains an address, the corresponding assembly to cast the calldata to an address is:

snippet.txt
1AND(PUSH20(0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF), CALLDATALOAD(4))
This is true in most cases. Some languages such as huff may use function selectors that are shorter than four bytes.
The signature is the function name, followed by a comma-separated list of argument types, enclosed in parentheses. The argument names are not preserved in the signature.

On Decoding Raw EVM Calldata

Brief: EVM Calldata

Function Signatures & Selectors

Encoding Arguments

Encoding Static Types

Encoding Dynamic Types

Decoding Calldata

Helper Functions

The Decoding Process

Decoding Static Types

Decoding Dynamic Types

Decoding `bytes`

Decoding `string`

Decoding `T[]`

A Quick Example

Conclusion

Resources & Citations

More Reading

Get notified of new research

On Decoding Raw EVM Calldata

Brief: EVM Calldata

Function Signatures & Selectors

Encoding Arguments

Encoding Static Types

Encoding Dynamic Types

Decoding Calldata

Helper Functions

The Decoding Process

Decoding Static Types

Decoding Dynamic Types

Decoding bytes

Decoding string

Decoding T[]

A Quick Example

Conclusion

Resources & Citations

More Reading

Get notified of new research

Decoding `bytes`

Decoding `string`

Decoding `T[]`