-
Notifications
You must be signed in to change notification settings - Fork 1.8k
feat: optimize map_abi_data #3697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! There are a couple breaking typing changes here that I listed, and a quick benchmark of test_map_abi_data
shows minimal improvement on only one test case and either equal or worse performance with the other 3, so I'm not quite convinced on this one. If you can show me data that this is more performant though I'd be happy to revisit!
How are you benchmarking? Is there a standard method that should be used? The current code does exactly the same stuff as the new code, same func calls and all, but also creates and populates two lists and creates and iterates thru an intermediary chain object which the new code does not do. I didn't anticipate you'd see a substantial difference, since this is a microoptimization after all, but a slowdown doesn't make any sense to me when the new PR is "call all the same functions with all the same inputs, minus a few that didn't need calling". We didn't modify the instructions we're sending to the cpu, we just removed a couple. I'd love to look into this further to figure out what's going on. |
I did this benchmark with empty args so we're truly comparing apples to apples and not polluting the results with the time it takes to actually format all the data from time import time
import itertools
from functools import partial
from web3._utils.abi import abi_data_tree, data_tree_map, recursive_map, strip_abi_type
from cytoolz import pipe
def old(data, types, normalizers):
pipeline = itertools.chain(
[abi_data_tree(types)],
map(data_tree_map, normalizers),
[partial(recursive_map, strip_abi_type)],
)
return pipe(data, *pipeline)
def new(data, types, normalizers):
return pipe(
data,
# 1. Decorating the data tree with types
abi_data_tree(types),
# 2. Recursively mapping each of the normalizers to the data
*map(data_tree_map, normalizers),
# 3. Stripping the types back out of the tree
strip_abi_types,
)
def strip_abi_types(elements):
return recursive_map(strip_abi_type, elements)
for i in range(5):
normalizers = (int,) * i
start = time()
for _ in range(100_000):
old((), (), normalizers)
print(f"old: {time() - start}")
start = time()
for _ in range(100_000):
new((), (), normalizers)
print(f"new: {time() - start}")
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have any benchmarking at this granular of a level. I just grabbed the test cases from this test and then ran the each test case 1000 times and compared main to this branch. Here's the script I was using:
import time
from typing import List, Tuple, Any, Callable
from web3._utils.abi import map_abi_data
from web3._utils.normalizers import BASE_RETURN_NORMALIZERS, addresses_checksummed, abi_string_to_text
def benchmark_map_abi_data():
test_cases = [
# Case 1: Simple bool array and int
{
"types": ["bool[2]", "int256"],
"data": [[True, False], 9876543210],
"funcs": [
lambda typ, dat: ((typ, "Tru-dat") if typ == "bool" and dat else (typ, dat)),
lambda typ, dat: (typ, hex(dat)) if typ == "int256" else (typ, dat),
],
},
# Case 2: Address normalization
{
"types": ["address"],
"data": ["0x5b2063246f2191f18f2675cedb8b28102e957458"],
"funcs": BASE_RETURN_NORMALIZERS,
},
# Case 3: Address array normalization
{
"types": ["address[]"],
"data": [["0x5b2063246f2191f18f2675cedb8b28102e957458"] * 2],
"funcs": BASE_RETURN_NORMALIZERS,
},
# Case 4: Complex tuple with addresses
{
"types": ["(address,address)[]"],
"data": [[
(
"0x5b2063246f2191f18f2675cedb8b28102e957458",
"0xebe0da78ecb266c7ea605dc889c64849f860383f",
)
] * 2],
"funcs": BASE_RETURN_NORMALIZERS,
},
# Case 5: String and address array
{
"types": ["(string,address[])"],
"data": [(
b"a string",
[b"\xf2\xe2F\xbbv\xdf\x87l\xef\x8b8\xae\x84\x13\x0fOU\xde9["],
)],
"funcs": [addresses_checksummed, abi_string_to_text],
},
]
for i, test_case in enumerate(test_cases, 1):
# Warm up
for _ in range(100):
map_abi_data(test_case['funcs'], test_case['types'], test_case['data'])
# Actual benchmark
start_time = time.time()
for _ in range(1000):
result = map_abi_data(test_case['funcs'], test_case['types'], test_case['data'])
end_time = time.time()
total_time = end_time - start_time
print(f"Total time for 1000 runs: {total_time:.4f} seconds")
if __name__ == "__main__":
benchmark_map_abi_data()
Here is the data I was seeing:
Test Case | Types | Main Branch Run Time (s) | Feature Branch Run Time Yesterday (s) | Feature Branch Run Time Today (Run 1) (s) | Feature Branch Run Time Today (Run 2) (s) |
---|---|---|---|---|---|
1 | ['bool[2]', 'int256'] | 0.0933 | 0.0668 | 0.0907 | 0.0972 |
2 | ['address'] | 0.0316 | 0.0321 | 0.0317 | 0.0318 |
3 | ['address[]'] | 0.0604 | 0.0607 | 0.0602 | 0.0604 |
4 | ['(address,address)[]'] | 0.1217 | 0.1268 | 0.1204 | 0.1200 |
5 | ['(string,address[])'] | 0.0855 | 0.0862 | 0.0898 | 0.0847 |
Running a few more times on this branch today, I'm seeing slight improvements on the feature branch numbers above. I think the type changes are good, and I think the performance change is minimal but doesn't impact readability so I'm okay merging but curious if you have an opinion @fselmo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious if you have an opinion
fselmo
?
I like it. Seems to be a slight improvement and doesn't hurt readability. I would certainly like for us to squash the 7 commits on a +24 -19
before merging though!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @BobTheBuidler!
Will squash and merge! |
What was wrong?
Title says it all, this is ready for review.
Related to Issue #
Closes #
How was it fixed?
Todo:
Cute Animal Picture