-
Notifications
You must be signed in to change notification settings - Fork 11.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Base64 library to utils #2884
Add Base64 library to utils #2884
Conversation
73b6c96
to
00e6097
Compare
|
||
import "../utils/Base64.sol"; | ||
|
||
contract Base64Mock { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just my 2 cents but as an alternative, you can have the following line at the top of the contract:
using Base64 for bytes;
and then in your encode:
value.encode();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it looks better just using the .encode
but I'm following the same pattern used for Strings mock. I'll wait if there's somebody from the OZ team who has input on this and why to use one or the other.
e87d3b7
to
184adb8
Compare
184adb8
to
7c187e0
Compare
Hello @ernestognw, thank you very much for this PR, and sorry we didn't answer it sooner. This is really something interesting that we would love to merge. I addition to actually providing code and tests, you have done a great job documenting the changes. One of the reasons for the late answer, is we have been reluctant to dive into the assembly code. The tests are showing it works, but we are still not really comfortable with it. Our opinion is that this code would not be paid for, and just be used as part of "view" operation. Thus it might not be necessary to make it as efficient as possible, and maybe we would prefer having more easily understandable code. Have you tried writing it in pure solidity, without any assembly ? |
I just stumbled upon this PR and decided to give it a try myself to check how the new optimizer in the compiler would fare against that hand-optimized code and see what we could improve. Here's a version of the library in pure Solidity (WARNING: not tested at all, might be buggy): library Base64 {
bytes internal constant _TABLE = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
function encode(bytes memory data) internal pure returns (string memory) {
if (data.length == 0) return "";
bytes memory table = _TABLE;
uint256 encodedLen = 4 * ((data.length + 2) / 3);
bytes memory result = new bytes(encodedLen);
unchecked {
uint resultPtr = 0;
for (uint dataPtr = 0; dataPtr + 3 <= data.length; dataPtr += 3) {
uint input =
uint(uint8(data[dataPtr ])) << 16 +
uint(uint8(data[dataPtr + 1])) << 8 +
uint(uint8(data[dataPtr + 2]));
result[resultPtr++] = table[((input >> 18) & 0x3f)];
result[resultPtr++] = table[((input >> 12) & 0x3f)];
result[resultPtr++] = table[((input >> 6) & 0x3f)];
result[resultPtr++] = table[( input & 0x3f)];
}
if (data.length % 3 == 2) {
result[resultPtr + 1] = 0x3d;
result[resultPtr + 2] = 0x3d;
}
else if (data.length % 3 == 1)
result[resultPtr + 1] = 0x3d;
}
return string(result);
}
} You can see the resulting assembly at the end of this post. Unfortunately looks like there are still a few things that the optimizer could do but does not:
For now as you can see, the hand-optimized version should be cheaper. I think I'll add this code as a test case for the optimizer though and it should improve in future versions of the compiler. Pure Solidity version passed through Yul optimizerThis is the result from running function memory_array_index_access_bytes(baseRef, index) -> addr
{
if iszero(lt(index, mload(baseRef)))
{
mstore(0, shl(224, 0x4e487b71))
mstore(4, 0x32)
revert(0, 0x24)
}
addr := add(add(baseRef, index), 32)
}
function read_from_memoryt_bytes1(ptr) -> returnValue
{
returnValue := and(mload(ptr), shl(248, 255))
}
function convert_bytes1_to_uint8(value) -> converted
{ converted := shr(248, value) }
function convert_uint8_to_uint256(value) -> converted
{ converted := and(value, 0xff) }
function fun_encode(var_data_mpos) -> var__mpos
{
if iszero(mload(var_data_mpos))
{
var__mpos := allocate_memory_array_string()
leave
}
let expr_35_mpos := copy_literal_to_memory_84d8a590de33e00cbdc16e1f28c3506f5ec15c599fab9a6a4bcd575cc2f110ce()
let expr_mpos := allocate_and_zero_memory_array_bytes(checked_mul_uint256(checked_div_uint256(checked_add_uint256(mload(var_data_mpos)))))
let var_resultPtr := 0x00
let var_dataPtr := var_resultPtr
for { }
0x01
{
var_dataPtr := add(var_dataPtr, 0x03)
}
{
let _1 := 0x03
if gt(add(var_dataPtr, _1), mload(var_data_mpos)) { break }
let _2 := convert_uint8_to_uint256(convert_bytes1_to_uint8(read_from_memoryt_bytes1(memory_array_index_access_bytes(var_data_mpos, var_dataPtr))))
let _3 := 0x01
let _4 := shl(add(0x10, convert_uint8_to_uint256(convert_bytes1_to_uint8(read_from_memoryt_bytes1(memory_array_index_access_bytes(var_data_mpos, add(var_dataPtr, _3)))))), _2)
let _5 := 0x02
let result := shl(add(0x08, convert_uint8_to_uint256(convert_bytes1_to_uint8(read_from_memoryt_bytes1(memory_array_index_access_bytes(var_data_mpos, add(var_dataPtr, _5)))))), _4)
let _6 := 0x3f
mstore8(memory_array_index_access_bytes(expr_mpos, var_resultPtr), byte(0x00, read_from_memoryt_bytes1(memory_array_index_access_bytes(expr_35_mpos, and(shr(18, result), _6)))))
mstore8(memory_array_index_access_bytes(expr_mpos, add(var_resultPtr, _3)), byte(0x00, read_from_memoryt_bytes1(memory_array_index_access_bytes(expr_35_mpos, and(shr(12, result), _6)))))
mstore8(memory_array_index_access_bytes(expr_mpos, add(var_resultPtr, _5)), byte(0x00, read_from_memoryt_bytes1(memory_array_index_access_bytes(expr_35_mpos, and(shr(6, result), _6)))))
let _7 := read_from_memoryt_bytes1(memory_array_index_access_bytes(expr_35_mpos, and(result, _6)))
let _8 := add(var_resultPtr, _1)
var_resultPtr := add(var_resultPtr, 0x04)
mstore8(memory_array_index_access_bytes(expr_mpos, _8), byte(0x00, _7))
}
let _9 := mod(mload(var_data_mpos), 0x03)
switch eq(_9, 0x02)
case 0 {
if eq(_9, 0x01)
{
mstore8(memory_array_index_access_bytes(expr_mpos, add(var_resultPtr, 0x01)), 0x3d)
}
}
default {
mstore8(memory_array_index_access_bytes(expr_mpos, add(var_resultPtr, 0x01)), 0x3d)
mstore8(memory_array_index_access_bytes(expr_mpos, add(var_resultPtr, 0x02)), 0x3d)
}
var__mpos := expr_mpos
} Inline assembly version passed through Yul optimizerfunction checked_add_uint256_794(x) -> sum
{
if gt(x, not(32)) { panic_error_0x11() }
sum := add(x, 0x20)
}
function fun_encode(var_data_mpos) -> var_mpos
{
if iszero(mload(var_data_mpos))
{
var_mpos := allocate_memory_array_string()
leave
}
let expr_37_mpos := copy_literal_to_memory_84d8a590de33e00cbdc16e1f28c3506f5ec15c599fab9a6a4bcd575cc2f110ce()
let expr := checked_mul_uint256(checked_div_uint256(checked_add_uint256(mload(var_data_mpos))))
let expr_mpos := allocate_and_zero_memory_array_string(checked_add_uint256_794(expr))
mstore(expr_mpos, expr)
let usr$dataPtr := var_data_mpos
let usr$endPtr := add(var_data_mpos, mload(var_data_mpos))
let usr$resultPtr := add(expr_mpos, 0x20)
for { } lt(usr$dataPtr, usr$endPtr) { }
{
let _1 := 0x03
usr$dataPtr := add(usr$dataPtr, _1)
let usr$input := mload(usr$dataPtr)
let _2 := 1
let _3 := 0x3F
let _4 := mload(add(add(expr_37_mpos, and(shr(18, usr$input), _3)), _2))
let _5 := 248
mstore(usr$resultPtr, shl(_5, _4))
mstore(add(usr$resultPtr, _2), shl(_5, mload(add(add(expr_37_mpos, and(shr(12, usr$input), _3)), _2))))
mstore(add(usr$resultPtr, 0x02), shl(_5, mload(add(add(expr_37_mpos, and(shr(6, usr$input), _3)), _2))))
mstore(add(usr$resultPtr, _1), shl(_5, mload(add(add(expr_37_mpos, and(usr$input, _3)), _2))))
usr$resultPtr := add(usr$resultPtr, 0x04)
}
switch mod(mload(var_data_mpos), 0x03)
case 1 {
mstore(add(usr$resultPtr, not(1)), shl(240, 15677))
}
case 2 {
mstore(add(usr$resultPtr, not(0)), shl(248, 61))
}
var_mpos := expr_mpos
} |
Note I'm trying to evaluate the different implementation. It's been long overdue, but we would want this to be part of our next release. I'm afraid to say that @cameel's implementation is not correct. It just doesn't return the expected value. I'll be trying to fix it so I can then accurately measure the impact on gas cost (if that was ever called as part of a transaction) and deployment cost. |
There is a major issue with the solidity implementation, and that is that, if the data length is not a multiple of 3, the last iteration of the loop will fail. Reading |
Here is a working solidity implementation. Even with optimisation is is SIGNIFICANTLY more expensive to run:
|
The bytecode for the solidity version is 77.2% larger. This increases the deployment cost by 156k gas. I'll go for the assembly any days. Its actually pretty clear / simple when you known of base64 encoding works |
contracts/utils/Base64.sol
Outdated
uint256 encodedLen = 4 * ((data.length + 2) / 3); | ||
|
||
// Add some extra buffer at the end required for the writing | ||
string memory result = new string(encodedLen + 32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need the additional space for ?
the padding equals already fit into the encodedLen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's because the code operates on whole 32-byte slots rather than single bytes. It uses mstore
instead of mstore8
so when setting the value for a byte it also writes over 31 bytes after it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then lets use mstore8 !
Thanks for fixing it. Like I said, my version was not tested at all. I was mostly interested in a rough comparison of the generated bytecode to get a good picture of whether going to assembly here is worth it here or not and for that purpose it was good enough.
Some of it is because your version does not have the But yeah, it's not the only thing that makes it more expensive. I'm pretty disappointed by stuff that was not optimized out too. In the current version of the compiler the Solidity code is much more expensive. Just wanted to let you know that many of the missed optimization opportunities here are things we're actively working on right now so it's not going to stay like this indefinitely. |
Are we still need something to get this PR merged? 🤔 |
@Drjacky We still need some internal review. This should make it into 4.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me.
I'm happy it's getting merged soon. To be honest, the optimization discussion went out of my scope, so thanks @cameel for providing extra examples and doing some tests. Let me know if there's anything else I can do. Otherwise here's a lovely meme: |
Fixes #2859
Description
As previously discussed on #2859, seems like Base64 has been used as a way to construct Safe-URL
tokenURIs
forERC721
. This PR aims to provide a standard library to perform those operations through Open Zeppelin utils.PR Checklist
Notes
I haven't been able to run docs locally to start creating the documentation for this, and according to the checklist, I don't know how to update the Changelog entry. Would be helpful to have some guide from one of the team members