Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Small tweaks to Dict asm size #17096

Merged
merged 1 commit into from
Mar 21, 2018
Merged

Conversation

benaadams
Copy link
Member

@benaadams benaadams commented Mar 21, 2018

Shrinks from #17076

Dictionary``2:Clear() reduction from moving _version++ out of the if was unexpected

Total bytes of diff: -513 (-0.01% of base)
    diff is an improvement.

Total byte diff includes 0 bytes from reconciling methods
        Base had    0 unique methods,        0 unique bytes
        Diff had    0 unique methods,        0 unique bytes

Top file improvements by size (bytes):
        -513 : System.Private.CoreLib.dasm (-0.01% of base)

1 total files with size differences (1 improved, 0 regressed), 0 unchanged.

Top method improvements by size (bytes):
        -319 : System.Private.CoreLib.dasm - Dictionary`2:Clear():this (29 methods)
         -24 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(ref,struct,ubyte):bool:this (4 methods)
         -22 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(struct,struct,ubyte):bool:this (2 methods)
         -20 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(struct,ref,ubyte):bool:this (3 methods)
         -18 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(ref,int,ubyte):bool:this (3 methods)
         -18 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(int,ref,ubyte):bool:this (3 methods)
         -12 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(long,ref,ubyte):bool:this (2 methods)
         -10 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(struct,ubyte,ubyte):bool:this
         -10 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(struct,int,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(ref,ref,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(ref,bool,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(long,bool,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(long,short,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(ushort,ref,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(short,long,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(int,int,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(int,ubyte,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(ref,long,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(ref,ushort,ubyte):bool:this

19 total methods with size differences (19 improved, 0 regressed), 17254 unchanged.

/cc @jkotas

@benaadams
Copy link
Member Author

Clear has two returns with the _version++ in the if

Pre

; Lcl frame size = 40
G_M52189_IG01:
       push     rdi
       push     rsi
       sub      rsp, 40
       mov      rsi, rcx
G_M52189_IG02:
       mov      edi, dword ptr [rsi+56]
       test     edi, edi
       jle      SHORT G_M52189_IG04
       mov      rcx, gword ptr [rsi+8]
       mov      r8d, dword ptr [rcx+8]
       xor      edx, edx
       call     Array:Clear(ref,int,int)
       xor      ecx, ecx
       mov      dword ptr [rsi+56], ecx
       mov      dword ptr [rsi+60], -1
       mov      dword ptr [rsi+64], ecx
       inc      dword ptr [rsi+68]
       mov      rcx, gword ptr [rsi+16]
       mov      r8d, edi
       xor      edx, edx
       lea      rax, [(reloc)]
G_M52189_IG03:
       add      rsp, 40
       pop      rsi
       pop      rdi
       rex.jmp  rax
G_M52189_IG04:
       add      rsp, 40
       pop      rsi
       pop      rdi
       ret      
; Total bytes of code 81, prolog size 9 for method Dictionary`2:Clear():this

Post

; Lcl frame size = 40
G_M52189_IG01:
       push     rdi
       push     rsi
       sub      rsp, 40
       mov      rsi, rcx
G_M52189_IG02:
       mov      edi, dword ptr [rsi+56]
       test     edi, edi
       jle      SHORT G_M52189_IG03
       mov      rcx, gword ptr [rsi+8]
       mov      r8d, dword ptr [rcx+8]
       xor      edx, edx
       call     Array:Clear(ref,int,int)
       xor      ecx, ecx
       mov      dword ptr [rsi+56], ecx
       mov      dword ptr [rsi+60], -1
       mov      dword ptr [rsi+64], ecx
       mov      rcx, gword ptr [rsi+16]
       mov      r8d, edi
       xor      edx, edx
       call     Array:Clear(ref,int,int)
G_M52189_IG03:
       inc      dword ptr [rsi+68]
G_M52189_IG04:
       add      rsp, 40
       pop      rsi
       pop      rdi
       ret      
; Total bytes of code 70, prolog size 6 for method Dictionary`2:Clear():this

Array.Clear(_entries, 0, count);
}
_version++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this an improvement? I would expect that it prevents tailcall for Array.Clear. Should the version increment be done as the first thing, similar to insert?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't reduce the size, though keeping the tail call have moved now the two Clears together at end so it can just run through cache setting them before making the calls out

; Lcl frame size = 40
G_M52189_IG01:
       push     rdi
       push     rsi
       sub      rsp, 40
       mov      rsi, rcx
G_M52189_IG02:
       mov      edi, dword ptr [rsi+56]
       test     edi, edi
       jle      SHORT G_M52189_IG04
       xor      r8d, r8d
       mov      dword ptr [rsi+56], r8d
       mov      dword ptr [rsi+60], -1
       mov      dword ptr [rsi+64], r8d
       inc      dword ptr [rsi+68]
       mov      rcx, gword ptr [rsi+8]
       mov      r8d, dword ptr [rcx+8]
       xor      edx, edx
       call     Array:Clear(ref,int,int)
       mov      rcx, gword ptr [rsi+16]
       mov      r8d, edi
       xor      edx, edx
       lea      rax, [(reloc)]
G_M52189_IG03:
       add      rsp, 40
       pop      rsi
       pop      rdi
       rex.jmp  rax
G_M52189_IG04:
       add      rsp, 40
       pop      rsi
       pop      rdi
       ret      
; Total bytes of code 84, prolog size 9 for method Dictionary`2:Clear():this

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@benaadams
Copy link
Member Author

Total bytes of diff: -107 (0.00% of base)
    diff is an improvement.

Total byte diff includes 0 bytes from reconciling methods
        Base had    0 unique methods,        0 unique bytes
        Diff had    0 unique methods,        0 unique bytes

Top file improvements by size (bytes):
        -107 : System.Private.CoreLib.dasm (0.00% of base)

1 total files with size differences (1 improved, 0 regressed), 0 unchanged.

Top method regessions by size (bytes):
          87 : System.Private.CoreLib.dasm - Dictionary`2:Clear():this (29 methods)

Top method improvements by size (bytes):
         -24 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(ref,struct,ubyte):bool:this (4 methods)
         -22 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(struct,struct,ubyte):bool:this (2 methods)
         -20 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(struct,ref,ubyte):bool:this (3 methods)
         -18 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(ref,int,ubyte):bool:this (3 methods)
         -18 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(int,ref,ubyte):bool:this (3 methods)
         -12 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(long,ref,ubyte):bool:this (2 methods)
         -10 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(struct,ubyte,ubyte):bool:this
         -10 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(struct,int,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(ref,ref,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(ref,bool,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(long,bool,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(long,short,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(ushort,ref,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(short,long,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(int,int,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(int,ubyte,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(ref,long,ubyte):bool:this
          -6 : System.Private.CoreLib.dasm - Dictionary`2:TryInsert(ref,ushort,ubyte):bool:this

19 total methods with size differences (18 improved, 1 regressed), 17254 unchanged.

What's better tail call or code size? (I assume Clear isn't called frequently?)

@jkotas
Copy link
Member

jkotas commented Mar 21, 2018

Codesize is better for Clear

@benaadams
Copy link
Member Author

Reverted to original diff as in summary

@jkotas jkotas merged commit 06f6f87 into dotnet:master Mar 21, 2018
dotnet-bot pushed a commit to dotnet/corert that referenced this pull request Mar 21, 2018
dotnet-bot pushed a commit to dotnet/corefx that referenced this pull request Mar 21, 2018
ahsonkhan pushed a commit to dotnet/corert that referenced this pull request Mar 21, 2018
dotnet-bot pushed a commit to dotnet/corefx that referenced this pull request Mar 22, 2018
dotnet-bot pushed a commit to dotnet/corefx that referenced this pull request Mar 22, 2018
dotnet-bot pushed a commit to dotnet/corefx that referenced this pull request Mar 22, 2018
ahsonkhan pushed a commit to dotnet/corefx that referenced this pull request Mar 23, 2018
* Small tweaks to Dict asm size (dotnet/coreclr#17096)

Signed-off-by: dotnet-bot-corefx-mirror <[email protected]>

* Moving Span APIs that allow skipping visibility checks to MemoryMarshal (#17087)

Signed-off-by: dotnet-bot-corefx-mirror <[email protected]>

* Improve DateTime{Offset} "r" and "o" formatting performance (#17092)

Two main changes:
1. Rewrote the formatting to use span, only to then discover that we already had almost exactly the same implementation in Utf8Formatter.  As that one had some extra optimizations around JIT behaviors, I ported that over instead.
2. Avoided [ThreadStatic] lookups unless necessary.

ToString/TryFormat for "o"/"O" improve by ~2.5x.

ToString/TryFormat for "r"/"R" improve by ~3x.

Signed-off-by: dotnet-bot-corefx-mirror <[email protected]>

* Rename {Try}Read/WriteMachineEndian to just {Try}Read/Write (#17106)

Signed-off-by: dotnet-bot-corefx-mirror <[email protected]>

* Fix incorrect array dereference. (dotnet/coreclr#17113)

Signed-off-by: dotnet-bot-corefx-mirror <[email protected]>

* Move Span APIs that allow skipping visibility checks to MemoryMarshal

* Rename {Try}Read/WriteMachineEndian to just {Try}Read/Write

* Update calls to BinaryPrimitives.ReadMachineEndian

* Rename calls to ReadMachineEndian in System.Memory perf tests.

* Add ApiCompatBaseline for UWP NETNative

* Add to ApiCompatBaseline for UWP NETNative netstandard20
@benaadams benaadams deleted the dict-code-size branch March 28, 2018 04:42
jkotas pushed a commit to dotnet/corert that referenced this pull request Sep 18, 2018
…completely, which only showed modest size improvement

* Removing the _version increment from Clear() entirely to bring it in line with the behavior in Remove() and to keep size gains

[tfs-changeset: 1714543]
dotnet-maestro-bot pushed a commit to dotnet-maestro-bot/coreclr that referenced this pull request Sep 18, 2018
…tely, which only showed modest size improvement

* Removing the _version increment from Clear() entirely to bring it in line with the behavior in Remove() and to keep size gains

[tfs-changeset: 1714543]

Signed-off-by: dotnet-bot <[email protected]>
dotnet-maestro-bot pushed a commit to dotnet-maestro-bot/corefx that referenced this pull request Sep 18, 2018
…completely, which only showed modest size improvement

* Removing the _version increment from Clear() entirely to bring it in line with the behavior in Remove() and to keep size gains

[tfs-changeset: 1714543]

Signed-off-by: dotnet-bot <[email protected]>
jkotas pushed a commit to dotnet/corefx that referenced this pull request Sep 18, 2018
…completely, which only showed modest size improvement

* Removing the _version increment from Clear() entirely to bring it in line with the behavior in Remove() and to keep size gains

[tfs-changeset: 1714543]

Signed-off-by: dotnet-bot <[email protected]>
jkotas pushed a commit that referenced this pull request Sep 19, 2018
…which only showed modest size improvement

* Removing the _version increment from Clear() entirely to bring it in line with the behavior in Remove() and to keep size gains

[tfs-changeset: 1714543]

Signed-off-by: dotnet-bot <[email protected]>
@danmoseley
Copy link
Member

This was reverted, because

Executing the following code snippet
            Dictionary<string, string> dictionary = new Dictionary<string, string>();
            var valuesEnum = dictionary.GetEnumerator();
                                                          
            dictionary.Clear();
            valuesEnum.MoveNext();
produces an InvalidOperationException. The enumerator is invalidated regardless of the fact that the underlying dictionary collection has not changed.

This snippet:
            Dictionary<string, string> dictionary = new Dictionary<string, string>();
            dictionary.Add("key", "value_0");
            var valuesEnum = dictionary.GetEnumerator();
            dictionary.TryAdd("key", "value_1"); // this should not invalidate the enumerator

            valuesEnum.MoveNext();

Also invalidates the iterator, even though TryAdd fails, returns false and does not modify the dictionary values.

@danmoseley
Copy link
Member

@jkotas I assume we should add tests for the above, since apparently we didn't have any, or is this now covered by changes that went into dotnet/corefx@eca3f8d#diff-c0be16052840e601127fc0a5e7b5c3af ?

@jkotas
Copy link
Member

jkotas commented Sep 21, 2018

Right, @A-And is working on adding tests for this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants