Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReadStringBytes allocates #145

Closed
Dieterbe opened this issue Feb 13, 2016 · 4 comments
Closed

ReadStringBytes allocates #145

Dieterbe opened this issue Feb 13, 2016 · 4 comments

Comments

@Dieterbe
Copy link

Hello,
thank you very much for your work. it works very nicely.
one thing i noticed though,
in my app, github.com/tinylib/msgp/msgp.ReadStringBytes is the largest allocator of objects (note: by far not the largest cause of memory usage, so it's just that it allocates a lot, but often not much each time)

is there anything that can be done? I'm using the UnMarshal method

(pprof) list ReadStringBytes
Total: 5227186039
ROUTINE ======================== github.com/tinylib/msgp/msgp.ReadStringBytes in /home/ubuntu/.go_workspace/src/github.com/tinylib/msgp/msgp/read_bytes.go
 851134023  851134023 (flat, cum) 16.28% of Total
         .          .    779:// - ErrShortBytes (b not long enough)
         .          .    780:// - TypeError{} (not 'str' type)
         .          .    781:// - InvalidPrefixError
         .          .    782:func ReadStringBytes(b []byte) (string, []byte, error) {
         .          .    783:   v, o, err := ReadStringZC(b)
 851134023  851134023    784:   return string(v), o, err
         .          .    785:}
         .          .    786:
         .          .    787:// ReadStringAsBytes reads a 'str' object
         .          .    788:// into a slice of bytes. 'v' is the value of
         .          .    789:// the 'str' object, which may reside in memory
(pprof) %

thanks.

@philhofer
Copy link
Member

Unfortunately, this is a side-effect of the semantics of string in the language. This is as good as it gets if you want to deal with immutable strings. (Internally, string(v) will copy the slice into a new location on the heap, and must do so because mutating the []byte from which it came must not mutate the string.) (Note that the call is doing precisely one allocation each time you call it.)

Things will go faster if you use []byte over string and pre-populate your structure with a slice from a memory pool. (The generated code will copy the data into the slice rather than allocating a new one.)

Things get faster still if you use ReadStringZC, which elides the copy entirely, but also aliases the original slice of data.

@philhofer philhofer changed the title ReadStringBytes causes a lot of allocations ReadStringBytes allocates Feb 13, 2016
@Dieterbe
Copy link
Author

fair enough.

just to make sure we're on the same page:

(..) if you want to deal with immutable strings.

i would be ok with immutable strings if it improves performance. but this is probably not an option since go only supports immutable strings, right?

thanks for your response, and feel free to close this if there's nothing else to do.

@philhofer
Copy link
Member

Yes; go's string is immutable by design. It has benefits (everyone can safely reference the same string object) and drawbacks (allocation, as you've identified here.) Of course, []byte is effectively just a mutable string.

Another alternative, if you're using a fixed set of strings, is to do string internment and transfer the data as some combination of integers and strings.

@Dieterbe
Copy link
Author

Things get faster still if you use ReadStringZC, which elides the copy entirely, but also aliases the original slice of data.

i missed this point last time we spoke I think. is there any docs with more specifics and how to do this? couldn't find this in docs or on the wiki. thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants