Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive Allocation for StringExp with Cache #3490

Closed
looked-at-me opened this issue Jul 3, 2020 · 3 comments
Closed

Excessive Allocation for StringExp with Cache #3490

looked-at-me opened this issue Jul 3, 2020 · 3 comments

Comments

@looked-at-me
Copy link
Contributor

There's an issue where e->toChars() linked below, allocates new memory and the string can be rather large sometimes. It is building a "header" pretty version of the string, so for a large string it allocates quite a bit extra memory (x3) each time. I tried it with peekString() and was able to reduce compile time of one my projects by about 1 second (from a 19 second total). I import("...") quite a few files but the sizes aren't that big but still quite significant considering what this code is doing.

Don't think peekString() would be portable though, all my strings are just regular strings. Any better ideas?

Some of the other optimizations below wouldn't be necessary if something else was used instead of the actual string. But otherwise they are quite slow for larger strings.

https://github.com/ldc-developers/ldc/blob/v1.22.0/gen/toir.cpp#L394

    llvm::StringMap<llvm::GlobalVariable *> *stringLiteralCache =
        stringLiteralCacheForType(cty);
    LLConstant *_init = buildStringLiteralConstant(e, true);           // *** These can go into the if statement below don't think they have side effects
                                                                       // *** only used to construct gvar first time
    const auto at = _init->getType();                                  // *** ditto

    llvm::StringRef key(e->toChars());                                 // *** allocates memory excessively
    llvm::GlobalVariable *gvar =
        (stringLiteralCache->find(key) == stringLiteralCache->end())   // *** can eliminate the operator[] here, avoid a second scan.
            ? nullptr
            : (*stringLiteralCache)[key];
    if (gvar == nullptr) {
      llvm::GlobalValue::LinkageTypes _linkage =
          llvm::GlobalValue::PrivateLinkage;
      IF_LOG {
        Logger::cout() << "type: " << *at << '\n';
        Logger::cout() << "init: " << *_init << '\n';
      }
      gvar = new llvm::GlobalVariable(gIR->module, at, true, _linkage, _init,
                                      ".str");
      gvar->setUnnamedAddr(llvm::GlobalValue::UnnamedAddr::Global);
      (*stringLiteralCache)[key] = gvar;
    }
@kinke
Copy link
Member

kinke commented Jul 3, 2020

Seems to make perfect sense, would you mind opening a PR with your suggested changes?

@looked-at-me
Copy link
Contributor Author

looked-at-me commented Jul 3, 2020

Yah I'll make them tomorrow. The only thing is, I'm not sure what to do about eliminating e->toChars() which allocates a new string (x3) size, but it does it a portable way expanding it into an string literial representation: "\x00\x01\0x2". My quick test just used peekString(), but that'd only work if StringExp is char[], it won't work for wchar[], or dchar[]. I'll need to look at the StringRef representation, if it ignores \0 it could work with just a peekString() with appropriate sizes. I'm not that familiar with wchar or dchar though.

I'll make a PR tomorrow, and look into this a bit more maybe tomorrow.

@kinke
Copy link
Member

kinke commented Jul 3, 2020

Yeah, StringExp::toChars() seems to be wrong (not overridden, so Expression::toChars() here). peekData() (not in C++ interface yet) seems to be the right thing (raw bytes). From the llvm::StringMap docs:

/// StringMap - This is an unconventional map that is specialized for handling
/// keys that are "strings", which are basically ranges of bytes.

So I hope they can deal with intermediate zeros for w/dstrings too. It works with llvm::StringRef, which is basically a char[] in D, i.e., has a length.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants