Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phobos: Enable x87 asm versions for std.math.{tan,expi} #2855

Merged
merged 2 commits into from
May 15, 2019

Conversation

kinke
Copy link
Member

@kinke kinke commented Sep 22, 2018

Following #2854.

@kinke
Copy link
Member Author

kinke commented Sep 22, 2018

Hmm, calling another function by name directly doesn't work in naked DMD-style asm if compiled as PIC, so the return-real.nan-workaround by calling a helper function isn't viable either...

@pineapplemachine
Copy link

pineapplemachine commented Sep 24, 2018

I haven't got an inline asm tangent or sincos implementation to work with LDC yet. However, I did have some apparent success with this particular part of the problem by declaring immutable nan = real.nan for DMD and static immutable nan = real.nan for LDC (each failed to compile for the other) and then writing e.g. fld nan[RBP]; ret; inside the asm block instead of return real.nan; outside of it.

@kinke
Copy link
Member Author

kinke commented Sep 24, 2018

static immutable nan = real.nan

That should have the same problem when compiled as PIC (-relocation-model=pic), the global won't be accessible directly.

@kinke kinke changed the title Phobos: Adapt & enable x87 asm version of std.math.tan() WIP: Phobos: Adapt & enable x87 asm version of std.math.tan() Sep 24, 2018
@kinke
Copy link
Member Author

kinke commented May 12, 2019

Revived this with a viable workaround for PIC (pass real.nan as extra default parameter); the x87 asm version is more than 3x faster than the generic 'software' one on my Ivy Bridge, tested on Linux x64 with -O and:

void main()
{
    enum MAXITER = 1 << 20;

    import std.datetime.stopwatch;
    import std.math;
    import std.stdio;

    real sum = 0;

    auto sw = StopWatch(AutoStart.yes);
    for(uint i; i < MAXITER; i++)
        sum += tan(real(i)); // also tested with `real(i) / MAXITER`
    sw.stop();

    writeln(sw.peek.total!"msecs", "ms\t", sum);
}

@kinke
Copy link
Member Author

kinke commented May 12, 2019

The x87 asm for std.math.expi() is faster by ~5x (expi(real(i) / MAXITER)) to ~12x (expi(real(i))) on my box (edit: and still 3-5x times faster than a llvm_cos(x) + llvm_sin(x)*1i implementation with an additional -ffast-math).

@kinke kinke changed the title WIP: Phobos: Adapt & enable x87 asm version of std.math.tan() Phobos: Adapt & enable x87 asm version of std.math.tan() May 12, 2019
@kinke kinke changed the title Phobos: Adapt & enable x87 asm version of std.math.tan() Phobos: Enable x87 asm versions for std.math.{tan,expi} May 12, 2019
@kinke kinke merged commit 9c0479b into ldc-developers:master May 15, 2019
@kinke kinke deleted the tan branch May 15, 2019 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants