Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Enable FPGA max_concurrency memory attribute #80

Merged
merged 4 commits into from
Apr 22, 2019

Conversation

vmaksimo
Copy link
Contributor

Signed-off-by: Viktoria Maksimova [email protected]

Copy link
Contributor

@keryell keryell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an attribute definitely useful on FPGA.

@@ -1474,6 +1482,22 @@ def IntelFPGANumBanks : Attr {
}];
}

def IntelFPGAMaxConcurrency : InheritableAttr {
let Spellings = [GNU<"max_concurrency">, CXX11<"intelfpga","max_concurrency">];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps intelfpga would be clearer if written intel_fpga.
Are these attributes defined to be compatible with some other tools or are just brand new attributes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these attributes defined to be compatible with some other tools or are just brand new attributes?

You can already use them, for example, in Intel® FPGA SDK for OpenCL™ 19.1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. With the "intelfpga::" namespace?
But honestly I have no problem about having Intel extension harder to use or unclear. ;-)

let Heading = "max_concurrency (IntelFGPA)";
let Content = [{
This attribute may be attached to a variable or struct member declaration and
instructs the backend to replicate the memory generated for the variable or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory generated might be unclear for people not familiar with FPGA.
But I do not now what to have instead. memory synthesized?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory generated might be unclear for people not familiar with FPGA.
But I do not now what to have instead. memory synthesized?

I agree.

@@ -53,7 +53,22 @@ void foo1()
//CHECK: IntelFPGANumBanksAttr
//CHECK-NEXT: ConstantExpr
//CHECK-NEXT: IntegerLiteral{{.*}}4{{$}}
__attribute__((__numbanks__(4))) unsigned int v_six2[32];
[[intelfpga::numbanks(4)]] unsigned int v_six2[32];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearer, indeed!

//CHECK: IntelFPGAMaxConcurrencyAttr
//CHECK-NEXT: ConstantExpr
//CHECK-NEXT: IntegerLiteral{{.*}}4{{$}}
__attribute__((max_concurrency(4)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little bit scared about naming conflicts with thousands of top-level attributes landing from many vendors...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. We'll think on how to solve this for C-style attributes.

Signed-off-by: Viktoria Maksimova <[email protected]>
@MrSidims
Copy link
Contributor

@bader @vladimirlaz why it's on 'Review required' state yet if I've already approved?

@bader
Copy link
Contributor

bader commented Apr 20, 2019

@MrSidims I think to change the state, someone with write access has to approve a PR.

@MrSidims
Copy link
Contributor

@MrSidims I think to change the state, someone with write access has to approve a PR.

Oh, I see, I thought the special rights are about merging PR not about reviewing.

@asavonic
Copy link
Contributor

@romanovvlad can you check this patch and merge if it's ok?

Copy link
Contributor

@asavonic asavonic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@romanovvlad romanovvlad merged commit 978ee22 into intel:sycl Apr 22, 2019
bb-sycl pushed a commit that referenced this pull request Jan 10, 2020
Summary:
This patch fixes pr23772  [ARM] r226200 can emit illegal thumb2 instruction: "sub sp, r12, #80".
The violation was that SUB and ADD (reg, immediate) instructions can only write to SP if the source register is also SP. So the above instructions was unpredictable.
To enforce that the instruction t2(ADD|SUB)ri does not write to SP we now enforce the destination register to be rGPR (That exclude PC and SP).
Different than the ARM specification, that defines one instruction that can read from SP, and one that can't, here we inserted one that can't write to SP, and other that can only write to SP as to reuse most of the hard-coded size optimizations.
When performing this change, it uncovered that emitting Thumb2 Reg plus Immediate could not emit all variants of ADD SP, SP #imm instructions before so it was refactored to be able to. (see test/CodeGen/Thumb2/mve-stacksplot.mir where we use a subw sp, sp, Imm12 variant )
It also uncovered a disassembly issue of adr.w instructions, that were only written as SUBW instructions (see llvm/test/MC/Disassembler/ARM/thumb2.txt).

Reviewers: eli.friedman, dmgreen, carwil, olista01, efriedma

Reviewed By: efriedma

Subscribers: john.brawn, efriedma, ostannard, kristof.beyls, hiraditya, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70680
bb-sycl pushed a commit that referenced this pull request Jan 14, 2020
Summary:
This patch fixes pr23772  [ARM] r226200 can emit illegal thumb2 instruction: "sub sp, r12, #80".
The violation was that SUB and ADD (reg, immediate) instructions can only write to SP if the source register is also SP. So the above instructions was unpredictable.
To enforce that the instruction t2(ADD|SUB)ri does not write to SP we now enforce the destination register to be rGPR (That exclude PC and SP).
Different than the ARM specification, that defines one instruction that can read from SP, and one that can't, here we inserted one that can't write to SP, and other that can only write to SP as to reuse most of the hard-coded size optimizations.
When performing this change, it uncovered that emitting Thumb2 Reg plus Immediate could not emit all variants of ADD SP, SP #imm instructions before so it was refactored to be able to. (see test/CodeGen/Thumb2/mve-stacksplot.mir where we use a subw sp, sp, Imm12 variant )
It also uncovered a disassembly issue of adr.w instructions, that were only written as SUBW instructions (see llvm/test/MC/Disassembler/ARM/thumb2.txt).

Reviewers: eli.friedman, dmgreen, carwil, olista01, efriedma, andreadb

Reviewed By: efriedma

Subscribers: gbedwell, john.brawn, efriedma, ostannard, kristof.beyls, hiraditya, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70680
vmaksimo pushed a commit to vmaksimo/llvm that referenced this pull request Mar 3, 2021
  CONFLICT (content): Merge conflict in clang/lib/Frontend/CompilerInstance.cpp
aelovikov-intel pushed a commit to aelovikov-intel/llvm that referenced this pull request Feb 23, 2023
iclsrc pushed a commit that referenced this pull request Sep 21, 2023
…… (#67069)

We noticed some performance issue while in lldb-vscode for grabing the
name of the SBValue. Profiling shows SBValue::GetName() can cause
synthetic children provider of shared/unique_ptr to deference underlying
object and complete it type.

This patch lazily moves the dereference from synthetic child provider's
Update() method to GetChildAtIndex() so that SBValue::GetName() won't
trigger the slow code path.

Here is the culprit slow code path:
```
...
frame #59: 0x00007ff4102e0660 liblldb.so.15`SymbolFileDWARF::CompleteType(this=<unavailable>, compiler_type=0x00007ffdd9829450) at SymbolFileDWARF.cpp:1567:25 [opt]
...
frame #67: 0x00007ff40fdf9bd4 liblldb.so.15`lldb_private::ValueObject::Dereference(this=0x0000022bb5dfe980, error=0x00007ffdd9829970) at ValueObject.cpp:2672:41 [opt]
frame #68: 0x00007ff41011bb0a liblldb.so.15`(anonymous namespace)::LibStdcppSharedPtrSyntheticFrontEnd::Update(this=0x000002298fb94380) at LibStdcpp.cpp:403:40 [opt]
frame #69: 0x00007ff41011af9a liblldb.so.15`lldb_private::formatters::LibStdcppSharedPtrSyntheticFrontEndCreator(lldb_private::CXXSyntheticChildren*, std::shared_ptr<lldb_private::ValueObject>) [inlined] (anonymous namespace)::LibStdcppSharedPtrSyntheticFrontEnd::LibStdcppSharedPtrSyntheticFrontEnd(this=0x000002298fb94380, valobj_sp=<unavailable>) at LibStdcpp.cpp:371:5 [opt]
...
frame #78: 0x00007ff40fdf6e42 liblldb.so.15`lldb_private::ValueObject::CalculateSyntheticValue(this=0x000002296c66a500) at ValueObject.cpp:1836:27 [opt]
frame #79: 0x00007ff40fdf1939 liblldb.so.15`lldb_private::ValueObject::GetSyntheticValue(this=<unavailable>) at ValueObject.cpp:1867:3 [opt]
frame #80: 0x00007ff40fc89008 liblldb.so.15`ValueImpl::GetSP(this=0x0000022c71b90de0, stop_locker=0x00007ffdd9829d00, lock=0x00007ffdd9829d08, error=0x00007ffdd9829d18) at SBValue.cpp:141:46 [opt]
frame #81: 0x00007ff40fc7d82a liblldb.so.15`lldb::SBValue::GetSP(ValueLocker&) const [inlined] ValueLocker::GetLockedSP(this=0x00007ffdd9829d00, in_value=<unavailable>) at SBValue.cpp:208:21 [opt]
frame #82: 0x00007ff40fc7d817 liblldb.so.15`lldb::SBValue::GetSP(this=0x00007ffdd9829d90, locker=0x00007ffdd9829d00) const at SBValue.cpp:1047:17 [opt]
frame #83: 0x00007ff40fc7da6f liblldb.so.15`lldb::SBValue::GetName(this=0x00007ffdd9829d90) at SBValue.cpp:294:32 [opt]
...
```

Differential Revision: https://reviews.llvm.org/D159542
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants