Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing up PLIO compiler support and creating an example #1623

Merged
merged 14 commits into from
Jul 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions include/aie/Dialect/AIE/IR/AIEOps.td
Original file line number Diff line number Diff line change
Expand Up @@ -1538,7 +1538,9 @@ def AIE_ShimDMAAllocationOp : AIE_Op<"shim_dma_allocation", [HasParent<"DeviceOp
ins FlatSymbolRefAttr:$sym_name,
DMAChannelDir:$channel_dir,
AIEI64Attr:$channel_index,
AIEI64Attr:$col
AIEI64Attr:$col,
// If this is set we are using the PLIO in this ShimTile
DefaultValuedAttr<BoolAttr, "false">:$plio
);

let results = (outs);
Expand Down Expand Up @@ -1634,7 +1636,8 @@ def AIE_ObjectFifoCreateOp: AIE_Op<"objectfifo", [HasParent<"DeviceOp">, Symbol]
TypeAttrOf<AIE_ObjectFifoType>:$elemType,
BDDimLayoutArrayAttr:$dimensionsToStream,
BDDimLayoutArrayArrayAttr:$dimensionsFromStreamPerConsumer,
DefaultValuedAttr<BoolAttr, "false">:$via_DMA
DefaultValuedAttr<BoolAttr, "false">:$via_DMA,
DefaultValuedAttr<BoolAttr, "false">:$plio
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit odd to me.. Can't the objectFifo figure out that it's connected to PLIO based on the shimDMAAllocation op? Could I have an objectFifo connecting 2 PLIOs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am using this as a way for the user to specify whether the corresponding shim tile(s) of an object FIFO should be connected to GMIO or PLIO. Here is an example: https://github.com/Xilinx/mlir-aie/blob/plio-rebase/programming_examples/basic/passthrough_dmas_plio/aie2-input-plio.py#L49.

We do also need to encode this in the shimDMAAllocationOP but that is later in the compilation, and I believe usually not exposed via the IRON python bindings.

I believe an object FIFO could connect 2 PLIOs but have never tested it. It can connect two GMIOs, correct?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shimDMAAllocationOp is currently being generated by the AIEObjectFifoStatefulTransform and as the object FIFO also abstracts the AIEFlowOps between tiles I think it can be confusing to know where to add new attributes to guide the configuration. A cleaner solution might be to "allow" shimDMAAllocationOps to exist before the object FIFO lowering pass and have the lowering pass check for them and either: adopt whatever configuration they specify for the shim tile (including the channel index), or completing any missing information if it was not provided (following the same generation it currently uses). Another possibility would be to rely on the current AIEFlowOp in the same manner as above for a more general solution that would allow users to specify the ports they want for any tile not just the shim tile.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either works for me the only important thing to me is that this is user facing. That being said, that change in allowing shimDMAAllocationOps to be before the object FIFO lowering pass sounds like it could require some refactoring. If it is non-trivial, should that be a separate PR? If so, I can create an issue so we can track it.

Copy link
Collaborator

@AndraBisca AndraBisca Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go this route, it will indeed require a bit of refactoring and I do think it will be better to have it in a separate PR that this PR would then build on. Let's discuss further and see if any of those solutions would be preferred.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A cleaner solution might be to "allow" shimDMAAllocationOps to exist before the object FIFO lowering pass

+1 because it makes things more composable

);

let assemblyFormat = [{
Expand Down
20 changes: 8 additions & 12 deletions lib/Dialect/AIE/Transforms/AIECreatePathFindFlows.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -104,8 +104,8 @@
SwitchboxOp swOp = analyzer.getSwitchbox(rewriter, curr.col, curr.row);
int shimCh = srcChannel;
// TODO: must reserve N3, N7, S2, S3 for DMA connections
if (curr == srcSB &&
analyzer.getTile(rewriter, srcSB.col, srcSB.row).isShimNOCTile()) {
if (curr == srcSB && analyzer.getTile(rewriter, srcSB.col, srcSB.row)
.isShimNOCorPLTile()) {
// shim DMAs at start of flows
if (srcBundle == WireBundle::DMA) {
shimCh = srcChannel == 0
Expand All @@ -125,13 +125,10 @@
srcBundle, srcChannel, WireBundle::North, shimCh);
} else if (srcBundle ==
WireBundle::PLIO) { // PLIO at start of flows with mux
if (srcChannel == 2 || srcChannel == 3 || srcChannel == 6 ||
srcChannel == 7) { // Only some PLIO requrie mux
ShimMuxOp shimMuxOp = analyzer.getShimMux(rewriter, srcSB.col);
addConnection(
rewriter, cast<Interconnect>(shimMuxOp.getOperation()),
flowOp, srcBundle, srcChannel, WireBundle::North, shimCh);
}
ShimMuxOp shimMuxOp = analyzer.getShimMux(rewriter, srcSB.col);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me what the point of this change is, but I don't understand the original code necessarily. Is this related to the fact that you only 'support' channel 0 and 1 in the code below?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how correct the previous code was, but in the pass here:

for (auto op : targetOp.getOps<ShimMuxOp>()) {
Region &r = op.getConnections();
Block &b = r.front();
bool isEmpty = b.getOps<ConnectOp>().empty();
if (isa<TileOp>(op.getTile().getDefiningOp())) {
int col = op.colIndex();
int row = op.rowIndex();
if (!isEmpty) {
output << "// ShimMux column " << col << " row " << row << "\n";
output << "// NOTE ShimMux always connects from the south as "
<< "directions are defined relative to the tile stream "
<< "switch\n";
output << "x = " << col << ";\n";
output << "y = " << row << ";\n";
}
}
for (auto connectOp : b.getOps<ConnectOp>()) {
if(connectOp.getSourceBundle() == WireBundle::DMA || connectOp.getDestBundle() == WireBundle::DMA) {
if (connectOp.getSourceBundle() == WireBundle::North)
// demux!
output
<< "__mlir_aie_try(XAie_EnableAieToShimDmaStrmPort("
<< deviceInstRef << ", " << tileLocStr("x", "y")
<< ", "
// <<
// stringifyWireBundle(connectOp.sourceBundle()).upper()
<< connectOp.sourceIndex() << "));\n";
else if (connectOp.getDestBundle() == WireBundle::North)
// mux
output
<< "__mlir_aie_try(XAie_EnableShimDmaToAieStrmPort("
<< deviceInstRef << ", " << tileLocStr("x", "y")
<< ", "
// <<
// stringifyWireBundle(connectOp.sourceBundle()).upper()
<< connectOp.destIndex() << "));\n";
}
else if(connectOp.getSourceBundle() == WireBundle::PLIO || connectOp.getDestBundle() == WireBundle::PLIO) {
// Note: Right now this just works with PLIO channel 0 and 1 as those don't require to program
// the shim mux
if(connectOp.destIndex() != 0 && connectOp.destIndex() != 1) {
return connectOp.emitOpError("Currently only PLIO channel 0 and 1 are supported.");
}
if (connectOp.getDestBundle() == WireBundle::North)
// mux
output
<< "__mlir_aie_try(XAie_PlToAieIntfEnable("
<< deviceInstRef << ", " << tileLocStr("x", "y")
<< ", "
<< connectOp.destIndex()
<< ", PLIF_WIDTH_64));\n";
}
}
}
we use the information in the shim mux to know whether we should enable PLIO or GMIO.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I get it. I think the original code here was conceptually incorrect. The intention is/was to be explicit about all the connections that exist even the non-programmable ones. Conceptually the 'shimMux' op is intended to represent all of the connectivity in the shim south of the stream switch.

addConnection(rewriter,
cast<Interconnect>(shimMuxOp.getOperation()), flowOp,
srcBundle, srcChannel, WireBundle::North, shimCh);
}
}
for (const auto &[bundle, channel] : setting.dsts) {
Expand All @@ -146,7 +143,7 @@
bundle == WireBundle::NOC)) {
shimCh = channel;
if (analyzer.getTile(rewriter, curr.col, curr.row)
.isShimNOCTile()) {
.isShimNOCorPLTile()) {
// shim DMAs at end of flows
if (bundle == WireBundle::DMA) {
shimCh = channel == 0
Expand All @@ -162,8 +159,7 @@
addConnection(
rewriter, cast<Interconnect>(shimMuxOp.getOperation()),
flowOp, WireBundle::North, shimCh, bundle, channel);
} else if (channel >=
2) { // must be PLIO...only PLIO >= 2 require mux
} else if (bundle == WireBundle::PLIO) {
ShimMuxOp shimMuxOp = analyzer.getShimMux(rewriter, curr.col);
addConnection(
rewriter, cast<Interconnect>(shimMuxOp.getOperation()),
Expand Down Expand Up @@ -416,7 +412,7 @@
keepPktHeaderAttr[{destTile, destPort}] =
StringAttr::get(Op.getContext(), "true");
Switchbox srcSB = {srcCoords.col, srcCoords.row};
if (PathEndPoint srcPoint = {srcSB, srcPort};

Check warning on line 415 in lib/Dialect/AIE/Transforms/AIECreatePathFindFlows.cpp

View workflow job for this annotation

GitHub Actions / ubuntu-22.04 gcc assert=OFF rtti=OFF

‘srcCoords.xilinx::AIE::TileID::row’ may be used uninitialized [-Wmaybe-uninitialized]

Check warning on line 415 in lib/Dialect/AIE/Transforms/AIECreatePathFindFlows.cpp

View workflow job for this annotation

GitHub Actions / ubuntu-22.04 gcc assert=OFF rtti=OFF

‘srcCoords.xilinx::AIE::TileID::col’ may be used uninitialized [-Wmaybe-uninitialized]

Check warning on line 415 in lib/Dialect/AIE/Transforms/AIECreatePathFindFlows.cpp

View workflow job for this annotation

GitHub Actions / ubuntu-22.04 gcc assert=ON rtti=ON

‘srcCoords.xilinx::AIE::TileID::row’ may be used uninitialized [-Wmaybe-uninitialized]

Check warning on line 415 in lib/Dialect/AIE/Transforms/AIECreatePathFindFlows.cpp

View workflow job for this annotation

GitHub Actions / ubuntu-22.04 gcc assert=ON rtti=ON

‘srcCoords.xilinx::AIE::TileID::col’ may be used uninitialized [-Wmaybe-uninitialized]
!analyzer.processedFlows[srcPoint]) {
SwitchSettings settings = analyzer.flowSolutions[srcPoint];
// add connections for all the Switchboxes in SwitchSettings
Expand Down
31 changes: 25 additions & 6 deletions lib/Dialect/AIE/Transforms/AIEObjectFifoStatefulTransform.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -973,11 +973,12 @@ struct AIEObjectFifoStatefulTransformPass
void createObjectFifoAllocationInfo(OpBuilder &builder, MLIRContext *ctx,
FlatSymbolRefAttr obj_fifo, int colIndex,
DMAChannelDir channelDir,
int channelIndex) {
int channelIndex, bool plio) {
builder.create<ShimDMAAllocationOp>(builder.getUnknownLoc(), obj_fifo,
DMAChannelDirAttr::get(ctx, channelDir),
builder.getI64IntegerAttr(channelIndex),
builder.getI64IntegerAttr(colIndex));
builder.getI64IntegerAttr(colIndex),
builder.getBoolAttr(plio));
}

void runOnOperation() override {
Expand All @@ -986,6 +987,8 @@ struct AIEObjectFifoStatefulTransformPass
DMAChannelAnalysis dmaAnalysis(device);
OpBuilder builder = OpBuilder::atBlockEnd(device.getBody());
auto ctx = device->getContext();
auto producerWireType = WireBundle::DMA;
auto consumerWireType = WireBundle::DMA;
std::set<TileOp>
objectFifoTiles; // track cores to check for loops during unrolling

Expand Down Expand Up @@ -1125,13 +1128,15 @@ struct AIEObjectFifoStatefulTransformPass
producerChan.channel, 0, producer.getDimensionsToStreamAttr());
// generate objectFifo allocation info
builder.setInsertionPoint(&device.getBody()->back());

if (producer.getProducerTileOp().isShimTile())
createObjectFifoAllocationInfo(
builder, ctx, SymbolRefAttr::get(ctx, producer.getName()),
producer.getProducerTileOp().colIndex(), producerChan.direction,
producerChan.channel);
producerChan.channel, producer.getPlio());

for (auto consumer : consumers) {

// create consumer tile DMA
DMAChannel consumerChan =
dmaAnalysis.getSlaveDMAChannel(consumer.getProducerTile());
Expand All @@ -1141,18 +1146,32 @@ struct AIEObjectFifoStatefulTransformPass
consumerChan.channel, 1, consumerDims);
// generate objectFifo allocation info
builder.setInsertionPoint(&device.getBody()->back());

// If we have PLIO then figure out the direction and make that a PLIO
if (producer.getPlio()) {
producerWireType = producer.getProducerTileOp().isShimTile()
? WireBundle::PLIO
: WireBundle::DMA;
consumerWireType = !(producer.getProducerTileOp().isShimTile())
? WireBundle::PLIO
: WireBundle::DMA;
} else {
producerWireType = WireBundle::DMA;
consumerWireType = WireBundle::DMA;
AndraBisca marked this conversation as resolved.
Show resolved Hide resolved
}

if (consumer.getProducerTileOp().isShimTile())
createObjectFifoAllocationInfo(
builder, ctx, SymbolRefAttr::get(ctx, producer.getName()),
consumer.getProducerTileOp().colIndex(), consumerChan.direction,
consumerChan.channel);
consumerChan.channel, producer.getPlio());

// create flow
builder.setInsertionPointAfter(producer);
builder.create<FlowOp>(builder.getUnknownLoc(),
producer.getProducerTile(), WireBundle::DMA,
producer.getProducerTile(), producerWireType,
producerChan.channel, consumer.getProducerTile(),
WireBundle::DMA, consumerChan.channel);
consumerWireType, consumerChan.channel);
}
}

Expand Down
2 changes: 1 addition & 1 deletion lib/Dialect/AIE/Transforms/AIEPathFinder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@ ShimMuxOp DynamicTileAnalysis::getShimMux(OpBuilder &builder, int col) {
if (coordToShimMux.count({col, row})) {
return coordToShimMux[{col, row}];
}
assert(getTile(builder, col, row).isShimNOCTile());
assert(getTile(builder, col, row).isShimNOCorPLTile());
auto switchboxOp = builder.create<ShimMuxOp>(builder.getUnknownLoc(),
getTile(builder, col, row));
SwitchboxOp::ensureTerminator(switchboxOp.getConnections(), builder,
Expand Down
6 changes: 4 additions & 2 deletions lib/Targets/AIETargetHSA.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
#include "aie/Dialect/AIEX/IR/AIEXDialect.h"
#include "aie/Targets/AIETargets.h"

#include "mlir/Dialect/Func/IR/FuncOps.h" // Eddie added to get the NPU func ops
#include "mlir/Dialect/Func/IR/FuncOps.h"
#include "mlir/IR/Attributes.h"
#include "mlir/IR/IRMapping.h"
#include "mlir/Pass/Pass.h"
Expand Down Expand Up @@ -134,6 +134,7 @@ mlir::LogicalResult AIETranslateToHSA(ModuleOp module, raw_ostream &output) {
uint32_t ChannelId = infoOp->getChannelIndex();
bool isMM2S = channelDir == AIE::DMAChannelDir::MM2S;
int col = infoOp->getCol();
bool isPlio = infoOp->getPlio();

llvm::SmallVector<int64_t, 4> strides = llvm::map_to_vector(
llvm::reverse(op.getMixedStrides()),
Expand Down Expand Up @@ -182,7 +183,8 @@ mlir::LogicalResult AIETranslateToHSA(ModuleOp module, raw_ostream &output) {
output << "\tmlir_aie_packet_nd_memcpy(&pkt" << op_count
<< ", 0 /* herd_id */, " << col << " /* col */, " << isMM2S
<< " /* dir */, " << ChannelId
<< "/* channel */, 4 /* Burst length */, 2 /* Memory space */, "
<< "/* channel */, 4 /* Burst length */, " << (isPlio ? 1 : 2)
<< " /* Memory space */, "
"(uint64_t)buf"
<< arg_idx << " + " << offset << " /* Address */, " << sizes[0] * 4
<< " /* 1d_length */, " << (strides[1] ? sizes[1] : 1)
Expand Down
55 changes: 37 additions & 18 deletions lib/Targets/AIETargetXAIEV2.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -725,24 +725,43 @@ mlir::LogicalResult AIETranslateToXAIEV2(ModuleOp module, raw_ostream &output) {
}

for (auto connectOp : b.getOps<ConnectOp>()) {
if (connectOp.getSourceBundle() == WireBundle::North)
// demux!
output
<< "__mlir_aie_try(XAie_EnableAieToShimDmaStrmPort("
<< deviceInstRef << ", " << tileLocStr("x", "y")
<< ", "
// <<
// stringifyWireBundle(connectOp.sourceBundle()).upper()
<< connectOp.sourceIndex() << "));\n";
else if (connectOp.getDestBundle() == WireBundle::North)
// mux
output
<< "__mlir_aie_try(XAie_EnableShimDmaToAieStrmPort("
<< deviceInstRef << ", " << tileLocStr("x", "y")
<< ", "
// <<
// stringifyWireBundle(connectOp.sourceBundle()).upper()
<< connectOp.destIndex() << "));\n";

if (connectOp.getSourceBundle() == WireBundle::DMA ||
connectOp.getDestBundle() == WireBundle::DMA) {
if (connectOp.getSourceBundle() == WireBundle::North)
// demux!
output
<< "__mlir_aie_try(XAie_EnableAieToShimDmaStrmPort("
<< deviceInstRef << ", " << tileLocStr("x", "y")
<< ", "
// <<
// stringifyWireBundle(connectOp.sourceBundle()).upper()
<< connectOp.sourceIndex() << "));\n";
else if (connectOp.getDestBundle() == WireBundle::North)
// mux
output
<< "__mlir_aie_try(XAie_EnableShimDmaToAieStrmPort("
<< deviceInstRef << ", " << tileLocStr("x", "y")
<< ", "
// <<
// stringifyWireBundle(connectOp.sourceBundle()).upper()
<< connectOp.destIndex() << "));\n";
}

else if (connectOp.getSourceBundle() == WireBundle::PLIO ||
connectOp.getDestBundle() == WireBundle::PLIO) {
if (connectOp.getSourceBundle() == WireBundle::North) {
// mux
output << "__mlir_aie_try(XAie_AieToPlIntfEnable(" << deviceInstRef
<< ", " << tileLocStr("x", "y") << ", "
<< connectOp.destIndex() << ", PLIF_WIDTH_64));\n";
} else if (connectOp.getDestBundle() == WireBundle::North) {
// mux
output << "__mlir_aie_try(XAie_PlToAieIntfEnable(" << deviceInstRef
<< ", " << tileLocStr("x", "y") << ", "
<< connectOp.destIndex() << ", PLIF_WIDTH_64));\n";
}
}
}
}
for (auto switchboxOp : targetOp.getOps<ShimSwitchboxOp>()) {
Expand Down
75 changes: 75 additions & 0 deletions programming_examples/basic/passthrough_dmas_plio/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# This file is licensed under the Apache License v2.0 with LLVM Exceptions.
# See https://llvm.org/LICENSE.txt for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
#
# (c) Copyright 2023 Advanced Micro Devices, Inc.

# parameters
# -DBOOST_ROOT: Path to Boost install
# -DXRT_INC_DIR: Full path to src/runtime_src/core/include in XRT cloned repo
# -DXRT_LIB_DIR: Path to xrt_coreutil.lib
# -DTARGET_NAME: Target name to be built

# cmake needs this line
cmake_minimum_required(VERSION 3.1)

set(CMAKE_CXX_STANDARD 23)
set(CMAKE_CXX_STANDARD_REQUIRED YES)

find_program(WSL NAMES powershell.exe)

if (NOT WSL)
set(CMAKE_C_COMPILER gcc-13)
set(CMAKE_CXX_COMPILER g++-13)
set(BOOST_ROOT /usr/include/boost CACHE STRING "Path to Boost install")
set(XRT_INC_DIR /opt/xilinx/xrt/include CACHE STRING "Path to XRT cloned repo")
set(XRT_LIB_DIR /opt/xilinx/xrt/lib CACHE STRING "Path to xrt_coreutil.lib")
else()
set(BOOST_ROOT C:/Technical/thirdParty/boost_1_83_0 CACHE STRING "Path to Boost install")
set(XRT_INC_DIR C:/Technical/XRT/src/runtime_src/core/include CACHE STRING "Path to XRT cloned repo")
set(XRT_LIB_DIR C:/Technical/xrtNPUfromDLL CACHE STRING "Path to xrt_coreutil.lib")
endif()

set(TARGET_NAME test CACHE STRING "Target to be built")

SET (ProjectName proj_${TARGET_NAME})
SET (currentTarget ${TARGET_NAME})

if ( WSL )
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY_RELEASE ${CMAKE_BINARY_DIR})
endif ()

project(${ProjectName})

# Find packages
find_package(Boost REQUIRED)

add_executable(${currentTarget}
${CMAKE_CURRENT_SOURCE_DIR}/../../../runtime_lib/test_lib/test_utils.cpp
test.cpp
)

target_compile_definitions(${currentTarget} PUBLIC DISABLE_ABI_CHECK=1)

target_include_directories (${currentTarget} PUBLIC
${XRT_INC_DIR}
${Boost_INCLUDE_DIRS}
${CMAKE_CURRENT_SOURCE_DIR}/../../../runtime_lib/test_lib
)

target_link_directories(${currentTarget} PUBLIC
${XRT_LIB_DIR}
${Boost_LIBRARY_DIRS}
)

if (NOT WSL)
target_link_libraries(${currentTarget} PUBLIC
xrt_coreutil
boost_program_options
boost_filesystem
)
else()
target_link_libraries(${currentTarget} PUBLIC
xrt_coreutil
)
endif()
48 changes: 48 additions & 0 deletions programming_examples/basic/passthrough_dmas_plio/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
##===- Makefile -----------------------------------------------------------===##
#
# This file licensed under the Apache License v2.0 with LLVM Exceptions.
# See https://llvm.org/LICENSE.txt for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
#
# Copyright (C) 2024, Advanced Micro Devices, Inc.
#
##===----------------------------------------------------------------------===##

srcdir := $(shell dirname $(realpath $(firstword $(MAKEFILE_LIST))))

include ${srcdir}/../../makefile-common

targetname = passThroughDMAs
LENGTH ?= 1024

all: input output

build/aie-input-plio.mlir: ${srcdir}/aie2-input-plio.py
mkdir -p ${@D}
python3 $< ${LENGTH} > $@

build/aie-output-plio.mlir: ${srcdir}/aie2-output-plio.py
mkdir -p ${@D}
python3 $< ${LENGTH} > $@

input: build/aie-input-plio.mlir
aiecc.py --link_against_hsa --host-target=x86_64-amd-linux-gnu build/aie-input-plio.mlir \
-I${srcdir}/../../../install/runtime_lib/x86_64-hsa/test_lib/include \
-L/lib/x86_64-linux-gnu/ \
${srcdir}/test_vck5000.cpp \
${srcdir}/../../../install/runtime_lib/x86_64-hsa/test_lib/src/test_library.cpp \
-Wl,--whole-archive -Wl,--no-whole-archive -lstdc++ -ldl -lelf -o input.elf

output: build/aie-output-plio.mlir
aiecc.py --link_against_hsa --host-target=x86_64-amd-linux-gnu build/aie-output-plio.mlir \
-I${srcdir}/../../../install/runtime_lib/x86_64-hsa/test_lib/include \
-L/lib/x86_64-linux-gnu/ \
${srcdir}/test_vck5000.cpp \
${srcdir}/../../../install/runtime_lib/x86_64-hsa/test_lib/src/test_library.cpp \
-Wl,--whole-archive -Wl,--no-whole-archive -lstdc++ -ldl -lelf -o output.elf

run_vck5000:
test.elf

clean:
rm -rf build aie-output-plio.mlir.prj aie-input-plio.mlir.prj core_* input.elf output.elf
27 changes: 27 additions & 0 deletions programming_examples/basic/passthrough_dmas_plio/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
<!---//===- README.md --------------------------*- Markdown -*-===//
//
// This file is licensed under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
// Copyright (C) 2024, Advanced Micro Devices, Inc.
//
//===----------------------------------------------------------------------===//-->

# <ins>Passthrough DMAs with PLIO</ins>

This reference design can be run on the VCK5000 Versal device. This design leverages the same data movement pattern as the [Passthrough DMAs](../passthrough-dmas) example design but it uses a soft DMA. Please see the [platforms repo](https://github.com/Xilinx/ROCm-air-platforms) for more information on how the programmable logic is integrated with the AIEs. This is meant to be an illustrative example to highlight how to integrate PL designs with AIE designs programmed using mlir-aie.

In the platform, tile (26, 0) has PLIO connected to a DMA implemented in the programmable logic. There are two designs, `aie2-input-plio.py` uses the soft DMA to push data from DRAM into the AIEs, wheras `aie2-output-plio.py` uses the soft DMA to receive data from the AIEs and push it to DRAM. The soft DMA is programmed using the same mechanism as the ShimDMAs.

In the [design](./aie2.py) data is brought from external memory to `ComputeTile2` and back, without modification from the tile, by using an implicit copy via the compute tile's Data Movement Accelerator (DMA). The data is read from and written to external memory through the Shim tile (`col`, 0).

The implicit copy is performed using the `object_fifo_link` operation that specifies how input data arriving via `of_in` should be sent further via `of_out` by specifically leveraging the compute tile's DMA. This operation and its functionality are described in more depth in [Section-2b](../../../programming_guide/section-2/section-2b/03_Link_Distribute_Join/README.md#object-fifo-link) of the programming guide.


To compile and run the design for VCK5000:
```
make all
./output.elf // To run the kernel which outputs over PLIO
./input.elf // To run the kernel which inputs over PLIO
```
Loading
Loading