LLVM for PULP Platform Projects

LLVM 12 with extensions for processors and computer systems of the PULP platform. These include:

HERO: mixed-data-model (64-bit + 32-bit) compilation and data sharing; automatic tiling of data structures and insertion of DMA transfers;
MemPool: Instruction scheduling model for the MemPool architecture; Xmempool extension to allow dynamic instruction tracing;
PULPv2 RISC-V ISA extension (Xpulpv2): automatic insertion of hardware loops, post-increment memory accesses, and multiply-accumulates; intrinsics, clang builtins , and assembly support for all instructions of the extension;
Snitch RISC-V ISA extensions (Xssr, Xfrep, and Xdma): automatic insertion of frep hardware loops; intrinsics and clang builtins for Xssr and Xdma extensions; assembly support for all instructions of the extension.

HERO and PULPv2 RISC-V ISA Extension Support

Refer to the HERO repository for build scripts and instructions to use these features.

Snitch RISC-V ISA Extension Support

Build instructions

Refer to snitch-toolchain-cd for build scripts and continuous deployment of pre-built toolchains.

Command-line options

Flag	Description
`--mcpu=snitch`	Enables all extensions for Snitch `rv32imafd,xfrep,xssr,xdma` and the Snitch machine model, which is not adapted for Snitch yet
`--debug-only=riscv-sdma`	Enable the debug output of the DMA pseudo instruction expansion pass
`--debug-only=riscv-ssr`	Enable the debug output of the SSR pseudo instruction expansion pass
`--debug-only=snitch-freploops`	Enable the debug output of the FREP loop inference pass
`--ssr-noregmerge`	Disable the SSR register merging in the SSR pseudo instruction expansion pass. Register merging is enabled by default and can be disabled with this flag.
`--snitch-frep-inference`	Globally enable the FREP inference on all loops in the compiled module.
`--enable-misched=false`	Disable the machine instruction scheduler. Instructions in a complex loop with multiple SSR push or pop instructions on the same data mover may not be rescheduled because the order in which the SSR are accessed is important.

`clang` builtins

The following clang builtins can be used to directly make use of the SSR and DMA extensions.

SSR

/**
 * @brief Setup 1D SSR read transfer
 * @details rep, b, s are raw values written directly to the SSR registers
 * 
 * @param DM data mover ID
 * @param rep repetition count minus one
 * @param b bound minus one
 * @param s relative stride
 * @param data pointer to data
 */
void __builtin_ssr_setup_1d_r(uint32_t DM, uint32_t rep, uint32_t b, uint32_t s, void* data);

/**
 * @brief Setup 1D SSR write transfer
 * @details rep, b, s are raw values written directly to the SSR registers
 * 
 * @param DM data mover ID
 * @param rep repetition count minus one
 * @param b bound minus one
 * @param s relative stride
 * @param data pointer to data
 */
void __builtin_ssr_setup_1d_w(uint32_t DM, uint32_t rep, uint32_t b, uint32_t s, void* data);

/**
 * @brief Write a datum to an SSR streamer
 * @details Must be within an SSR region
 * 
 * @param DM data mover ID
 * @param val value to write
 */
void __builtin_ssr_push(uint32_t DM, double val);

/**
 * @brief Read a datum from an SSR streamer
 * @details Must be within an SSR region
 * 
 * @param DM data mover ID
 * @return datum fetched from DM
 */
double __builtin_ssr_pop(uint32_t DM);

/**
 * @brief Enable an SSR region
 * @details FT registers are reserved and the push/pop methods can be used to access the SSR
 */
void __builtin_ssr_enable();

/**
 * @brief Disable an SSR region
 * @details FT registers are restored and push/pop is not possible
 */
void __builtin_ssr_disable();

/**
 * @brief Start an SSR read transfer
 * @details Bound and stride can be configured using the respective methods
 * 
 * @param DM data mover ID
 * @param dim Number of dimensions minus one
 * @param data pointer to data
 */
void __builtin_ssr_read(uint32_t DM, uint32_t dim, void* data);

/**
 * @brief Start an SSR write transfer
 * @details Bound and stride can be configured using the respective methods
 * 
 * @param DM data mover ID
 * @param dim Number of dimensions minus one
 * @param data pointer to data
 */
void __builtin_ssr_write(uint32_t DM, uint32_t dim, void* data);

/**
 * @brief Start an SSR read transfer. `DM` and `dim` must be constant. This method 
 * lowers to a single `scfgwi` instruction as opposed to the non-immediate version which
 * does address calculation first.
 * @details Bound and stride can be configured using the respective methods
 * 
 * @param DM data mover ID
 * @param dim Number of dimensions minus one
 * @param data pointer to data
 */
void __builtin_ssr_read_imm(uint32_t DM, uint32_t dim, void* data);

/**
 * @brief Start an SSR write transfer. `DM` and `dim` must be constant. This method 
 * lowers to a single `scfgwi` instruction as opposed to the non-immediate version which
 * does address calculation first.
 * @details Bound and stride can be configured using the respective methods
 * 
 * @param DM data mover ID
 * @param dim Number of dimensions minus one
 * @param data pointer to data
 */
void __builtin_ssr_write_imm(uint32_t DM, uint32_t dim, void* data);

/**
 * @brief Configure repetition value
 * @details A value of 0 loads each datum once
 * 
 * @param DM data mover ID
 * @param rep repetition count minus one
 */
void __builtin_ssr_setup_repetition(uint32_t DM, uint32_t rep);

/**
 * @brief Configure bound and stride for dimension 1
 * @details 
 * 
 * @param DM data mover ID
 * @param b bound minus one
 * @param s relative stride
 */
void __builtin_ssr_setup_bound_stride_1d(uint32_t DM, uint32_t b, uint32_t s);

/**
 * @brief Configure bound and stride for dimension 2
 * @details 
 * 
 * @param DM data mover ID
 * @param b bound minus one
 * @param s relative stride
 */
void __builtin_ssr_setup_bound_stride_2d(uint32_t DM, uint32_t b, uint32_t s);

/**
 * @brief Configure bound and stride for dimension 3
 * @details 
 * 
 * @param DM data mover ID
 * @param b bound minus one
 * @param s relative stride
 */
void __builtin_ssr_setup_bound_stride_3d(uint32_t DM, uint32_t b, uint32_t s);

/**
 * @brief Configure bound and stride for dimension 4
 * @details 
 * 
 * @param DM data mover ID
 * @param b bound minus one
 * @param s relative stride
 */
void __builtin_ssr_setup_bound_stride_4d(uint32_t DM, uint32_t b, uint32_t s);

/**
 * @brief Wait for the done bit to be set on data mover `DM`
 * @details Creates a polling loop and might not exit if SSR not configured correctly
 * 
 * @param DM data mover ID
 */
void __builtin_ssr_barrier(uint32_t DM);

SDMA

/**
 * @brief Start 1D DMA transfer
 * @details non-blocking call, doesn't check if DMA is ready to accept a new transfer
 * 
 * @param src Pointer to source
 * @param dst Pointer to destination
 * @param size Number of bytes to copy
 * @param cfg DMA configuration word
 * @return transfer ID
 */
uint32_t __builtin_sdma_start_oned(uint64_t src, uint64_t dst, uint32_t size, uint32_t cfg);

/**
 * @brief Start 2D DMA transfer
 * @details non-blocking call, doesn't check if DMA is ready to accept a new transfer
 * 
 * @param src Pointer to source
 * @param dst Pointer to destination
 * @param size Number of bytes in the inner transfer
 * @param sstrd Source stride
 * @param dstrd Destination stride
 * @param nreps Number of repetitions in the outer transfer
 * @param cfg DMA configuration word
 * @return transfer ID
 */
uint32_t __builtin_sdma_start_twod(uint64_t src, uint64_t dst, uint32_t size, 
  uint32_t sstrd, uint32_t dstrd, uint32_t nreps, uint32_t cfg);

/**
 * @brief Read DMA status register
 * @details 
 * 
 * @param tid Transfer ID to check
 * @return status register
 */
uint32_t __builtin_sdma_stat(uint32_t tid);

/**
 * @brief Polling wiat for idle
 * @details Block until all transactions have completed
 */
void __builtin_sdma_wait_for_idle(void);

FREP hardware loops

Inference can be enabled globally with --snitch-frep-inference or locally with #pragma frep infer.

For frep inference to work, clang must be invoked with at least -O1

#pragma frep infer
for(unsigned i = 0; i < 128; ++i)
  acc += __builtin_ssr_pop(0)*__builtin_ssr_pop(1);

The LLVM Compiler Infrastructure

This directory and its sub-directories contain source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.

The README briefly describes how to get started with building LLVM. For more information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.

Getting Started with the LLVM System

Taken from https://llvm.org/docs/GettingStarted.html.

Overview

Welcome to the LLVM project!

The LLVM project has multiple components. The core of the project is itself called "LLVM". This contains all of the tools, libraries, and header files needed to process intermediate representations and converts it into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer. It also contains basic regression tests.

C-like languages use the Clang front end. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.

Other components include: the libc++ C++ standard library, the LLD linker, and more.

Getting the Source Code and Building LLVM

The LLVM Getting Started documentation may be out of date. The Clang Getting Started page might have more accurate information.

This is an example work-flow and configuration to get and build the LLVM source:

Checkout LLVM (including related sub-projects like Clang):
- git clone https://github.com/llvm/llvm-project.git
- Or, on windows, git clone --config core.autocrlf=false https://github.com/llvm/llvm-project.git
Configure and build LLVM and Clang:
- cd llvm-project
- mkdir build
- cd build
- cmake -G <generator> [options] ../llvm
  
  Some common build system generators are:
  - Ninja --- for generating Ninja build files. Most llvm developers use Ninja.
  - Unix Makefiles --- for generating make-compatible parallel makefiles.
  - Visual Studio --- for generating Visual Studio projects and solutions.
  - Xcode --- for generating Xcode projects.
  Some Common options:
  - -DLLVM_ENABLE_PROJECTS='...' --- semicolon-separated list of the LLVM sub-projects you'd like to additionally build. Can include any of: clang, clang-tools-extra, libcxx, libcxxabi, libunwind, lldb, compiler-rt, lld, polly, or debuginfo-tests.
    
    For example, to build LLVM, Clang, libcxx, and libcxxabi, use -DLLVM_ENABLE_PROJECTS="clang;libcxx;libcxxabi".
  - -DCMAKE_INSTALL_PREFIX=directory --- Specify for directory the full path name of where you want the LLVM tools and libraries to be installed (default /usr/local).
  - -DCMAKE_BUILD_TYPE=type --- Valid options for type are Debug, Release, RelWithDebInfo, and MinSizeRel. Default is Debug.
  - -DLLVM_ENABLE_ASSERTIONS=On --- Compile with assertion checks enabled (default is Yes for Debug builds, No for all other build types).
- cmake --build . [-- [options] <target>] or your build system specified above directly.
  - The default target (i.e. ninja or make) will build all of LLVM.
  - The check-all target (i.e. ninja check-all) will run the regression tests to ensure everything is in working order.
  - CMake will generate targets for each tool and library, and most LLVM sub-projects generate their own check-<project> target.
  - Running a serial build will be slow. To improve speed, try running a parallel build. That's done by default in Ninja; for make, use the option -j NNN, where NNN is the number of parallel jobs, e.g. the number of CPUs you have.
- For more information see CMake

Consult the Getting Started with LLVM page for detailed information on configuring and compiling LLVM. You can visit Directory Layout to learn about the layout of the source code tree.

Name		Name	Last commit message	Last commit date
Latest commit History 378,656 Commits
.github		.github
.gitlab-ci.d		.gitlab-ci.d
.gitlab/merge_request_templates		.gitlab/merge_request_templates
clang-tools-extra		clang-tools-extra
clang		clang
compiler-rt		compiler-rt
debuginfo-tests		debuginfo-tests
flang		flang
libc		libc
libclc		libclc
libcxx		libcxx
libcxxabi		libcxxabi
libunwind		libunwind
lld		lld
lldb		lldb
llvm		llvm
mlir		mlir
openmp		openmp
parallel-libs		parallel-libs
polly		polly
pstl		pstl
runtimes		runtimes
utils/arcanist		utils/arcanist
.arcconfig		.arcconfig
.arclint		.arclint
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLVM for PULP Platform Projects

HERO and PULPv2 RISC-V ISA Extension Support

Snitch RISC-V ISA Extension Support

Build instructions

Command-line options

`clang` builtins

SSR

SDMA

FREP hardware loops

The LLVM Compiler Infrastructure

Getting Started with the LLVM System

Overview

Getting the Source Code and Building LLVM

About

Releases 11

Packages

pulp-platform/llvm-project

Folders and files

Latest commit

History

Repository files navigation

LLVM for PULP Platform Projects

HERO and PULPv2 RISC-V ISA Extension Support

Snitch RISC-V ISA Extension Support

Build instructions

Command-line options

clang builtins

SSR

SDMA

FREP hardware loops

The LLVM Compiler Infrastructure

Getting Started with the LLVM System

Overview

Getting the Source Code and Building LLVM

About

Resources

Stars

Watchers

Forks

Releases 11

Packages 0

`clang` builtins

Packages