-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLVM JIT interface #2277
LLVM JIT interface #2277
Conversation
Not actually implemented, though. It does print out some jit-compiled stuff, but that's about it. For example, this query: select number from system.numbers where something(cast(number as Float64)) == 4 results in this on server's stderr: define double @"something(CAST(number, 'Float64'))"(void**, i8*, void*) { "something(CAST(number, 'Float64'))": ret double 1.234500e+04 } (and an exception, because that's what the non-jitted method does.) As one may notice, this function neither reads the input (first argument; tuple of arrays) nor writes the output (third argument; array), instead returning some general nonsense. In addition, `#if USE_EMBEDDED_COMPILER` doesn't work for some reason, including LLVM headers requires -Wno-unused-parameter, this probably only works on LLVM 5.0 due to rampant API instability, and I'm definitely no expert on CMake. In short, there's still a long way to go.
The example from the previous commit doesn't need a cast to Float64 anymore.
It actually seems to work, so long as you only have one row that is. E.g. > select something(cast(number + 6 as Float64), cast(number + 2 as Float64)) from system.numbers limit 1'; 8 with this IR: define void @"something(CAST(plus(number, 6), 'Float64'), CAST(plus(number, 2), 'Float64'))"(void**, i8*, double*) { entry: %3 = load void*, void** %0 %4 = bitcast void* %3 to double* %5 = load double, double* %4 %6 = getelementptr void*, void** %0, i32 1 %7 = load void*, void** %6 %8 = bitcast void* %7 to double* %9 = load double, double* %8 %10 = fadd double %5, %9 store double %10, double* %2 ret void }
I honestly can't tell if they work. LLVM has surprisingly bad API documentation.
Given that the list of supported types is hardcoded in LLVMContext::Data::toNativeType, this method is redundant because LLVMPreparedFunction can create a ColumnVector itself.
8b900cc
to
888d97c
Compare
std::vector<bool> redundant(actions.size()); | ||
// an empty optional is a poisoned value prohibiting the column's producer from being removed | ||
// (which it could be, if it was inlined into every dependent function). | ||
std::unordered_map<std::string, std::unordered_set<std::optional<size_t>>> current_dependents; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better use struct { bool is_used; std::unordered_set<size_t> dependents; }
instead of std::unordered_set<std::optional<size_t>>
|
||
if (MAKE_STATIC_LIBRARIES) | ||
# fix strange static error: undefined reference to 'std::error_category::~error_category()' | ||
target_link_libraries(clickhouse-compiler-lib PUBLIC stdc++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix nedded only for old distributions (ubuntu trusty, xenial)
In ubuntu artful anything ok.
and this fix adds dependency to shared system lib - it unacceptable for our fully static package.
dbms/CMakeLists.txt
Outdated
# TODO: global-disable no-unused-parameter | ||
set_source_files_properties(src/Interpreters/ExpressionJIT.cpp PROPERTIES COMPILE_FLAGS "-Wno-unused-parameter -Wno-non-virtual-dtor") | ||
else () | ||
list (REMOVE_ITEM dbms_sources src/Interpreters/ExpressionJIT.cpp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use in this files:
#include <Common/config.h>
#if USE_EMBEDDED_COMPILER
...
#endif
template <typename T> | ||
static bool typeIsA(const DataTypePtr & type) | ||
{ | ||
if (auto * nullable = typeid_cast<const DataTypeNullable *>(type.get())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use removeNullable(type)
|
||
static MutableColumnPtr createNonNullableColumn(const DataTypePtr & type) | ||
{ | ||
if (auto * nullable = typeid_cast<const DataTypeNullable *>(type.get())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removeNullable(type)->createColumn()
|
||
void LLVMPreparedFunction::executeImpl(Block & block, const ColumnNumbers & arguments, size_t result) | ||
{ | ||
size_t block_size = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
block.rows()?
/// assume the column is a `ColumnVector<T>`. there's probably no good way to actually | ||
/// check that at runtime, so let's just hope it's always true for columns containing types | ||
/// for which `LLVMContext::Data::toNativeType` returns non-null. | ||
columns[i] = column->getDataAt(0).data; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's very dangerous. What you can do is:
- Add
IColumn::isColumnVector()
andStringRef IColumn::getData()
implemented for columns which store data in a single continuous memory segment.
or - Use
TypeListNumbers::forEach
and check typeid for each ColumnVector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IColumn::isFixedAndContiguous
seems to be what I wanted here. The compiled loop could be extended to arbitrary columns for which this method returns true by passing a tuple of strides (e.g. string length for ColumnFixedString
) instead of a tuple of "is constant" flags, though I'm not sure how this could affect loop auto-vectorization.
Mostly for intrinsics like memcpy/memset/memmove, which are inserted during optimization by LLVM itself. (With a null resolver, a compiled version of something like `Uint64 < 0` would segfault.)
That one has some edge cases which I can't be bothered to code.
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
This pull request adds the capability to compile built-in functions with numeric (possibly nullable) arguments and return type to native code through
llvm::IRBuilder
. Compilable functions are automatically inlined into each other for a performance boost. The function can then also decide where to evaluate each of its inlined arguments, allowing for some laziness (non-compilable subexpressions still have to be evaluated eagerly).This PR also includes implementations of the interface for most arithmetic and logic functions. Don't have any performance comparisons yet, though.
Known problems:
this code will break with LLVM 7 when it's released due to API incompatibilities;-Wno-unused-parameter
to clickhouse_functions;and
has somewhat weird semantics in thatand(false, null)
is null. This means it's impossible forand
to be lazy in the second argument if it's nullable. (and(x, non-nullable)
andand(null, x)
work fine, though.)