Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORC-1356: [C++] Use Intel AVX-512 instructions to accelerate the Rle-bit-packing decode #1375

Merged
merged 110 commits into from
May 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
110 commits
Select commit Hold shift + click to select a range
58c3ab6
Use AVX512 to optimize bit-packing decode functions. This will improve
wpleonardo Jan 10, 2023
acbc214
Fix some conficts.
wpleonardo Jan 10, 2023
293d863
Fix some conflicts.
wpleonardo Jan 10, 2023
e7a9119
Fix the code format.
wpleonardo Jan 11, 2023
cfde08f
Modify TestRleVectorDecoder.cc to match the new format.
wpleonardo Jan 11, 2023
8341943
Fix a mistake on function name
wpleonardo Jan 11, 2023
e840649
Modified code into namespace orc
wpleonardo Jan 11, 2023
c7962d5
Modify function name to fix a build issue.
wpleonardo Jan 11, 2023
495a620
Modify code format.
wpleonardo Jan 12, 2023
5c937e6
Fix a build issue about int64 has different printf format between mac…
wpleonardo Jan 12, 2023
a87c281
Fix build issue on windows.
wpleonardo Jan 12, 2023
d8fcbe6
Fix some code format issue and function name.
wpleonardo Jan 12, 2023
668335c
1. Modified the code format;
wpleonardo Jan 14, 2023
46daa2d
1. Use clang-format to modify the code format of TestRleVectorDecoder.cc
wpleonardo Jan 14, 2023
415d1eb
1. Use clang-format -style=google to format code style of TestRleVect…
wpleonardo Jan 15, 2023
cd2f71d
1. Use clang-format to modify the code style of c++/test/TestRleVecto…
Jan 30, 2023
f9ee0b4
Use clang-format to modify code style of files:
Jan 30, 2023
6f8cb56
1.Add an Env parameter "ENABLE_RUNTIME_AVX512" to open or close AVX51…
wpleonardo Jan 31, 2023
c1c2448
Update CMakeLists.txt
wpleonardo Feb 1, 2023
edf164f
Update CMakeLists.txt
wpleonardo Feb 1, 2023
f360582
Merge pull request #3 from wpleonardo/fix_comments
wpleonardo Feb 13, 2023
743ac84
1.Add the dynamic dispatch function to distribute avx512 and default …
Feb 13, 2023
6bc9035
Delete some comments in code.
Feb 13, 2023
ca3af78
Fix some comments.
Feb 14, 2023
eeafccf
Fix some comments
Feb 14, 2023
1beb9b5
Merge pull request #4 from wpleonardo/fix_comments
wpleonardo Feb 15, 2023
d9c562b
1.Modified the CMakelists, delete the part of aarch64 and ORC_RUNTIME…
Feb 16, 2023
0cf5620
Modified the macro name
Feb 16, 2023
08e32f4
Merge pull request #5 from wpleonardo/fix_comments
wpleonardo Feb 16, 2023
8a6b9f7
Merge pull request #6 from wpleonardo/fix_comments
wpleonardo Feb 16, 2023
1b8301f
1.Fixed build error on macos
Feb 17, 2023
1924ecf
Merge pull request #7 from wpleonardo/fix_comments
wpleonardo Feb 17, 2023
3c4f2b8
Merge pull request #8 from wpleonardo/fix_comments
wpleonardo Feb 17, 2023
b1759a1
Merge pull request #9 from wpleonardo/fix_comments
wpleonardo Feb 17, 2023
dc81e79
Merge pull request #10 from wpleonardo/fix_comments
wpleonardo Feb 17, 2023
b37c7dd
Fixed the build error on macos.
Feb 17, 2023
23dd7ff
Fix the build error on macos, and code format.
Feb 17, 2023
6a6f491
Fix build error on macos.
Feb 18, 2023
b2abf44
Fix build error on macos
Feb 18, 2023
36f06aa
Fix a build error about "%ld" and "%lld" on macos.
Feb 18, 2023
3db8d1a
Merge pull request #11 from wpleonardo/fix_comments
wpleonardo Feb 18, 2023
15db3d1
Use std::cout instead of printf function
Feb 19, 2023
d77c81b
Merge pull request #12 from wpleonardo/fix_comments
wpleonardo Feb 20, 2023
42cc703
Fix build error on macos.
Feb 20, 2023
4fbe1d7
Merge pull request #13 from wpleonardo/fix_comments
wpleonardo Feb 28, 2023
9d86e3d
Macos doesn't support AVX512 fully. So skip Macos to support AVX512 d…
Mar 1, 2023
75e4cfa
Add the comments about arch=native compile option.
Mar 1, 2023
284a9a4
Merge pull request #14 from wpleonardo/fix_comments
wpleonardo Mar 1, 2023
2c9f93f
Merge pull request #15 from wpleonardo/fix_comments
wpleonardo Mar 2, 2023
d21705c
Merge pull request #16 from wpleonardo/fix_comments
wpleonardo Mar 2, 2023
197f2e6
Add the cpu flags information in the cmake process.
Mar 2, 2023
1d050af
Modified the cmake check of supoorting AVX512.
Mar 2, 2023
10b7009
Merge pull request #17 from wpleonardo/fix_comments
wpleonardo Mar 3, 2023
a239e47
When user set BUILD_ENABLE_AVX512=on, but the compiler cannot support…
Mar 3, 2023
b2b6aff
Add the comment about -mtune=native in cmake process.
Mar 3, 2023
6f06b79
Merge pull request #18 from wpleonardo/fix_comments
wpleonardo Mar 6, 2023
a0aa823
Merge pull request #19 from wpleonardo/fix_comments
wpleonardo Mar 6, 2023
d7112e9
1.Add the new CI action to test AVX512 feature.
Mar 6, 2023
5b38980
Change the build_type back to Debug, keep consistent with the original.
Mar 6, 2023
2ad64bc
Merge pull request #20 from wpleonardo/fix_comments
wpleonardo Mar 8, 2023
6768165
Fix an error about _mm512_load_si512 on some CPU core when running wi…
Mar 9, 2023
d383035
Merge pull request #21 from wpleonardo/fix_comments
wpleonardo Mar 10, 2023
ce7f6de
Most hotspot of function RleDecoderV2::resetBufferStart locates in sa…
Mar 11, 2023
8f6806b
Modified some cmake options and status message
Mar 14, 2023
e27be9e
Delete macro ORC_HAVE_RUNTIME_AVX512. Modified CMakeLists.txt to choo…
Mar 16, 2023
fe5b6c7
Merge pull request #22 from wpleonardo/fix_comments
wpleonardo Mar 16, 2023
5b0e66d
Modified the code format.
Mar 16, 2023
8c99fcd
1.Delete the redundancy code in CpuInfo file
Mar 16, 2023
0f1adda
Add the cpu flags print on windows.
Mar 17, 2023
440d6d1
Merge pull request #23 from wpleonardo/fix_comments
wpleonardo Mar 17, 2023
070ca0f
Update cmake_modules/ConfigSimdLevel.cmake
wgtmac Mar 17, 2023
21de59a
1. Code format change in c++/src/Bpacking.hh
Mar 17, 2023
ae0d5c2
Code format change about c++/src/CpuInfoUtil.cc
Mar 17, 2023
3f47d1c
Merge pull request #24 from wpleonardo/fix_comments
wpleonardo Mar 17, 2023
1fdfe54
1. Deleted some useless header files included in source file
Mar 20, 2023
3f156b4
1. Code format about c++/src/BpackingAvx512.cc
Mar 20, 2023
4b166ee
1. Delete the redundant buffer array in class UnpackAvx512
Mar 22, 2023
3c21f2e
Use macros to replace some number
wpleonardo Mar 22, 2023
27d5b40
Change RleDecoderV2::readLongs return type back to void.
Mar 24, 2023
7cea68e
Added "how to build&use AVX512 in ORC" in README.md
Mar 27, 2023
3be42ee
1.Modified the description about how to use AVX512 in README.md
Mar 27, 2023
11ceeaa
When compiler doesn't support AVX512, but customer set BUILD_ENABLE_A…
Mar 27, 2023
277d9be
1. Update link information about apple avx512 in CMakeLists.txt
Mar 28, 2023
305a317
Fix an error about if judgement in windows CI test
Mar 28, 2023
4debd50
Add the align header and tailer code in the process of bit-unpacking.
Mar 29, 2023
62d373c
Fix an error in the CI test yaml file on windows platform.
Mar 29, 2023
3dca1d7
Modified the AVX512 enable description in the README.md
Mar 29, 2023
e23ca29
Add "shell: bash" in the CI test on windows, and make CI commands run…
Mar 31, 2023
1a32212
1. In function alignHeaderBoundary and alignTailerBoundary, rename pa…
Apr 11, 2023
fc2c288
Change the parameter bitMaxSize type to const uint32_t
Apr 12, 2023
3468df0
Change some parameter's type to const
wpleonardo Apr 13, 2023
93feaf9
1. Changed the parameters bufferStart, bufferEnd, bitsLeft and curByt…
Apr 17, 2023
3b831f2
Merge pull request #39 from wpleonardo/fix_comments
wpleonardo Apr 18, 2023
1deb2cf
Merge pull request #40 from wpleonardo/fix_comments
wpleonardo Apr 18, 2023
b48ec06
1. Modified vectorUnpack16,vectorUnpack24,vectorUnpack32 to support a…
Apr 18, 2023
b89870a
Added the comments of function alignHeaderBoundary and alignTailerBou…
Apr 18, 2023
fe09a92
Delete useless header file
Apr 18, 2023
596835d
Merge branch 'main' into fix_comments
Apr 18, 2023
e236773
Code format change
Apr 18, 2023
321ab63
Merge pull request #41 from wpleonardo/fix_comments
wpleonardo Apr 19, 2023
df6fe45
Add a parameter comments
Apr 19, 2023
ce77b50
Merge pull request #42 from wpleonardo/fix_comments
wpleonardo Apr 21, 2023
6c84d8d
Merge pull request #43 from wpleonardo/fix_comments
wpleonardo Apr 21, 2023
f3ff215
Change the invoking way about bufferstart,bufferend parameters.
Apr 21, 2023
af96de9
1. Code format change
Apr 22, 2023
d6fd57d
Merge pull request #44 from wpleonardo/fix_comments
wpleonardo Apr 23, 2023
0bfc862
Modified cmakefile about the checking of AVX512.
Apr 23, 2023
e584a42
Because check_cxx_source_run will be hung on windows, change check_cx…
Apr 24, 2023
4d261eb
Change check_cxx_source_runs back to CHECK_CXX_SOURCE_COMPILES
Apr 24, 2023
1f2085e
Merge pull request #45 from wpleonardo/fix_comments
wpleonardo Apr 23, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 38 additions & 2 deletions .github/workflows/build_and_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -79,8 +79,16 @@ jobs:
cat /home/runner/work/orc/orc/build/java/rat.txt

windows:
name: "Build on Windows"
name: "C++ ${{ matrix.simd }} Test on Windows"
runs-on: windows-2019
strategy:
fail-fast: false
matrix:
simd:
- General
- AVX512
env:
ORC_USER_SIMD_LEVEL: AVX512
steps:
- name: Checkout
uses: actions/checkout@v2
Expand All @@ -89,13 +97,41 @@ jobs:
with:
msbuild-architecture: x64
- name: "Test"
shell: bash
run: |
mkdir build
cd build
cmake .. -G "Visual Studio 16 2019" -DCMAKE_BUILD_TYPE=Debug -DBUILD_LIBHDFSPP=OFF -DBUILD_TOOLS=OFF -DBUILD_JAVA=OFF
if [ "${{ matrix.simd }}" = "General" ]; then
cmake .. -G "Visual Studio 16 2019" -DCMAKE_BUILD_TYPE=Debug -DBUILD_LIBHDFSPP=OFF -DBUILD_TOOLS=OFF -DBUILD_JAVA=OFF
else
cmake .. -G "Visual Studio 16 2019" -DCMAKE_BUILD_TYPE=Debug -DBUILD_LIBHDFSPP=OFF -DBUILD_TOOLS=OFF -DBUILD_JAVA=OFF -DBUILD_ENABLE_AVX512=ON
fi
cmake --build . --config Debug
ctest -C Debug --output-on-failure

simdUbuntu:
name: "SIMD programming using C++ intrinsic functions on ${{ matrix.os }}"
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os:
- ubuntu-22.04
cxx:
- clang++
env:
ORC_USER_SIMD_LEVEL: AVX512
steps:
- name: Checkout
uses: actions/checkout@v2
- name: "Test"
run: |
mkdir -p ~/.m2
mkdir build
cd build
cmake -DBUILD_JAVA=OFF -DBUILD_ENABLE_AVX512=ON ..
make package test-out

doc:
name: "Javadoc generation"
runs-on: ubuntu-20.04
Expand Down
17 changes: 15 additions & 2 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,10 @@ option(BUILD_CPP_ENABLE_METRICS
"Enable the metrics collection at compile phase"
OFF)

option(BUILD_ENABLE_AVX512
wpleonardo marked this conversation as resolved.
Show resolved Hide resolved
"Enable build with AVX512 at compile time"
OFF)
wgtmac marked this conversation as resolved.
Show resolved Hide resolved

# Make sure that a build type is selected
if (NOT CMAKE_BUILD_TYPE)
message(STATUS "No build type selected, default to ReleaseWithDebugInfo")
Expand Down Expand Up @@ -121,7 +125,7 @@ if (CMAKE_CXX_COMPILER_ID MATCHES "Clang")
set (WARN_FLAGS "${WARN_FLAGS} -Wno-covered-switch-default")
set (WARN_FLAGS "${WARN_FLAGS} -Wno-missing-noreturn -Wno-unknown-pragmas")
set (WARN_FLAGS "${WARN_FLAGS} -Wno-gnu-zero-variadic-macro-arguments")
set (WARN_FLAGS "${WARN_FLAGS} -Wconversion")
set (WARN_FLAGS "${WARN_FLAGS} -Wno-conversion")
if (CMAKE_CXX_COMPILER_VERSION VERSION_GREATER_EQUAL "13.0")
set (WARN_FLAGS "${WARN_FLAGS} -Wno-reserved-identifier")
endif()
Expand All @@ -140,7 +144,7 @@ elseif (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
else ()
set (CXX17_FLAGS "-std=c++17")
endif ()
set (WARN_FLAGS "-Wall -Wno-unknown-pragmas -Wconversion")
set (WARN_FLAGS "-Wall -Wno-unknown-pragmas -Wno-conversion")
if (CMAKE_CXX_COMPILER_VERSION VERSION_GREATER "12.0")
set (WARN_FLAGS "${WARN_FLAGS} -Wno-array-bounds -Wno-stringop-overread") # To compile protobuf in Fedora37
endif ()
Expand Down Expand Up @@ -174,6 +178,15 @@ enable_testing()

INCLUDE(CheckSourceCompiles)
INCLUDE(ThirdpartyToolchain)
message(STATUS "BUILD_ENABLE_AVX512: ${BUILD_ENABLE_AVX512}")
#
# macOS doesn't fully support AVX512, it has a different way dealing with AVX512 than Windows and Linux.
#
# Here can find the description:
# https://github.com/apple/darwin-xnu/blob/2ff845c2e033bd0ff64b5b6aa6063a1f8f65aa32/osfmk/i386/fpu.c#L174
if (BUILD_ENABLE_AVX512 AND NOT APPLE)
wgtmac marked this conversation as resolved.
Show resolved Hide resolved
INCLUDE(ConfigSimdLevel)
endif ()

set (EXAMPLE_DIRECTORY ${CMAKE_SOURCE_DIR}/examples)

Expand Down
15 changes: 15 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,3 +93,18 @@ To build only the C++ library:
% make test-out

```

To build the C++ library with AVX512 enabled:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check if this looks good to you. @stiga-huang

```shell
export ORC_USER_SIMD_LEVEL=AVX512
% mkdir build
% cd build
% cmake .. -DBUILD_JAVA=OFF -DBUILD_ENABLE_AVX512=ON
% make package
% make test-out
```
Cmake option BUILD_ENABLE_AVX512 can be set to "ON" or (default value)"OFF" at the compile time. At compile time, it defines the SIMD level(AVX512) to be compiled into the binaries.

Environment variable ORC_USER_SIMD_LEVEL can be set to "AVX512" or (default value)"NONE" at the run time. At run time, it defines the SIMD level to dispatch the code which can apply SIMD optimization.

Note that if ORC_USER_SIMD_LEVEL is set to "NONE" at run time, AVX512 will not take effect at run time even if BUILD_ENABLE_AVX512 is set to "ON" at compile time.
484 changes: 484 additions & 0 deletions c++/src/BitUnpackerAvx512.hh

Large diffs are not rendered by default.

34 changes: 34 additions & 0 deletions c++/src/Bpacking.hh
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#ifndef ORC_BPACKING_HH
#define ORC_BPACKING_HH

#include <cstdint>

namespace orc {
class RleDecoderV2;

class BitUnpack {
public:
static void readLongs(RleDecoderV2* decoder, int64_t* data, uint64_t offset, uint64_t len,
uint64_t fbs);
};
} // namespace orc

#endif
Loading