Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](es catalog) fix issue with select and insert from es catalog core #24318

Merged
merged 2 commits into from
Sep 13, 2023

Conversation

qidaye
Copy link
Contributor

@qidaye qidaye commented Sep 13, 2023

Proposed changes

Issue Number: close #24315

The root cause of this issue is that Elasticsearch's long type allows inserting floats and strings. Doris did not handle these cases when doing type conversion. The current strategy is to take the integer before the decimal point if a float or string is found.

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

@qidaye qidaye changed the title [fix](es catalog) fix issue with insert from es catalog core [fix](es catalog) fix issue with select and insert from es catalog core Sep 13, 2023
@qidaye
Copy link
Contributor Author

qidaye commented Sep 13, 2023

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 50.97 seconds
stream load tsv: 595 seconds loaded 74807831229 Bytes, about 119 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
insert into select: 29.2 seconds inserted 10000000 Rows, about 342K ops/s
storage size: 17162267809 Bytes

@qidaye
Copy link
Contributor Author

qidaye commented Sep 13, 2023

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@qidaye
Copy link
Contributor Author

qidaye commented Sep 13, 2023

run beut

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.01% (7968/21527)
Line Coverage: 28.96% (63854/220464)
Region Coverage: 27.90% (33155/118815)
Branch Coverage: 24.46% (17011/69550)
Coverage Report: http://coverage.selectdb-in.cc/coverage/fda1276e13867cea1c9db359fcc225cfaf08429d_fda1276e13867cea1c9db359fcc225cfaf08429d/report/index.html

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 51.05 seconds
stream load tsv: 595 seconds loaded 74807831229 Bytes, about 119 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.2 seconds inserted 10000000 Rows, about 342K ops/s
storage size: 17162411714 Bytes

@xiaokang xiaokang added dev/2.0.2 usercase Important user case type label labels Sep 13, 2023
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@zy-kkk zy-kkk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 13, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman merged commit 11afd32 into apache:master Sep 13, 2023
16 of 17 checks passed
@qidaye qidaye deleted the es_insert_core branch September 14, 2023 02:08
xiaokang pushed a commit that referenced this pull request Sep 14, 2023
…re (#24318)

Issue Number: close #24315

The root cause of this issue is that Elasticsearch's long type allows inserting floats and strings. Doris did not handle these cases when doing type conversion. The current strategy is to take the integer before the decimal point if a float or string is found.
lide-reed pushed a commit that referenced this pull request Aug 20, 2024
…alog core (#39585)

## Proposed changes

Issue Number: pick #24318

```
(gdb) 
#0  0x00007fb77df2b387 in raise () from /lib64/libc.so.6
#1  0x00007fb77df2ca78 in abort () from /lib64/libc.so.6
#2  0x000056189ef8e1d9 in ?? ()
#3  0x000056189ef837ed in google::LogMessage::Fail() ()
#4  0x000056189ef85d29 in google::LogMessage::SendToLog() ()
#5  0x000056189ef83356 in google::LogMessage::Flush() ()
#6  0x000056189ef86399 in google::LogMessageFatal::~LogMessageFatal() ()
#7  0x000056189ac09e08 in doris::vectorized::IColumn::append_data_by_selector_impl<doris::vectorized::ColumnNullable> (selector=..., res=..., 
    this=<optimized out>) at /var/local/ldb-toolchain/include/c++/11/ext/new_allocator.h:89
#8  doris::vectorized::ColumnNullable::append_data_by_selector (this=<optimized out>, res=..., selector=...)
    at /data/TCHouse-D-1.2/be/src/vec/columns/column_nullable.h:201
#9  0x000056189acab325 in doris::vectorized::Block::append_block_by_selector (this=this@entry=0x7fb6a9e3a510, columns=..., selector=...)
    at /var/local/ldb-toolchain/include/c++/11/bits/stl_vector.h:1043
#10 0x000056189e90e9a8 in doris::stream_load::VNodeChannel::add_block (this=0x56198399b090, block=0x7fb6a9e3a510, payload=...)
    at /data/TCHouse-D-1.2/be/src/vec/core/block.h:421
#11 0x000056189e914840 in doris::stream_load::VOlapTableSink::send (this=<optimized out>, state=<optimized out>, input_block=<optimized out>)
    at /data/TCHouse-D-1.2/be/src/vec/sink/vtablet_sink.cpp:608
#12 0x0000561899b13261 in doris::PlanFragmentExecutor::open_vectorized_internal (this=this@entry=0x56199fdaaf80)
    at /data/TCHouse-D-1.2/be/src/runtime/plan_fragment_executor.cpp:322
#13 0x0000561899b1423e in doris::PlanFragmentExecutor::open (this=this@entry=0x56199fdaaf80)
    at /data/TCHouse-D-1.2/be/src/runtime/plan_fragment_executor.cpp:261
#14 0x0000561899aec9a0 in doris::FragmentExecState::execute (this=0x56199fdaaf00) at /data/TCHouse-D-1.2/be/src/runtime/fragment_mgr.cpp:260
#15 0x0000561899aeffae in doris::FragmentMgr::_exec_actual(std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::PlanFragmentExecutor*)>) (this=this@entry=0x5618a9b11400, exec_state=..., cb=...) at /var/local/ldb-toolchain/include/c++/11/bits/shared_ptr_base.h:1290
#16 0x0000561899af04e2 in operator() (__closure=<optimized out>) at /data/TCHouse-D-1.2/be/src/runtime/fragment_mgr.cpp:737
#17 std::__invoke_impl<void, doris::FragmentMgr::exec_plan_fragment(const doris::TExecPlanFragmentParams&, doris::FragmentMgr::FinishCallback)::<lambda()>&> (__f=...) at /var/local/ldb-toolchain/include/c++/11/bits/invoke.h:61
#18 std::__invoke_r<void, doris::FragmentMgr::exec_plan_fragment(const doris::TExecPlanFragmentParams&, doris::FragmentMgr::FinishCallback)::<lambda()>&> (__fn=...) at /var/local/ldb-toolchain/include/c++/11/bits/invoke.h:111
#19 std::_Function_handler<void(), doris::FragmentMgr::exec_plan_fragment(const doris::TExecPlanFragmentParams&, doris::FragmentMgr::FinishCallback)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /var/local/ldb-toolchain/include/c++/11/bits/std_function.h:291
#20 0x0000561899d9fcb5 in std::function<void ()>::operator()() const (this=<optimized out>)
    at /var/local/ldb-toolchain/include/c++/11/bits/std_function.h:560
#21 doris::FunctionRunnable::run (this=<optimized out>) at /data/TCHouse-D-1.2/be/src/util/threadpool.cpp:46
#22 doris::ThreadPool::dispatch_thread (this=0x5618ab4e0700) at /data/TCHouse-D-1.2/be/src/util/threadpool.cpp:535
#23 0x0000561899d9510f in std::function<void ()>::operator()() const (this=0x5618a9e02ef8)
    at /var/local/ldb-toolchain/include/c++/11/bits/std_function.h:560
#24 doris::Thread::supervise_thread (arg=0x5618a9e02ee0) at /data/TCHouse-D-1.2/be/src/util/thread.cpp:454
#25 0x00007fb77dce0ea5 in start_thread () from /lib64/libpthread.so.0
#26 0x00007fb77dff39fd in clone () from /lib64/libc.so.6
```

<!--Describe your changes.-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.0.2-merged reviewed usercase Important user case type label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] select/insert into select es catalog core
5 participants