Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use TTaskGroup interface to unzip baskets in parallel. #1010

Merged
merged 6 commits into from
Feb 20, 2018

Conversation

zzxuanyuan
Copy link
Contributor

@zzxuanyuan zzxuanyuan commented Sep 18, 2017

@bbockelm @pcanal @dpiparo

Here is the new imt unzipping basket with TTaskGroup interface.

Comparing to #785 , I noticed there are still 3%(in Real Time) ~ 5%(in CPU Time) performance drops in new implementation. The degradation is caused by tbb function:

tbb::internal::custom_schedulertbb::internal::IntelSchedulerTraits::receive_or_steal_task(long&)

I suspect the reason is because #785 in the following function:

https://github.com/zzxuanyuan/root/blob/15cceff19b48dfe4a4b0c69c1ec07ea75bd1ccb5/tree/tree/src/TTreeCacheUnzip.cxx#L708

CreateTasks() explicitly creates 2 tasks (empty_task and MappingTask; and set_ref_count(2) means 2 tasks in total). The scheduler might make a better decision here since it knows there will be only one task except empty_task running in future.

On the other hand, TTaskGroup uses tbb::task_group which calls the following function:

https://github.com/01org/tbb/blob/b9805bacadd4d0474fd3358cf0c7153042ce50c3/include/tbb/task_group.h#L108

task_group_base() also first creates a empty_task. However, it only creates 1 task(itself) by setting reference count as 1 (set_ref_count(1)). When it invoke another task by calling

https://github.com/01org/tbb/blob/b9805bacadd4d0474fd3358cf0c7153042ce50c3/include/tbb/task_group.h#L103

allocate_additional_child() will create a new task as child and increment reference count by 1. I guess accumulating tasks on-the-fly might degrade the performance since the tbb scheduler could spend more time on finding tasks to work on.

In a short, I think explicitly defining the total number of tasks and task graph should have better performance (more efficient for scheduler I guess) than adding more tasks to task_group as the program runs.

There are two alternative approaches that might improve the performance.

  1. Since we have already know we will only have one task (except empty_task) to add into the task_group, we could revise TTaskGroup interface and notify it what task is going to run in advance.
  2. We could get rid of TTaskGroup in my current implemention and synchronously map baskets to different tasks.

If we do not mind a little performance drops, the current implementation should be fine.

Thanks,

Zhe

@zzxuanyuan zzxuanyuan requested a review from pcanal as a code owner September 18, 2017 20:59
@phsft-bot
Copy link
Collaborator

Can one of the admins verify this patch?

@dpiparo
Copy link
Member

dpiparo commented Sep 19, 2017

Hi @zzxuanyuan ,

a lot of work. Thanks for exploring these aspects. I would have two comments:

  1. Are we really sure that ~3% real time is a number we have enough "resolution" to measure accurately?
  2. If that 3% is there systematically for several long runs of a trivial example and for something like a CMSSW skimming job, perhaps we can do something else. If I understand correctly, the TTaskGroup at this point is just a way to asynchronously schedule work. Wouldn't a thread running a lambda invoking TThreadedExecutor do the job too?

@zzxuanyuan
Copy link
Contributor Author

Hi @dpiparo ,

  1. I tested MainEvent.cxx with 500~50000 events on my desktop running Ubuntu 14.04. I repeated 10 runs for each test case and ~3% is average performance drop. I did not have a chance to run CMSSW skimming job. I am actually not familiar with CMSSW yet.

  2. What you were saying is correct. Since my case only needs one thread to invoke TThreadExecutor, using tbb task_group run interface likely spends too much time on scheduler's receive_and_steal function (from the profiling results).

@dpiparo
Copy link
Member

dpiparo commented Sep 19, 2017

Hi @zzxuanyuan , if this new setup reveals good enough to get back that 3% we can even consider having an execution policy for Async to allow the user to hit the runtime, and therefore the workers pool, or spawn a new thread.

@zzxuanyuan
Copy link
Contributor Author

@dpiparo , we could start some simple APIs and I hope we could validate that ~3% does be caused by TTaskGroup.Run.

@dpiparo
Copy link
Member

dpiparo commented Sep 20, 2017

@zzxuanyuan , sure. Before diving in the API upgrade, let's be sure the 3% is gone and start with the thread implementation of the parallel decompression and its tests. Upgrading your solid work will be then straightforward!

@dpiparo
Copy link
Member

dpiparo commented Sep 20, 2017

Hi @zzxuanyuan , I think we discussed this already, but I 'd like to go through this again. You need TTaskGroup to have something asynchronous. The real work is done by a "parallel for" incarnated in an invocation of the TThreadExecutor.
Given that the total number of workers is constant, can you remind me of the advantage we have of launching the "parallel for" from a different task/thread rather than having it invoked from the main thread directly? Is it because we think more tasks will be spawned by it in the meanwhile?

@zzxuanyuan
Copy link
Contributor Author

zzxuanyuan commented Sep 20, 2017

Hi @dpiparo ,

We want to group small baskets together so that a task can work on enough amount of work. Before TThreadExecutor starts decompressing baskets, our original plan (without async APIs) is that the main thread needs to iterate all baskets and assign a task with a group of baskets where their accumulated size is beyond 100KB (For a large basket >100 KB, we just assign that single basket to a task).

Iterating all baskets and assigning them to tasks cannot be parallelized. It will block the main thread and it could be harmful if there are lots of baskets to decompress. Therefore, we decided to hide this sequential processing to some background thread and then we came up with idea that adopt async function calls to our code.

Copy link
Contributor

@bbockelm bbockelm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A number of items to fix - some small (code formatting, header organization), some large (use TTaskGroup throughout instead of parallel_for).

See review comments.

Int_t fBlocksToGo;

// Unzipping related members
Int_t *fUnzipLen; ///<! [fNseek] Length of the unzipped buffers
char **fUnzipChunks; ///<! [fNseek] Individual unzipped chunks. Their summed size is kept under control.
Byte_t *fUnzipStatus; ///<! [fNSeek] For each blk, tells us if it's unzipped or pending
Long64_t fTotalUnzipBytes; ///<! The total sum of the currently unzipped blks
std::atomic<Byte_t> *fUnzipStatus; ///<! [fNSeek]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix alignment as best you can.

#include "TEnv.h"

#define THREADCNT 2
#include "ROOT/TThreadExecutor.hxx"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Order headers alphabetically, group by generality.

Put #include "ROOT/TThreadExecutor.hxx" into the group of ROOT headers.

@@ -122,14 +118,9 @@ TTreeCacheUnzip::TTreeCacheUnzip(TTree *tree, Int_t buffersize) : TTreeCache(tre

void TTreeCacheUnzip::Init()
{
fMutexList = new TMutex(kTRUE);
fUnzipTaskGroup = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please switch to std::unique_ptr instead.

}
// Prepare a static tmp buf of adequate size
if(locbuffsz < rdlen) {
if (locbuff) delete [] locbuff;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get rid of the manual memory management here; use std::vector instead.

if (locbuff) delete [] locbuff;
locbuffsz = rdlen;
locbuff = new char[locbuffsz];
//memset(locbuff, 0, locbuffsz);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented-out code.

UnzipCache(reqi, locbuffsz, locbuff);
}
} else {
usleep(200000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave a TODO to implement a task-stealing scheme instead of sleeping for a random amount of time.

}

} // scope of the lock!
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove dead code block.

delete [] fCompBuffer;
fCompBuffer = new char[len*2];
fCompBufferSize = len*2;
delete [] fCompBuffer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get rid of manual memory management (or at least use std::unique_ptr).

}


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove trailing whitespace.

accusz = 0;
}
ROOT::TThreadExecutor pool;
pool.Foreach(unzipFunction, basketIndices);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the source of your speed issues.

Using TThreadExecutor causes you to have to pre-create all the tasks; the first one isn't executed until all are created. Use the TTaskGroup object and it will schedule the tasks as they are created.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed earlier, the performance was decremented by another ~3% if I use TTaskGroup.Run here.

Since I still keep outer TTaskGroup, so there is still ~3% performance drop, but replacing inner TTaskGroup by TThreadExecutor mitigate the slowdown.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - per our discussion at the ROOT IO meeting, let's leave this for now and just focus on the code style cleanups.

@bbockelm
Copy link
Contributor

@zzxuanyuan - is it possible to get this PR updated / revised in the next day or two? I'll be at FNAL on Wednesday and would like to discuss this one with @pcanal.

@zzxuanyuan
Copy link
Contributor Author

zzxuanyuan commented Sep 26, 2017

@bbockelm I fix several issues addressed in your comments. Could you look at it now?
clang-tidy-modernize fails due to "sleep(0.2)". I tried to change it to "usleep(200000)", however, the performance was much slower. Removing sleep function also causes slow down.

@amadio
Copy link
Member

amadio commented Sep 29, 2017

Possible alternative to sleep, at least on Linux. http://man7.org/linux/man-pages/man2/sched_yield.2.html

@dpiparo
Copy link
Member

dpiparo commented Sep 29, 2017

Hi, I think the sleep could be replaced by a condition variable? The stl also provide an implementation: http://en.cppreference.com/w/cpp/thread/condition_variable

@bbockelm
Copy link
Contributor

bbockelm commented Oct 2, 2017

@dpiparo - we discussed the idea of a condition variable, but I'm wary. The condition variable approach causes a per-task overhead that is always paid (as notify would have to be done from the tasks). This per-task overhead is one of the things that make the pthreads implementation problematic.

However, this particular case is a fairly obscure corner case: this code is only triggered when there is exactly one remaining basket to unzip, the main thread needs it, and one of the TBB threads is currently working on it.

Talking to Zhe, I think the best way to go is a busy-loop (with sched_yield in the Linux case) instead. It has no penalty in the common case - and won't share the potential to overshoot the waiting by such a significant amount.

@zzxuanyuan
Copy link
Contributor Author

@pcanal @bbockelm

The updates address some issues for random read case and the code should be good now.

Some updates after last Friday meeting:

As we discussed last Friday, random read performance is very slow. It technically cannot be improved if we decide to use cache. I also tried random read workload with pthread. The performance was the same with tbb. I think the reason is obvious that reading next random event will invalidate current cache and all baskets need to be reset and cache buffer has to be filled by next cluster of events. Based on current cache replacement policy, the slow performance is reasonable.

Philippe pointed out the common use case for ROOT should be mostly sequential reads plus little random reads. I was thinking it would be not helpful if we store decompressed baskets by main thread (when cache miss happens) back to cache. Because for sequential read, they will not be accessed again, neither random reads since the cache will be invalidate and all decompressed baskets in cache should be marked as invalid.

Copy link
Contributor

@bbockelm bbockelm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pcanal - I think it's now ready for your review!

@zzxuanyuan
Copy link
Contributor Author

zzxuanyuan commented Oct 11, 2017

@dpiparo @bbockelm ,

Hi Danilo,

Based on current cache replacement policy, the cache will be invalidated (set fIsTransferred to kFALSE) immediately once the first event miss occurs. In my current implementation, each task monitors fIsTransferred and return immediately without doing actual unzipping work. But we still need to create tasks corresponding to the number of baskets. I am wondering if we should add task_group.cancel() function into TTaskGroup interface? In that case, the main thread only needs to cancel all tasks once the cache is invalid.

With event simulation benchmark, I did not see too much difference between task_group wait and cancel. But I guess it could be more efficient once the number of baskets in cache buffer becomes larger.

if (fUnzipChunks) delete [] fUnzipChunks;
if (fUnzipStatus) delete [] fUnzipStatus;
}
void Clear(Int_t size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort the member alphabetically (at least within grouping of functionality)

pf = new TTreeCacheUnzip(this, cacheSize);
#else
pf = new TTreeCache(this, cacheSize);
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to write it as:

#ifdef R__USE_IMT
   if(TTreeCacheUnzip::IsParallelUnzip() && file->GetCompressionLevel() > 0)
      pf = new TTreeCacheUnzip(this, cacheSize);
   else
#endif
      pf = new TTreeCache(this, cacheSize);

@@ -41,21 +41,23 @@ TTreeCache::SetUnzipBufferSize(Long64_t bufferSize)
where bufferSize must be passed in bytes.
*/

#include "Bytes.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be with the other header (after TTreeCacheUnzip.h)?

If it is needed inside TTreeCacheUnzip.h then it should be include there (because each header should be standalone).

////////////////////////////////////////////////////////////////////////////////
/// Reset all baskets' state arrays.

void TTreeCacheUnzip::UnzipState::Reset(Int_t oldSize, Int_t newSize) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since fUnzipStatus point to atomic, they are presumably accessed from multiple thread. Can 'Reset' be called when there is a potential access from other threads? If so, was the ordering of the operation here checked for thread safety. If so, can you add documentation/explanation. If Reset does not need to be thread, please note in comment why this is the case.


////////////////////////////////////////////////////////////////////////////////

Bool_t TTreeCacheUnzip::UnzipState::IsUntouched(Int_t index) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mark this function (and all function to which it applies) as const.

// Triggered by the user, not the learning phase
if (entry == -1) entry=0;

TTree::TClusterIterator clusterIter = tree->GetClusterIterator(entry);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following code feels copy/pasted (and likely adapted from somewhere else. If it is following the pattern of another routine and there is no easy way to factor them out, then please write down both here and in the original that the two are similar and note here what are the differences. [I.e. at least something like. Inspired by XYZ::GetABC, adding calls to Some::Thing() inside the 2nd nested loops [or whatever is accurate :) ]]

return 1;
}

// Prepare a static tmp buf of adequate size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment probably needs updating (it mentions static but nothing seems (gladfully) static).

Int_t loclen = UnzipBuffer(&ptr, locbuff);
if ((loclen > 0) && (loclen == objlen+keylen)) {
if ( (myCycle != fCycle) || !fIsTransferred) {
fUnzipState.SetFinished(index); // Set it as not done
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment and the function (name) seems to disagree on the semantic ...

std::vector<std::vector<Int_t>> basketIndices;
std::vector<Int_t> indices;
for (Int_t i = 0; i < fNseek; i++) {
while (accusz < 102400) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the semantic of 102400? Why was the particular value picked? Could the user ever want to customize this value?
Either way, please use a constexpr to record and name this value.

@@ -958,7 +872,7 @@ Int_t TTreeCacheUnzip::UnzipBuffer(char **dest, char *src)
/* early consistency check */
UChar_t *bufcur = (UChar_t *) (src + keylen);
Int_t nin, nbuf;
if(R__unzip_header(&nin, bufcur, &nbuf)!=0) {
if(objlen > nbytes-keylen && R__unzip_header(&nin, bufcur, &nbuf)!=0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is changed in a commit titled "Use TTaskGroup interface to unzip baskets in parallel. "
The link between this change and the title is not clear. If it is not related can you put it in its own commit?

@dpiparo
Copy link
Member

dpiparo commented Oct 16, 2017

Hi all. The implementation of this feature radically changed wrt ~1 month ago. How is the "Event" benchmark performing? How is the "CMSSW candle" performing?

@zzxuanyuan
Copy link
Contributor Author

@dpiparo
Event benchmark is similar as before. I have not test CMSSW yet since I do not know a correct version of file need to test.

@zzxuanyuan
Copy link
Contributor Author

zzxuanyuan commented Oct 17, 2017

Hi @dpiparo @bbockelm @pcanal

Run two tests: Event benchmark and B2HHH.root (compressed with zlib-6). Both of the tests disabled parallel TTree::GetEntry since receive_and_steal function from tbb takes ridiculous long in total runtime.

B2HHH.root (25 branches and 8556118 entries):

Unzipping Real Time (s) CPU Time (s)
Serial 12.42 12.42
IMT 8,12 13.93

Event benchmark (56 branches and 10000 entries):

Unzipping Real Time (s) CPU Time (s)
Serial 8.03 8.02
IMT 5.14 9.00

@zzxuanyuan
Copy link
Contributor Author

@pcanal @bbockelm ,

This PR has been updated to the upstream.

Zhe

@pcanal
Copy link
Member

pcanal commented Jan 26, 2018

@phsft-bot build

@phsft-bot
Copy link
Collaborator

Starting build on centos7/gcc49, mac1013/native, slc6/gcc49, slc6/gcc62, slc6/gcc62, ubuntu16/native, ubuntu16/native, windows10/vc15 with flags -Dvc=OFF -Dimt=ON -Dccache=ON
How to customize builds

@zzxuanyuan
Copy link
Contributor Author

@pcanal Unfortunately, I can't see what output of the build. I do not have access permission. What are those failures about?

@zzxuanyuan
Copy link
Contributor Author

zzxuanyuan commented Feb 1, 2018

@pcanal @bbockelm If I turn on TBB both for TTree::GetEntry() and TTreeCacheUnzip, the system will crush as due to the memory unalignment read. I do not know why this happens, but if I only turn on TBB for either TTree::GetEntry or TTreeCacheUnzip, it won't happen.

I post the stack trace from gdb as follows:

//===========================================================
There was a crash.
This is the entire stack trace of all threads:
//===========================================================

Thread 4 (Thread 0x7f16176f2700 (LWP 30317)):
#0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1 0x00007f1624f3ee42 in __GI___pthread_mutex_lock (mutex=0x2e995b0) at ../nptl/pthread_mutex_lock.c:115
#2 0x00007f161a717930 in TLockGuard::TLockGuard (mutex=0x2e9b6e0, this=) at /home/zhe/buildimt/include/TVirtualMutex.h:85
#3 TTreeCacheUnzip::ReadBufferExt (this=0x2e97a30, buf=0x7f1616016010 "", pos=18817671, len=647382, loc=
0x7f16176ed584: -1) at /home/zhe/root/tree/tree/src/TTreeCacheUnzip.cxx:978
#4 0x00007f161a716b3c in TTreeCacheUnzip::GetUnzipBuffer (this=0x2e97a30, buf=0x7f16176ed620, pos=18817671, len=647382, free=0x7f16176ed61c) at /home/zhe/root/tree/tree/src/TTreeCacheUnzip.cxx:810
#5 0x00007f161a6a7d97 in TBasket::ReadBasketBuffers (this=this
entry=0x7f160c0008f0, pos=18817671, len=647382, file=file
entry=0x1e41b90) at /home/zhe/root/tree/tree/src/TBasket.cxx:474
#6 0x00007f161a6b22d0 in TBranch::GetBasket (this=this
entry=0x2f97910, basketnumber=0) at /home/zhe/root/tree/tree/src/TBranch.cxx:1159
#7 0x00007f161a6b29db in TBranch::GetEntry (this=0x2f97910, entry=0, getall=) at /home/zhe/root/tree/tree/src/TBranch.cxx:1285
#8 0x00007f161a6c6607 in TTree::<lambda()>::operator()(void) const (__closure=0x7ffdc1fce730) at /home/zhe/root/tree/tree/src/TTree.cxx:5478
#9 0x00007f161afe63a6 in std::function<void (unsigned int)>::operator()(unsigned int) const (__args#0=9, this=) at /usr/include/c++/5/functional:2267
#10 tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>::operator()(tbb::blocked_range const&) const (r=..., this=0x7f161ab3bd58) at /home/zhe/buildimt/include/tbb/parallel_for.h:162
#11 tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>::run_body(tbb::blocked_range&) (r=..., this=0x7f161ab3bd40) at /home/zhe/buildimt/include/tbb/parallel_for.h:102
#12 tbb::interface9::internal::balancing_partition_type<tbb::interface9::internal::adaptive_modetbb::interface9::internal::auto_partition_type >::work_balance<tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>, tbb::blocked_range >(tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>&, tbb::blocked_range&) (range=..., start=warning: RTTI symbol not found for class 'tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>'
#13 tbb::interface9::internal::partition_type_basetbb::interface9::internal::auto_partition_type::execute<tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>, tbb::blocked_range >(tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>&, tbb::blocked_range&) (range=..., start=warning: RTTI symbol not found for class 'tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>'
#14 tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>::execute() (this=0x7f161ab3bd40) at /home/zhe/buildimt/include/tbb/parallel_for.h:127
#15 0x00007f161adc854b in tbb::internal::custom_schedulertbb::internal::IntelSchedulerTraits::local_wait_for_all (this=0x7f161ab2fe00, parent=..., child=) at ../../src/tbb/custom_scheduler.h:501
#16 0x00007f161adc1522 in tbb::internal::arena::process (this=0x7f161ab4ed00, s=...) at ../../src/tbb/arena.cpp:159
#17 0x00007f161adbffa4 in tbb::internal::market::process (this=0x7f161ab57e80, j=...) at ../../src/tbb/market.cpp:677
#18 0x00007f161adbbbb6 in tbb::internal::rml::private_worker::run (this=0x7f161ab4fc00) at ../../src/tbb/private_server.cpp:271
#19 0x00007f161adbbe09 in tbb::internal::rml::private_worker::thread_routine (arg=) at ../../src/tbb/private_server.cpp:224
#20 0x00007f1624f3c6ba in start_thread (arg=0x7f16176f2700) at pthread_create.c:333
#21 0x00007f16258e741d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 3 (Thread 0x7f1617af3700 (LWP 30316)):
#0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1 0x00007f1624f3ee42 in __GI___pthread_mutex_lock (mutex=0x2e995b0) at ../nptl/pthread_mutex_lock.c:115
#2 0x00007f161a717930 in TLockGuard::TLockGuard (mutex=0x2e9b6e0, this=) at /home/zhe/buildimt/include/TVirtualMutex.h:85
#3 TTreeCacheUnzip::ReadBufferExt (this=0x2e97a30, buf=0x7f1616016010 "", pos=1235260, len=1248359, loc=
0x7f1617aee584: -1) at /home/zhe/root/tree/tree/src/TTreeCacheUnzip.cxx:978
#4 0x00007f161a716b3c in TTreeCacheUnzip::GetUnzipBuffer (this=0x2e97a30, buf=0x7f1617aee620, pos=1235260, len=1248359, free=0x7f1617aee61c) at /home/zhe/root/tree/tree/src/TTreeCacheUnzip.cxx:810
#5 0x00007f161a6a7d97 in TBasket::ReadBasketBuffers (this=this
entry=0x7f16080008f0, pos=1235260, len=1248359, file=file
entry=0x1e41b90) at /home/zhe/root/tree/tree/src/TBasket.cxx:474
#6 0x00007f161a6b22d0 in TBranch::GetBasket (this=this
entry=0x2f884b0, basketnumber=0) at /home/zhe/root/tree/tree/src/TBranch.cxx:1159
#7 0x00007f161a6b29db in TBranch::GetEntry (this=0x2f884b0, entry=0, getall=) at /home/zhe/root/tree/tree/src/TBranch.cxx:1285
#8 0x00007f161a6c6607 in TTree::<lambda()>::operator()(void) const (__closure=0x7ffdc1fce730) at /home/zhe/root/tree/tree/src/TTree.cxx:5478
#9 0x00007f161afe60b3 in std::function<void (unsigned int)>::operator()(unsigned int) const (__args#0=13, this=) at /usr/include/c++/5/functional:2267
#10 tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>::operator()(tbb::blocked_range const&) const (r=..., this=0x7f161ab4ba58) at /home/zhe/buildimt/include/tbb/parallel_for.h:162
#11 tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>::run_body(tbb::blocked_range&) (r=..., this=0x7f161ab4ba40) at /home/zhe/buildimt/include/tbb/parallel_for.h:102
#12 tbb::interface9::internal::balancing_partition_type<tbb::interface9::internal::adaptive_modetbb::interface9::internal::auto_partition_type >::work_balance<tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>, tbb::blocked_range >(tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>&, tbb::blocked_range&) (range=..., start=..., this=) at /home/zhe/buildimt/include/tbb/partitioner.h:429
#13 tbb::interface9::internal::partition_type_basetbb::interface9::internal::auto_partition_type::execute<tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>, tbb::blocked_range >(tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>&, tbb::blocked_range&) (range=..., start=warning: RTTI symbol not found for class 'tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>'
#14 tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>::execute() (this=0x7f161ab4ba40) at /home/zhe/buildimt/include/tbb/parallel_for.h:127
#15 0x00007f161adc854b in tbb::internal::custom_schedulertbb::internal::IntelSchedulerTraits::local_wait_for_all (this=0x7f161ab3fe00, parent=..., child=) at ../../src/tbb/custom_scheduler.h:501
#16 0x00007f161adc1522 in tbb::internal::arena::process (this=0x7f161ab4ed00, s=...) at ../../src/tbb/arena.cpp:159
#17 0x00007f161adbffa4 in tbb::internal::market::process (this=0x7f161ab57e80, j=...) at ../../src/tbb/market.cpp:677
#18 0x00007f161adbbbb6 in tbb::internal::rml::private_worker::run (this=0x7f161ab4fd00) at ../../src/tbb/private_server.cpp:271
#19 0x00007f161adbbe09 in tbb::internal::rml::private_worker::thread_routine (arg=) at ../../src/tbb/private_server.cpp:224
#20 0x00007f1624f3c6ba in start_thread (arg=0x7f1617af3700) at pthread_create.c:333
#21 0x00007f16258e741d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 2 (Thread 0x7f1617ef4700 (LWP 30315)):
#0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1 0x00007f1624f3ee42 in __GI___pthread_mutex_lock (mutex=0x2e995b0) at ../nptl/pthread_mutex_lock.c:115
#2 0x00007f161a717930 in TLockGuard::TLockGuard (mutex=0x2e9b6e0, this=) at /home/zhe/buildimt/include/TVirtualMutex.h:85
#3 TTreeCacheUnzip::ReadBufferExt (this=0x2e97a30, buf=0x7f1616016010 "", pos=19528010, len=1132885, loc=
0x7f1617eef584: -1) at /home/zhe/root/tree/tree/src/TTreeCacheUnzip.cxx:978
#4 0x00007f161a716b3c in TTreeCacheUnzip::GetUnzipBuffer (this=0x2e97a30, buf=0x7f1617eef620, pos=19528010, len=1132885, free=0x7f1617eef61c) at /home/zhe/root/tree/tree/src/TTreeCacheUnzip.cxx:810
#5 0x00007f161a6a7d97 in TBasket::ReadBasketBuffers (this=this
entry=0x7f1610000ac0, pos=19528010, len=1132885, file=file
entry=0x1e41b90) at /home/zhe/root/tree/tree/src/TBasket.cxx:474
#6 0x00007f161a6b22d0 in TBranch::GetBasket (this=this
entry=0x2f99770, basketnumber=0) at /home/zhe/root/tree/tree/src/TBranch.cxx:1159
#7 0x00007f161a6b29db in TBranch::GetEntry (this=0x2f99770, entry=0, getall=) at /home/zhe/root/tree/tree/src/TBranch.cxx:1285
#8 0x00007f161a6c6607 in TTree::<lambda()>::operator()(void) const (__closure=0x7ffdc1fce730) at /home/zhe/root/tree/tree/src/TTree.cxx:5478
#9 0x00007f161afe60b3 in std::function<void (unsigned int)>::operator()(unsigned int) const (__args#0=6, this=) at /usr/include/c++/5/functional:2267
#10 tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>::operator()(tbb::blocked_range const&) const (r=..., this=0x7f161ab4b858) at /home/zhe/buildimt/include/tbb/parallel_for.h:162
#11 tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>::run_body(tbb::blocked_range&) (r=..., this=0x7f161ab4b840) at /home/zhe/buildimt/include/tbb/parallel_for.h:102
#12 tbb::interface9::internal::balancing_partition_type<tbb::interface9::internal::adaptive_modetbb::interface9::internal::auto_partition_type >::work_balance<tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>, tbb::blocked_range >(tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>&, tbb::blocked_range&) (range=..., start=..., this=) at /home/zhe/buildimt/include/tbb/partitioner.h:429
#13 tbb::interface9::internal::partition_type_basetbb::interface9::internal::auto_partition_type::execute<tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>, tbb::blocked_range >(tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>&, tbb::blocked_range&) (range=..., start=warning: RTTI symbol not found for class 'tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>'
#14 tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>::execute() (this=0x7f161ab4b840) at /home/zhe/buildimt/include/tbb/parallel_for.h:127
#15 0x00007f161adc854b in tbb::internal::custom_schedulertbb::internal::IntelSchedulerTraits::local_wait_for_all (this=0x7f161ab37e00, parent=..., child=) at ../../src/tbb/custom_scheduler.h:501
#16 0x00007f161adc1522 in tbb::internal::arena::process (this=0x7f161ab4ed00, s=...) at ../../src/tbb/arena.cpp:159
#17 0x00007f161adbffa4 in tbb::internal::market::process (this=0x7f161ab57e80, j=...) at ../../src/tbb/market.cpp:677
#18 0x00007f161adbbbb6 in tbb::internal::rml::private_worker::run (this=0x7f161ab4fc80) at ../../src/tbb/private_server.cpp:271
#19 0x00007f161adbbe09 in tbb::internal::rml::private_worker::thread_routine (arg=) at ../../src/tbb/private_server.cpp:224
#20 0x00007f1624f3c6ba in start_thread (arg=0x7f1617ef4700) at pthread_create.c:333
#21 0x00007f16258e741d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 1 (Thread 0x7f1626ed7a40 (LWP 30289)):
#0 0x00007f16258ac0cb in __GI___waitpid (pid=30320, stat_loc=stat_loc
entry=0x7ffdc1fcb5c0, options=options
entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:29
#1 0x00007f1625824fbb in do_system (line=) at ../sysdeps/posix/system.c:148
#2 0x00007f162696e21d in TUnixSystem::Exec (shellcmd=, this=0x15da570) at /home/zhe/root/core/unix/src/TUnixSystem.cxx:2118
#3 TUnixSystem::StackTrace (this=0x15da570) at /home/zhe/root/core/unix/src/TUnixSystem.cxx:2412
#4 0x00007f162697085c in TUnixSystem::DispatchSignals (this=0x15da570, sig=kSigSegmentationViolation) at /home/zhe/root/core/unix/src/TUnixSystem.cxx:3643
#5
#6 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:37
#7 0x00007f1626206eb4 in memcpy (__len=1234974, __src=, __dest=0x7f1616858010) at /usr/include/x86_64-linux-gnu/bits/string3.h:53
#8 TFileCacheRead::ReadBufferExtNormal (this=0x2e97a30, buf=0x7f1616858010 <error: Cannot access memory at address 0x7f1616858010>, pos=286, len=1234974, loc=
0x7ffdc1fcdf44: 0) at /home/zhe/root/io/io/src/TFileCacheRead.cxx:531
#9 0x00007f161a71794a in TTreeCacheUnzip::ReadBufferExt (this=0x2e97a30, buf=, pos=, len=, loc=) at /home/zhe/root/tree/tree/src/TTreeCacheUnzip.cxx:979
#10 0x00007f161a716b3c in TTreeCacheUnzip::GetUnzipBuffer (this=0x2e97a30, buf=0x7ffdc1fcdfe0, pos=286, len=1234974, free=0x7ffdc1fcdfdc) at /home/zhe/root/tree/tree/src/TTreeCacheUnzip.cxx:810
#11 0x00007f161a6a7d97 in TBasket::ReadBasketBuffers (this=this
entry=0x2e9b770, pos=286, len=1234974, file=file
entry=0x1e41b90) at /home/zhe/root/tree/tree/src/TBasket.cxx:474
#12 0x00007f161a6b22d0 in TBranch::GetBasket (this=this
entry=0x2f7f180, basketnumber=0) at /home/zhe/root/tree/tree/src/TBranch.cxx:1159
#13 0x00007f161a6b29db in TBranch::GetEntry (this=0x2f7f180, entry=0, getall=) at /home/zhe/root/tree/tree/src/TBranch.cxx:1285
#14 0x00007f161a6c6607 in TTree::<lambda()>::operator()(void) const (__closure=0x7ffdc1fce730) at /home/zhe/root/tree/tree/src/TTree.cxx:5478
#15 0x00007f161afe60b3 in std::function<void (unsigned int)>::operator()(unsigned int) const (__args#0=0, this=) at /usr/include/c++/5/functional:2267
#16 tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>::operator()(tbb::blocked_range const&) const (r=..., this=0x7f161ab4bd58) at /home/zhe/buildimt/include/tbb/parallel_for.h:162
#17 tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>::run_body(tbb::blocked_range&) (r=..., this=0x7f161ab4bd40) at /home/zhe/buildimt/include/tbb/parallel_for.h:102
#18 tbb::interface9::internal::balancing_partition_type<tbb::interface9::internal::adaptive_modetbb::interface9::internal::auto_partition_type >::work_balance<tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>, tbb::blocked_range >(tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>&, tbb::blocked_range&) (range=..., start=..., this=) at /home/zhe/buildimt/include/tbb/partitioner.h:429
#19 tbb::interface9::internal::partition_type_basetbb::interface9::internal::auto_partition_type::execute<tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>, tbb::blocked_range >(tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>&, tbb::blocked_range&) (range=..., start=warning: RTTI symbol not found for class 'tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>'
#20 tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>::execute() (this=0x7f161ab4bd40) at /home/zhe/buildimt/include/tbb/parallel_for.h:127
#21 0x00007f161adc854b in tbb::internal::custom_schedulertbb::internal::IntelSchedulerTraits::local_wait_for_all (this=0x7f161ab46600, parent=..., child=) at ../../src/tbb/custom_scheduler.h:501
#22 0x00007f161adc5450 in tbb::internal::generic_scheduler::local_spawn_root_and_wait (this=0x7f161ab46600, first=warning: RTTI symbol not found for class 'tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>'
#23 0x00007f161afe48b2 in tbb::task::spawn_root_and_wait (root=warning: RTTI symbol not found for class 'tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>'
#24 tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>::run(tbb::blocked_range const&, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int> const&, tbb::auto_partitioner const&) (partitioner=..., body=, range=) at /home/zhe/buildimt/include/tbb/parallel_for.h:90
#25 tbb::parallel_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int> >(tbb::blocked_range const&, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int> const&, tbb::auto_partitioner const&) (partitioner=..., body=, range=) at /home/zhe/buildimt/include/tbb/parallel_for.h:200
#26 tbb::strict_ppl::parallel_for_impl<unsigned int, std::function<void (unsigned int)>, tbb::auto_partitioner const>(unsigned int, unsigned int, unsigned int, std::function<void (unsigned int)> const&, tbb::auto_partitioner const&) (first=0, last=, step=1, f=..., partitioner=...) at /home/zhe/buildimt/include/tbb/parallel_for.h:268
#27 0x00007f161afe4a7d in tbb::strict_ppl::parallel_for_impl<unsigned int, std::function<void (unsigned int)>, tbb::auto_partitioner const>(unsigned int, unsigned int, unsigned int, std::function<void (unsigned int)> const&, tbb::auto_partitioner const&) (partitioner=..., f=..., step=1, last=, first=0) at /home/zhe/root/core/imt/src/TThreadExecutor.cxx:92
#28 tbb::strict_ppl::parallel_for<unsigned int, std::function<void (unsigned int)> >(unsigned int, unsigned int, unsigned int, std::function<void (unsigned int)> const&) (f=..., step=1, last=, first=0) at /home/zhe/buildimt/include/tbb/parallel_for.h:275
#29 ROOT::TThreadExecutor::ParallelFor(unsigned int, unsigned int, unsigned int, std::function<void (unsigned int)> const&) (this=this
entry=0x7ffdc1fce720, start=start
entry=0, end=, step=step
entry=1, f=...) at /home/zhe/root/core/imt/src/TThreadExecutor.cxx:91
#30 0x00007f161a6cb094 in ROOT::TThreadExecutor::Foreach<TTree::GetEntry(Long64_t, Int_t)::<lambda()> > (nTimes=, func=..., this=0x7ffdc1fce720) at /home/zhe/buildimt/include/ROOT/TThreadExecutor.hxx:115
#31 TTree::GetEntry (this=0x2c85d30, entry=0, getall=0) at /home/zhe/root/tree/tree/src/TTree.cxx:5489
#32 0x00000000004012fd in main ()
//===========================================================

The lines below might hint at the cause of the crash.
You may get help by asking at the ROOT forum http://root.cern.ch/forum.
Only if you are really convinced it is a bug in ROOT then please submit a
report at http://root.cern.ch/bugs. Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
//===========================================================
#6 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:37
#7 0x00007f1626206eb4 in memcpy (__len=1234974, __src=, __dest=0x7f1616858010) at /usr/include/x86_64-linux-gnu/bits/string3.h:53
#8 TFileCacheRead::ReadBufferExtNormal (this=0x2e97a30, buf=0x7f1616858010 <error: Cannot access memory at address 0x7f1616858010>, pos=286, len=1234974, loc=
0x7ffdc1fcdf44: 0) at /home/zhe/root/io/io/src/TFileCacheRead.cxx:531
#9 0x00007f161a71794a in TTreeCacheUnzip::ReadBufferExt (this=0x2e97a30, buf=, pos=, len=, loc=) at /home/zhe/root/tree/tree/src/TTreeCacheUnzip.cxx:979
#10 0x00007f161a716b3c in TTreeCacheUnzip::GetUnzipBuffer (this=0x2e97a30, buf=0x7ffdc1fcdfe0, pos=286, len=1234974, free=0x7ffdc1fcdfdc) at /home/zhe/root/tree/tree/src/TTreeCacheUnzip.cxx:810
#11 0x00007f161a6a7d97 in TBasket::ReadBasketBuffers (this=this
entry=0x2e9b770, pos=286, len=1234974, file=file
entry=0x1e41b90) at /home/zhe/root/tree/tree/src/TBasket.cxx:474
#12 0x00007f161a6b22d0 in TBranch::GetBasket (this=this
entry=0x2f7f180, basketnumber=0) at /home/zhe/root/tree/tree/src/TBranch.cxx:1159
#13 0x00007f161a6b29db in TBranch::GetEntry (this=0x2f7f180, entry=0, getall=) at /home/zhe/root/tree/tree/src/TBranch.cxx:1285
#14 0x00007f161a6c6607 in TTree::<lambda()>::operator()(void) const (__closure=0x7ffdc1fce730) at /home/zhe/root/tree/tree/src/TTree.cxx:5478
#15 0x00007f161afe60b3 in std::function<void (unsigned int)>::operator()(unsigned int) const (__args#0=0, this=) at /usr/include/c++/5/functional:2267
#16 tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>::operator()(tbb::blocked_range const&) const (r=..., this=0x7f161ab4bd58) at /home/zhe/buildimt/include/tbb/parallel_for.h:162
#17 tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>::run_body(tbb::blocked_range&) (r=..., this=0x7f161ab4bd40) at /home/zhe/buildimt/include/tbb/parallel_for.h:102
#18 tbb::interface9::internal::balancing_partition_type<tbb::interface9::internal::adaptive_modetbb::interface9::internal::auto_partition_type >::work_balance<tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>, tbb::blocked_range >(tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>&, tbb::blocked_range&) (range=..., start=..., this=) at /home/zhe/buildimt/include/tbb/partitioner.h:429
#19 tbb::interface9::internal::partition_type_basetbb::interface9::internal::auto_partition_type::execute<tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>, tbb::blocked_range >(tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>&, tbb::blocked_range&) (range=..., start=warning: RTTI symbol not found for class 'tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>'
#20 tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>::execute() (this=0x7f161ab4bd40) at /home/zhe/buildimt/include/tbb/parallel_for.h:127
#21 0x00007f161adc854b in tbb::internal::custom_schedulertbb::internal::IntelSchedulerTraits::local_wait_for_all (this=0x7f161ab46600, parent=..., child=) at ../../src/tbb/custom_scheduler.h:501
#22 0x00007f161adc5450 in tbb::internal::generic_scheduler::local_spawn_root_and_wait (this=0x7f161ab46600, first=warning: RTTI symbol not found for class 'tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>'
#23 0x00007f161afe48b2 in tbb::task::spawn_root_and_wait (root=warning: RTTI symbol not found for class 'tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>'
#24 tbb::interface9::internal::start_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>::run(tbb::blocked_range const&, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int> const&, tbb::auto_partitioner const&) (partitioner=..., body=, range=) at /home/zhe/buildimt/include/tbb/parallel_for.h:90
#25 tbb::parallel_for<tbb::blocked_range, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int> >(tbb::blocked_range const&, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int> const&, tbb::auto_partitioner const&) (partitioner=..., body=, range=) at /home/zhe/buildimt/include/tbb/parallel_for.h:200
#26 tbb::strict_ppl::parallel_for_impl<unsigned int, std::function<void (unsigned int)>, tbb::auto_partitioner const>(unsigned int, unsigned int, unsigned int, std::function<void (unsigned int)> const&, tbb::auto_partitioner const&) (first=0, last=, step=1, f=..., partitioner=...) at /home/zhe/buildimt/include/tbb/parallel_for.h:268
#27 0x00007f161afe4a7d in tbb::strict_ppl::parallel_for_impl<unsigned int, std::function<void (unsigned int)>, tbb::auto_partitioner const>(unsigned int, unsigned int, unsigned int, std::function<void (unsigned int)> const&, tbb::auto_partitioner const&) (partitioner=..., f=..., step=1, last=, first=0) at /home/zhe/root/core/imt/src/TThreadExecutor.cxx:92
#28 tbb::strict_ppl::parallel_for<unsigned int, std::function<void (unsigned int)> >(unsigned int, unsigned int, unsigned int, std::function<void (unsigned int)> const&) (f=..., step=1, last=, first=0) at /home/zhe/buildimt/include/tbb/parallel_for.h:275
#29 ROOT::TThreadExecutor::ParallelFor(unsigned int, unsigned int, unsigned int, std::function<void (unsigned int)> const&) (this=this
entry=0x7ffdc1fce720, start=start
entry=0, end=, step=step
entry=1, f=...) at /home/zhe/root/core/imt/src/TThreadExecutor.cxx:91
#30 0x00007f161a6cb094 in ROOT::TThreadExecutor::Foreach<TTree::GetEntry(Long64_t, Int_t)::<lambda()> > (nTimes=, func=..., this=0x7ffdc1fce720) at /home/zhe/buildimt/include/ROOT/TThreadExecutor.hxx:115
#31 TTree::GetEntry (this=0x2c85d30, entry=0, getall=0) at /home/zhe/root/tree/tree/src/TTree.cxx:5489
#32 0x00000000004012fd in main ()
//===========================================================

@pcanal
Copy link
Member

pcanal commented Feb 2, 2018

@phsft-bot build

@phsft-bot
Copy link
Collaborator

Starting build on centos7/gcc49, mac1013/native, slc6/gcc49, slc6/gcc62, slc6/gcc62, ubuntu16/native, ubuntu16/native, windows10/vc15 with flags -Dvc=OFF -Dimt=ON -Dccache=ON
How to customize builds

@phsft-bot
Copy link
Collaborator

@zzxuanyuan
Copy link
Contributor Author

zzxuanyuan commented Feb 3, 2018 via email

@zzxuanyuan
Copy link
Contributor Author

@pcanal @bbockelm
The failed test cases seem to be transient failures that were shown as "Time Out".
I re-run all the failed tests on my desktop and all of them passed except this one:

[projectroot.roottest.root.multicore.roottest_root_multicore_tp_process_imt]

It is still shown as "Time Out" on my desktop. I also tried this particular test with latest upstream root. It can't pass either.

@bbockelm
Copy link
Contributor

@phsft-bot build

@phsft-bot
Copy link
Collaborator

Starting build on centos7/gcc49, mac1013/native, slc6/gcc49, slc6/gcc62, slc6/gcc62, ubuntu16/native, ubuntu16/native, windows10/vc15 with flags -Dvc=OFF -Dimt=ON -Dccache=ON
How to customize builds

@phsft-bot
Copy link
Collaborator

Build failed on ubuntu16/native.
See console output.

@bbockelm
Copy link
Contributor

Test failed due to failure of uploading test results to cdash, it seems:

05:58:48 100% tests passed, 0 tests failed out of 1038

Two minutes later:

06:00:58    Error message was: Operation too slow. Less than 1 bytes/sec transferred the last 120 seconds
06:00:58    Problems when submitting via HTTP

@bbockelm
Copy link
Contributor

@phsft-bot build

(from the Mattermost discussion, it seems there were overnight issues in CVMFS?)

@phsft-bot
Copy link
Collaborator

Starting build on centos7/gcc49, mac1013/native, slc6/gcc49, slc6/gcc62, slc6/gcc62, ubuntu16/native, ubuntu16/native, windows10/vc15 with flags -Dvc=OFF -Dimt=ON -Dccache=ON
How to customize builds

@pcanal pcanal merged commit bd37056 into root-project:master Feb 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants