You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In some cases we may wish to rechunk our data prior to execution. This can help to balance between high scheduling overheads (too many tasks) and poor load balancing (too few tasks).
It appears that different algorithms have different optimum sizes. For example algorithms with low task counts like ADMM benefit from smaller chunksizes while algorithms with many small tasks like gradient/proximal descent benefit from larger chunksizes.
The text was updated successfully, but these errors were encountered:
Section 4.2 Cache-aware Access of Tianqi Chen and Carlos Guestrin's XGBoost paper discusses the tradeoff between smaller blocks to larger block sizes. Smaller blocks workloads and inefficient parallelization for each thread. Larger blocks that result in processor cache misses. They do this analysis for two data sets Allstatate 10M and Higgs 10M. Figure 9 plots time per thread versus number of threads for block sizes of 2^12, 2^16, 2^20, and 2^24. There is a significant difference between the performance of the 2^24 block size and the other block sizes. https://arxiv.org/pdf/1603.02754.pdf
Maybe we could duplicate this experiment for ADMM on those two data sets. They used S3 instead of HDFS.
Also, it is worth looking at Section 4.3 Blocks for Out-of-core computation which discuss block-compression and sharding. You might also look at the Blosc/c-blosc package that hdf5 uses. https://github.com/Blosc/c-blosc
In some cases we may wish to rechunk our data prior to execution. This can help to balance between high scheduling overheads (too many tasks) and poor load balancing (too few tasks).
It appears that different algorithms have different optimum sizes. For example algorithms with low task counts like ADMM benefit from smaller chunksizes while algorithms with many small tasks like gradient/proximal descent benefit from larger chunksizes.
The text was updated successfully, but these errors were encountered: