-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance and portability testing #25
Comments
Raspberry Pi 3ARM Cortex-A53, 1.2 GHz, 1 GB LPDDR2, Ubuntu 16.04 armv7l (32-bit), compiled with GCC 8.1.0
AMD Ryzen 7 1700Clocked at 3.35 GHz (Turbo disabled), 16 GB of DDR4-2933 (dual channel), Ubuntu 16.04 x86_64, compiled with GCC 8.1.0
Intel Core i7-8550UBoosts to ~3.5 GHz at the beginning, then throttles down to ~2.4 GHz (mining mode).
|
Intel Core i7-7700HQ16GB of DDR4. Base speed 2.8GHz. Boosts to ~3.5GHz in testing. L3 cache 6MB. Windows 10 64-bit. Used precompiled binaries.
|
@SamsungGalaxyPlayer Can you try |
Intel Core i5-8250U CPU @ 1.60GHz
Compiled latest commit |
Intel(R) Xeon(R) CPU E5506 @ 2.13GHz
|
AMD Ryzen Mobile Pro 2700U @ 2.20GHz, 32GB DDR4-2400
|
Geekbox (Rockchip RK3368, 8xARM Cortex-A53 @ 1.5GHz, 2GB RAM)
I'm a bit overwhelmed with work at the moment, but I can get ARM AES working after my next conference is over in a couple weeks. Or someone else can beat me to it if they like. |
Xeon E5-2670v2 @ 2.50Ghz (2 processors), 128GB ECC DD3
|
Try mining mode with only 10 threads. It might run faster than 12 threads because of L2 cache size. |
You mean L3? It is 25Mb per CPU (plus 2.5Mb L2). So 12 threads should be ok.
|
OK. Someone tested an 8 core Xeon with 20 MB of cache in #22 and got better performance with 8 threads than 10. So your case is different I guess. |
Note that the only system where the binary worked was an ubuntu 18 box. Everything else i had to compile
Now here's an interesting case. This is a 4 core virtual machine on top of a .... some kind of opteron. 6138 or something. nproc spits out 4, but /proc/cpuinfo shows 1.
And here's where I grumble about this 4 GB dataset again.
I had to increase swap on the last one just to get it to mine. Is it going to require 8 GB of ram even if the box is just GPU mining? I mean, I'm definitely not pro GPU focused mining, but making every rig have a minimum of 8GB .... basically GPU miners try to get the things that run the GPUs as cheap as possible. Check the octominer. This thing maxes out at 8 GB ram. grumble grumble. |
Can you please do:
and pastebin the result?
Even if those CPUs had enough RAM, I'm estimating the following hashrates:
Perhaps the Celeron G3900 would be worth it if you could find a cheap 4 GB stick to add to the system. But all these CPUs will have a very low performance per watt. My laptop can do ~1670 H/s at 30 W, so probably 5-10x the efficiency of these machines. Trying to mine using the swap file is a pointless effort. If you insist to mine on a low-end system, you can mine in verification mode and at least get 5-25 H/s using just 256 MB of RAM. |
4x Xeon E5-4640 @ 2.5 GHz verification mode:
mining mode:
total hashrate: 10166.31 hashes per second best result with a single process:
|
2x Xeon E5-2450L @ 2 GHz
|
@tevador , it was just grumbling. Here's the results... i had to use pastebin |
@Gingeropolous Sorry, I missed some options on the command line. Try again this:
|
@tevador , i fixed it on that VM. When i changed the makefile earlier today i didn't make clean, so yeah. I
|
AMD Ryzen 5 2600 (6 cores), DDR4 2666 MHz (all stock), Windows 10. Precompiled binary:
Binary compiled with GCC 8.2.0 on msys64:
|
I have pushed a new release with a different form of division instruction (issue #26). Slight performance increase can be expected (1-5%). |
i5-4310U 12GB (hp840G1 mint19.1) numactl -i all ./randomx --mine --nonces 100000 --init 1 --threads 1
numactl -i all ./randomx --mine --nonces 100000 --init 1 --threads 2
numactl -i all ./randomx --mine --nonces 100000 --init 2 --threads 1
numactl -i all ./randomx --mine --nonces 100000 --init 2 --threads 2
|
Intel Xeon E5-2651 v2 1.80GHz (2 processors) 8Gb Memory RandomX - verification mode RandomX - mining mode |
i5-4310U 12GB (hp840G1 mint19.1) randomx --verify
randomx --verify --softAes
|
i5-8500 8GB (Windows10beta1803) randomx --verify
randomx --verify --softAes
randomx --mine --nonces 100000 --init 1 --threads 4
randomx --mine --nonces 100000 --init 4 --threads 4
randomx --mine --nonces 100000 --init 6 --threads 6
randomx --mine --nonces 100000 --init 6 --threads 5
randomx --mine --nonces 100000 --init 6 --threads 8
|
AMD Ryzen 6 9850 (128 cores) 256Gb Memory
|
HP DL580 G7: verification mode:
mining mode (Power Reading at iLO3 = 726 Watts; Idle = 323 Watts)
Single process mining (Power Reading at iLO3 = 704 Watts):
|
@kdovijak
AFAIK there is no such CPU and also your result doesn't match any version of RandomX. Can you please fix your comment and include the whole command line? |
Intel i7-4770 @3.4Ghz RandomX - verification mode RandomX - mining mode |
Intel Core i5-8250U CPU @ 1.60GHz
Compiled latest commit xmrig
Compiled latest commit |
i7 3770K default clocks/ pre-compiled windows randomx --jit --largepages --verify randomx --mine --init 8 --threads 4 --nonces 100000 --largepages randomx --mine --init 8 --threads 5 --nonces 100000 --largepages randomx --mine --init 8 --threads 6 --nonces 100000 --largepages randomx --mine --init 8 --threads 7 --nonces 100000 --largepages randomx --mine --init 8 --threads 8 --nonces 100000 --largepages XMRig 4 threads and large pages ~260 H/s CNV-4 is fastest at 4 threads, RandomX is fastest at 7 and 8 threads (equal). 4 threads of CNV-4 uses about 50% of the CPU. Reported CPU usages: 8T 105% |
@JustFranz randomx --mine --init 8 --threads 4 --nonces 100000 --largepages and immediately after start in the task manager, set affinity at threads 0,2,4,6 and set priority at high. The hashrate should be around 1400 H/s (i think). |
AMD Ryzen 3 2200GClocked at 3.20 GHz (Turbo disabled), 16 GB of DDR4-2133 (single channel), NixOS kernel 5.0.2. with clang 7
with gcc 7.4
Intel Celeron G3900 @ 2.80GHz8 GB of DDR3-1333 (dual channel), NixOS kernel 4.19.29. with clang 7
with gcc 7.4
|
1 mil nonce, high priority and 0 2 4 6 affinity --mine --init 8 --threads 4 --nonces 1000000 --largepages |
@nioroso-x3 Can you please retest the latest master on big endian? I have fixed some bugs thanks to your test files. If the result still doesn't match the reference, please repeat the same steps as before on the debug branch and send me the 2 files. |
Still doesn't match the reference. |
AMD FX(tm)-8320 8 core 4.00 GHz 8 GB RAM Windows 10 randomx.exe --verify --jit randomx.exe --mine --largePages --init 8 --threads 8 --nonces 100000 |
@nioroso-x3 I have pushed a possible fix directly into the debug branch. Please run it and see if the final hash at the bottom is |
Nope, still doesn't match LE x86 |
Intel i7-4790K CPU @ 4.00GHz
|
Noob Question: Does Hashing in 'Verify mode' earn you any Moneroj ? or is it just for client/nodes, etc? |
@MoneroChan 'Verification mode' produces the same results as 'mining mode', it's just much slower. So you could theoretically mine with it, but it would be very inefficient. |
How far are we from the final release and testnet launch? |
@miki-bgd-011 Code freeze will happen in the end of April. We need to fix #34 and #35 first. After that, only critical bugs/weaknesses will be fixed and some parameter tuning might also happen. Pool/miner implementations can already be started in May. |
Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz (8 cores/16 threads)
|
Intel(R) Core(TM) i9-7900X CPU @ 3.30GHz (10 cores/20 threads)
|
Intel Core 2 Duo E8400: randomx.exe --mine --init 1 --threads 1 --largePages --softAes --nonces 100000 |
Single Channel DDR4 Bottleneck Experiment randomx.exe --mine --init 2 --threads 8 --largePages --nonces 100000 Note: CPU Utilization says 50% probably as it's just an artifact due to hyperthreading. @tevador I suspect DDR4 can hit 4000H/s single channel with good OC as I didn't even try to OC the Ram properly (just dropped CAS -1). |
Ryzen 7 1700 @ 3.6 GHz, dual channel DDR4-2666, 14-16-16-35
|
Windows 10 Pro
Mining mode:
|
Dual 7501 64 core 2GHz
|
Dual 7601 64 core 2.2GHz
|
Why my laptop hashrate is so low?(Intel Core i7-8550U 16 GB of DDR4-2400 Win10 home edition ) full memory mode (2080 MiB) And largePages parameter is failed on my laptop。
|
It's because you are running the interpreter mode. Add the
You need to enable the "Lock Pages in Memory" policy for your account. Use google search if you don't know how to do it. |
Thanks a lot.
—JIT setting improve hashrate to 1300 /s on my I78550 16M ddr4 ram win 10 (hp 840 g5)laptop.
I’m try to run randomx on the armv8 android phone . There are many mobile phones in China. I think this is a huge distributed computing cluster.
It would be nice to see the compilation options and benchmark data for armv8
From: tevador
Sent: 2019年6月15日 16:08
To: tevador/RandomX
Cc: toweryu; Comment
Subject: Re: [tevador/RandomX] Performance and portability testing (#25)
Why my laptop hashrate is so low?
It's because you are running the interpreter mode. Add the --jit option to use the compiled VM.
And largePages parameter is failed on my laptop
You need to enable the "Lock Pages in Memory" policy for your account. Use google search if you don't know how to do it.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
AMD FX-6300 (6 core, DDR3) w/ large pages:
|
Please post:
Precompiled binaries are available for Windows 64-bit and Ubuntu/Debian x86-64. Download the latest release here: https://github.com/tevador/RandomX/releases
Verification mode
Run as:
--jit
option to get 2-3 times faster verification.--softAes
option to get a small increase in performance.--largePages
option to get a small increase in performance.Mining mode
Mining mode is currently only supported on 64-bit x86 CPUs. Requires at least 2.25 GiB of RAM.
Run as:
Q
(number of initialization threads) equal to the number of hardware threads of your CPU.T
(number of mining threads) which produces the highest hashrate. Starting point should be 1 thread per 2 MB of L3 cache, but some CPUs can benefit from running more threads, while some CPUs cannot run more than 1 thread per core efficiently depending on other factors such as L1/L2 cache sizes.N
(number of nonces) equal to10000
,100000
or1000000
depending on the performance of your system. Aim for at least a 60-second mining period for accurate results.--softAes
option. Your mining performance will be about 40% lower.--largePages
option to get a significant increase in performance (up to 25%).The text was updated successfully, but these errors were encountered: