-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Example auto tune as of 2018 12 06 with Titan X
magnum edited this page Jan 18, 2019
·
4 revisions
Example auto-tune:
Calculating best GWS for LWS=512; max. 200ms single kernel invocation.
Raw speed figures including buffer transfers:
gws: 3072 70471c/s 1154455922 rounds/s 43.591ms per crypt_all()! <--- ~45 ms
gws: 6144 141378c/s 2316054396 rounds/s 43.457ms per crypt_all()! <--- ~45 ms
gws: 12288 271768c/s 4452103376 rounds/s 45.215ms per crypt_all()+ <--- ~45 ms
gws: 24576 286046c/s 4686005572 rounds/s 85.916ms per crypt_all()+
gws: 49152 287033c/s 4702174606 rounds/s 171.241ms per crypt_all()
gws: 98304 288612c/s 4728041784 rounds/s 340.609ms per crypt_all()
gws: 196608 289690c/s 4745701580 rounds/s 678.682ms per crypt_all()+
gws: 393216 290196c/s 4753990872 rounds/s 1.354s per crypt_all()
gws: 786432 290719c/s 4762558658 rounds/s 2.705s per crypt_all()
gws: 1572864 290919c/s 4765835058 rounds/s 5.406s per crypt_all()
gws: 3145728 291966c/s 4782987012 rounds/s 10.774s per crypt_all()
Local worksize (LWS) 512, global worksize (GWS) 196608
DONE
Speed for cost 1 (key version [0:PMKID 1:WPA 2:WPA2 3:802.11w]) of 2
Raw: 289129 c/s real, 289129 c/s virtual, GPU util: 99%
For the above example, we probably want to set min_keys_per_crypt to 12288 since that's the point where it lifts off.
Same with more details:
Calculating best GWS for LWS=512; max. 200ms single kernel invocation.
Raw speed figures including buffer transfers:
xfer: 34.784us, init: 64.736us, loop: 78x555.744us, pass2: 46.080us, final: 83.584us, xfer: 9.472us
gws: 3072 70471c/s 1154455922 rounds/s 43.591ms per crypt_all()!
xfer: 67.040us, init: 53.120us, loop: 78x554.112us, pass2: 40.256us, final: 54.560us, xfer: 16.928us
gws: 6144 141378c/s 2316054396 rounds/s 43.457ms per crypt_all()!
xfer: 131.360us, init: 70.880us, loop: 78x574.848us, pass2: 51.264us, final: 86.304us, xfer: 31.584us
gws: 12288 271768c/s 4452103376 rounds/s 45.215ms per crypt_all()+
xfer: 259.872us, init: 110.880us, loop: 78x1.093ms, pass2: 82.816us, final: 90.400us, xfer: 60.832us
gws: 24576 286046c/s 4686005572 rounds/s 85.916ms per crypt_all()+
xfer: 516.672us, init: 210.816us, loop: 78x2.180ms, pass2: 168.864us, final: 162.400us, xfer: 119.488us
gws: 49152 287033c/s 4702174606 rounds/s 171.241ms per crypt_all()
xfer: 1.031ms, init: 350.688us, loop: 78x4.337ms, pass2: 334.848us, final: 298.560us, xfer: 237.024us
gws: 98304 288612c/s 4728041784 rounds/s 340.609ms per crypt_all()
xfer: 2.057ms, init: 623.840us, loop: 78x8.644ms, pass2: 641.504us, final: 551.424us, xfer: 471.200us
gws: 196608 289690c/s 4745701580 rounds/s 678.682ms per crypt_all()+
xfer: 4.125ms, init: 1.147ms, loop: 78x17.260ms, pass2: 1.235ms, final: 1.033ms, xfer: 940.096us
gws: 393216 290196c/s 4753990872 rounds/s 1.354s per crypt_all()
xfer: 8.295ms, init: 2.177ms, loop: 78x34.462ms, pass2: 2.372ms, final: 2.010ms, xfer: 1.876ms
gws: 786432 290719c/s 4762558658 rounds/s 2.705s per crypt_all()
xfer: 16.451ms, init: 4.284ms, loop: 78x68.882ms, pass2: 4.608ms, final: 3.950ms, xfer: 3.765ms
gws: 1572864 290919c/s 4765835058 rounds/s 5.406s per crypt_all()
xfer: 32.922ms, init: 8.428ms, loop: 78x137.270ms, pass2: 9.164ms, final: 7.818ms, xfer: 7.514ms
gws: 3145728 291966c/s 4782987012 rounds/s 10.774s per crypt_all()
xfer: 65.836ms, init: 16.762ms, loop: 78x274.142ms (exceeds 200ms)
Local worksize (LWS) 512, global worksize (GWS) 196608
DONE
Speed for cost 1 (key version [0:PMKID 1:WPA 2:WPA2 3:802.11w]) of 2
Raw: 289129 c/s real, 289129 c/s virtual, GPU util: 99%