v12.12
Benchmarking
Initializing enviroment...
Loading IL program
Found RV610 device at 550 MHz (2 SIMDs, wavefront size=64)
28 MB of cached, 4 MB uncached RAM available
Compiling...
Linking...
Allocating LOCAL buffers
Program info:
Scratch regs needed: 0
Number of shared GPRs: 0
Number of shared GPRs total: 0
Slow mode: no
Number of wavefronts per SIMD: 0
Is max wavefronts per SIMD?: no
---Benchmarking core, peak size (no readback)---
Using optimal size (8x16)
Iters: 1024, time=109 ms, 9394 iters/sec, 4 Mkeys/sec
Iters: 2048, time=235 ms, 8714 iters/sec, 4 Mkeys/sec
Iters: 4096, time=484 ms, 8462 iters/sec, 4 Mkeys/sec
Iters: 8192, time=953 ms, 8596 iters/sec, 4 Mkeys/sec
Using optimal size (16x8)
Iters: 1024, time=109 ms, 9394 iters/sec, 4 Mkeys/sec
Iters: 2048, time=235 ms, 8714 iters/sec, 4 Mkeys/sec
Iters: 4096, time=469 ms, 8733 iters/sec, 4 Mkeys/sec
Iters: 8192, time=953 ms, 8596 iters/sec, 4 Mkeys/sec
---Trying grid (24x24)---
Iters: 256, time=110 ms, 2327 iters/sec, 5 Mkeys/sec
Iters: 512, time=203 ms, 2522 iters/sec, 5 Mkeys/sec
Iters: 1024, time=422 ms, 2426 iters/sec, 5 Mkeys/sec
Iters: 2048, time=828 ms, 2473 iters/sec, 5 Mkeys/sec
---Trying grid (32x32)---
Iters: 256, time=187 ms, 1368 iters/sec, 5 Mkeys/sec
Iters: 512, time=360 ms, 1422 iters/sec, 5 Mkeys/sec
Iters: 1024, time=703 ms, 1456 iters/sec, 5 Mkeys/sec
---Trying grid (40x40)---
Iters: 256, time=281 ms, 911 iters/sec, 5 Mkeys/sec
Iters: 512, time=531 ms, 964 iters/sec, 6 Mkeys/sec
---Trying grid (48x48)---
Iters: 256, time=391 ms, 654 iters/sec, 6 Mkeys/sec
Iters: 512, time=766 ms, 668 iters/sec, 6 Mkeys/sec
---Trying grid (56x56)---
Iters: 256, time=546 ms, 468 iters/sec, 5 Mkeys/sec
---Trying grid (64x64)---
Iters: 256, time=704 ms, 363 iters/sec, 5 Mkeys/sec
---Trying grid (72x72)---
Iters: 256, time=890 ms, 287 iters/sec, 5 Mkeys/sec
---Trying grid (80x80)---
Iters: 256, time=1078 ms, 237 iters/sec, 6 Mkeys/sec
****Calculating readback speed*****
Using optimal size (8x16)
Iters: 2048, time=1984 ms, 1032 iters/sec
Using optimal size (16x8)
Iters: 1024, time=1000 ms, 1024 iters/sec
---Trying grid (24x24)---
Iters: 1024, time=1032 ms, 992 iters/sec
---Trying grid (32x32)---
Iters: 1024, time=1063 ms, 963 iters/sec
---Trying grid (40x40)---
Iters: 1024, time=1109 ms, 923 iters/sec
---Trying grid (48x48)---
Iters: 1024, time=1156 ms, 885 iters/sec
---Trying grid (56x56)---
Iters: 1024, time=1250 ms, 819 iters/sec
---Trying grid (64x64)---
Iters: 1024, time=1359 ms, 753 iters/sec
---Trying grid (72x72)---
Iters: 1024, time=1422 ms, 720 iters/sec
---Trying grid (80x80)---
Iters: 1024, time=1515 ms, 675 iters/sec
****Benchmarking full cycle (1b4******
Using optimal size (8x16)
Iters: 1024, time=125 ms, 8192 iters/sec, 4 Mkeys/sec
Iters: 2048, time=234 ms, 8752 iters/sec, 4 Mkeys/sec
Iters: 4096, time=485 ms, 8445 iters/sec, 4 Mkeys/sec
Iters: 8192, time=953 ms, 8596 iters/sec, 4 Mkeys/sec
Using optimal size (16x8)
Iters: 1024, time=125 ms, 8192 iters/sec, 4 Mkeys/sec
Iters: 2048, time=234 ms, 8752 iters/sec, 4 Mkeys/sec
Iters: 4096, time=484 ms, 8462 iters/sec, 4 Mkeys/sec
Iters: 8192, time=953 ms, 8596 iters/sec, 4 Mkeys/sec
---Trying grid (24x24)---
Iters: 256, time=110 ms, 2327 iters/sec, 5 Mkeys/sec
Iters: 512, time=203 ms, 2522 iters/sec, 5 Mkeys/sec
Iters: 1024, time=422 ms, 2426 iters/sec, 5 Mkeys/sec
Iters: 2048, time=844 ms, 2426 iters/sec, 5 Mkeys/sec
---Trying grid (32x32)---
Iters: 256, time=171 ms, 1497 iters/sec, 6 Mkeys/sec
Iters: 512, time=360 ms, 1422 iters/sec, 5 Mkeys/sec
Iters: 1024, time=719 ms, 1424 iters/sec, 5 Mkeys/sec
---Trying grid (40x40)---
Iters: 256, time=281 ms, 911 iters/sec, 5 Mkeys/sec
Iters: 512, time=531 ms, 964 iters/sec, 6 Mkeys/sec
---Trying grid (48x48)---
Iters: 256, time=391 ms, 654 iters/sec, 6 Mkeys/sec
Iters: 512, time=781 ms, 655 iters/sec, 6 Mkeys/sec
---Trying grid (56x56)---
Iters: 256, time=531 ms, 482 iters/sec, 6 Mkeys/sec
---Trying grid (64x64)---
Iters: 256, time=703 ms, 364 iters/sec, 5 Mkeys/sec
---Trying grid (72x72)---
Iters: 256, time=891 ms, 287 iters/sec, 5 Mkeys/sec
---Trying grid (80x80)---
Iters: 256, time=1094 ms, 234 iters/sec, 5 Mkeys/sec
Deallocating resources
|