全志科技A40i国产开发板——性能参数综合测试

2022-11-30 09:45:24 浏览数 (3)

本次测试板卡是创龙科技旗下,一款基于全志科技A40i开发板,其接口资源丰富,可引出双路网口、双路CAN、双路USB、双路RS485等通信接口,板载Bluetooth、WIFI、4G(选配)模块,同时引出MIPI LCD、LVDS LCD、TFT LCD、HDMI OUT、CVBS OUT、CAMERA、LINE IN、H/P OUT等音视频多媒体接口,支持双屏异显、1080P@45fps H.264视频硬件编码、1080P@60fps H.264视频硬件解码,并支持SATA大容量存储接口。

以下是测评用户编写的测评内容,欢迎阅读:

前言

之前进行了开发环境的体验,现在对各方面的性能进行一个定性体验。

跑分

打开WSL终端

下载代码

git clone https://github.com/eembc/coremark.git

cd coremark/

vi simple/core_portme.h

修改

#define COMPILER_FLAGS

FLAGS_STR /* "Please put compiler flags here (e.g. -o3)" */

#endif

#define COMPILER_FLAGS

"-O3" /* "Please put compiler flags here (e.g. -o3)" */

#endif

如果-O0编译则改为”-O0”

typedef ee_u32 ee_ptr_int;

改为

typedef unsigned long ee_ptr_int;

编译

export PATH=$PATH:~/lichee/out/sun8iw11p1/linux/common/buildroot/host/usr/bin

arm-linux-gnueabihf-gcc -o coremarko0 core_list_join.c core_main.c core_matrix.c core_state.c core_util.c simple/core_portme.c -DPERFORMANCE_RUN=1 -DITERATIONS=100000 -Isimple -I. -O0

arm-linux-gnueabihf-gcc -o coremarko3 core_list_join.c core_main.c core_matrix.c core_state.c core_util.c simple/core_portme.c -DPERFORMANCE_RUN=1 -DITERATIONS=100000 -Isimple -I. -O3

导入到windows下

cp coremarko0 coremarko3 /mnt/d

然后通过串口rz导入到开发板

添加可执行权限

chmod x coremarko0 coremarko3

运行

./coremarko0

./coremarko3

结果如下,可以看到优化不同差距较大

root@T3/A40i-Tronlong:~# ./coremarko0

2K performance run parameters for coremark.

CoreMark Size : 666

Total ticks : 146952831

Total time (secs): 146.952831

Iterations/Sec : 680.490463

Iterations : 100000

Compiler version : GCC9.4.0

Compiler flags : -O0

Memory location : STACK

seedcrc : 0xe9f5

[0]crclist : 0xe714

[0]crcmatrix : 0x1fd7

[0]crcstate : 0x8e3a

[0]crcfinal : 0xd340

Correct operation validated. See README.md for run and reporting rules.

CoreMark 1.0 : 680.490463 / GCC9.4.0 -O0 / STACK

root@T3/A40i-Tronlong:~# ./coremarko3

2K performance run parameters for coremark.

CoreMark Size : 666

Total ticks : 29362505

Total time (secs): 29.362505

Iterations/Sec : 3405.703975

Iterations : 100000

Compiler version : GCC9.4.0

Compiler flags : -O0

Memory location : STACK

seedcrc : 0xe9f5

[0]crclist : 0xe714

[0]crcmatrix : 0x1fd7

[0]crcstate : 0x8e3a

[0]crcfinal : 0xd340

Correct operation validated. See README.md for run and reporting rules.

CoreMark 1.0 : 3405.703975 / GCC9.4.0 -O0 / STACK

从https://www.eembc.org/coremark/scores.php

搜索Cortex-A7可以对比同型号CPU的得分。

Cortex - A7 1.2GHz

RAM性能测试

WSL中

下载代码

git clone https://github.com/qinyunti/STREAM.git

cd STREAM/

编译

export PATH=$PATH:~/lichee/out/sun8iw11p1/linux/common/buildroot/host/usr/bin

arm-linux-gnueabihf-gcc -O3 -DSTREAM_ARRAY_SIZE=5000000 stream.c -o stream.5M

导出到windows下

cp stream.5M /mnt/d

然后通过串口rz导入到开发板

添加可执行权限

chmod x stream.5M

运行

./stream.5M

结果如下

root@T3/A40i-Tronlong:~# ./stream.5M

-------------------------------------------------------------

STREAM version $Revision: 5.10 $

-------------------------------------------------------------

This system uses 8 bytes per array element.

-------------------------------------------------------------

Array size = 5000000 (elements), Offset = 0 (elements)

Memory per array = 38.1 MiB (= 0.0 GiB).

Total memory required = 114.4 MiB (= 0.1 GiB).

Each kernel will be executed 10 times.

The *best* time for each kernel (excluding the first iteration)

will be used to compute the reported bandwidth.

-------------------------------------------------------------

Your clock granularity/precision appears to be 1 microseconds.

Each test below will take on the order of 52219 microseconds.

(= 52219 clock ticks)

Increase the size of the arrays if this shows that

you are not getting at least 20 clock ticks per test.

-------------------------------------------------------------

WARNING -- The above is only a rough guideline.

For best results, please be sure you know the

precision of your system timer.

-------------------------------------------------------------

Function Best Rate MB/s Avg time Min time Max time

Copy: 972.1 0.083436 0.082297 0.084256

Scale: 868.5 0.092398 0.092110 0.092609

Add: 829.7 0.144716 0.144639 0.144788

Triad: 683.4 0.175755 0.175587 0.175917

-------------------------------------------------------------

Solution Validates: avg error less than 1.000000e-13 on all three arrays

参考https://www.cs.virginia.edu/stream/ref.html

RAM压力测试

参考 https://pyropus.ca./software/memtester/

WSL中

下载代码

wget https://pyropus.ca./software/memtester/old-versions/memtester-4.5.1.tar.gz

tar -xvf memtester-4.5.1.tar.gz

cd memtester-4.5.1/

编译

export PATH=$PATH:~/lichee/out/sun8iw11p1/linux/common/buildroot/host/usr/bin

arm-linux-gnueabihf-gcc -O3 memtester.c tests.c -o memtester

导出到WINDOWS下,下载到开发板

cp memtester /mnt/d

chmod x memtester

运行

./memtester

运行结果如下,默认一直测试下去,可以最后指定测试次数

比如

./memtester 128M 1

128M表示测试RAM大小

1表示测试一次

另外也可以-p直接指定物理地址,适合在板子开发阶段裸机代码直接指定物理地址测试。

root@T3/A40i-Tronlong:~# ./memtester 128M 1

memtester version 4.5.1 (32-bit)

Copyright (C) 2001-2020 Charles Cazabon.

Licensed under the GNU General Public License version 2 (only).

pagesize is 4096

pagesizemask is 0xfffff000

want 128MB (134217728 bytes)

got 128MB (134217728 bytes), trying mlock ...locked.

Loop 1/1:

Stuck Address : ok

Random Value : ok

Compare XOR : ok

Compare SUB : ok

Compare MUL : ok

Compare DIV : ok

Compare OR : ok

Compare AND : ok

Sequential Increment: ok

Solid Bits : ok

Block Sequential : ok

Checkerboard : ok

Bit Spread : ok

Bit Flip : ok

Walking Ones : ok

Walking Zeroes : ok

Done.

EMMC性能测试

dmesg | grep mmc

4GEMMC

[ 4.008550] mmc0: new HS200 MMC card at address 0001

[ 4.009409] mmcblk0: mmc0:0001 S04111 3.56 GiB

和16G的SD卡

[ 8.202017] mmc1: new high speed SDHC card at address aaaa

[ 8.208872] mmcblk1: mmc1:aaaa SL16G 14.8 GiB

EMMC速度为HS200

Speed Mode

clock (MHz)

Default Speed

26

Hight Speed SDR

52

Hight Speed DDR

52

HS200

200

HS400

200

df查看,使用/目录进行读写测试

root@T3/A40i-Tronlong:~# df

Filesystem 1K-blocks Used Available Use% Mounted on

/dev/root 2029971 514680 1406338 27% /

devtmpfs 107996 0 107996 0% /dev

tmpfs 124604 0 124604 0% /dev/shm

tmpfs 124604 8 124596 0% /tmp

tmpfs 124604 12 124592 0% /run

cgroup 124604 0 124604 0% /sys/fs/cgroup

root@T3/A40i-Tronlong:~#

不插入SD卡 /挂载在emmc

bs/count 1GB

指令

结果

16k/65536

time dd if=test.bin of=/dev/null bs=16k count=65536

98.5MB/S

4k/262144

1k/1048576

16k/65536

time dd if=/dev/zero of=/test.bin bs=16k count=65536

27.24MB/S

4k/262144

1k/1048576

root@T3/A40i-Tronlong:/# time dd if=/dev/zero of=/test.bin bs=16k count=65536

65536 0 records in

65536 0 records out

real 0m37.581s

user 0m0.080s

sys 0m15.230s

root@T3/A40i-Tronlong:/# time dd if=test.bin of=/dev/null bs=16k count=65536

65536 0 records in

65536 0 records out

real 0m10.386s

user 0m0.070s

sys 0m4.040s

root@T3/A40i-Tronlong:/#

以上仅作参考,实际欸有考虑缓存的影响。

SD卡性能测试

插入SD卡后重启,自动挂在到/root到SD卡

bs/count 1GB

指令

结果

16k/65536

time dd if=/root/test.bin of=/dev/null bs=16k count=65536

21.25MB/S

4k/262144

1k/1048576

16k/65536

time dd if=/dev/zero of=/root/test.bin bs=16k count=65536

11MB/S

4k/262144

1k/1048576

root@T3/A40i-Tronlong:~# time dd if=/dev/zero of=/root/test.bin bs=16k count=65536

65536 0 records in

65536 0 records out

real 1m32.412s

user 0m0.330s

sys 0m17.700s

root@T3/A40i-Tronlong:~# time dd if=/root/test.bin of=/dev/null bs=16k count=65536

65536 0 records in

65536 0 records out

real 0m48.177s

user 0m0.100s

sys 0m4.350s

速度和SD卡本身有关,也没有考虑缓存,所以结果仅作参考。

总结

以上综合对性能进行了测试,感觉性能还是非常不错的,各测试结果仅作参考,因为环境等因素不一样测得结果也会不一样,包括存储的测试方法也不是很科学,比如没有考虑缓存等。上述测试只是一个定性的性能体验,板子的性能是一个综合的体验,需要是面对真实的应用场景才有意义,并且针对场景优化也很重要。

0 人点赞