Xilixn FPGA提供了一种在线升级的方式,可以通过ICAP指令实现。ICAP(Internal Configuration Access Port) 指的是内部配置访问端口,其主要作用是通过内部配置访问端口(ICAP),用户可以在FPGA逻辑代码中直接读写FPGA内部配置寄存器(类似SelectMAP),从而实现特定的配置功能,例如Multiboot。FPGA实现IPROG通常有两种方式,一种是通过ICAP配置,一种是把相关指令嵌入bit文件中。与通过bit文件实现IPROG相比,通过ICAP更灵活。对Xilinx FPGA的升级其实是Multiboot的操作。如下图所示,基地址存放的是Golden Image(bootloader),而高地址存放的是MultiBoot Image。小编会在本文对Xilinx 7系列的MulTIboot做一些简单介绍。
程序在启动的过程中,首先会加载MultiBoot Image,然后判断配置是否成功,这一步一般都是由外部电路决定,如果成功,则FPGA芯片上运行的是MultiBoot Image,如果失败,程序会自动返回到Golden Image。
1.STARTUP原语
我们都知道fpga掉电程序会丢失,一般使用外部flash存储代码,flash有spi、bpi、qspi等接口,外部存储器的时钟管脚一般与fpga的CCLK_0连接(BANK0),当使用远程更新时,首先fpga内部有控制flash的驱动(即逻辑控制flash时序)的时钟,当然flash时钟也需要控制了,但这时时钟管脚已经连接到CCLK_0,这时候就需要用STARTUPE2(7系列),SPANTAN系列使用STARTUPE原语,而UltraScale系列使用STARTUPE3原语,小编使用的是xc7k325的器件,所以:
代码语言:javascript复制STARTUPE2 #(
.PROG_USR("FALSE"), // Activate program event security feature. Requires encrypted bitstreams.
.SIM_CCLK_FREQ(0.0) // Set the Configuration Clock Frequency(ns) for simulation
)
STARTUPE2_inst
(
.CFGCLK(), // 1-bit output: Configuration main clock output
.CFGMCLK(), // 1-bit output: Configuration internal oscillator clock output
.EOS(), // 1-bit output: Active high output signal indicating the End Of Startup.
.PREQ(), // 1-bit output: PROGRAM request to fabric output
.CLK(0), // 1-bit input: User start-up clock input
.GSR(0), // 1-bit input: Global Set/Reset input (GSR cannot be used for the port name)
.GTS(0), // 1-bit input: Global 3-state input (GTS cannot be used for the port name)
.KEYCLEARB(1), // 1-bit input: Clear AES Decrypter Key input from Battery-Backed RAM (BBRAM)
.PACK(1), // 1-bit input: PROGRAM acknowledge input
.USRCCLKO(flash_clk), // 1-bit input: User CCLK input**将SPI的时钟链接到这里**
.USRCCLKTS(0), // 1-bit input: User CCLK 3-state enable input
.USRDONEO(1), // 1-bit input: User DONE pin output control
.USRDONETS(1) // 1-bit input: User DONE 3-state enable outpu
);
其中flash_clk就是时序控制的flash时钟信号,连接到这就行了,其它的不需要改动,也无需约束此管脚(因为约束会报错,小编已经踩过坑了)。 其实在Xilinx上的Xilinx SPI Controller里面包含STARTUP原语,如下图所示,所以对于Xilinx支持的FLASH芯片厂商如:Micron,Winbond,Spansion等,不需要再例化该原语。
SPI-controller
2.ICAP原语
IRPOG命令序列是实现FPGA重加载的重要环节。IPROG命令的效果与在PROGRAM_B引脚产生一个脉冲的效果类似,但是IPROG命令不对重配置[4]逻辑进行复位。Kintex7内部ICAPE2模块能够执行IPROG命令,IPROG命令触发FPGA从SPI Flash中重新加载比特文件,加载地址是Kintex7中WBSTAR寄存器指定的地址。 IPROG命令发送后,FPGA完成3个动作:
- 发送同步字节(AA995566);
- 向Kintex7的WBSTAR寄存器写入下一个加载地址(下表地址为00000000);
- 发送IPORG命令(0000000F)。
下表是通过ICAPE2向重配置模块发送IPROG命令的顺序。
ICAPE2编程命令程序实现如下图所示
代码语言:javascript复制ICAPE2 #(
.DEVICE_ID(0'h3651093 ), // Specifies the pre-programmed Device ID value to be used for simulation
.ICAP_WIDTH ("X32" ), // Specifies the input and output data width.
.SIM_CFG_FILE_NAME ("C:\VivadoPrj\FPGAUartProgram\7Serial\7Serial_A7\7Serial_A7.runs\impl_1\OnlineProgram_top.bit" ) // Specifies the Raw Bitstream (RBT) file to be parsed by the simulation model.
)
ICAPE2_inst (
.O (ICAPE2_O ), // 32-bit output: Configuration data output bus
.CLK (ICAPE2_CLK ), // 1-bit input: Clock Input
.CSIB (ICAPE2_CSIB ), // 1-bit input: Active-Low ICAP Enable
.I (ICAPE2_I ), // 32-bit input: Configuration data input bus
.RDWRB (ICAPE2_RDWRB ) // 1-bit input: Read/Write Select input
);
注意有坑在这里哦 坑一:需要注意的是Flash的地址为24bit时候,需要将该Warm addr的高24位置为所需要的切换的镜像地址。
坑二:需要对WBSTAR进行字节内交换
具体实现程序如下所示:
代码语言:javascript复制ICAPE2_I[0] <= icape2_data_r[7];
ICAPE2_I[1] <= icape2_data_r[6];
ICAPE2_I[2] <= icape2_data_r[5];
ICAPE2_I[3] <= icape2_data_r[4];
ICAPE2_I[4] <= icape2_data_r[3];
ICAPE2_I[5] <= icape2_data_r[2];
ICAPE2_I[6] <= icape2_data_r[1];
ICAPE2_I[7] <= icape2_data_r[0];
ICAPE2_I[8] <= icape2_data_r[15];
ICAPE2_I[9] <= icape2_data_r[14];
ICAPE2_I[10] <= icape2_data_r[13];
ICAPE2_I[11] <= icape2_data_r[12];
ICAPE2_I[12] <= icape2_data_r[11];
ICAPE2_I[13] <= icape2_data_r[10];
ICAPE2_I[14] <= icape2_data_r[9];
ICAPE2_I[15] <= icape2_data_r[8];
ICAPE2_I[16] <= icape2_data_r[23];
ICAPE2_I[17] <= icape2_data_r[22];
ICAPE2_I[18] <= icape2_data_r[21];
ICAPE2_I[19] <= icape2_data_r[20];
ICAPE2_I[20] <= icape2_data_r[19];
ICAPE2_I[21] <= icape2_data_r[18];
ICAPE2_I[22] <= icape2_data_r[17];
ICAPE2_I[23] <= icape2_data_r[16];
ICAPE2_I[24] <= icape2_data_r[31];
ICAPE2_I[25] <= icape2_data_r[30];
ICAPE2_I[26] <= icape2_data_r[29];
ICAPE2_I[27] <= icape2_data_r[28];
ICAPE2_I[28] <= icape2_data_r[27];
ICAPE2_I[29] <= icape2_data_r[26];
ICAPE2_I[30] <= icape2_data_r[25];
ICAPE2_I[31] <= icape2_data_r[24];
其实在Xilinx上的Block Design中也有ICAP的IP核,所以在进行设计的时候也可以直接调用该IP进行实现跳转功能。
坑三:需要对外部SPI接口进行约束
约束如下:
代码语言:javascript复制set cclk_delay 6.7
# Following are the SPI device parameters
# Max Tco
set tco_max 7
# Min Tco
set tco_min 1
# Setup time requirement
set tsu 2
# Hold time requirement
set th 3
# Following are the board/trace delay numbers
# Assumption is that all Data lines are matched
set tdata_trace_delay_max 0.25
set tdata_trace_delay_min 0.25
set tclk_trace_delay_max 0.2
set tclk_trace_delay_min 0.2
### End of user provided delay numbers
# This is to ensure min routing delay from SCK generation to STARTUP input
# User should change this value based on the results
# Having more delay on this net reduces the Fmax
# Following constraint should be commented when the STARTUP block is disabled
set_max_delay 1.5 -from [get_pins -hier *SCK_O_reg_reg/C] -to [get_pins -hier *USRCCLKO] -datapath_only
set_min_delay 0.1 -from [get_pins -hier *SCK_O_reg_reg/C] -to [get_pins -hier *USRCCLKO]
# Following command creates a divide by 2 clock
# It also takes into account the delay added by the STARTUP block to route the CCLK
# This constraint is not needed when the STARTUP block is disabled
# The following constraint should be commented when the STARTUP block is disabled
create_generated_clock -name clk_sck -source [get_pins -hierarchical *axi_quad_spi_1/ext_spi_clk] [get_pins -hierarchical *USRCCLKO] -edges {3 5 7} -edge_shift [list $cclk_delay $cclk_delay $cclk_delay]
# Enable the following constraint when STARTUP block is disabled
#create_generated_clock -name clk_virt -source [get_pins -hierarchical *axi_quad_spi_1/ext_spi_clk] [get_ports <SCK_IO>] -edges {3 5 7}
# Data is captured into FPGA on the second rising edge of ext_spi_clk after the SCK falling edge
# Data is driven by the FPGA on every alternate rising_edge of ext_spi_clk
set_input_delay -clock clk_sck -max [expr $tco_max $tdata_trace_delay_max $tclk_trace_delay_max] [get_ports IO*_IO] -clock_fall;
set_input_delay -clock clk_sck -min [expr $tco_min $tdata_trace_delay_min $tclk_trace_delay_min] [get_ports IO*_IO] -clock_fall;
set_multicycle_path 2 -setup -from clk_sck -to [get_clocks -of_objects [get_pins -hierarchical */ext_spi_clk]]
set_multicycle_path 1 -hold -end -from clk_sck -to [get_clocks -of_objects [get_pins -hierarchical */ext_spi_clk]]
# Data is captured into SPI on the following rising edge of SCK
# Data is driven by the IP on alternate rising_edge of the ext_spi_clk
set_output_delay -clock clk_sck -max [expr $tsu $tdata_trace_delay_max - $tclk_trace_delay_min] [get_ports IO*_IO];
set_output_delay -clock clk_sck -min [expr $tdata_trace_delay_min -$th - $tclk_trace_delay_max] [get_ports IO*_IO];
set_multicycle_path 2 -setup -start -from [get_clocks -of_objects [get_pins -hierarchical */ext_spi_clk]] -to clk_sck
set_multicycle_path 1 -hold -from [get_clocks -of_objects [get_pins -hierarchical */ext_spi_clk]] -to clk_sck