在FPGA里面,AXI DMA這個IP核的主要作用,就是在Verilog語言和C語言之間傳輸大批量的數據,使用的通信協議為AXI4-Stream。
Xilinx很多IP核都是基于AXI4-Stream協議的,例如浮點數Floating-point IP核,以及以太網Tri Mode Ethernet MAC IP核。要想將Verilog層面的數據搬運到C語言里面處理,就要使用DMA IP核。
本文以浮點數Floating-point IP核將定點數轉換為浮點數為例,詳細講解AXI DMA IP核的使用方法。
浮點數IP核的輸入輸出數據都是32位,協議均為AXI4-Stream。C語言程序首先將要轉換的定點數數據通過DMA發送給浮點數IP核,浮點數IP核轉換完成后再通過DMA將單精度浮點數結果發回C語言程序,再通過printf打印出來。
定點數的數據類型為int,小數點定在第四位上,即:XXXXXXX.X。整數部分占28位,小數部分占4位。
轉換后浮點數的數據類型為float,可以用printf的%f直接打印出來。
工程下載地址:https://pan.baidu.com/s/1SXppHMdhroFT8vGCIysYTQ(提取碼:u7wf)
MicroBlaze C語言工程的建法不再贅述,請參閱:https://blog.csdn.net/ZLK1214/article/details/111824576
首先添加Floating-point IP核,作為DMA的外設端:(主存端為BRAM)
這里要注意一下,一定要勾選上TLAST,否則DMA接收端會出現DMA Internal Error的錯誤:
下面是Xilinx DMA手冊里面對DMA Internal Error錯誤的描述:
添加AXI DMA IP核:
IP核添加好了,但還沒有連線:
點擊Run Connection Automation,自動連接DMA的S_AXI_LITE接口:
自動連接浮點數IP核的時鐘引腳:
添加BRAM控制器:
最終的連線結果:
修改新建的BRAM的容量為64KB:
最終的地址分配方式:
保存Block Design,然后生成Bitstream:
Bitstream生成后,導出xsa文件:
Vitis Platform工程重新導入xsa文件:
修改C程序(helloworld.c)的代碼:
(這里面XPAR_BRAM_2_BASEADDR最好改成0xc0000000,因為生成的xparameters.h配置文件里面BRAM號可能有變化)
/******************************************************************************
*
* Copyright (C) 2009 - 2014 Xilinx, Inc. All rights reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* Use of the Software is limited solely to applications:
* (a) running on a Xilinx device, or
* (b) that interact with a Xilinx device through a bus or interconnect.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* XILINX BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF
* OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
* Except as contained in this notice, the name of the Xilinx shall not be used
* in advertising or otherwise to promote the sale, use or other dealings in
* this Software without prior written authorization from Xilinx.
*
******************************************************************************/
/*
* helloworld.c: simple test application
*
* This application configures UART 16550 to baud rate 9600.
* PS7 UART (Zynq) is not initialized by this application, since
* bootrom/bsp configures it to baud rate 115200
*
* ------------------------------------------------
* | UART TYPE BAUD RATE |
* ------------------------------------------------
* uartns550 9600
* uartlite Configurable only in HW design
* ps7_uart 115200 (configured by bootrom/bsp)
*/
#include
#include
#include "platform.h"
// DMA無法通過AXI Interconnect訪問Microblaze本身的BRAM內存
// 只能訪問掛接在AXI Interconnect上的內存
#define _countof(arr) (sizeof(arr) / sizeof(*(arr)))
typedef struct
{
int numbers_in[40];
float numbers_out[40];
} BRAM2_Data;
static BRAM2_Data *bram2_data = (BRAM2_Data *)XPAR_BRAM_2_BASEADDR;
static XAxiDma xaxidma;
int main(void)
{
int i, ret = 0;
XAxiDma_Config *xaxidma_cfg;
init_platform();
printf("Hello World\n");
printf("Successfully ran Hello World application\n");
// 初始化DMA
xaxidma_cfg = XAxiDma_LookupConfig(XPAR_AXIDMA_0_DEVICE_ID);
XAxiDma_CfgInitialize(&xaxidma, xaxidma_cfg);
ret = XAxiDma_Selftest(&xaxidma);
if (ret != XST_SUCCESS)
{
printf("XAxiDma_Selftest() failed! ret=%d\n", ret);
goto err;
}
// 初始化DMA的輸入數據
printf("numbers_in=%p, numbers_out=%p\n", bram2_data->numbers_in, bram2_data->numbers_out);
for (i = 0; i numbers_in); i++)
{
bram2_data->numbers_in[i] = 314 * (i + 1);
if (i & 1)
bram2_data->numbers_in[i] = -bram2_data->numbers_in[i];
}
// DMA開始發送數據 (Length參數的單位為字節)
ret = XAxiDma_SimpleTransfer(&xaxidma, (uintptr_t)bram2_data->numbers_in, sizeof(bram2_data->numbers_in), XAXIDMA_DMA_TO_DEVICE);
if (ret != XST_SUCCESS)
{
printf("XAxiDma_SimpleTransfer(XAXIDMA_DMA_TO_DEVICE) failed! ret=%d\n", ret);
goto err;
}
// DMA開始接收數據
ret = XAxiDma_SimpleTransfer(&xaxidma, (uintptr_t)bram2_data->numbers_out, sizeof(bram2_data->numbers_out), XAXIDMA_DEVICE_TO_DMA);
if (ret != XST_SUCCESS)
{
printf("XAxiDma_SimpleTransfer(XAXIDMA_DEVICE_TO_DMA) failed! ret=%d\n", ret);
goto err;
}
// 等待DMA發送完畢
i = 0;
while (XAxiDma_Busy(&xaxidma, XAXIDMA_DMA_TO_DEVICE))
{
i++;
if (i == 200000)
{
// 必須確保DMA訪問的內存是直接掛接在AXI Interconnect上的
// 否則這里會報DMA Decode Error的錯誤 (the address request points to an invalid address)
printf("DMA Tx timeout! DMASR=0x%08lx\n", XAxiDma_ReadReg(xaxidma.RegBase + XAXIDMA_TX_OFFSET, XAXIDMA_SR_OFFSET));
goto err;
}
}
printf("DMA Tx complete!\n");
// 等待DMA接收完畢
i = 0;
while (XAxiDma_Busy(&xaxidma, XAXIDMA_DEVICE_TO_DMA))
{
i++;
if (i == 200000)
{
// floating-point IP核的配置里面一定要把A通道的tlast復選框勾選上, 使輸入端和輸出端都有tlast信號
// 否則s_axis_s2mm_tlast一直為0, DMA以為數據還沒接收完, 就會報DMA Internal Error的錯誤
// (the incoming packet is bigger than what is specified in the DMA length register)
printf("DMA Rx timeout! DMASR=0x%08lx\n", XAxiDma_ReadReg(xaxidma.RegBase + XAXIDMA_RX_OFFSET, XAXIDMA_SR_OFFSET));
goto err;
}
}
printf("DMA Rx complete!\n");
err:
for (i = 0; i numbers_out); i++)
printf("numbers_out[%d]=%f\n", i, bram2_data->numbers_out[i]);
cleanup_platform();
return 0;
}
C程序的運行結果:
Hello World
Successfully ran Hello World application
numbers_in=0xc0000000, numbers_out=0xc00000a0
DMA Tx complete!
DMA Rx complete!
numbers_out[0]=19.625000
numbers_out[1]=-39.250000
numbers_out[2]=58.875000
numbers_out[3]=-78.500000
numbers_out[4]=98.125000
numbers_out[5]=-117.750000
numbers_out[6]=137.375000
numbers_out[7]=-157.000000
numbers_out[8]=176.625000
numbers_out[9]=-196.250000
numbers_out[10]=215.875000
numbers_out[11]=-235.500000
numbers_out[12]=255.125000
numbers_out[13]=-274.750000
numbers_out[14]=294.375000
numbers_out[15]=-314.000000
numbers_out[16]=333.625000
numbers_out[17]=-353.250000
numbers_out[18]=372.875000
numbers_out[19]=-392.500000
numbers_out[20]=412.125000
numbers_out[21]=-431.750000
numbers_out[22]=451.375000
numbers_out[23]=-471.000000
numbers_out[24]=490.625000
numbers_out[25]=-510.250000
numbers_out[26]=529.875000
numbers_out[27]=-549.500000
numbers_out[28]=569.125000
numbers_out[29]=-588.750000
numbers_out[30]=608.375000
numbers_out[31]=-628.000000
numbers_out[32]=647.625000
numbers_out[33]=-667.250000
numbers_out[34]=686.875000
numbers_out[35]=-706.500000
numbers_out[36]=726.125000
numbers_out[37]=-745.750000
numbers_out[38]=765.375000
numbers_out[39]=-785.000000
接下來講一下我們剛才禁用掉的Scatter Gather接口的用法。取消禁用后,之前的C代碼就不能運行了。
之前沒有啟用Scatter Gather的時候,我們一次只能提交一個DMA請求,等這個DMA請求的數據傳輸完畢后,我們才能提交下一個DMA傳輸請求。
有了Scatter Gather接口,我們就可以一次性提交很多很多DMA請求,然后CPU去干其他的事情。這可以大大提高傳輸效率。
除此以外,Scatter Gather還可以將多個位于不同內存地址的緩沖區合并成一個AXI4-Stream數據包傳輸。
下面的示例演示了如何利用Scatter Gather功能批量收發3組數據包。
啟用了Scatter Gather后,DMA里面多出了一個M_AXI_SG接口,點擊Run Connection Automation,連接到AXI Interconnect上:
Vivado工程Generate Bitstream,然后導出xsa文件。回到Vitis后,必須把Platform工程刪了重建,不然XPAR_AXI_DMA_0_INCLUDE_SG的值得不到更新。
原有的C程序不再可用,修改一下程序代碼:
/******************************************************************************
*
* Copyright (C) 2009 - 2014 Xilinx, Inc. All rights reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* Use of the Software is limited solely to applications:
* (a) running on a Xilinx device, or
* (b) that interact with a Xilinx device through a bus or interconnect.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* XILINX BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF
* OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
* Except as contained in this notice, the name of the Xilinx shall not be used
* in advertising or otherwise to promote the sale, use or other dealings in
* this Software without prior written authorization from Xilinx.
*
******************************************************************************/
/*
* helloworld.c: simple test application
*
* This application configures UART 16550 to baud rate 9600.
* PS7 UART (Zynq) is not initialized by this application, since
* bootrom/bsp configures it to baud rate 115200
*
* ------------------------------------------------
* | UART TYPE BAUD RATE |
* ------------------------------------------------
* uartns550 9600
* uartlite Configurable only in HW design
* ps7_uart 115200 (configured by bootrom/bsp)
*/
#include
#include
#include "platform.h"
/* Xilinx的官方例程:C:\Xilinx\Vitis\2020.1\data\embeddedsw\XilinxProcessorIPLib\drivers\axidma_v9_11\examples\xaxidma_example_sg_poll.c */
// DMA無法通過AXI Interconnect訪問Microblaze本身的BRAM內存
// 只能訪問掛接在AXI Interconnect上的內存
#define _countof(arr) (sizeof(arr) / sizeof(*(arr)))
typedef struct
{
int numbers_in[40];
float numbers_out[40];
} BRAM2_Data;
typedef struct
{
uint8_t txbuf[640];
uint8_t rxbuf[640];
} BRAM2_BdRingBuffer;
static BRAM2_Data *bram2_data = (BRAM2_Data *)0xc0000000;
static BRAM2_BdRingBuffer *bram2_bdringbuf = (BRAM2_BdRingBuffer *)0xc0008000;
static XAxiDma xaxidma;
int main(void)
{
int i, n, ret = 0;
XAxiDma_Bd *bd, *p;
XAxiDma_BdRing *txring, *rxring;
XAxiDma_Config *cfg;
init_platform();
printf("Hello World\n");
printf("Successfully ran Hello World application\n");
// 初始化DMA
cfg = XAxiDma_LookupConfig(XPAR_AXIDMA_0_DEVICE_ID);
XAxiDma_CfgInitialize(&xaxidma, cfg);
ret = XAxiDma_Selftest(&xaxidma);
if (ret != XST_SUCCESS)
{
printf("XAxiDma_Selftest() failed! ret=%d\n", ret);
goto err;
}
if (!XAxiDma_HasSg(&xaxidma))
{
printf("XPAR_AXI_DMA_0_INCLUDE_SG=%d\n", XPAR_AXI_DMA_0_INCLUDE_SG);
printf("Please recreate and build Vitis platform project!\n");
goto err;
}
// 初始化DMA的輸入數據
printf("[0] numbers_in=%p, numbers_out=%p\n", bram2_data[0].numbers_in, bram2_data[0].numbers_out);
printf("[1] numbers_in=%p, numbers_out=%p\n", bram2_data[1].numbers_in, bram2_data[1].numbers_out);
printf("[2] numbers_in=%p, numbers_out=%p\n", bram2_data[2].numbers_in, bram2_data[2].numbers_out);
for (i = 0; i {
bram2_data[0].numbers_in[i] = 314 * (i + 1);
bram2_data[1].numbers_in[i] = -141 * (i + 1);
bram2_data[2].numbers_in[i] = -2718 * (i + 1);
if (i & 1)
{
bram2_data[0].numbers_in[i] = -bram2_data[0].numbers_in[i];
bram2_data[1].numbers_in[i] = -bram2_data[1].numbers_in[i];
bram2_data[2].numbers_in[i] = -bram2_data[2].numbers_in[i];
}
}
// 配置DMA發送描述符
txring = XAxiDma_GetTxRing(&xaxidma);
n = XAxiDma_BdRingCntCalc(XAXIDMA_BD_MINIMUM_ALIGNMENT, sizeof(bram2_bdringbuf->txbuf));
ret = XAxiDma_BdRingCreate(txring, (uintptr_t)bram2_bdringbuf->txbuf, (uintptr_t)bram2_bdringbuf->txbuf, XAXIDMA_BD_MINIMUM_ALIGNMENT, n);
if (ret != XST_SUCCESS)
{
printf("XAxiDma_BdRingCreate(txring) failed! ret=%d\n", ret);
goto err;
}
printf("BdRing Tx count: %d\n", n);
ret = XAxiDma_BdRingAlloc(txring, 3, &bd);
if (ret != XST_SUCCESS)
{
printf("XAxiDma_BdRingAlloc(txring) failed! ret=%d\n", ret);
goto err;
}
p = bd;
for (i = 0; i {
XAxiDma_BdSetBufAddr(p, (uintptr_t)bram2_data[i].numbers_in);
XAxiDma_BdSetLength(p, sizeof(bram2_data[i].numbers_in), txring->MaxTransferLen);
XAxiDma_BdSetCtrl(p, XAXIDMA_BD_CTRL_TXSOF_MASK | XAXIDMA_BD_CTRL_TXEOF_MASK);
XAxiDma_BdSetId(p, i);
p = (XAxiDma_Bd *)XAxiDma_BdRingNext(txring, p);
}
ret = XAxiDma_BdRingToHw(txring, 3, bd);
if (ret != XST_SUCCESS)
{
printf("XAxiDma_BdRingToHw(txring) failed! ret=%d\n", ret);
goto err;
}
// 配置DMA接收描述符
rxring = XAxiDma_GetRxRing(&xaxidma);
n = XAxiDma_BdRingCntCalc(XAXIDMA_BD_MINIMUM_ALIGNMENT, sizeof(bram2_bdringbuf->rxbuf));
ret = XAxiDma_BdRingCreate(rxring, (uintptr_t)bram2_bdringbuf->rxbuf, (uintptr_t)bram2_bdringbuf->rxbuf, XAXIDMA_BD_MINIMUM_ALIGNMENT, n);
if (ret != XST_SUCCESS)
{
printf("XAxiDma_BdRingCreate(rxring) failed! ret=%d\n", ret);
goto err;
}
printf("BdRing Rx count: %d\n", n);
ret = XAxiDma_BdRingAlloc(rxring, 3, &bd);
if (ret != XST_SUCCESS)
{
printf("XAxiDma_BdRingAlloc(rxring) failed! ret=%d\n", ret);
goto err;
}
p = bd;
for (i = 0; i {
XAxiDma_BdSetBufAddr(p, (uintptr_t)bram2_data[i].numbers_out);
XAxiDma_BdSetLength(p, sizeof(bram2_data[i].numbers_out), rxring->MaxTransferLen);
XAxiDma_BdSetCtrl(p, 0);
XAxiDma_BdSetId(p, i);
p = (XAxiDma_Bd *)XAxiDma_BdRingNext(rxring, p);
}
ret = XAxiDma_BdRingToHw(rxring, 3, bd);
if (ret != XST_SUCCESS)
{
printf("XAxiDma_BdRingToHw(rxring) failed! ret=%d\n", ret);
goto err;
}
// 開始發送數據
ret = XAxiDma_BdRingStart(txring);
if (ret != XST_SUCCESS)
{
printf("XAxiDma_BdRingStart(txring) failed! ret=%d\n", ret);
goto err;
}
// 開始接收數據
ret = XAxiDma_BdRingStart(rxring);
if (ret != XST_SUCCESS)
{
printf("XAxiDma_BdRingStart(rxring) failed! ret=%d\n", ret);
goto err;
}
// 等待收發結束
n = 0;
while (n {
// 檢查發送是否結束
ret = XAxiDma_BdRingFromHw(txring, XAXIDMA_ALL_BDS, &bd);
if (ret != 0)
{
n += ret;
p = bd;
for (i = 0; i {
printf("DMA Tx%lu Complete!\n", XAxiDma_BdGetId(p));
p = (XAxiDma_Bd *)XAxiDma_BdRingNext(txring, p);
}
ret = XAxiDma_BdRingFree(txring, ret, bd);
if (ret != XST_SUCCESS)
printf("XAxiDma_BdRingFree(txring) failed! ret=%d\n", ret);
}
// 檢查接收是否結束
ret = XAxiDma_BdRingFromHw(rxring, XAXIDMA_ALL_BDS, &bd);
if (ret != 0)
{
n += ret;
p = bd;
for (i = 0; i {
printf("DMA Rx%lu Complete!\n", XAxiDma_BdGetId(p));
p = (XAxiDma_Bd *)XAxiDma_BdRingNext(rxring, p);
}
ret = XAxiDma_BdRingFree(rxring, ret, bd);
if (ret != XST_SUCCESS)
printf("XAxiDma_BdRingFree(rxring) failed! ret=%d\n", ret);
}
}
err:
for (i = 0; i printf("numbers_out[%d]=%f,%f,%f\n", i, bram2_data[0].numbers_out[i], bram2_data[1].numbers_out[i], bram2_data[2].numbers_out[i]);
cleanup_platform();
return 0;
}
程序運行結果:
Hello World
Successfully ran Hello World application
[0] numbers_in=0xc0000000, numbers_out=0xc00000a0
[1] numbers_in=0xc0000140, numbers_out=0xc00001e0
[2] numbers_in=0xc0000280, numbers_out=0xc0000320
BdRing Tx count: 10
BdRing Rx count: 10
DMA Tx0 Complete!
DMA Tx1 Complete!
DMA Tx2 Complete!
DMA Rx0 Complete!
DMA Rx1 Complete!
DMA Rx2 Complete!
numbers_out[0]=19.625000,-8.812500,-169.875000
numbers_out[1]=-39.250000,17.625000,339.750000
numbers_out[2]=58.875000,-26.437500,-509.625000
numbers_out[3]=-78.500000,35.250000,679.500000
numbers_out[4]=98.125000,-44.062500,-849.375000
numbers_out[5]=-117.750000,52.875000,1019.250000
numbers_out[6]=137.375000,-61.687500,-1189.125000
numbers_out[7]=-157.000000,70.500000,1359.000000
numbers_out[8]=176.625000,-79.312500,-1528.875000
numbers_out[9]=-196.250000,88.125000,1698.750000
numbers_out[10]=215.875000,-96.937500,-1868.625000
numbers_out[11]=-235.500000,105.750000,2038.500000
numbers_out[12]=255.125000,-114.562500,-2208.375000
numbers_out[13]=-274.750000,123.375000,2378.250000
numbers_out[14]=294.375000,-132.187500,-2548.125000
numbers_out[15]=-314.000000,141.000000,2718.000000
numbers_out[16]=333.625000,-149.812500,-2887.875000
numbers_out[17]=-353.250000,158.625000,3057.750000
numbers_out[18]=372.875000,-167.437500,-3227.625000
numbers_out[19]=-392.500000,176.250000,3397.500000
numbers_out[20]=412.125000,-185.062500,-3567.375000
numbers_out[21]=-431.750000,193.875000,3737.250000
numbers_out[22]=451.375000,-202.687500,-3907.125000
numbers_out[23]=-471.000000,211.500000,4077.000000
numbers_out[24]=490.625000,-220.312500,-4246.875000
numbers_out[25]=-510.250000,229.125000,4416.750000
numbers_out[26]=529.875000,-237.937500,-4586.625000
numbers_out[27]=-549.500000,246.750000,4756.500000
numbers_out[28]=569.125000,-255.562500,-4926.375000
numbers_out[29]=-588.750000,264.375000,5096.250000
numbers_out[30]=608.375000,-273.187500,-5266.125000
numbers_out[31]=-628.000000,282.000000,5436.000000
numbers_out[32]=647.625000,-290.812500,-5605.875000
numbers_out[33]=-667.250000,299.625000,5775.750000
numbers_out[34]=686.875000,-308.437500,-5945.625000
numbers_out[35]=-706.500000,317.250000,6115.500000
numbers_out[36]=726.125000,-326.062500,-6285.375000
numbers_out[37]=-745.750000,334.875000,6455.250000
numbers_out[38]=765.375000,-343.687500,-6625.125000
numbers_out[39]=-785.000000,352.500000,6795.000000
審核編輯:符乾江
-
FPGA
+關注
關注
1630文章
21759瀏覽量
604291 -
Xilinx
+關注
關注
71文章
2168瀏覽量
121680
發布評論請先 登錄
相關推薦
評論