Run a compiled model

How to run your compiled model on a system with a Tensil accelerator

Things you’ll need

an FPGA board (e.g. the Ultra96-V2)
a compiled model (e.g. the set of three files: resnet20v2_cifar_onnx.tmodel, resnet20v2_cifar_onnx.tdata, resnet20v2_cifar_onnx.tprog)
a fully implemented bitstream (.bit) and a hardware handoff file (.hwh): if you don’t have these, learn how to integrate the RTL

In this guide we’ll assume you are using the PYNQ execution environment, but we also support bare metal execution with our embedded C driver.

1. Move files onto the FPGA

With PYNQ, you can achieve this by running

$ scp <my_model>.t* [email protected]:~/

and then doing the same for the .bit and .hwh files. For example:

$ scp resnet20v2_cifar_onnx.t* [email protected]:~/
$ scp design_1_wrapper.bit [email protected]:~/ultra96-tcu.bit
$ scp design_1.hwh [email protected]:~/ultra96-tcu.hwh

Note that with PYNQ, the .bit and .hwh files must have the same name up to the extension.

2. Copy the Python driver onto the FPGA

If you haven’t already cloned the repository, get the Tensil source code from Github, e.g.

curl -L https://github.com/tensil-ai/tensil/archive/refs/tags/v1.0.0.tar.gz | tar xvz

Now copy the Python driver over:

$ scp -r tensil-1.0.0/drivers/tcu_pynq [email protected]:~/

3. Execute

Now it’s time to hand everything over to the driver and tell it to execute the model. This guide will only cover the bare necessities for doing so, go here for a more complete example.

Import the Tensil driver

from pynq import Overlay
import sys
sys.path.append('/home/xilinx')
from tcu_pynq.driver import Driver
from tcu_pynq.architecture import ultra96

Flash the bitstream onto the FPGA

bitstream = '/home/xilinx/ultra96-tcu.bit'
overlay = Overlay(bitstream)
tcu = Driver(ultra96, overlay.axi_dma_0)

Load the compiled model

resnet = '/home/xilinx/resnet20v2_cifar_onnx_ultra96v2.tmodel'
tcu.load_model(resnet)

Run

Pass your input data to the driver in the form of a dictionary. You can see which inputs the driver expects by printing tcu.model.inputs.

img = ...
inputs = {'x:0': img}
outputs = tcu.run(inputs)

If all went well, outputs should contain the results of running your model.

Next Steps

You’ve successfully run your compiled model on Tensil’s accelerator implemented on your FPGA. You’re ready to use this capability in your application. Reach out to us if you need help taking it from here.

Troubleshooting

As always, if you run into trouble please ask a question on Discord or email us at [email protected].

Last modified March 4, 2022: Finish howto guides and add blog post (f3c8ba0)