| |
Debugging Processor-based FPGA Designs Highly integrated FPGA designs built around RISC cores require new debug methods FPGA companies seem to be announcing higher density devices on a quarterly basis. Multi-million-gate devices that seemed unimaginable just a year ago are shipping in volume today. Rushing to fill the millions of available gates is a myriad of synthesizable IP including sophisticated RISC and DSP cores and countless peripheral devices. As a result, FPGA devices are getting a serious look by design engineers contemplating either a small to medium size ASIC or ASIC designs where the run-rate of the final product is of questionable size. An easy migration path to structured ASIC-like devices makes the FPGA choice even more attractive. Embedding RISC and DSP cores, busses, and peripheral devices presents a new challenge to the tried and true methods of FPGA debug. These devices are now falling into the lap of software engineers for whom issues like gate delays and timing closure are new and foreign concepts. While the configurable on-chip logic analyzers provided by the major FPGA vendors are indispensable for debugging the FPGA fabric, they are of little help in dealing with uninitialized variables, failing device drivers and software performance issues. Fortunately FPGA vendors have partnered with 3rd party tool providers to address the debug issues faced by the software engineer. Enter On-Chip Instrumentation (OCI™) Processors and DSP have benefited from a host of tools over the years like In-Circuit Emulators (ICE) logic analyzers with disassembler probes, Background Debug Mode (BDM) and the like. The latest generation of programmable logic offers all of this and more in integrated environments that let you methodically build an FPGA-based SOC or SOPC and at the same time construct the debug tools for development and software debug. Debug features require gates, and while most FPGA designs have plenty of extra gates to allocate to debug, there can be exceptions. Cost sensitive applications are one of those cases where gate utilization is paramount. Typically the FPGA embedded debug tools are completely scalable, giving the engineer a broad range of feature options versus required gates. In this scenario a design team will typically prototype in a larger device giving them access to all of the debug tools. When it comes time to retarget the design to a smaller device, the debug capability can easily be scaled back to live within the available gates. Advanced Triggers The most fundamental debug feature in any processor-based debug environment is run control – go, halt, single-step and software breakpoints. Current debug environments thatfar exceed these basic features with a rich set of options are now available for embedded FPGA processors. The screen shot in Figure 1 shows just how powerful these trigger definitions can be, including such features as address ranges coupled with data and masking. Triggers can be further qualified on cycle types like load or store, or either. Note that the triggers do more than just cause a breakpoint; they can also be used to gate trace collection or trigger on-chip or external instrumentation like logic analyzers.
Large Real-time Trace Buffers Real-time trace is considered by many software engineers as the second most important tool in their debug arsenal. Basic trace collection usually takes the form of PC trace or what some call Branch Trace Messages. This method of trace only records discontinuities in program flow then reconstructs the execution path when the processor hits a breakpoint or is halted. This type of trace collection usually lends itself well to compression techniques particular to the processor architecture, resulting in deep trace buffers without requiring a significant amount of on-chip memory. Some trace schemes allow for both on-chip and off-chip trace collections providing trace buffers well in excess of 100k frames deep. Trace collection options are not limited to just executed instructions. Most sophisticated environments support the addition of load addresses and store addresses and even data. Figure 2 shows a real time trace display that includes the source code interleaved with the associated assembly code and the load/store data. Note that loads and store data is color coded for easy identification.
Performance Analysis Tools One of the drawbacks often associated with using an FPGA versus an ASIC is performance, so extracting the maximum amount of performance from the processor is mandatory. Recent FPGA processors allow for the embedding of counters and timers to track items that can have a significant impact on performance. Cache thrashing is a notorious problem that can reduce performance by an order of magnitude. Simple cache hit/miss measurements can go a long way in identifying the source of the problem. Performance tools usually allow trigger matches to be recognized as countable events so performance monitoring hardware can count writes to specific memory-mapped peripheral registers. The user can then measure I/O activity such as interrupt frequency from each device, packet writes, UART traffic, DMA starts, or even a unique value written to a register. Other measurements include basic timing duration of a repetitive algorithm along with the number of instructions executed. The user can code different implementations and characterize each by the duration, number of I or D cache misses per loop, and how many times an array of data is accessed per loop to determine the optimum algorithm. Unique to the FPGA processor environment is the ability of performance tools to measure the frequency with which inner loops are called and the accumulation of time taken to execute them. Then the user can determine if the code should be replaced by hardware logic or changed to use custom instructions. Integrated Debug Environment Up to this point, debugging the programmable logic (or fabric) and the embedded processor have been treated like completely independent entities. The reality is there are interdependencies between the fabric and the embedded core that need to be debugged. Recently, tools have emerged to address just such a problem. Essentially, a configurable logic analyzer resides in the programmable logic that knows how to pass a trigger to the debug logic in the embedded processor. Most importantly, an integrated GUI is required to provide a visual correlation of this information. Getting the hardware engineers and software engineers looking at the same screen can solve fabric/core interdependence issues in a big hurry. Coming Soon To an FPGA Near You As the FPGA devices grow, the debug problems are only going to get more difficult. Before the end of the year, tools to address multiprocessor debug support will be available for processors embedded in programmable logic. Look for such features as synchronous go/halt, cross triggers and synchronized trace collection. When multiple processors or lots of integrated peripherals are involved, then system level performance analysis will be required. No longer will performance measurements like cache hit/miss or interrupt latency be sufficient. System level bus-centric measurements address issues like bus utilization or how long a bus master waits to get access to the bus, i.e. access latency. Measuring these times can determine if the bus master priorities should be adjusted or if the software running on different processor cores should be modified to avoid these timing conflicts. System level performance tools can also help to determine if data should reside off-chip and be processed with slower accesses versus being DMA’ed into on-chip memory for faster access but with the added overhead. The overhead could be zero if the CPU can process other data while the transfer is occurring and it doesn’t take up the majority of the bus bandwidth. Good system level performance tools allow all these types of dynamic data transfers and data crunching to be analyzed. Summary Debug tools are important for both processor and general system-level analysis and they get more important as the number of integrated IP cores increases. Simulation is extremely helpful, but the best simulations will not reveal problems that only manifest themselves during real-time operation. This is where top quality debug tools really prove their value. In the increasingly sophisticated FPGA development environments, the design engineer can choose the tools appropriate to the application at hand. There are lot of differences in debug capabilities at both the processor and chip level. Understand what you need up front and plan it in during the design stage. When your system is malfunctioning in the field and the FPGA is soldered on the board, that is the wrong time to be thinking about debug. About the author: May 25, 2004 Comments on this article? Send them to comments@fpgajournal.com |
All
material on this site copyright © 2006 techfocus media, inc.
All rights reserved.
FPGA and Structured ASIC Journal Privacy Statement |