- Why Register?
- Download development tools
- Create and track support tickets
- Subscribe to resource updates
- Access latest developer news
XMOS vs FPGA Whitepaper
A Programmable Revolution
A Compelling Alternative to Low Cost FPGAs
Introduction
XMOS XS1-L
FPGAs and CPLDs are used in many
The XMOS XS1-L family of devices are based
industries covering a broad range of
on the XMOS XCore® processor, a 500 MIPS
performance requirements, price points and
event-driven RISC processor with 100%
power envelopes.
deterministic operation, a 32x32 multiplier,
In the early days, FPGAs were used for
programmable I/O and a host of other
prototyping ASICs and for high-end, low
resources, all programmable entirely in
volume applications that could bear a high
C++, C and XC. XC includes extensions to C
unit cost, such as the communications and
for concurrency, communications, and
defence sectors. Since then, FPGA vendors
timed I/O operations
have driven down costs and power, through
XMOS devices can be used as a direct
rapid process migration, to produce new
substitute for low cost SRAM based FPGAs.
lower cost and lower power device families
In other cases, they provide suitable
to address new requirements.
replacements for some of the higher
performance flash-based FPGAs.
Evolution
Figure 1 shows the XS1-L family with
In many cases low end FPGA families are
respect to the price, capability (capacity and
now considered for power and cost
performance) and power consumption of
sensitive consumer and industrial
various FPGA device families. For a large
applications. These sectors benefit
number of digital processing applications,
sufficiently from the flexibility and time to
the XS1-L outperforms Altera Cyclone III,
market advantages offered by FPGAs to
Xilinx Spartan 3A FPGAS and equivalents on
warrant the price premium of
both price and power consumption.
programmability.
From 2009 onwards, new entrants to the
programmable silicon market are starting to
win the hearts and minds of designers
looking for the best possible mix of
solution flexibility, price and performance.
Some new players, such as SiliconBlue and
Achronix and tabula have come up with new
FPGA architectures. In parallel, other
vendors such as Actel and Cypress have
integrated FPGA fabrics with programmable
analogue blocks and microcontrollers.
All of these efforts represent an evolution
of the same FPGA concept.
Revolution: Now for the first time there is
an all-digital flexible solution that will
prove to be a better, cheaper, easier and
lower power solution than an FPGA for
Figure 1: XS1-L compared to popular FPGA
many applications--XMOS.
families
A single core XS1-L device offers a capacity
for general digital logic implementation
roughly comparable to an FPGA having
7-20K logic elements (roughly 70K-200K
ASIC gates).
2010-05-25
© 2010 XMOS Ltd
www.xmos.com
A Programmable Revolution  A Compelling Alternative to Low Cost FPGAs
There will always be a place for micropower
Threads, Memory and Channels
programmable devices and high-end, DSP,
Threads can use channels to provide
bandwidth intensive FPGAs such as Virtex
buffered, event-based communication
or Stratix parts. For applications residing in
between threads, allowing data exchange
the space in between, however, XMOS can
and synchronization using single cycle
improve development speed and lower
instructions. Alternatively, threads can
costs and power consumption without
share 64KB of on-chip SRAM memory to
compromising solution flexibility and
exchange data, using single cycle lock
programmability.
instructions to co-ordinate access.
In addition the XS1-L provides robust IP
This makes the implementation of
protection only found in flash-based FPGAs
lightweight protocol stacks (such as TCP/IP
whilst retaining performance much closer to
microIP) that fit within the 64KB of memory
that of an SRAM FPGA.
essentially free when compared to an
The rest of this paper describes how XMOS
equivalent implementation in an FPGA,
technology delivers a revolution in both the
which requires a soft core such as Xilinx's
programmable silicon itself and the
MicroBlaze and an external memory
associated hardware design processes.
interface that would consume a large
The XCore Processor
portion of the FPGA capacity, not to
mention adding an external memory chip to
Instead of writing code in HDL to describe
the BOM cost.
registers, gate and wires, designers who
use XMOS technology, write code in C, C++
Task
XMOS approach FPGA approach
or XC to implement deterministic
Design
High Level,
HDL entry:
processing functions, as shown in Figure 2.
Capture
parallel C/XC
always @(posedge
code
clock)
Resources
instructions,
Gates, LUTs,
threads,
routing
channels, timers
DSP Threads,
32x32 HDL entry,
MAC
Embedded Block
Multipliers
Table 1: FPGA Design Concepts and XMOS
Equivalents
Figure 2: Designing with the XCore
Time
Parallelism
Each XCore has ten configurable timers,
An XCore processor runs multiple real-time
which can be directly instantiated in XC and
hardware threads simultaneously. Each
used to control program execution or I/O
thread has access to a dedicated set of
operations with nominal resolution of 10ns.
general purpose registers, gets a
guaranteed share of the processing power,
I/O and Interfacing
and executes a program using common
RISC-style instructions. Each thread can
Each XCore provides up to 64 GPIO that can
execute simple computational code, DSP
be set and sampled in a single instruction
code, control software (taking logic
via intelligent, autonomous I/O resources
decisions, or executing a state machine) or
called Ports. Simple input and output
handle I/O operations using intelligent I/O
instructions transfer data to or from I/O
resources.
ports, as shown in Figure 3. More complex
use of ports allows data to be serialized and
The eight hardware threads, generous MIPS,
de-serialized, enabling the processor to
100% deterministic architecture and
keep up with high-speed data streams. The
intelligent I/O provide designers with the
ports can timestamp data, synchronize
flexibility of HDL, while dramatically easing
transfers with an external or internal clock,
the design entry and verification tasks.
and schedule data to be input or output at
specific times.
XMOS, the XMOS logo and XCore are trademarks of XMOS Ltd
All other trademarks are the property of their respective owners.
A Programmable Revolution  A Compelling Alternative to Low Cost FPGAs
out buffered port:1 outP = XS1_PORT_1B;
Selecting your programmable
in buffered port:4 inP = XS1_PORT_4A;
clock ref = XS1_CLKBLK_REF;
solution
int main(void) {
Table 3 lists a range of application function
examples and compares the utilization of
int value;
XCore resources and FPGA logic elements
configure_out_port_no_ready(outP, ref, 0);
configure_in_port_no_ready(inP, ref);
required to implement the function.
while (1) {
inP :> value;
XS1-L
FPGA Asic
if (value > 9)
outP <: 1;
Func
Thread
Me
Cells
Lo
Gate
Nand
else
MI
GP
mor
outP <: 0;
tion
P
IO
gic
}
S
s
s
y
2
USB2 +
5
400 30794 12 4400 44000
2EP
Ethernet
5
250 9982 14 3600 36000
MAC+MII
TCP/IP
1
50 40000 0
61001 61000
(uip)
Figure 3: XMOS Ports Use Example
S/PDIF
2
100 5036
2
800 8000
Clock Blocks are used to select the internal
I2C
XCore system clock, the timer reference
0.5
50
3044
2
700 7000
Master
clock, or an external clock connected via a
1-bit port to clock a given port. Clock
SDRAM
blocks sample incoming external clocks and
Interface
1
100 2974 30 1100 11000
then provide a variety of conditioning
(D8,
options (for example, delaying the clock
A14)
relative to the data associated with it).
Table 3: Application Function Examples
Task
XMOS approach
FPGA approach
IP Protection
I/O
Ports, timers
HDL entry
Each XCore has 8KB of secure one time
Interfacing
programmable (OTP) memory, secure
Clocking Clock
blocks Clock Management Units
execution mode, the ability to load AES
encrypted firmware, and the option to
Table 2: XMOS and FPGA I/O Concepts
disable JTAG and external channel access to
a secured XCore. This all adds up to a level
Event-Driven Processing
of IP protection that cannot be matched by
The XCore processor is event-driven.
an SRAM FPGA.
Threads waiting for events do not consume
any processing resources. An event can be
Applications requiring robust IP protection
the completion of a communication or I/O
are often forced to use a slower but more
operation, the release of a lock, or a timer
secure flash-based FPGA, which can lead to
reaching a programmed time. Threads can
timing closure issues. XMOS XS1-L devices
wait for any one of a set of events; the first
offer a way to meet security and
event causes the thread to start in a single
performance requirements with minimal
instruction.
effort.
The XS1-L XCore provides an Active Energy
Conservation mode in which it automatically
and instantly slows the XCore clock down to
a user-specified speed whenever all threads
are paused. The clock returns to its normal
speed as soon as any thread has new work
to do.
1 Assumes a NIOS II and external memory interface is
required for TCP/IP running in a Cyclone III device
XMOS, the XMOS logo and XCore are trademarks of XMOS Ltd
All other trademarks are the property of their respective owners.
A Programmable Revolution  A Compelling Alternative to Low Cost FPGAs
DSP
Soft Processors
XS1-L devices offer easily accessible DSP
For FPGA designs that need to employ a
functionality via its 500 MHz 32x32
soft processor to implement a protocol
multiplier, offering a sustained rate
stack, the issue becomes the amount of
(including load/store operations) of 59
code memory required. For many simple
MMACS per XCore (119 MMACS peak) which
protocol stacks, such as TCP/IP for simple
is sufficient for many audio, signal control
web-servers and various I/O related
and lower end DSP tasks that need low cost
standard and proprietary protocols, the
and power per MMAC.
64KB of internal SRAM on the XCore is
The low cost FPGA families such as Altera
sufficient.
Cyclone III, on the other hand, offer tens or
In these cases the XS1-L is the cost-effective
hundreds of embedded block multipliers,
choice. To achieve the above in an FPGA
which can be ganged together to create
would require either:
multipliers of arbitrary width. When many of
a gate hungry soft processor core and
these are employed in parallel, an
external memory interface plus external
aggregate DSP processing capability can be
memory chip, all of which adds a
built up far in excess of what the XS1-L can
sizeable penalty in device capacity,
achieve.
power consumption, I/O, BOM cost and
Consequently the FPGA provides a
board space.
significant advantage for high throughput
image, video processing or
A soft processor core with additional
telecommunications infrastructure
logic cells used to implement a small
processing. For many emerging applications
code memory on the FPGA.
(such as consumer and prosumer digital
Many soft processor implementations may
audio), however, moderate DSP needs are
also find it impossible to achieve the clock
just one item on the list of requirements
speed required to meet processing
alongside flexible control, low cost and
requirements, leaving the designer to look
integration. For these types of applications
for a product that integrates hardened 32-
XMOS is likely to offer the ideal solution, all
bit RISC cores with a suitable programmable
programmable in a high-level language.
fabric.
For applications that have code footprints
Solution Scaling
well in excess of 64KB, an FPGA with
An application that does not fit in a single
external memory may be the only option.
XCore may be easily spread across multiple
cores by selecting the two-core XS1-L2
device. Alternatively multiple XMOS devices
can be connected together by asynchronous
off-chip links that unify multiple XS1
processors into a single unified network
mediated by communication via channels.
High I/O Capability
For applications that require many 100s of
I/Os, a low cost FPGA is likely to be a
preferable choice. Likewise for very high
speed native I/O capabilities such as LVDS,
gigabit SERDES transceivers, SSTL2 or other
exotic I/O technology, choose an FPGA.
However a large majority of applications are
well served with single ended 3.3V I/O,
making large amounts of high speed I/O an
expensive and unneeded feature.
Figure 4: Costs associated with Soft Core Usage
in FPGAs
XMOS, the XMOS logo and XCore are trademarks of XMOS Ltd
All other trademarks are the property of their respective owners.
A Programmable Revolution  A Compelling Alternative to Low Cost FPGAs
Design Flow
synthesis.
Designers using XMOS technology, on the
Figure 5 compares the standard FPGA
other hand, immediately reap the
design flow to the XMOS design flow.
productivity benefits of coding in a high
Overall, the XMOS design flow offers
level language, yet avoid the pitfalls of high
dramatically shorter iteration times and
level synthesis.
more straightforward design entry than the
traditional FPGA flow.
Ultra Fast Compilation
Design Entry
Even large XMOS programs compile and link
Design entry is C++, C or XC using either
in seconds compared to the minutes or
the XDE
even hours required to complete a typical
graphical development environment or your
iteration of FPGA synthesis and place and
favorite text editor. The XDE offers syntax
route.
highlighting, indenting and offers the ability
to compile, launch simulations and
Application Timing Closure
debugging.
The XS1-L implements parallelism using its
Design in a High Level Language
instruction set and native resources, all of
which reliably run at 500 MHz. Designers
EDA vendors have expended significant
using XMOS have no need to check register
efforts to bring the advantages of high level
to register timing paths across multiple
languages to FPGA design, and still have a
design corners.
long way to go to deliver practical hardware
One of the most powerful attractions of the
design flows using C and high level
Figure 5: XMOS and FPGA Design Flows Compared
XMOS, the XMOS logo and XCore are trademarks of XMOS Ltd
All other trademarks are the property of their respective owners.
A Programmable Revolution  A Compelling Alternative to Low Cost FPGAs
XMOS approach for FPGA designers is the
Bitstream Generation
ability to statically time paths through
application code using the XMOS Timing
After the design is ready, firmware for
Analyzer, which times critical application
downloading to configuration flash
paths rather than register-to-register paths.
memories are easily generated with XFLASH,
which includes provision for multiple boot
The Timing Analyzer achieves 100%
images and Dynamic Field Upgrade (DFU).
coverage of enumerated constraints, unlike
test-bench based simulation. For example,
XBURN can be used to burn parts of the
the Timing Analyzer can calculate the time
code image and selected user encryption
in XCore cycles from a thread sampling a
keys to the 8KB of OTP on chip, or just set
specific pattern on an input port to
security options such as disabling JTAG
outputting a response on an output port.
debug access.
The result can be graphically displayed,
In System Debug
highlighting the critical path through the
code and automatically signing off against
XMOS offers a typical processor debugging
user specified timing constraints expressed
environment using XGDB (built on top of
as pragmas in the code or entered using the
gdb, the GNU Debugger) and the XS1-L
XTA GUI.
JTAG
For FPGA designers to access similar
interface.
functionality they must deploy property
Debug iterations with XMOS tools only
checkers and formal proof methods, which
require a recompile and regeneration of
rapidly reach their limits on even
firmware. FPGA designers must pre-select
moderately sized designs, and require
the nodes they wish to view and iterate
specialist design knowledge to apply.
through synthesis, place and route and
timing analysis for each debug iteration.
The Timing Analyzer offers a whole-
application level timing capability that does
PCB Design considerations
not rely on time consuming dynamic
XMOS offers its processors in QFP, QFN and
simulation that will be appreciated by
BGA packages, suitable for 2 layer and 4
software and hardware engineers alike.
layer BCB implementations.
Simulation
In addition, the XS1-L parts require only two
voltage supplies, a 3.3V or 2.5V supply for
Designers have the option to run XCore
the I/O, and a 1V core voltage.
simulations of their code, visualizing the
The various port/pin configurations that can
results with the XMOS VCD waveform viewer
be realized with the XS1-L also offer some
and debugging and single stepping with the
late pin assignment flexibility although not
debugger, all built into the XDE graphical
to the same fine degree offered by FPGAs.
environment.
The signals displayed in the VCD viewer are
Toolchain Simplicity and Platform
a range of actual signals that exist within
Support
the XS1-L silicon including program
counters, port resource signals, timers,
Full FPGA design tool chains from the FPGA
channels and thread status.
vendors and/or third party EDA suppliers
run to multiple gigabytes of data.
These simulations run an order of
magnitude faster than a corresponding
The XMOS tools typically only require about
dynamic simulation in an event-driven HDL
200 megabytes and work out of the box on
simulator. XSIM also provides a range of
Windows, Linux and MAC platforms,
simple testbench plug-ins and an API for the
allowing you to develop your applications
user to create more of their own.
on desktop PCs or notebooks.
Summary
XMOS offers a lower cost and more secure
platform with dramatically enhanced
Revision History
| Revision | Released | Formats | Supported Tools |
|---|---|---|---|
| Version: 1.0 | September 15, 2010 | download | N/A |
