Full time, permanent
Location: Ideally Bristol or based in UK.
Reporting to: Software Manager, Compilation Tools
The role
Based in our Bristol office, the Senior AI Compiler Engineer works in the Compilation Tools team and plays a pivotal role in developing and enhancing the LLVM – based toolchain for our novel multi – core processors, with a particular focus on enabling AI and machine learning workloads on XMOS hardware.
Key Responsibilities:
- Working on a variety of tools and target libraries to build , maintain, and enhance a complete toolchain for a new multi – core RISC – V processor . The toolchain consists of a Clang C/C++ compiler, runtime libraries and LLD linker.
- Maintaining and enhancing the currently shipping XCORE ISA toolchain. This includes a traditional compile chain plus bespoke tools and libraries for targeting the unique features of XCORE devices.
- Maintaining the currently shipping AI Compiler. This is an MLIR – based compile chain capable of transforming a Tensorflow Lite model to a representation optimised for execution on XCORE devices, including memory optimisations and use of a bespoke C/C++ runtime.
- Planning and implementation of the next generation of our AI Compilation offering
- Making benchmark – driven tool improvements, particularly optimisation for a resource – constrained target.
- Contributing to the specification of the next generation of silicon and toolchain.
The ideal candidate
You’re a software engineer with either prior experience of compiler or AI/ML toolchain development, or the ability (and strong desire) to learn about it on – the – job.
Key skills and qualifications
Required:
- Strong C/C++ programming skills.
- Deep understanding of tools and libraries used to build software (especially for embedded systems).
- The ability to quickly assimilate complex problems and develop solutions autonomously.
- Compiler development, in particular the back – end, including experience with or interest in AI/ML compiler frameworks such as TensorFlow (XLA, MLIR, or TFLite)
- An interest in working across the full AI stack – from ML framework integration and compiler front – end work through to low – level runtime optimisation and hardware bring – up.
- Minimum of 2 – 3 years of development experience in a commercial setting.
Preferred:
- Reading and writing assembly code.
- Implementing and maintaining runtime libraries for bare – metal targets.
- Knowledge of micro – processor architectures for embedded applications – for example, instruction set composition, pipeline stages, memory hierarchy and cache implementations.
- Development using Python.
- Hands – on experience with TensorFlow, including working with TFLite, XLA, or MLIR – based compilation pipelines.
- Interest in the future of embedded ML technologies, including LiteRT,
- Performing benchmarking and benchmark – driven optimisation.
- Releasing compiler technology to a user base and supporting internal and/or external users.
At XMOS, we believe that diverse experiences and perspectives drive innovation and success. We know that no one checks every box, and we don’t expect you to. If you’re excited about this role, passionate about what you do, and eager to learn, we want to hear from you —even if you don’t meet every qualification. Your unique background, skills, and potential to grow are just as important as ticking every box. If you believe you could make a valuable contribution to our team, we encourage you to apply.
About XMOS
XMOS is the leading producer of Generative Systems-on-Chip (GenSoC). XCORE® is a generative SoC platform, capable of integrating Control, IO, DSP and AI in a single chip to match any requirements customers may have. Its deterministic, parallel architecture makes it a game – changer for modern generative system design, enabling users to build entire differentiated systems more quickly and economically than on any other platform.
The parallel architecture means it can perform multiple tasks at the same time without interference, and complete them reliably and predictably. XCORE’s flexibility, scalability and determinism give it a critical edge over systems built on sequential processors as a platform upon which systems can be generated.
In the modern era of generative system design, XCORE chips enable systems to become endlessly reconfigurable in real time, allowing for flexibility and faster time to market, with lower development costs. Custom hardware requires an immense amount of forward planning, compromise, expertise, and cost. Fixed hardware, high development costs and long lead times mean that creative projects get stalled, compromised, or never built at all.
