← All Jobs
Posted Mar 12, 2026

Software Engineer (AI Compiler)

Apply Now ✨
Requirements • BS, MS, or PhD in Computer Science, Electrical/Computer Engineering, or related field, • 4+ years of hands-on ML compiler or systems engineering experience, • Demonstrated experience building and owning an end-to-end compiler stack (front-end, IR, optimization, and backend code generation), • Experience working with machine learning models, neural network graphs, and graph optimizations as part of lowering and acceleration, using frameworks like TVM, XLA, or Glow, • Comfortable collaborating with hardware teams to map novel architectural primitives from IR to efficient lowerings, kernel implementations, and runtime support, • Strong understanding of compiler performance trade-offs, profiling, bottleneck analysis, and optimization strategies for ML workloads, • (Desirable) Prior experience on compilers for AI/ML accelerators, GPUs, DSPs, or domain-specific architectures, • (Desirable) Contributions to LLVM, MLIR, XLA, TVM, or related open-source compiler projects, • (Desirable) Experience in kernel performance optimization and accelerator-specific code generation, • (Desirable) Demonstrated work in hardware-software co-design where compiler insights shaped ISA or architectural decisions, • (Desirable) Experience building or contributing to cycle-accurate simulators for performance modeling, • (Desirable) Prior work building profiling tools, performance evaluation suites, or bottleneck analyzers for compiler or runtime stacks, • (Desirable) Familiarity with deep learning frameworks and model formats (e.g., JAX, ONNX**, PyTorch, TensorFlow**) and graph transformations, • (Desirable) Experience designing custom IR dialects, optimization passes, and domain-specific lowering transformations What the job involves • We're building an AI accelerator from the ground up, and we need a strong ML compiler engineer to be at the heart of hardware-software co-design. This isn't about inheriting a mature compiler stack - it's about creating one, • You'll join at the architecture definition stage, directly influencing ISA design and the trade-offs that determine what our hardware can do. As we progress toward hardware bringup, you'll build the complete compiler toolchain that takes machine learning models from high-level frameworks down to efficient execution on our novel architecture, • This role offers the rare opportunity to shape both silicon and software simultaneously, • The rare opportunity to shape both silicon and software simultaneously. You'll work alongside hardware architects and researchers to co-design compiler strategies that unlock the full potential of our accelerator, building infrastructure that bridges the gap between ML model graphs and custom ISA primitives, • Your compiler decisions will directly inform hardware features, and hardware capabilities will open new optimisation frontiers for your toolchain, • If you want to architect a compiler stack from first principles, optimise ML workloads on new hardware, and see your decisions realised in silicon, this is the role, • Work across the full stack with software, systems, and hardware teams to ensure correctness, performance, and deployment readiness for real workloads, • Contribute to shaping the long-term compiler architecture and tooling strategy in a fast-moving startup environment, • Design and implement parts of the compiler stack targeting our novel AI accelerator, including front-end lowering, IR transformations, optimization passes, and backend code generation, • Build and evolve MLIR/LLVM based infrastructure to support graph lowering, hardware-aware optimizations, and performance-centric code emission, • Collaborate closely with hardware architects, microarchitects, and research teams to co-design compiler strategies that align with evolving ISA and hardware constraints, • Develop profiling and analysis tools to identify performance bottlenecks, validate generated code, and ensure high throughput/low latency execution of AI workloads, • Enable efficient mapping of high-level ML models to hardware by working with model frameworks and graph representations (e.g., ONNX, JAX, PyTorch), • Drive performance tuning strategies including kernel authoring, schedule generation, and hardware-specific optimization passes Apply Now Apply Now