Tilde Research is a moonshot AI lab advancing mechanistic interpretability, new architectures, and pretraining science. We build foundational understanding of models to advance the frontier of intelligence.About the role:As a Kernel Engineer at Tilde, you'll design, implement, and optimize high-performance GPU kernels that are critical to scaling our training and inference workloads. Your work will enable faster iteration cycles, higher throughput, and lower latency. You'll work closely with ML researchers and engineers to co-design models and infrastructure that are deeply performance-aware, and help push the limits of what current hardware can support.What you might work on:Design, develop, and tune custom GPU kernels for core model operationsWork with ML engineers to prototype and scale novel model architecturesContribute to system-wide efforts to improve efficiency and throughput, beyond just kernel-level optimizationsYou're a good fit if you:Have experience in deep learning or related research areasHave demonstrated exceptional capability in working on ML kernels. This can include:Strong open source contributionsThoughtful technical blog posts/work logsPrevious experience working on hardware-aligned algorithmsDeep familiarity with PyTorch, Triton/TK/TileLang (>1 of), basic familiarity with CUDA, and knowledge of GPU architecture.Communicate clearly and effectively, both verbally and in writingStrong algorithmic thinkerAre able to learn quickly