
Senior / Staff AI Research Engineer, Real-Time Inference at RoboForce
Milpitas, CAFull-timeAIPosted about 2 months ago
Apply with PipelineAbout the Role
<p><strong>Why RoboForce</strong></p>
<div data-page-id="SQebduWNJoyXdvxrW44lL7ssgEf" data-lark-html-role="root" data-docx-has-block-data="false">
<div class="ace-line ace-line old-record-id-R8SQdhthJo8kHKxiTfzljrYOgug">
<div data-page-id="SQebduWNJoyXdvxrW44lL7ssgEf" data-lark-html-role="root" data-docx-has-block-data="false">
<div class="ace-line ace-line old-record-id-MJobd8l0Yo8hQ7xYFcolKmGHgfd">RoboForce is an AI robotics company developing Physical AI–powered Robo-Labor for dull, dirty, and dangerous work. The company's robots are engineered for demanding industrial environments, with a focus on real-world deployment and scalability.</div>
<div class="ace-line ace-line old-record-id-MZAodyEiwoqwSDxbFSAlgnCMgeg"> </div>
<div class="ace-line ace-line old-record-id-MZAodyEiwoqwSDxbFSAlgnCMgeg">We are looking for a <strong>Senior / Staff AI Research Engineer, Real-Time Inference</strong> to make embodied AI practical on the edge. In this role, you will drive the full stack of model optimization — from CUDA kernel engineering to quantization and compression — to deploy high-performance AI models on edge compute platforms powering RoboForce robots in the field.</div>
<div class="ace-line ace-line old-record-id-X0GId5bxCogqzxxLBAsl5STRgJc"> </div>
<div class="ace-line ace-line old-record-id-X0GId5bxCogqzxxLBAsl5STRgJc"><strong>Responsibilities</strong></div>
<ul class="list-bullet1">
<li class="ace-line ace-line old-record-id-JgE8deH2Jo1BVYxEToBlr2GvgMh" data-list="bullet">
<div>Develop and optimize inference pipelines for embodied AI models (VLA, perception, world models) targeting real-time execution on edge hardware such as NVIDIA Jetson platforms.</div>
</li>
<li class="ace-line ace-line old-record-id-RNJMdN4uqoPMwix0yhnl5loxgqc" data-list="bullet">
<div>Implement CUDA-level optimizations including custom kernels, memory layout tuning, and hardware-aware graph compilation to minimize model latency.</div>
</li>
<li class="ace-line ace-line old-record-id-CD7LdeLyLoMBl5x3BVslTMu3g5c" data-list="bullet">
<div>Apply and advance model compression techniques — quantization (INT8/FP16/INT4), pruning, distillation, and structured sparsity — to achieve production-grade throughput on constrained devices.</div>
</li>
<li class="ace-line ace-line old-record-id-BhgEdsJMiolGkyxyI4ul09Fdg4y" data-list="bullet">
<div>Profile and debug end-to-end inference stacks using tools such as NSight, TensorRT, and Triton to identify and eliminate performance bottlenecks.</div>
</li>
<li class="ace-line ace-line old-record-id-TWiXdK6AUo0fCaxmq2dlm0s4gvb" data-list="bullet">
<div>Collaborate with ML research and robotics teams to co-design model architectures that meet real-time control-loop latency requirements.</div>
</li>
<li class="ace-line ace-line old-record-id-XxCMd1Vkio3cNdxTlsclbqr4gah" data-list="bullet">
<div>Establish benchmarking frameworks to evaluate model performance across latency, throughput, power consumption, and accuracy tradeoffs on target hardware.</div>
</li>
</ul>
<div class="ace-line ace-line old-record-id-DgbRdlUSGoXYLNxP4C8l83iGghd"><strong>Requirements</strong></div>
<ul class="list-bullet1">
<li class="ace-line ace-line old-record-id-Jzppd6t7yob01SxhoaZl6aFyg1f" data-list="bullet">
<div>Master's degree in Computer Science, Electrical Engineering, or related field with 4+ years of experience, or a PhD degree.</div>
</li>
<li class="ace-line ace-line old-record-id-WcqodnpmQotLxExU0QGlMplUgSb" data-list="bullet">
<div>Deep expertise in CUDA programming, GPU architecture, and low-level kernel optimization, including custom kernel authoring with tools such as Triton.</div>
</li>
<li class="ace-line ace-line old-record-id-UC42dnlWuo6XHWxWAdel1v61gvc" data-list="bullet">
<div>Hands-on experience with model quantization, pruning, distillation, and deployment using frameworks such as TensorRT, ONNX Runtime, TVM, or Triton.</div>
</li>
<li class="ace-line ace-line old-record-id-DpasdwJS5o4zoLxHi9wlXOn8gVc" data-list="bullet">
<div>Proficiency in C++ and Python; strong systems programming and performance profiling skills.</div>
</li>
<li class="ace-line ace-line old-record-id-KJ46d2Vl6oiIrzx5kT6l4AkSgkd" data-list="bullet">
<div>Experience deploying ML models on edge or embedded hardware (e.g., NVIDIA Jetson, Orin, or equivalent ARM/GPU SoCs).</div>
</li>
<li class="ace-line ace-line old-record-id-D7thdMQtMogIJkx9JGflhsL8gZO" data-list="bullet">
<div><strong>Requires 5 days/week in-office collaboration with the teams.</strong></div>
</li>
</ul>
<div class="ace-line ace-line old-record-id-KXKWdrl4TokeUExmoWKlxZ8RgMc"><strong>Bonus Qualifications</strong></div>
<ul class="list-bullet1">
<li class="ace-line ace-line old-record-id-GKCNdalW1oW16hxpy2QlHfsOgXm" data-list="bullet">
<div>Familiarity with embodied AI models — VLA, multimodal transformers, or diffusion-based policies — and their inference characteristics.</div>
</li>
<li class="ace-line ace-line old-record-id-RM01dJ8iSoJltbx3NPRlmKUFgKf" data-list="bullet">
<div>Familiarity with compiler-based optimization pipelines such as XLA, torch.compile, or MLIR for graph-level model acceleration.</div>
</li>
<li class="ace-line ace-line old-record-id-Oxibdl8vyoQiBxxNBuUlcTsDg2d" data-list="bullet">
<div>Understanding of robotics system constraints such as control-loop timing, sensor fusion latency, and memory bandwidth limits on edge SoCs.</div>
</li>
<li class="ace-line ace-line old-record-id-RqYedyyHbocrlNxnTCilrQZRg4f" data-list="bullet">
<div>Publication or production work in efficient deep learning or on-device ML systems.</div>
</li>
</ul>
<div class="ace-line ace-line old-record-id-TFgTdWsPgoqoyOx1Bivl75Yrgtf"><strong>Benefits</strong></div>
<ul class="list-bullet1">
<li class="ace-line ace-line old-record-id-GTd6dPeZuo1GB9xU6XOlVvpnggd" data-list="bullet">
<div>Competitive stock options/equity programs.</div>
</li>
<li class="ace-line ace-line old-record-id-Ms7sdrJr5oC16OxYCcdlVm9dgwm" data-list="bullet">
<div>Health, dental, and vision insurance, 401(k) plan.</div>
</li>
<li class="ace-line ace-line old-record-id-Sy2Sd3dHDoROyFxR9yrl72Fkg0M" data-list="bullet">
<div>Visa sponsorship and green card support for qualified candidates.</div>
</li>
<li class="ace-line ace-line old-record-id-UIOhdl9llo0X0cxMsdilPSB7gve" data-list="bullet">
<div>Lunches and dinners, a fully stocked kitchen, and regular team-building events.</div>
</li>
</ul>
</div>
</div>
</div>
Related Roles
Model Evaluation Operator (Swing Shift)
RoboForce
Milpitas, CAData Collection Operator (Swing Shift)
RoboForce
Milpitas, CAAI Resident
RoboForce
Milpitas, CASenior / Staff AI Research Scientist, Manipulation
RoboForce
Milpitas, CASenior / Staff AI Research Scientist, Foundation Models
RoboForce
Milpitas, CASenior / Staff AI Research Engineer, Data Infrastructure
RoboForce
Milpitas, CA