
Senior / Staff AI Research Engineer, Data Infrastructure at RoboForce
Milpitas, CAFull-timeAIPosted about 2 months ago
Apply with PipelineAbout the Role
<p><strong>Why RoboForce</strong></p>
<div data-page-id="SQebduWNJoyXdvxrW44lL7ssgEf" data-lark-html-role="root" data-docx-has-block-data="false">
<div class="ace-line ace-line old-record-id-JWoMdQeYxodhHvxYF3rlsTdogif">RoboForce is an AI robotics company developing Physical AI–powered Robo-Labor for dull, dirty, and dangerous work. The company's robots are engineered for demanding industrial environments, with a focus on real-world deployment and scalability.</div>
<div class="ace-line ace-line old-record-id-JWoMdQeYxodhHvxYF3rlsTdogif"> </div>
<div class="ace-line ace-line old-record-id-PkTRdAygZofYhPxyIG4lYfIvgle">We are looking for a <strong>Senior / Staff AI Research Engineer, Data Infrastructure</strong> to build the data and learning engine behind RoboForce's Physical AI stack. In this role, you will own the full pipeline — from raw teleoperation and UMI device data collection through curation, annotation, and storage, to post-training infrastructure that scores demonstrations, identifies failure patterns, and closes the loop back into model retraining.</div>
<div class="ace-line ace-line old-record-id-PkTRdAygZofYhPxyIG4lYfIvgle"> </div>
<div class="ace-line ace-line old-record-id-DqY7dKHlYoWeSyxgmFpleqVTgF8"><strong>Responsibilities</strong></div>
<ul class="list-bullet1">
<li class="ace-line ace-line old-record-id-JRz4d1r0IoHSxTx588kldmGpgVb" data-list="bullet">
<div>Design and maintain end-to-end data collection pipelines ingesting multimodal demonstration data from teleoperation devices and UMI hardware, including synchronization, versioning, and distributed storage at scale.</div>
</li>
<li class="ace-line ace-line old-record-id-KoREdwqt0on7oRxDTxqlaPZTgGc" data-list="bullet">
<div>Build annotation tooling and data curation workflows — quality filtering, deduplication, episode scoring, and domain reweighting — to produce high-quality training datasets for robot policy learning.</div>
</li>
<li class="ace-line ace-line old-record-id-RtSTdxfk5o5nShxkraZlWHb4gez" data-list="bullet">
<div>Develop post-SFT reinforcement learning infrastructure: implement reward scoring on demonstrations, mine and categorize failure patterns, and feed curated failure data back into the retraining loop.</div>
</li>
<li class="ace-line ace-line old-record-id-WPwkd0aUEoH5gQxP566l74fSgof" data-list="bullet">
<div>Build evaluation and test infrastructure to log policy rollouts on-robot, capture structured results, and surface actionable diagnostics for the research team.</div>
</li>
<li class="ace-line ace-line old-record-id-AfUMdTYNjoDweDx4Hu3lZ3Isgrh" data-list="bullet">
<div>Collaborate with ML researchers to define data schemas, episode formats, and pipeline interfaces that support rapid iteration on VLA and manipulation policy training.</div>
</li>
<li class="ace-line ace-line old-record-id-F1zjdHSYBo7iFExobpol5g6PgDc" data-list="bullet">
<div>Architect scalable storage and retrieval systems for heterogeneous robot data (vision, proprioception, action, language) across both cloud and on-prem environments.</div>
</li>
</ul>
<div class="ace-line ace-line old-record-id-DoiNd72l1oS6mYxN70JlXEH8g17"><strong>Requirements</strong></div>
<ul class="list-bullet1">
<li class="ace-line ace-line old-record-id-CDuDdsj0jovq8WxsDo0llVR2gog" data-list="bullet">
<div>Bachelor's or Master's degree in Computer Science, Robotics, or related field with 5+ years of experience.</div>
</li>
<li class="ace-line ace-line old-record-id-IA1TdZe7Vo0kMBxaYyWlYigegLh" data-list="bullet">
<div>Strong proficiency in Python and experience building production-grade data pipelines and ETL systems.</div>
</li>
<li class="ace-line ace-line old-record-id-YZAmd26ZzoyGDxxfYHCl8J9OgBh" data-list="bullet">
<div>Hands-on experience with large-scale dataset management, including versioning, deduplication, quality filtering, and distributed storage (e.g., S3, GCS, HDF5, WebDataset, Zarr).</div>
</li>
<li class="ace-line ace-line old-record-id-AZJtdI2ito9EPQxgpx3liFJjgXd" data-list="bullet">
<div>Experience building or working with post-training infrastructure — SFT pipelines, reward modeling, or RL training loops (e.g., PPO, DPO, rejection sampling).</div>
</li>
<li class="ace-line ace-line old-record-id-AMNud2znXocNsrxfLTFlHUpbgAd" data-list="bullet">
<div>Familiarity with deep learning frameworks (PyTorch, JAX) and ML training workflows sufficient to collaborate tightly with research teams.</div>
</li>
<li class="ace-line ace-line old-record-id-LQ2ld62doocvLBx7kmcllm9ngrf" data-list="bullet">
<div><strong>Requires 5 days/week in-office collaboration with the teams.</strong></div>
</li>
</ul>
<div class="ace-line ace-line old-record-id-VMPddqZK4oKA6nx9J89lRUNSgBd"><strong>Bonus Qualifications</strong></div>
<ul class="list-bullet1">
<li class="ace-line ace-line old-record-id-N2RvdZOt6oM81UxP3EUlwEZvgae" data-list="bullet">
<div>Experience with robotics data collection hardware — teleoperation devices, UMI, GELLO, or similar — and the synchronization and preprocessing challenges they introduce.</div>
</li>
<li class="ace-line ace-line old-record-id-LsMZdfMfhoq6TVxnuEDldELmgWh" data-list="bullet">
<div>Familiarity with robot learning pipelines: imitation learning, behavior cloning, or VLA/VLM fine-tuning workflows.</div>
</li>
<li class="ace-line ace-line old-record-id-SdjldPu11oNJEOxhYYMlC4TkgXB" data-list="bullet">
<div>Experience building evaluation or experiment tracking infrastructure (e.g., Weights & Biases, MLflow, custom rollout loggers).</div>
</li>
<li class="ace-line ace-line old-record-id-TT7Idxpmwo9VZ2xFGIqlkKHAgKg" data-list="bullet">
<div>Proven ability to design annotation tooling or human-in-the-loop labeling systems for structured or multimodal data.</div>
</li>
</ul>
<div class="ace-line ace-line old-record-id-TKnndD8KKooWyrxJsXSlqevIgDf"><strong>Benefits</strong></div>
<ul class="list-bullet1">
<li class="ace-line ace-line old-record-id-AHqyd1aX6ohZnqx6Odil4fr1gEc" data-list="bullet">
<div>Competitive stock options/equity programs.</div>
</li>
<li class="ace-line ace-line old-record-id-Hd6AdCk4Yo0EztxqcEHlNsjXgIc" data-list="bullet">
<div>Health, dental, and vision insurance, 401(k) plan.</div>
</li>
<li class="ace-line ace-line old-record-id-U27WduxWfoOAwSx9SH9l7aO1g0w" data-list="bullet">
<div>Visa sponsorship and green card support for qualified candidates.</div>
</li>
<li class="ace-line ace-line old-record-id-C8fVdayw0oF2A0xA0yzl5jDngGh" data-list="bullet">
<div>Lunches and dinners, a fully stocked kitchen, and regular team-building events.</div>
</li>
</ul>
</div>
Related Roles
Model Evaluation Operator (Swing Shift)
RoboForce
Milpitas, CAData Collection Operator (Swing Shift)
RoboForce
Milpitas, CAAI Resident
RoboForce
Milpitas, CASenior / Staff AI Research Scientist, Manipulation
RoboForce
Milpitas, CASenior / Staff AI Research Engineer, Real-Time Inference
RoboForce
Milpitas, CASenior / Staff AI Research Scientist, Foundation Models
RoboForce
Milpitas, CA