Research Engineer, Model Performance & Quality (San Francisco) Job at The Rundown AI, Inc., San Francisco, CA

WjZrcVE5UUZiQ09vR3EvMTJDdFRkbGZhUGc9PQ==
  • The Rundown AI, Inc.
  • San Francisco, CA

Job Description

About the role

As a Research Engineer on the Model Performance team, you will help solve one of our greatest challenges: systematically understanding and monitoring model quality in real-time. This role blends research and engineering responsibilities, requiring you to train production models, develop robust monitoring systems, and create novel evaluation methodologies.

Representative Projects

  • Build comprehensive training observability systems - Design and implement monitoring infrastructure to keep an eye on how model behaviors evolve throughout training.
  • Develop next-generation evaluation frameworks - Move beyond traditional benchmarks to create evaluations that capture real-world utility.
  • Create automated quality assessment pipelines - Build custom classifiers to continuously monitor RL transcripts for complex issues
  • Bridge research and production - Partner with research teams to translate cutting-edge evaluation techniques into production-ready systems, and work with engineering teams to ensure our monitoring infrastructure scales with increasingly complex training workflows.

You may be a good fit if you:

  • Are proficient in Python and have experience building production ML systems
  • Have experience with training, evaluating, or monitoring large language models
  • Are naturally curious about debugging complex, distributed systems and thinking about failure modes
  • Enjoy collaborative problem-solving and working across diverse teams - youll work on virtually all stages of our model training pipeline
  • Can balance research exploration with engineering rigor.
  • Have strong analytical skills for interpreting training metrics and model behavior
  • Want to directly impact the quality and safety of deployed AI systems

Strong candidates may have:

  • Experience with reinforcement learning and language model training pipelines
  • Experience designing and implementing evaluation frameworks or benchmarks
  • Background in production monitoring, observability, and incident response
  • Experience with statistical analysis and experimental design
  • Knowledge of AI safety and alignment research

Strong candidates need not have:

  • Formal certifications or education credentials
  • Academic research experience or publication history
  • Prior experience in AI safety or evaluation specifically

We're looking for thoughtful engineers who are excited about the challenge of measuring and monitoring capabilities we're still discovering. This role offers the opportunity to shape how the field approaches model quality assessment while working on systems that will be critical as AI capabilities continue to advance.

#J-18808-Ljbffr

Job Tags

Full time,

Similar Jobs

United Parcel Service

Package Handler - No Interview Required Job at United Parcel Service

Package Handler - No Interview Required at United Parcel Service summary: This seasonal Package Handler position at UPS involves loading and unloading packages from trailers and trucks in a fast-paced warehouse environment. The role requires physical stamina, the ability...

99 Ranch Market

Warehouse Picker Job at 99 Ranch Market

 ...cuisine in daily life. Walong Marketing Inc. welcomes you to join our team! Job Overview: We are seeking a reliable and hardworking Warehouse Associate to join our team. As a Warehouse Associate, you will be responsible for various tasks related to materials handling,... 

Mitchell Equipment Corporation

QuickBooks Bookkeeper Job at Mitchell Equipment Corporation

 ...Summary We are seeking an experienced and detail-oriented Bookkeeper to join our team. In this role, you will work closely with the...  ..., including accounts payable and accounts receivable, using QuickBooks Enterprise and our accounting/manufacturing software. You will... 

Cabela's

Operations Scheduler Job at Cabela's

POSITION SUMMARY: The Operations Scheduler supports daily and weekly production scheduling at White River Marine Group to ensure smooth and efficient manufacturing operations. Working closely with the Operations Planner, Production Supervisors, and Supply Chain teams...

Medilodge of Montrose Inc

Assistant Nursing Home Administrator (ANHA) Job at Medilodge of Montrose Inc

 ...Job Description Job Description Assistant Nursing Home Administrator (ANHA) Facility: MediLodge of Montrose We invite you to apply and be part of a team that truly values your contribution. We offer competitive wages and are committed to fostering a workplace...