Research Engineer

Posted 9 Days Ago
Be an Early Applicant
New York, NY
Mid level
Artificial Intelligence
The Role
Develop and implement advanced methods for optimizing AI applications, focusing on dynamic testing, reliable evaluation models, and data generation. Collaborate with customers to adapt tooling for various domains.
Summary Generated by Built In

Haize Labs gets LLM apps out of POCs and into production. We eliminate the risk and improve the reliability of LLM apps by haizing them -- i.e. rigorously, proactively, and continuously fuzz-testing them.

We are looking for Research Engineers to help develop our reliability platform, with a focus on:

  1. Data-efficient alignment of evaluation models
  2. Dynamic testing of AI applications
  3. Observability and anomaly detection 
  4. Discrete optimization (with applications in architecture search and automated prompting)

Our work is both intellectually stimulating and practically in high demand. Your work will result in net-new primitives, frameworks, and algorithms for developing robust LLM applications. You work will directly influence how LLM apps are tested, verified, and deployed everywhere. You will directly influence how the world responsibly uses LLMs.

Responsibilities

  • Develop optimization, synthetic data generation, and fuzzing methods for breaking LLM systems.
  • Implement complex automated evaluation models and systems.
  • Go from research idea to code within hours; and iterate quickly on experiments and data.
  • Work directly with customers to adapt our tools for different domains.

Qualifications

  • First-author publications in top-tier ML venues (NeurIPS, ICML, ICLR, and others).
  • Bias for action & experimentation over philosophizing (though sometimes good).
  • Not interested in printing papers for papers' sake.
  • Some production engineering experience (e.g. ML in an applied setting). No spaghetti research code!
  • Some familiarity with ideas from active learning, weak supervision, synthetic data, functional verification, reinforcement learning, reward modeling, automated evaluation. A subset of this is fine.

Annual Salary

$150,000 – $600,000 USD

Logistics

  • Location policy: In NYC.
  • US visa sponsorship: If you are exceptional, we will sponsor.
  • Compensation and Benefits: We provide generous salary, equity, and benefits

We're Not Here to Play Games.

We're not here to write GPT wrappers or get rich quick off the AI bubble. We're here to solve the hardest problem in AI: making it safe, reliable, and production-ready. 

Since our company's inception in 2024, we've amassed amazing customers like OpenAI, Anthropic, AI21, and several others. We've developed best-in-class tooling for evaluation, dynamic testing, red-teaming, observability, and continuous robustification. And we’re backed + advised by the founders of Cognition, Hugging Face, Weights and Biases, Nous, Etched, Okta, Replit and C-suite execs from Google, Stripe, Databricks, Robinhood, and more.

Our core team is exceptionally fit for this mission. We turned down Stanford PhDs, got into & rejected Y Combinator, wrote ML-guided matchmaking for 50,000+ students, built an educational nonprofit supporting 60 countries, and did some other cool things along the way. Our early hires include an MIT PhD with 21,000+ Physics/ML/Stats citations, a Datadog engineering manager who led their GenAI observability team, a Citadel quant with a huge open-source presence, and more.

We can only serve our mission with an incredibly high talent-density team. Come here to push yourself, learn fast, experience excellence, grow with each other, and pursue your life's work.

Top Skills

Active Learning
Automated Evaluation
Functional Verification
Fuzz Testing
Machine Learning
Optimization
Reinforcement Learning
Synthetic Data
Weak Supervision
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New York City, New York
10 Employees
On-site Workplace

What We Do

haize labs is the robustness and reliability layer underlying any AI model, in any industry, for any use case.

to solve the robustness problem, we haize (i.e. stress-test and red-team) AI models to preemptively discover all their failure modes before production.

Similar Jobs

Hudson River Trading Logo Hudson River Trading

Systems Engineer - Research & Development

Artificial Intelligence • Fintech • Other • Automation
Hybrid
New York, NY, USA
1000 Employees

Datadog Logo Datadog

Staff Software Engineer, Security Research

Artificial Intelligence • Cloud • Software • Cybersecurity
Hybrid
New York, NY, USA
5000 Employees
235K-300K Annually

Citadel Logo Citadel

Global Quantitative Strategies | Quantitative Research Engineer

Information Technology • Software • Financial Services • Big Data Analytics
New York, NY, USA
4000 Employees
200K-225K Annually

Citadel Securities Logo Citadel Securities

Quantitative Developer/Research Engineer

Information Technology • Software • Financial Services
New York, NY, USA
1900 Employees
250K-350K Annually

Similar Companies Hiring

Stepful Thumbnail
Software • Healthtech • Edtech • Artificial Intelligence
New York, New York
60 Employees
HERE Technologies Thumbnail
Software • Logistics • Internet of Things • Information Technology • Computer Vision • Automotive • Artificial Intelligence
Amsterdam, NL
6000 Employees
True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account