Enable job alerts via email!

AI Evaluations Research Scientist

RAND

Boston (MA)

Hybrid

USD 115,000 - 247,000

Full time

13 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

RAND is seeking mission-driven AI Evaluations Research Scientists to join their Technology and Security Policy Center in Boston. This role focuses on evaluating AI capabilities relevant to national security, requiring strong analytic skills and proficiency in Python. Candidates will contribute to multidisciplinary teams addressing high-consequence risks associated with AI systems, with opportunities for career growth.

Benefits

Health insurance coverage

Life and disability insurance

Savings plan

Paid time-off

Qualifications

Strong interest in AI capability evaluations.
Proficiency in Python is required.
Familiarity with machine learning and computational infrastructure.

Responsibilities

Develop threat models for AI risks.
Design objective evaluations of AI capabilities.
Communicate research results to policymakers.

Skills

Analytic skills

Communication skills

Interest in national security risks

Proficiency in Python

Familiarity with AI systems

Education

PhD in relevant field

Master’s degree with 3 years of experience

Bachelor’s degree with 5 years of experience

Job Type:

Regular

Overview

RAND’s Technology and Security Policy Center (TASP) is seeking mission-driven AI Evaluations Research Scientists to develop and execute research projects and engineering efforts within our AI Capability Evaluations (ACE) team.

RAND's reputation for excellence is built on our commitment to high-quality, rigorous analysis and objectivity. TASP is at the forefront of research and implementation regarding the impact of high-consequence, dual-use technologies—such as artificial intelligence and biotechnology—on global competition and security. Our research has been used by the White House, government departments, the EU and UK governments, and industry leaders, among others. Our alumni have gone on to important roles at the NSC, Commerce, DOD, Congress, Google DeepMind, OpenAI, EU AI Office, UK AISI, other key think tanks , and founding mission-driven tech initiatives .

ACE develops and conducts evaluations of national security relevant capabilities of frontier AI systems, with a current focus on the intersection of large language models (LLMs) and AI agents with biological risk. We’re hiring for people with research scien ce and /or research engineering skills to play a key role in work that assists public policymaker s at all levels in strengthen ing national security and mitigat ing catastrophic risks enabled by AI systems. They will work on complex problems at the intersection of AI and national security where technical details matter and will contribute to multidisciplinary project teams that include biosecurity experts, machine learning engineers, and policy researchers.

This position is initially structured as a focused 1-year appointment to create the urgency needed to drive ambitious change in this rapidly evolving field. Every day of your tenure will count toward that goal. The appointment may be renewed for up to a total of 3 years , with options for longer-term employment at RAND thereafter. Full-time and part-time (at least 20 hours per week) schedules will be considered, but with a strong preference for full-time.

Respons ibilitie s

Given the breadth of valuable work our team could do, there is some ability to align responsibilities with an individual’s skills, interests, and career goals , including in terms of the balance of research scientist - versus research engineer -style responsibilities . Responsibilities may include but are not limited to:

Contribute to developing concrete threat models for high-consequence risks AI risks , working with internal and external partners

Design and execute rigorous, objective evaluations of AI capabilities relevant to key bottlenecks within those threat models

Develop and maintain the technical infrastructure r equired to support this research, working with relevant internal and external IT stakeholders

Develop and maintain code for fundamental evaluation components that can be used ac ross research efforts ( e.g. prompting, auto mated grading , statistical analysis )

Keep up to date with the latest advance s in AI evaluation engineering and the science of evaluations to continually im prove the rigor and efficiency of our evaluations

Contribute to setting strategic and research priorities, with an emphasis on the policy impact of evaluations

Communicate research results to policymakers and other key stakeholders at all levels through written products and oral presentations

A successful candidate could grow into leading a team and/or mentoring more junior staff.

Qualifications
All research positions at RAND require excellent analytic skills; the ability to communicate clearly and effectively in English, both orally and in writing; the ability to work effectively as a member of a multi-disciplinary team; and a strong commitment to RAND's core values of quality and objectivity.

Other r equired qualifications :

Strong i nterest in understanding and add ressing potential national security risks related to autonomy or high-consequence misuse of LLMs and AI agents , and in AI capability evaluations as a route to impact

P roficiency in Python

Familiarity with technical aspects of AI systems and related technologies, such as machine learning, computational infrastructure, or information securit y

Preferred but not required :

Experience with evaluations and evaluation frameworks for LLM s and AI agent s ( e.g. Inspect)

Experience with LLM elicitation techniques ( e.g. fine-tuning, retrieval augmented generation, tool-use integration, agent scaffolding)

Experience working on ML model development/deployment or working at/with leading AI companies

Experience with c loud computing , in particular Azure and AWS , including government cl oud environments

Familiarity with common LLM frameworks ( e.g. LangChain , LlamaIndex )

Aptitude for project management and/or mentorship

Strong communication skills, both written and verbal, tailored to technical and non-technical audiences, or ability to rapidly develop that

Experience in government, intelligence community, other relevant decision-making offices, or policy analysis roles

E ducation Requirements
RAND is hiring for this role at associate, specialist, and expert levels of experience. Minimum education requirements at the associate level include:

A PhD in a rele vant field. This can include Artificial Intelligen ce, Machine Learning, Computer Science, Cybers ecurity, Electrical Engineering, Physics, Mathematics, Engineering and Public Policy, Security Studies, or similar .

A Master’s degree in the fields listed above with at least 3 years of relevant professional experience .

A Bachelor’s degree in the fields listed above with at least 5 years of relevant professional experience.

Security Clearance
Ability to obtain and maintain a U.S. security clearance, including having US citizenship, is preferred but not required .

Location

We are actively hiring for this position in Washington, DC; San Francisco, CA; Boston, MA; Santa Monica, CA; and Pittsburgh, PA. San Francisco or especially DC are preferred . We offer a hybrid work arrangement, combining work from home and on-site options. Fully remote work will also be considered .

Term

This position is a 1 -year term appointment with a possibility of renewal for up to 3 years total, alongside options for longer term employment .

Application

Applications must include:

A detailed resume highlighting relevant academic and professional experience.

A writing sample demonstrating analytical and communication skills. This sample may be a recent, previously written paper or report (e.g., journal article, master’s thesis or paper written for coursework, prior employment, or internship). Applicants whose study and work experience (e.g., model development) has not involved producing written products that are shareable may submit a short, written summary (i.e., less than one page) of one or more recent products they have developed.

A code sample .

A cover letter which contains only responses to each of the following prompts:

1) Summarize in .

2) Describe in g infrastructure project you may want to pursue in this role. For a research direction: Describe what questions you would try to answer, what methods you would use, how many months of work would be required from you and/or colleagues, and what outcomes this research might help achieve (e.g., what important policy decisions it might inform). For an infrastructure project: You may make guesses about our goals and existing infrastructure, and prop ose a way you might help improve that, noting how you would implement that, how many months of work may be required from you and/or colleagues, and wh y this might be useful. This is just an assessment step and does not mean you would definitely work on this if hired.

Salary Range: $115,400 - $246,600

Visiting Technical Associate = $115,400 - $167,300

Visiting Technical Specialist = $137,000 - $209,000

Visiting Technical Expert = $157,800 - $246,600

RAND considers a variety of factors when formulating an offer, including the specific role responsibilities; a candidate’s work experience, education/training, skills, expertise ; and internal equity. In addition, RAND provides strong benefits including health insurance coverage, life and disability insurance, a savings plan, paid time-off, and more.

Equal Opportunity Employer

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

AI Evaluations Research Associate

RAND

Boston

Hybrid

USD 47,000 - 157,000

7 days ago

Be an early applicant

AI Evaluations Research Associate

RAND

Pittsburgh

Hybrid

USD 47,000 - 157,000

7 days ago

Be an early applicant