Skip to main content

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

The T2T scoring pipeline is currently offline for maintenance ! Submissions on T2T Round 3 can be uploaded but will not be scored before Monday Dec. 9 11am EST.

Assessing Risks and Impacts of AI

A compelling set of scenarios will aim to explore risks and related impacts across three levels of testing: model testing, red-teaming, and field testing.

AI Challenge Problem Overview

The latest in a portfolio of evaluations managed by the NIST Information Technology Laboratory – ARIA will assess models and systems submitted by technology developers from around the world. ARIA is an evaluation environment which is sector and task agnostic.

ARIA will support three evaluation levels: model testing, red-teaming, and field testing. ARIA is unique in that it will move beyond an emphasis on system performance and accuracy and produce measurements on technical and contextual robustness.

The program will result in guidelines, tools, methodologies, and metrics that organizations can use for evaluating their systems and informing decision making regarding positive or negative impacts of AI deployment. ARIA will inform the work of the U.S. AI Safety Institute at NIST.

ARIA 0.1

The initial evaluation (ARIA 0.1) will be conducted as a pilot effort to fully exercise the NIST ARIA test environment. ARIA 0.1 will focus on risks and impacts associated with large language models (LLMs). Future iterations of ARIA may consider other types of generative AI technologies such as text-to-image models, or other forms of AI such as recommender systems or decision support tools. A compelling and exploratory set of tasks will aim to elicit pre-specified (and non-specified) risks and impacts across three levels of testing: model testing, red-teaming, and field testing.

Image listing important dates for ARIA 0.1

ARIA Email Distribution List

Join to receive important ARIA program news and announcements

Join