By Bethany Abbate and Paul Lekas

As the AI landscape rapidly evolves, the safety of advanced foundation models has become a core focus of policymakers and the private sector.  The U.S. AI Safety Institute (AISI), launched earlier this year within the National Institute for Standards and Technology (NIST), has provided a glimpse into how the U.S. government is looking at AI safety in its first publication, a draft version of NIST 800-1, Managing Misuse Risk for Dual-Use Foundation Models.

We view this draft as an important first step towards developing a framework of voluntary best practices to address misuse risk in foundation models. We particularly appreciate the clarity with which AISI has outlined the key challenges associated with this effort, including difficulty to predict performance at scale, lack of clarity around measured capability and potential risk of harm, nascent methods to evaluate safeguards, difficulty in profiling malicious actors, and more. Due to the nature of these challenges, our submission of September 9 includes constructive recommendations to enable the final version of 800-1 to be an effective resource to guide AI developers, policymakers, and other stakeholders.

Here are some of the key concerns SIIA raised in our response:

1. Limitations in Measurement Science and Technical Guidance

The NIST 800-1 draft acknowledges significant challenges in assessing and mitigating misuse risk. Among these is the lack of technical standards and gaps in measurement science for assessing and mitigating misuse risk. However, while recognizing these limitations is a start, the document doesn’t go far enough in adjusting its recommendations to reflect the technical or practical feasibility of carrying out the recommended practices. Without robust technical guidance, many of the proposed best practices remain aspirational rather than actionable. This could lead to unrealistic expectations for AI developers and, ultimately, hinder the effectiveness of the guidelines.

SIIA’s recommendation: The AISI should distinguish between recommendations that are dependent on future research and the establishment of technical standards as “aspirational” and recommendations that are reasonable and useful for developers to implement today. NIST 800-1 should clearly indicate which recommendations are “aspirational”. AISI should also ensure that future iterations of NIST 800-1 are aligned with what is technically feasible at the time of publication.

2. Uniform Approach to Foundation Models

Dual-use foundation models vary widely in terms of architecture, openness, and intended application. The current draft of NIST 800-1 treats these models as too uniform and does not fully account for these differences. Misuse risks vary depending on factors like the openness of the model, who has access to its weights or API, and the intended application. Treating all models as the same could lead to blanket recommendations that aren’t appropriate for every system.

SIIA’s recommendation: The NIST 800-1 should include a nuanced, risk-based representation of dual-use foundation models, incorporating guidance tailored to specific types of models. A marginal risk framework, for example, could be applied to differentiate between the risks of widely open models and those that are more controlled.

3. Insufficient Focus on the Full AI Value Chain

The NIST 800-1 draft primarily targets model developers when, in reality, managing misuse risk requires appropriate responsibility by actors across the AI value chain – including deployers and parties acting in-between developers and deployers. Those who deploy or use AI systems—whether in academia, private industry, or government—also play a crucial role in mitigating risk. A narrow focus on developers ignores the importance of post-deployment oversight and the shared responsibility across all AI stakeholders.

SIIA’s recommendation: AISI should expand the scope of NIST 800-1 to reflect the collaborative nature of AI risk management, recognizing the roles of all actors in the AI ecosystem. Effective risk management should span the entire lifecycle of AI systems, from development through to deployment and ongoing use. The community of stakeholders within the AISI Consortium should have the opportunity to provide additional input to hone the draft as it proceeds towards final publication. We recommend that this work be undertaken in the appropriate AISI Consortium task force to leverage the expertise of these task force members to work collaboratively with AISI to address feedback and improve the final product.

4. Downstream Implications and Regulatory Pressures

NIST has been viewed as a global thought leader, and lawmakers have increasingly relied on NIST’s recommendations as a baseline for crafting public policy. Although NIST is not a regulatory body, another aspect of concern within the 800-1 draft is the potential for its recommendations to become incorporated into regulation without accounting for the difficulties of implementation as currently written. For this reason, we recommend that AISI take care to manage expectations in light of what is achievable given the significant challenges noted early in the 800-1 draft.

Moreover, certain of the 800-1 draft recommended best practices should be revisited from an adversarial lens. For example, the recommendation that foundation model developers disclose detailed information about a model’s safeguards could provide malicious actors with a roadmap for exploitation. Assessing each practice in this manner will help to avoid recommended actions that carry significant legal, operational, and societal risk.

SIIA’s recommendation: When considering refinements and reviewing public comments from stakeholders on the 800-1 draft, AISI should consider the downstream effects of its recommendations, ensuring that they are not only aspirational but also practical,  implementable, and do not lead to countervailing risks. AISI should also consider the impact on innovation balanced against the intended outcome of the proposed action: If recommendations are too stringent, this could stifle innovation in the long-run, particularly in the open-source community, which has been a vital contributor to AI research and development.

Final Thoughts: A Call for Refinement and Collaboration

SIIA appreciates the effort AISI has taken in crafting the 800-1 draft and values AISI’s leadership in advancing the important interests of AI safety and security. We are committed to continued advocacy around codifying the AISI and providing it with the resources it needs to achieve its objectives in the United States and enhance U.S. leadership in AI on the global stage. However, before finalizing NIST 800-1, we urge AISI to:

  • Leverage the expertise of the AISI Consortium to develop, red-team, and fine tune the draft. Quality is more important than speed; it is more important to get the draft right than to rush a product that may have unintended consequences.
  • Clarify language around where measurement science and technical standards are still lacking.
  • Tailor guidance based on the specific characteristics of dual-use models.
  • Consider the potential regulatory implications and the feasibility of compliance.

By addressing these concerns, AISI can ensure that NIST 800-1 will be an effective resource for managing AI misuse risk while promoting innovation and responsible development. SIIA remains committed to working with NIST and other stakeholders to shape a balanced approach to AI safety that considers the diverse landscape of AI systems and the real-world challenges faced by developers and users alike.