Controls & Evidence
NIST AI RMF readiness evaluates your controls across multiple domains. For each domain, reviewers look for evidence that controls are designed properly and operating effectively. Below are the core control domains with minimum requirements and example evidence artifacts.
What reviewers look for: Reviewers don't just check that policies exist. They verify that controls are operating as described, that evidence is produced on schedule, and that gaps are tracked and remediated. The evidence examples below show what "operating effectiveness" looks like in practice.
Comprehensive catalog of all AI systems with risk classification, purpose documentation, and stakeholder identification.
Requirements
- Catalog all AI systems with unique identifiers
- Classify each system by risk tier (minimal, limited, high, unacceptable)
- Document purpose and intended use for each system
- Identify stakeholders and affected populations
- Maintain regular inventory update cadence
Evidence Examples
| Artifact | Owner | Frequency |
| AI system registry with unique identifiers and descriptions | AI Governance Lead | Quarterly |
| Risk classification records with tier justification | Risk Management | Annually or on system change |
| Stakeholder analysis and affected population mapping | AI Governance Lead | Annually |
Structured methodology for categorizing AI system risk levels and applying tiered governance requirements.
Requirements
- Risk categorization methodology documented and approved
- Factors include potential harm severity, reversibility, scale of impact, and automation level
- Tiered governance requirements established (higher risk = more oversight)
- Alignment with organizational risk tolerance documented
Evidence Examples
| Artifact | Owner | Frequency |
| Risk categorization framework document | Risk Management | Annually |
| Tier assignment records with justification for each AI system | AI Governance Lead | Quarterly or on system change |
| Risk tolerance documentation and board-level approval | Executive Leadership | Annually |
Systematic analysis of AI system impacts on stakeholders, including harm identification and deployment context evaluation.
Requirements
- Stakeholder impact analysis completed for each AI system
- Context-of-use documentation maintained
- Identification of potential harms at individual, group, societal, and environmental levels
- Assessment of deployment context distinguishing high-stakes from low-stakes use
- Comparative analysis of AI versus non-AI alternatives
Evidence Examples
| Artifact | Owner | Frequency |
| Impact assessment reports per AI system | AI Governance Lead | Prior to deployment and annually |
| Stakeholder engagement records and feedback | Product Management | Prior to deployment |
| Context analysis documents with deployment environment details | AI Governance Lead | Annually or on context change |
Pre-deployment and ongoing testing for bias across demographic groups with defined fairness metrics and remediation procedures.
Requirements
- Bias testing across demographic groups completed before deployment
- Fairness metrics defined and documented (demographic parity, equalized odds, etc.)
- Disparate impact analysis performed
- Ongoing monitoring for bias drift in production
- Remediation procedures established for detected bias
Evidence Examples
| Artifact | Owner | Frequency |
| Bias test reports with demographic group analysis | Data Science / ML Engineering | Prior to deployment and quarterly |
| Fairness metric dashboards with threshold definitions | AI Governance Lead | Continuous monitoring |
| Demographic impact analysis documentation | Data Science / ML Engineering | Prior to deployment and on model retrain |
| Bias remediation records and outcome tracking | AI Governance Lead | As needed |
Continuous monitoring of AI system accuracy, drift detection, robustness testing, and reliability SLAs.
Requirements
- Accuracy and performance benchmarks established for each AI system
- Production monitoring for model drift and degradation
- Robustness testing against adversarial inputs and edge cases
- Reliability metrics and SLAs defined
- Automated alerting for performance degradation
Evidence Examples
| Artifact | Owner | Frequency |
| Performance benchmark records with baseline metrics | ML Engineering | Prior to deployment and on retrain |
| Drift detection dashboards and alert logs | ML Engineering / SRE | Continuous |
| Robustness test results (adversarial, edge case) | ML Engineering | Prior to deployment and quarterly |
| SLA documentation and uptime reports | SRE / Operations | Monthly |
Model cards, user-facing disclosures, explanation methods, and documentation of AI system limitations.
Requirements
- Model cards or system cards maintained for each AI system
- Explanation methods appropriate to audience (technical, user, affected person)
- Disclosure of AI use to end users
- Documentation of known limitations and failure modes
- Interpretability requirements scaled by risk tier
Evidence Examples
| Artifact | Owner | Frequency |
| Model cards or system cards for each AI system | ML Engineering / AI Governance Lead | On deployment and on material change |
| User-facing AI disclosure statements | Product / Legal | On deployment |
| Explainability documentation with audience-appropriate methods | ML Engineering | On deployment and annually |
| Limitation and failure mode disclosures | ML Engineering | On deployment and on material change |
Human-in-the-loop mechanisms, override capabilities, accountability chains, and appeal processes for AI decisions.
Requirements
- Human-in-the-loop required for high-risk decisions
- Override mechanisms documented and tested
- Clear accountability chain defining responsibility for AI decisions
- Escalation paths for AI system failures
- Appeal mechanisms available for affected individuals
Evidence Examples
| Artifact | Owner | Frequency |
| Human oversight procedures and decision authority matrix | AI Governance Lead | Annually |
| Override mechanism logs and testing records | Operations / ML Engineering | Quarterly |
| RACI matrix for AI system accountability | AI Governance Lead | Annually |
| Appeal process documentation and resolution records | Compliance / Legal | Ongoing |
Training data quality, provenance tracking, consent management, labeling QA, and synthetic data governance.
Requirements
- Training data quality assessed for accuracy, completeness, and representativeness
- Data provenance and lineage tracking from source to model
- Consent and authorization verified for data use in AI training and inference
- Data labeling quality assurance processes established
- Synthetic data generation and use governed
Evidence Examples
| Artifact | Owner | Frequency |
| Data quality reports with representativeness analysis | Data Engineering / Data Science | Per training cycle |
| Data provenance and lineage records | Data Engineering | Per training cycle |
| Consent and authorization documentation for training data | Legal / Privacy | Prior to data acquisition |
| Labeling QA audit records and inter-annotator agreement scores | Data Science | Per labeling campaign |
Version control, validation gates, deployment approvals, production maintenance, and retirement procedures for AI models.
Requirements
- Version control for models and training data
- Validation and testing gates before deployment
- Deployment approval process with sign-off authority
- Production monitoring and maintenance procedures
- Retirement and decommissioning procedures
Evidence Examples
| Artifact | Owner | Frequency |
| Model version registry with training data references | ML Engineering | Per model version |
| Validation test results and gate approval records | ML Engineering / QA | Per deployment |
| Deployment approval sign-offs | AI Governance Lead / Engineering Manager | Per deployment |
| Decommission records with rollback and data retention details | ML Engineering | As needed |
Assessment, contractual controls, and ongoing monitoring for AI systems provided by vendors and third parties.
Requirements
- Vendor AI system assessment completed before adoption
- Contractual requirements for AI transparency and data handling
- Ongoing monitoring of vendor AI system performance
- Supply chain risk assessment for AI components (models, data, infrastructure)
- Incident reporting requirements established for vendor AI systems
Evidence Examples
| Artifact | Owner | Frequency |
| Vendor AI assessment questionnaires and evaluation results | Procurement / AI Governance Lead | Prior to adoption and annually |
| Contract provisions for AI transparency and incident reporting | Legal / Procurement | Per contract cycle |
| Vendor AI performance monitoring reports | AI Governance Lead / Operations | Quarterly |
| AI supply chain risk register | Risk Management | Annually |
Evidence Naming Conventions
Organized, traceable evidence is critical for a smooth review. Adopting a consistent convention makes evidence retrieval faster and reduces friction.
Recommended format:
ControlID_System_ArtifactType_YYYY-MM-DD_Period_Owner_v# Key principles for evidence management:
- Centralized repository with access control and version history
- Consistent naming across all control domains and artifact types
- Defined cadence for each evidence type: event-driven, monthly, quarterly, or annual
- Immutable exports where possible to demonstrate evidence integrity
AI and data companies: Standard controls are the baseline. See the AI-specific advisory modules for additional controls addressing data governance, prompt logging, RAG security, and model vendor risk.