How can I measure AI explainability?
A Scoresheet for Explainable AI
February 17, 2025
https://arxiv.org/pdf/2502.09861This paper introduces a "scoresheet" for evaluating the explainability of AI systems, including multi-agent systems. It aims to bridge the gap between high-level standards for AI transparency and the practical need for concrete assessment methods.
Key points for LLM-based multi-agent systems:
- Veracity: Explanations must be reliable and reflect the system's actual reasoning, which is challenging with LLMs prone to "hallucinations." Directly deriving explanations from the system's internal workings or logs is preferred over using proxy models.
- Global vs. Local Explanations: Global explanations describe the system's overall functionality (how and how well it works), while local explanations pertain to specific decisions. Both are crucial for understanding multi-agent system behavior.
- Explanation Concepts: Using concepts like beliefs, goals, and preferences in explanations, mirroring human reasoning, can improve their understandability. This aligns with the tendency of LLMs to generate explanations using these concepts.
- Explanation Types: The scoresheet considers various explanation types based on questions the system can answer (e.g., factual, "why," "why not," hypothetical). This is relevant for designing interactive interfaces for querying LLM-based agents.
- Automation: Ideally, explanations should be automatically generated. The level of automation is a key factor in practical XAI. This is particularly important for multi-agent systems with complex interactions.
- Customization and Interactivity: Tailoring explanations to individual users and providing interactive exploration of explanations can enhance understanding, which can be facilitated by LLMs' natural language capabilities.
- Stakeholder Needs: Understanding the needs of various stakeholders is essential for designing effective explanations. LLMs can potentially be used to adapt explanations to different stakeholders' requirements.