Workshop Structure (Tentative)

Time Session Description
09:00 - 09:30 Welcome and Introduction
  • Opening remarks
  • Overview of workshop structure and objectives
09:30 - 11:00 Reflections on the Landscape
  • Collaborative reflection on the existing landscape
  • Talks, panels, and breakouts by modality (text, images, audio, video, and multimodal data)
  • Topics:
    • Underlying frameworks
    • Contextualization challenges
    • Defining robust evaluations
    • Incentive structures
11:00 - 11:15 Break
11:15 - 12:45 Talks + Provocations
  • Invited speakers on current technical evaluations for base models across all modalities
  • Key social impact categories covered:
    • Bias and stereotyping
    • Cultural values
    • Performance disparities
    • Privacy
    • Financial and environmental costs
    • Data moderator labor
  • Presentations of accepted provocations
12:45 - 13:45 Lunch Break
13:45 - 15:45 Group Activity
  • Participants break into groups focusing on key social impact categories
  • Activities include:
    1. Choosing Evaluations: Determining how to select evaluations from a large repository
    2. Reviewing Tools and Datasets: Assessing existing artifacts and identifying gaps
    3. Examining construct reliability, validity, and ranking methodologies
15:45 - 16:00 Break
16:00 - 17:45 What's Next? Documentation + Resources
  • Develop policy guidance highlighting impact categories, subcategories, and modalities requiring further investment
  • Discussions on:
    1. Documenting Methods: Creating a proposed framework for documenting evaluations
    2. Developing Shareable Resources: Improving evaluation repository and conceptualizing improved resources
    3. Underlying Frameworks: Examining foundational frameworks influencing evaluations
    4. Contextualization Challenges: Identifying challenges in contextualizing evaluations across different contexts
    5. Defining Robust Evaluations: Establishing criteria for robust and meaningful evaluations
17:45 - 18:00 Closing Remarks