Speaker Schedule

DateSpeakerTalk InformationRecommended Reading
March 30th
[Virtual]
Anna Bethke,
Salesforce

The FATE of AI Ethics
In this hour-long session I will be discussing the components of AI Ethics, and ways that researchers, and AI practitioners can incorporate these components into every stage of their development process. These components include fairness, accountability, transparency, explainability and human rights.
An ethics checklist for data scientists

Consequence Scanning – an agile practice for responsible innovators

Ethics in AI research papers and articles
April 6thJerry Lopez, Motional

Challenges in AI Safety – A Perspective from an Autonomous Driving Company
There is a long legacy of deploying complex software in safety critical applications in industries like aviation and automotive. The increasing use of Machine Learning (ML) models in these applications has forced the engineering community to rethink what it means to ensure that software applications can operate with extremely low probabilities of failure. This talk will highlight some of the biggest challenges with AI safety in the autonomous driving community and discuss some of the most promising methods currently being researched for ensuring a high degree of safety.
Risk-based safety envelopes for AVs under perception uncertainty

Decision-time postponing motion planning for combinatorial
uncertain maneuvering
April 13thTom Lue, DeepMind

AI policy and Governance
AI is increasingly being deployed in both the public and private sectors with the potential for large-scale impacts in many industries.  That trend will only accelerate with the development of more powerful general-purpose algorithms with the capacity to perform an increasingly wider range of tasks which may have profound implications for society.  My talk will give an overview of the state of AI technology & policy/regulation today, some predictions on where it is likely to go, and how we should think about ideal governance of AI given the anticipated technological developments in the coming years.
AI Governance: Opportunity and Theory of Impact

Stanford HAI issue brief – European Commission’s Artificial Intelligence Act

DeepMind Podcast – Interview with Demis Hassabis, CEO of DeepMind
April 20thTatsu Hashimoto, Stanford

Emerging risks and opportunities from large language models
Large, pre-trained language models have driven dramatic improvements in performance for a range of challenging NLP benchmarks. However, these language models also present serious risks such as eroding user privacy, enabling disinformation, and relying on discriminatory `shortcuts’ for prediction. In this talk, we will provide a short overview of a range of potential harms from language models, as well as two case studies in the privacy and brittleness of large language models.
April 27thHadas Kress-Gazit, Cornell

Safety (and Liveness!) of Robot Behaviors
In this talk I will describe how formal methods such as synthesis – automatically creating a system from a formal specification – can be leveraged to design robots, guarantee their behavior, and provide feedback about things that might go wrong.  I will discuss the benefits and challenges of writing formal specifications that capture safety as well as liveness properties, and will give examples of different robotic systems including multi robot systems and robots interacting with people.
Formalizing and Guaranteeing Human-Robot Interaction

Synthesis for Robots: Guarantees and Feedback for Robotic Behavior
May 4thZico Kolter, CMU
Recent progress in verifying neural networks
This talk looks at the task of verifying deep networks, guaranteeing that outputs of a network obey certain properties for certain classes of inputs.  Such approaches can be used to validate robustness and safety of neural networks, but such exact verification is a hard combinatorial problem, and off-the-shelf solvers typically perform quite poorly.  Nonetheless, over the past several years there has been a large amount of progress in the area, and recent methods are able to verify medium-sized networks thousands of times faster than generic solvers.  In this talk, I will provide a general overview of the verification problem, then highlight several advances we and others have made in recent years to achieve these speedups.  Many of these approaches were implemented in our submission to the Verification of Neural Networks Competition (VNNCOMP) in 2021, where our team, a collaboration with UCLA, Northeastern, and Columbia, won first place across most categories.
May 11thCarlos Guestrin, Stanford University
How can you trust machine learning?
Machine learning (ML) and AI systems are becoming integral parts of every aspect of our lives. The definition, development and deployment of these systems are driven by (complex) human choices. And, as these AIs are making more and more decisions for us, and the underlying ML systems are becoming more and more complex, it is natural to ask the question: “How can you trust machine learning?”

In this talk, I’ll present a framework, anchored on three pillars: Clarity, Competence and Alignment. For each, I’ll describe algorithmic and human processes that can help drive towards more effective, impactful and trustworthy AIs. For Clarity, I’ll cover methods for making the predictions of machine learning more explainable. For Competence, I will focus on methods to evaluating and testing ML models with the rigor that we apply to complex software products. Finally, for Alignment, I’ll describe the complexities of aligning the behaviors of an AI with the values we want to reflect in the world, along with methods that can yield more aligned outcomes.Through this discussion, we will cover both fundamental concepts and actionable algorithms and tools that can lead to increased trust in ML.
“Why Should I Trust You?”: Explaining the Predictions of Any Classifier

Beyond Accuracy: Behavioral Testing of NLP models with CheckList

Color film was built for white people. Here’s what it did to dark skin
May 18thJacob Steinhardt, UC Berkeley
Forecasting and Aligning AI
Modern ML systems sometimes undergo qualitative shifts in behavior simply by “scaling up” the number of parameters and training examples. Given this, how can we extrapolate the behavior of future ML systems and ensure that they behave safely and are aligned with humans? I’ll argue that we can often study (potential) capabilities of future ML systems through well-controlled experiments run on current systems, and use this as a laboratory for designing alignment techniques. I’ll also discuss some recent work on “medium-term” AI forecasting.
May 25thJames Zou, Stanford University
Lessons from evaluating and debugging healthcare AI in deployment
Translating trustworthy AI from research into healthcare deployment is a major (and exciting!) challenge. I will discuss insights that we learned from conducting the first real-time AI trials at Stanford and analyzing data from >100 FDA-approved medical AI systems. We will explore challenges and new opportunities in each step of translation: 1) data curation (quantifying how different data contribute to model’s success or biases); 2) model testing and monitoring (continuous real-time testing and explaining model’s mistakes); and 3) human-AI interactions (designing AI for optimizing clinician’s performance). 
June 1stKatherine Driggs-Campbell, UIUC
Fantastic Failures and Where to Find Them: Considering Safety as a Function of Structure 
Autonomous systems and robots are becoming prevalent in our everyday lives and changing the foundations of our way of life. In this talk, we’ll explore the notion of structure across high-impact application domains and consider how different contexts and tasks lend themselves to different mechanisms for safety assessment. First, we will consider reasonably structured driving environments, and explore tools for efficiently finding failures in our autonomous driving systems. Then, we’ll consider scenarios where the environment is highly unstructured and clearly defining failures becomes challenging. For such tasks, we’ll explore how anomaly detection can serve as a proxy failure identification. 
Adaptive stress testing with reward augmentation for autonomous vehicle validation

Multi-Modal Anomaly Detection for Unstructured and Uncertain Environments