Welcome to the MLBoost channel, where you always get new insights!
🌟 I am hosting Kexin Huang from Stanford University for a presentation on the channel 🎥. Kexin will present his work “Uncertainty Quantification over Graph with Conformalized Graph Neural Networks”.
The work proposes conformalized GNN (CF-GNN), extending 🔥🔥 conformal prediction (CP)🔥🔥 to graph-based models for guaranteed uncertainty estimates
Link to Paper: https://arxiv.org/abs/2305.14535
Link to Slides: https://lnkd.in/g75Su4rn
00:00 - Hello!
00:41 - Presentation Starts!
01:21 - Graphs are Everywhere!
04:06 - Graphs Neural Networks
06:27 - Uncertainty Quantification, Coverage Guarantees, and Efficiency
12:00 - Study Goals and Presentation Agenda
13:20 - What is a Graph? Graph ML Tasks, and Graph Representation Learning
21:54 - Putting all Graph-Related Things Together
23:36 - Data Split on Graphs and Transductive Node-Level Prediction
26:13 - Existing Graph-ML Methods Fail on Coverage and Conformal Predictors are to Rescue!
27:03 - Overview of Conformal Predictors
36:46 - Does Exchangeability Hold for Graph Structured Data?
37:20 - Question 1: Where does the Dependency Between Train and Test Sets Come From?
40:10 - Conditions under which Graph Exchangeability Holds! and Why?
41:56 - Question 2: How are the Node Non-Conformity Scores Defined?
45:58 - GNNs are Permutation-Invariant when the Aggregation Function is Permutation Invariant.
46:54 - Question 3: What are Some Examples of Aggregation Functions that Are(Not) Permutation Invariant?
48:12 - When Are Graphs Not Permutation Invariant?
51:40 - Now that Coverage is Satisfied, How to Improve Efficiency?
Paper Abstract: "Graph Neural Networks (GNNs) are powerful machine learning prediction models on graph-structured data. However, GNNs lack rigorous uncertainty estimates, limiting their reliable deployment in settings where the cost of errors is significant. We propose conformalized GNN (CF-GNN), extending conformal prediction (CP) to graph-based models for guaranteed uncertainty estimates. Given an entity in the graph, CF-GNN produces a prediction set/interval that provably contains the true label with pre-defined coverage probability (e.g. 90%). We establish a permutation invariance condition that enables the validity of CP on graph data and provide an exact characterization of the test-time coverage. Moreover, besides valid coverage, it is crucial to reduce the prediction set size/interval length for practical use. We observe a key connection between non-conformity scores and network structures, which motivates us to develop a topology-aware output correction model that learns to update the prediction and produces more efficient prediction sets/intervals. Extensive experiments show that CF-GNN achieves any pre-defined target marginal coverage while significantly reducing the prediction set/interval size by up to 74% over the baselines. It also empirically achieves satisfactory conditional coverage over various raw and network features."