publications

2024

  1. NeurIPS 2024 ATTRIB WS
    fig_1_website.png
    Quanda: An Interpretability Toolkit for Training Data Attribution Evaluation and Beyond
    Dilyara Bareeva, Galip Ümit Yolcu, Anna Hedström, Niklas Schmolenski, Thomas Wiegand, Wojciech Samek, and Sebastian Lapuschkin
    In Second NeurIPS Workshop on Attributing Model Behavior at Scale, Dec 2024
  2. ICML 2024 Mechanistic Interpretability WS
    manipulating-publish.png
    Manipulating Feature Visualizations with Gradient Slingshots
    Dilyara Bareeva, Marina MC Höhne, Alexander Warnecke, Lukas Pirch, Klaus Robert Muller, Konrad Rieck, and Kirill Bykov
    In ICML 2024 Workshop on Mechanistic Interpretability, Jul 2024
  3. Finding the Right XAI Method—A Guide for the Evaluation and Ranking of Explainable AI Methods in Climate Science
    Philine Lou Bommer, Marlene Kretschmer, Anna Hedström, Dilyara Bareeva, and Marina M.-C. Höhne
    Artificial Intelligence for the Earth Systems, Jul 2024
  4. CVPR 2024 SAIAD WS
    r-clarc-simple.png
    Reactive Model Correction: Mitigating Harm to Task-Relevant Features via Conditional Bias Suppression
    Dilyara Bareeva, Maximilian Dreyer, Frederik Pahde, Wojciech Samek, and Sebastian Lapuschkin
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Jun 2024

2023

  1. Quantus: An Explainable AI Toolkit for Responsible Evaluation of Neural Network Explanations and Beyond
    Anna Hedström, Leander Weber, Daniel Krakowczyk, Dilyara Bareeva, Franz Motzkus, Wojciech Samek, Sebastian Lapuschkin, and Marina M.-C. Höhne
    Journal of Machine Learning Research, Jun 2023