publications | Dilyara Bareeva

2024

NeurIPS 2024 ATTRIB WS

Quanda: An Interpretability Toolkit for Training Data Attribution Evaluation and Beyond

Dilyara Bareeva, Galip Ümit Yolcu, Anna Hedström, Niklas Schmolenski, Thomas Wiegand, Wojciech Samek, and Sebastian Lapuschkin

In Second NeurIPS Workshop on Attributing Model Behavior at Scale, Dec 2024

PDF
ICML 2024 Mechanistic Interpretability WS

Manipulating Feature Visualizations with Gradient Slingshots

Dilyara Bareeva, Marina MC Höhne, Alexander Warnecke, Lukas Pirch, Klaus Robert Muller, Konrad Rieck, and Kirill Bykov

In ICML 2024 Workshop on Mechanistic Interpretability, Jul 2024

PDF
Finding the Right XAI Method—A Guide for the Evaluation and Ranking of Explainable AI Methods in Climate Science

Philine Lou Bommer, Marlene Kretschmer, Anna Hedström, Dilyara Bareeva, and Marina M.-C. Höhne

Artificial Intelligence for the Earth Systems, Jul 2024
CVPR 2024 SAIAD WS

Reactive Model Correction: Mitigating Harm to Task-Relevant Features via Conditional Bias Suppression

Dilyara Bareeva, Maximilian Dreyer, Frederik Pahde, Wojciech Samek, and Sebastian Lapuschkin

In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Jun 2024

PDF

Quantus: An Explainable AI Toolkit for Responsible Evaluation of Neural Network Explanations and Beyond

Anna Hedström, Leander Weber, Daniel Krakowczyk, Dilyara Bareeva, Franz Motzkus, Wojciech Samek, Sebastian Lapuschkin, and Marina M.-C. Höhne

Journal of Machine Learning Research, Jun 2023