Dilyara Bareeva

PhD Candidate in Interpretable AI. Fraunhofer Heinrich Hertz Institute.

dilya.jpg

Hi, I am Dilya 👋 I am a PhD candidate in Interpretable AI at the Fraunhofer Heinrich Hertz Institute, supervised by Sebastian Lapuschkin and Wojciech Samek. My research focuses on developing methods for understanding and improving the decision-making of deep learning models. Additionally, I am interested in evaluating the faithfulness and adversarial robustness of established interpretability methods.

I began my research journey in the field of Explainability at the Understandable Machine Intelligence Lab. Prior to this, I worked as a data scientist at EY, where I developed AI-based tools for automating financial processes. I hold a Bachelor’s degree in Computer Science from the Technical University of Berlin, a Bachelor’s degree in Economics from MGIMO, and a Master’s degree in Economics and Management Science from Humboldt University of Berlin.

news

Dec 06, 2024 🏆 I am honored to have been awarded the Rolf Niedermeier Prize for an outstanding final thesis by the Faculty of Electrical Engineering and Computer Science at the Technical University of Berlin.
Oct 09, 2024 🐼 quanda library release on GitHub and paper release on arXiv!
Aug 07, 2024 🐣 Started this website AND learned how to ride a bike!
Jul 27, 2024 Presenting Manipulating Feature Visualizations with Gradient Slingshots paper at the ICML 2024 Workshop on Mechanistic Interpretability.
Jul 26, 2024 Presenting Manipulating Feature Visualizations with Gradient Slingshots paper at the NextGenAISafety workshop at ICML 2024.

selected publications

  1. NeurIPS 2024 ATTRIB WS
    fig_1_website.png
    Quanda: An Interpretability Toolkit for Training Data Attribution Evaluation and Beyond
    Dilyara Bareeva, Galip Ümit Yolcu, Anna Hedström, Niklas Schmolenski, Thomas Wiegand, Wojciech Samek, and Sebastian Lapuschkin
    In Second NeurIPS Workshop on Attributing Model Behavior at Scale, Dec 2024
  2. ICML 2024 Mechanistic Interpretability WS
    manipulating-publish.png
    Manipulating Feature Visualizations with Gradient Slingshots
    Dilyara Bareeva, Marina MC Höhne, Alexander Warnecke, Lukas Pirch, Klaus Robert Muller, Konrad Rieck, and Kirill Bykov
    In ICML 2024 Workshop on Mechanistic Interpretability, Jul 2024
  3. CVPR 2024 SAIAD WS
    r-clarc-simple.png
    Reactive Model Correction: Mitigating Harm to Task-Relevant Features via Conditional Bias Suppression
    Dilyara Bareeva, Maximilian Dreyer, Frederik Pahde, Wojciech Samek, and Sebastian Lapuschkin
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Jun 2024