PhD student · LLM internals · AI systems

I study how language models think and understand the world.

I’m Drejc Pesjak, a PhD student at the Jožef Stefan Institute working on mechanistic interpretability for transformer language models.

My current work focuses on sparse autoencoders, probing, causal interventions, and practical tools for understanding LLM internals. I also build concrete AI systems when an idea is interesting enough to test.

Read Notes View Projects Download CV

GitHub Email Google Scholar LinkedIn

About

Researcher-builder, currently focused on LLM interpretability.

My route into AI started at an IT-focused secondary school, where I learned to program and first found out about machine learning. I then continued at the Faculty of Computer and Information Science in Ljubljana, where I did my bachelor’s in computer science and began specializing in AI.

After that I moved into data science, went deeper into building working AI models, and picked up industry experience from deploying computer vision models to embedded devices at Luxonis and doing big-data analysis at Outbrain. Now I’m a PhD researcher at the International Postgraduate School Jožef Stefan and Jožef Stefan Institute, focused on mechanistic interpretability for language models.

Recent Work

Trained sparse autoencoders on large models on an HPC cluster.
Identified and suppressed deceptive features in LLaMA models.
Beat prior NER results using MI-based analysis.
Built agentic social simulations with LLMs.
Connected LLMs to a robotic arm for planning and control.

Technical Libraries

TransformerLens SAE Lens Logit Lens GemmaScope 2 Neuronpedia PyTorch Hugging Face HPC/SLURM

Selected Work

Featured Projects and Notes

Scaling Monosemanticity on LLaMA

Reproducing and adapting sparse-autoencoder interpretability experiments for LLaMA-style models, including SAE training, feature inspection, and steering.

SAELLaMAreplication

Note Code

LLM Activation Storm

A local browser app for visualizing LLM activations layer by layer from a single prompt forward pass, built to make transformer internals easier to inspect.

activationsGemmavisualization

Code

Auto-GemmaScope

Exploring SAE features in Gemma 3 vision-language models, comparing autonomous feature discovery with human-guided interpretability workflows.

GemmaVLMsNeuronpedia

Code

DPhate: Hate Speech Paraphrasing

Bachelor thesis project on reducing hateful content through paraphrasing while preserving meaning, using an evaluated NLP rewriting pipeline.

NLPtransformersthesis

Code

SPCA: Sense-Plan-Code-Act

A research prototype combining LLMs, PDDL planning, code generation, and ROS2/UR5 simulation for embodied robotic manipulation.

agentsplanningrobotics

Code

Public notebook

Research Notes & Ideas

A place for longer notes, partial conclusions, and the parts of projects that do not fit nicely into a GitHub README.

SAE feature dictionary with mixed syntax, token, phrase, and concept features

SAE features · 2026

SAE Features Are Useful, But They Are Not Magic

A practical note on why SAE features are hard to rank, interpret, and steer, even when they look beautifully interpretable at first.

Sparse AutoencodersSteeringFeature Search

Research agenda · 2026

The Future of Mech Interp

Seven underexplored areas that feel especially important right now, pulled from a giant research mind map.

Interp AgentsReasoning ModelsCausality

SAE replication · 2025

Scaling Monosemanticity on LLaMA

Notes from replicating sparse-autoencoder-style interpretability experiments on LLaMA models: what broke, what mattered, and what I would try differently next.

SAELLaMAHPC

Agentic mechanistic interpretability schema

Agentic interpretability · TBA

The Agent Detective: Investigating Gender Bias

A small agentic interpretability experiment around gender-bias circuitry in language models. Write-up coming soon.

SAE agentsbiasGemma

Project Archive

Projects and Experiments

Older experiments, small tools, and applied ML projects.

Research Tools and Simulations

Multi-agent simulation

Agentic Belief Propagation

LLM-powered agent simulation where beliefs evolve through structured debate, inspired by Axelrod-style models.

View code

arXiv

Research tooling

arXiv News Filter

Daily cs.AI paper fetcher that filters for LLM and mechanistic interpretability papers using local Ollama.

View code

Applied NLP

Unbalanced Media

Technical analysis of Slovenian media framing using scraping, sentiment analysis, and zero-shot classification.

View code

Applied ML, Data Systems, and Older Experiments

Weather Prediction MLOps

University cloud project with model serving, Streamlit frontend, FastAPI/Ray Serve, Docker, and Google Cloud deployment.

Code

TimescaleDB Stock Forecasting

Real-time stock data streaming into TimescaleDB with XGBoost models and a Flask prediction API.

Code

NYC Violation Tickets

Large-scale public data analysis and prediction project using big-data tooling and ML workflows.

Code

Compare Color Detectors

A computer vision comparison project for color detection methods.

Code

Face Detection Study

Older computer vision comparison of face-detection algorithms on real-world video-style inputs.

Code

Tools, Frameworks, and Languages

Python JavaScript HTML/CSS HTML5 CSS3 SQL R C++ C Java C# Kotlin Bash TypeScript PyTorch TensorFlow Keras Hugging Face TransformerLens SAE Lens Logit Lens GemmaScope 2 Neuronpedia pandas NumPy scikit-learn Matplotlib XGBoost Jupyter Linux Docker Git GitHub Bitbucket VS Code LaTeX HPC/SLURM FastAPI Ray Serve Streamlit Flask Bootstrap Node.js jQuery Angular TimescaleDB MySQL PostgreSQL MongoDB Kafka HDF5 Ollama Kaggle MATLAB OpenCV Google Cloud GCP BigQuery Airflow AWS Heroku Kubernetes Helm ROS2 Arduino Gazebo Selenium PDDL Android Studio

Timeline

Events and Conferences

I have been around AI for a while. I like going to events, hearing about new ideas, and discussing where the field might go next.

2026 7 events

Brave Conversations Event on AI and Society 6. 5. 2026, IJS LJ
AI and Data Privacy and Security Training: Oshani Seneviratne 5. 5. 2026, IJS LJ
AI for Materials Science Tuning Laser-Induced Graphene Production: Lars Kotthoff 23. 4. 2026, IJS LJ
From Understanding People to Securing AI - A Human-Centered Research Journey Through Large Language Models: Erik Derner 16. 3. 2026, IJS LJ
Slo AI Meetup - From Pixels to Probabilities: AI for Sports Understanding 12. 3. 2026, Sportsradar LJ
Deep Learning in the Search for Dark Matter: Dr. Roberto Ruiz de Austri Bazan 4. 3. 2026, IJS LJ
DATA_FAIR 2026 - Data Engineering and Data Science conference 12. 2. 2026, LJ

2025 12 events

AutoLearn-SI Two-Day Exploratory Workshop 24. - 25. 11. 2025, IJS LJ
Spark Sessions 005: Data Science@UL-FRI 18. 11. 2025, UL FRI
Cognitive Science conference 9. 10. 2025, IJS LJ
Slovenian Conference on Artificial Intelligence 8. 10. 2025, IJS LJ
Data Mining and Data wearhouses: SiKDD conference 6. 10. 2025, IJS LJ
Predicting Traffic Intensity on Motorway Sections: Matic Kladnik 22. 9. 2025, IPS LJ
Umetna inteligenca za znanost: Na poti k ustvarjalnim, odprtim agentom umetne inteligence: dr. Nenad Tomašev 1. 7. 2025, IJS LJ
Dan umetne inteligence NVIDIA & FRI AI Day 25. 3. 2025, UL FRI
Spark Sessions 003: Data Science@UL-FRI 25. 2. 2025, UL FRI
AI x Pravo - Umetna inteligenca za pravo: Law Brainer 20. - 21. 2. 2025, FMF UL, PF UL
Lessons from Building AI Coding Assistants - Context and Evaluation: Slovenian AI Meetup 11. 2. 2025, Outbrain LJ
AI/LLM monitoring and observability with New Relic and OpenTelemetry: Harry Kimpel 15. 1. 2025, UL FRI

2024 14 events

Spark Sessions 002: Data Science@UL-FRI 10. 12. 2024, UL FRI
Ali bo umetna inteligenca pisala kot Ivan Cankar?: Knjižni sejem, RTV Slo 27. 11. 2024, LJ
Ai4Gov Evaluation Workshop 27. 11. 2024, IJS LJ
Kompetentna in etična uporaba orodij generativne umetne inteligence: Karierni center UL 26. 11. 2024, online
LLM's and Audio Understanding: Slo AI Meetup 21. 11. 2024, Sportradar LJ
Spark Sessions 001: Data Science@UL-FRI 22. 10. 2024, UL FRI
Nič več znanstvena fantastika: AI in napredne tehnologije v znanosti: BEST 22. 10. 2024, UL FRI
LLM Dojo: Jan Rupnik 18. 10. 2024, IJS E3
Okrogla miza o vplivu umetne inteligence na znanost in družbo: SMASH 8. 10. 2024, Vipava
Snowflake beginner course by In516ht 15. 2. - 5. 6. 2024
Geometric Deep Learning, Categorical Deep Learning: Petar Veličković, DeepMind 21. 5. 2024, FRI
Data Streaming with Kafka - A Modernist Approach: Jure Ham and Matic Žgur, Outbrain 15. 5. 2024, FRI
Usage of AWS in data engineering: Anžej Curk and Denis Turšič, Result 8. 5. 2024, FRI
Reinforcement fundamentals and state of the art: Joshua B. Evans 10.-11. 1. 2024, FRI

2023 6 events

Graph Deep Learning - Fundamentals and Applications of Graph Convolutional Layers: Florian Thamm 23. 11. 2023, FRI
Okrogla miza: UI - ustaviti ali pospešiti njen razvoj? 29. 9. 2023, FRI
Outbrain DataScience summer school 10.-14. 7. 2023
ŠO FRI: Sodobna umetna inteligenca: Tehnologije in dileme 29. 3. 2023, FRI
EESTEC JobFair: Svet Prihodnosti; Odkrivanje moči procesiranja naravnega jezika in umetne inteligence 16. 3. 2023, FRI
Data Science@UL-FRI: Recommender systems with TensorFlow workshop 22. 2. 2023, FRI

Older 2017-2022 13 events

Data Science@UL-FRI: Introduction to state of the art NLP (BERT and GPT) 17.6.2021, online
Gašper Beguš, Ph.D.: Kako se umetna inteligenca uči govoriti? 11.11.2020, online
Umetna inteligenca in digitalni marketing - Kaj pa Watson? - OpenLab 21.10.2019, Kranj
Umetna inteligenca - malo zares in malo za hec 17.10.2019, Ljubljana
The subtle art of recommendation (algorithms) 14.10.2019, Ljubljana
MEi:CogSci Conference 13.-15.6.2019, Ljubljana
Deep Machine Vision - From Research to Market 25.4.2019
Artificial Intelligence at your fingertips with Microsoft Azure 18.4.2019
Predavanje Vid Kocijan: Ali ima umetna inteligenca zdravo pamet? 16.4.2019, SAZU LJ
Tedensko udeležujem Debate KUI četrtki 2019, FRI
Okrogla miza "Možnosti umetne inteligence" EESTEC JobFair 21.3.2019, FE LJ
Konferenca AI4GOOD - Huawei FRI, 26.9.2018
FeelTheFuture sejem, Celje 19.-21. 10. 2017

Contact

Research conversations are the best reason to reach out.

If you have a strange idea or a question, shoot me a message.