Software engineering
Build reliable Python-backed software
Web backends, APIs, services, Linux deployments, automation, and maintainable applications that people can actually use.
Python · Django · Flask · Nginx · Gunicorn · Linux · CI/CD
I'm a computer scientist and ML engineer with a background in business information systems, applied data science, and computational neuroscience.
Right now, I'm finishing a PhD in Computer Science / Machine Learning at the Max Planck Institute in Leipzig, where I build large-scale data and ML infrastructure for NeuroAI and visual neuroscience. Before that, I worked on production-facing ML services, recommender systems, image-processing pipelines, and product data workflows.
Hello from the other side! If you're here, it means you want to learn more about me personally - so I'll keep things more casual!
Here you'll find some info about me and my hobbies, my taste in music, movies and books, and some notes on things I'm thinking about.
Software engineering
Web backends, APIs, services, Linux deployments, automation, and maintainable applications that people can actually use.
Python · Django · Flask · Nginx · Gunicorn · Linux · CI/CD
ML & data systems
Computer vision, image processing, recommender logic, model evaluation, data collection, and deployment-oriented inference workflows.
PyTorch · TensorFlow · scikit-learn · CLIP · Redis · BigQuery
Business technology
I studied business information systems and have worked on product matching, travel search, quality scoring, analytics, monitoring, and customer-facing data problems.
PostgreSQL · AWS · Grafana · data pipelines · analytics
Research depth
I can translate ambiguous domain questions into concrete datasets, experiments, codebases, and infrastructure with clear evaluation loops.
open source · HPC · datasets · reproducible workflows
NeuroAI, visual neuroscience, and large-scale data infrastructure
Hebart Lab · Max Planck Institute & University of Gießen
Researching NeuroAI approaches for efficient experimental design in visual neuroscience: choosing better stimuli from large naturalistic image spaces, comparing model and brain representations, and improving fMRI dataset coverage. Built the ML/data infrastructure behind these workflows, including public datasets and re:vision challenge infrastructure.
Python · PyTorch · CLIP · scikit-learn · SLURM · Docker
Machine learning, data analysis, and medical image processing
Leipzig University
Studied applied machine learning and data analysis, graduating with grade 1.2. Thesis used GANs to synthesize stimuli that maximally activate targeted brain regions. Project work ranged from clothing-color recognition with segmentation models to EEG eigenspectra for states of consciousness and cloud-formation classification from satellite imagery.
Machine learning · GANs · Segmentation · EEG · Satellite imagery
Computer science foundation with business-facing analytics and systems thinking
Leipzig University
Built a foundation across computer science, data systems, software engineering, and business-facing analytics, graduating with grade 1.5. Thesis extended a software-visualization framework to ABAP. Other projects included ABAP and full-stack work on an in-house metering app at GISA and a full e-commerce build spanning store implementation, SEO, catalogue work, design, and marketing.
Software engineering · ABAP · Full-stack development · Databases · E-commerce
Medical imaging models, segmentation, and uncertainty estimation
ScaDS.AI Dresden/Leipzig
Implemented and adapted recent deep-learning papers for medical imaging, especially brain-tumor segmentation and survival prediction. Built preprocessing, training, and evaluation workflows, ran experiments, and analyzed uncertainty in model predictions.
PyTorch · FT-Transformer · SAINT · UNet++
Production ML service for hotel images and travel product data
CHECK24 (Travel Vertical)
Built applied ML and backend features for the travel product. Designed and deployed a Flask + Redis service for fast image inference across millions of hotel images, using the outputs for deduplication, retrieval, classification, quality scoring, and recommendation tuning. Also worked on PHP backend features for the booking engine and outlier-detection workflows for hotel prices.
Python · PyTorch · Flask · Redis · PHP · BigQuery · Grafana
Django sites, Linux hosting, and deployment automation
Kimetric UG
Delivered full-stack web projects for academic clients, from backend implementation to deployment. Built Django-based websites, configured Linux servers with Nginx and Gunicorn, and wrote deployment scripts so the sites could be maintained reliably after handoff.
Django · Nginx · Gunicorn · Linux · CI/CD
Product matching, scraping, and model evaluation workflows
Webdata Solutions GmbH (now Vistex)
Worked on product matching for e-commerce data, where the hard part was turning noisy scraped web data into usable training and evaluation datasets. Rebuilt parts of the matching pipeline with a neural-network approach, improving matching accuracy from below 50% to 92%.
Python · TensorFlow · PostgreSQL · AWS
Dataset notes
Vision research needs naturalistic photographs, but web-scraped datasets like LAION are full of screenshots, memes, ads, and generated images. We scored all 2.1 billion images in ReLAION-2B for "naturalness" using a CLIP-based classifier, then extracted and published ViT-H/14 embeddings for the ~500M most photographic ones. The result is a 167GB dataset on Hugging Face that lets researchers query half a billion images by visual similarity without downloading a single pixel.
Library notes
An open-source Python toolbox for extracting and comparing image representations from deep neural networks. Supports 100+ models across torchvision, timm, CLIP, self-supervised models (DINO, MAE, SimCLR), and more. Also includes tools for aligning DNN representations with human similarity judgments via RSA and CKA. I'm the third-largest contributor to the project, which has 460k+ PyPI downloads and is used across vision and cognitive neuroscience labs.
Dataset notes
LAION-fMRI is a deeply sampled 7T fMRI dataset for studying human visual representations with naturalistic images. It contains brain responses to 25,052 LAION-derived images across 5 subjects and 165 acquired scanning sessions, including single-trial GLMsingle betas, retinotopy, localizers, and diffusion data. The dataset supports work on model-brain comparison, stimulus selection, and replication through the re:vision initiative.
We analyzed the object similarity structure of 162 diverse vision models to ask what, if anything, converges across architectures, objectives, and datasets. The paper separates universal from model-specific dimensions and links the more universal structure to interpretability, semantic image properties, macaque IT activity, and human similarity judgments.
Vision neuroscience runs on large fMRI datasets, but nobody had checked whether the stimulus images in these datasets actually cover what humans see in the real world. We built LAION-natural -a reference distribution of ~120M naturalistic photographs filtered from 2 billion LAION images using a CLIP-based classifier trained on 25k actively sampled labels. Then we measured coverage: ~50% of the visual-semantic space is missing from the two most widely used datasets (NSD and THINGS).
The good news: you don't need millions of images to fix this. In both simulations and real fMRI data, out-of-distribution generalization saturates at 5-10k samples - as long as you draw them from a diverse enough pool. We compared seven sampling strategies (random, stratified, k-Means, Core-Set, effective dimensionality optimization, active learning) and found that pool diversity matters far more than which algorithm you use to sample from it.
The pipeline processes billions of images using CLIP embeddings, Annoy indices for nearest-neighbor search, mini-batch k-Means clustering, and Ridge regression encoding models - all at a scale that runs on a university HPC cluster, not a cloud budget.
Most scientists learn to code informally - picking things up as they go, optimizing for "does it run?" over "will anyone else understand this?" This paper introduces a structured framework for writing better research code, built around the idea that researchers naturally switch between quick prototyping and careful development - and that being deliberate about which mode you're in makes all the difference.
The ten principles span three tiers: organizing code (standardized project structures, version control, automation), writing reusable code (testing, documentation, clean interfaces), and collaborating (code review systems, shared knowledge bases, lab-wide standards). Already at 22k+ accesses, it clearly hit a nerve - these are problems every computational lab deals with but rarely talks about explicitly.
You can't show the brain every possible image, so how do you figure out what a specific patch of visual cortex actually responds to? We trained a convolutional neural network end-to-end on fMRI data from a subject watching naturalistic movies - no ImageNet pretraining, just raw stimulus-response pairs. Then we used BigGAN to synthesize images that maximally activate individual voxels via gradient ascent through the model.
Early visual areas (V1-V3) preferred gratings in small receptive fields, as expected. More interesting: FFA showed preference for faces but also oval shapes and vertical symmetry, while PPA preferred places plus horizontal lines and high spatial frequencies. An SVM classifier could distinguish FFA vs. PPA preferred stimuli from their GAN latent vectors at 87% accuracy, confirming the approach produces meaningfully different outputs per region. This was one of the first demonstrations of GAN-based preferred stimulus synthesis for the human visual system.
Recognition / service
My love for moving my body started with joining a breakdance crew at the age of 17. Since then I've dabbled in all kinds of movement disciplines. Today, I really enjoy bouldering, but I also do resistance training, yoga and more recently Qi Gong (good way to wake up the body in the morning!). I also love hiking in nature (and combining it with climbing is the best way to spend a day, see above).
A couple of years ago, I asked a bunch of friends, "Hey, want to start a band?" To my surprise, everyone said yes! We had to drag our front singer into it, but now he regularly screams his heart out. We mostly play pop punk covers for friends' birthday parties, but the dream of writing our own songs is still alive and kicking.
As a kid, one of my teachers would say "Johannes lives off plain air", because I refused to eat food from the school kitchen. Either I was very picky, or the food was just plain bad. Today I eat almost anything - but I love throwing together a plate of whole food ingredients into something delicious. Lately, I enjoyed hosting friends for dinner nights, which have been a blast (but surprisingly stressful!).
I've been meditating for over a decade now - although it's an on-off relationship. I even attended a 10-day silent meditation retreat in 2024, which has been pretty life changing for me! This is an image from the forest near the meditation hall - I walked that path probably 100 times and discovered something new every time.
I listen to lots of stuff - but Spotify knows me best, so here is my weekly discovery playlist for your perusal.
Once a month, my wife and I do a horror movie night with friends - here are the last three movies we watched (pulled from my Letterboxd account). Also, check out the recommender I built for us.
Silence
Old Path White Clouds
You Are Here
No Mud, No Lotus
Peace Is Every Step
When Things Fall Apart
Full Catastrophe Living
Wherever You Go...
Why We Meditate
The Art of Living
Be Here Now
After the Ecstasy, the Laundry
The Attention Revolution
The Book: On the Taboo
The Wisdom of Insecurity
Autobiography of a Yogi
Doors of Perception
The Psychedelic Explorer's Guide
Food of the Gods
Red Rising series
The Kingkiller Chronicle
A Song of Ice and Fire series
The Expanse series
The Belgariad series
Mistborn series
The 13th Paladin series
The Three-Body Problem
Dune
It
Island
Malice
Of Blood and Fire
Gods of the Wyrdwood
Magician
The Dwarves
Uldart: Die dunkle Zeit
A Wizard of Earthsea
Ender's Game
Braiding Sweetgrass
Wild
Into the Wild
Into Thin Air
Seven Years in Tibet
Born to Run
A Search in Secret India
Adult Children of Emotionally Immature Parents
How to Do the Work
Why Has Nobody Told Me This Before?
Why Zebras Don't Get Ulcers
Maybe You Should Talk to Someone
In the Realm of Hungry Ghosts
How to Change Your Mind
DARE
Feeling Good
A Liberated Mind
Four Thousand Weeks
Deep Work
The Myth of Normal
How Not to Die
How Not to Age
Food Rules
Omnivore's Dilemma
Deep Medicine
Total Immersion
24/6
Smartphone Nation
It Doesn't Have to Be Crazy at Work
The Age of Surveillance Capitalism
How to Break Up with Your Phone
Make Time
We Should All Be Feminists
Men Explain Things to Me
The Road to Reality
Surely You're Joking, Mr. Feynman!
Tools of Titans
How to Win Friends and Influence People
The Life-Changing Magic of Tidying Up
When Breath Becomes Air
Man's Search for Meaning
That Good Night
The Obstacle Is the Way
Stolen Focus
The Wim Hof Method
Atomic Habits
Indistractable
Waking Up
Built to Move
Kitchen Confidential
The Comfort Crisis
The Shallows A passage from Art and Fear and what it says about how we see other people's work versus our own.
A month with Claude Max taught me how work, science, and life will change radically in the very near future.
Happy to hear from friends, collaborators, future coworkers, or people who want to talk about brains, tools, climbing, books, or dinner. Email is best.
Happy to chat about research, potential collaborations, or opportunities. Email is best. Also on LinkedIn, GitHub, Hugging Face, and Google Scholar.