Jeff Hajewski | Reliable AI Systems

I build reliable, measurable, economical AI systems.

I currently lead a company-wide AI infrastructure and training initiative at SAP. My work spans large language model training, the infrastructure behind it, and the systems required to make model capability usable in practice.

Before SAP, I built ML platforms at Noom and worked on distributed inference at Salesforce. I did my PhD at the University of Iowa under Prof. Suely Oliveira, where I focused on neural architecture search for variational autoencoders and biologically inspired optimization methods.

What I work on

I tend to sit at the layer where model capability has to become dependable software. The problems I care about most are:

reliable agent and LLM runtime systems
evaluation and post-deployment monitoring
context and retrieval systems that scale
performance, latency, and cost economics
infrastructure for accountable AI products

How I think

I approach engineering as a builder with research habits: measure first, name the tradeoffs, and make systems legible enough to operate. I am less interested in demo-driven novelty than in the work that makes a system survivable after deployment.

That is why this site focuses on durable artifacts: technical essays, case studies, and open-source projects that show how I think in public.

Background

Across industry and research, my work has sat at the intersection of distributed systems and machine learning infrastructure. I have built production ML platforms, worked on distributed inference, taught computer science, written The Art of Elixir, and spent a long time learning where elegant ideas break once they meet operational constraints.

Contact

GitHub
Email