Zaher Khateeb zahere

Zaher Khateeb | AI/ML Engineer

Founder of AgentiCraft — infrastructure layer for production multi-agent systems.

I specialize in multi-agent systems architecture, LLM infrastructure, and distributed systems reliability. My work sits at the intersection of formal methods and production engineering — building systems that are provably correct, not just empirically okay.

Current Research

Fault-Dependent Resilience in Multi-Agent LLM Systems

Extending classical network reliability theory to stochastic agent quality. The core result: an iff characterization of when topology choice actually matters — crash-stop faults make all mesh topologies equivalent (a mathematical identity), while Byzantine faults break that equivalence in ways determined by the coordination protocol, not the graph structure.

Validated across ~34,000 LLM experiments spanning 13 coordination topologies, two fault regimes, two task domains, and two model generations. Preparing for submission to a top-tier ML systems venue.

Standalone libraries from this research:

Library	Description
stochastic-circuit-breaker	CUSUM-optimal circuit breaker for LLM agents and stochastic systems. 4-state FSM with statistically principled degradation detection and provably minimax detection delay.
reliability-polynomials	Generalized reliability polynomials where coefficients encode quality, not just connectivity. Fault-dependent crossover analysis, three theorems.

Technical Focus

Multi-Agent Systems — mesh coordination architecture, fault-dependent topology selection, Byzantine fault tolerance for LLM systems, stochastic service mesh, MCP/A2A protocol integration

Formal Methods — session type theory for deadlock-freedom guarantees, runtime property verification, CSP process algebra, refinement checking

LLM Infrastructure — provider-agnostic inference abstraction, statistical circuit breakers with CUSUM-optimal change detection, quality-weighted reliability theory

Distributed Systems — consensus protocols, fault injection and fault modeling, observability, Kubernetes-native deployment

Tech Stack

Languages: Python (expert), C++, TypeScript, SQL, Bash

AI/ML: PyTorch, RAG, fine-tuning (LoRA, QLoRA), LLM evaluation, OpenTelemetry

Infrastructure: Kubernetes, Docker, Helm, CI/CD, service mesh, PostgreSQL, Redis, Qdrant

Cloud: AWS, GCP, Azure, Nebius AI Cloud

Background

B.Sc. Industrial Engineering & Management (Data Science concentration) — Tel Aviv University
Advanced Data Science & AI Program — Nebius Academy (Y-DATA), Tel Aviv University
Previously: AI & Infrastructure Engineer at Visual Arena (Gothenburg, Sweden)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zaher Khateeb zahere

Achievements

Achievements

Block or report zahere

Zaher Khateeb | AI/ML Engineer

Current Research

Technical Focus

Tech Stack

Background

Pinned Loading

Uh oh!