Allen Philip J
Building AI platforms & making large models run fast
Dublin, Ireland
Background
I'm a Senior Product Engineer at Intercom, working on AI platforms. Previously, I spent 4 years at Adobe optimizing inference for Firefly's generative video and audio models.
My focus is post-training inference optimization: quantization, attention efficiency, runtime acceleration, and deep profiling. I work at the intersection of model architecture and system performance—getting the most out of inference frameworks and hardware (A100/H100).
I don't just run models—I make them run well.
Focus Areas
Attention & Kernels — Integrating FlashAttention variants, writing custom CUDA kernels, optimizing diffusion loop inference for video models.
Quantization — INT8 and FP8 workflows, calibration strategies, navigating the accuracy-performance tradeoff for production GenAI models.
Model Compilation — TensorRT optimization, torch.compile debugging, operator fusion strategies for real-world speedups on A100/H100.
AI Platforms — Building self-serve ML platforms, anomaly detection systems, and scalable inference pipelines.
Experience
- Building AI platform infrastructure for customer support automation
- Scaling LLM-powered features for enterprise customers
- Reduced inference latency by 35% via FP8 quantization and 25% via custom TRT attention plugins
- Scaled Enhance Speech pipeline to 150 RPM on 50+ A100 GPUs; cut ops costs by 33%
- Led ML pipelines for demand modeling serving 90M+ SKUs for Walmart, Loblaws, Woolworths
- Optimized batch processing to meet SLAs for retailers with massive product catalogs
- Full-stack development for B2B catalog, pricing, and user management features
- Recipient of Best Rookie Award (2015)
B.Tech Electrical Engineering, IIT Madras (2014)
Now
Building AI platform infrastructure at Intercom. Expanding deeper into LLMs while continuing to write about inference optimization, attention mechanisms, and the practical side of ML systems.
Tools
Get in touch
Have a question about ML inference or want to collaborate? Feel free to reach out.
allenphilip93@gmail.com