<feed xmlns="http://www.w3.org/2005/Atom"> <id>https://hiraditya.github.io/</id><title>Aditya Kumar</title><subtitle>Compilers, Static analysis, software performance optimizations.</subtitle> <updated>2026-06-19T21:49:06+00:00</updated> <author> <name>Aditya Kumar</name> <uri>https://hiraditya.github.io/</uri> </author><link rel="self" type="application/atom+xml" href="https://hiraditya.github.io/feed.xml"/><link rel="alternate" type="text/html" hreflang="en" href="https://hiraditya.github.io/"/> <generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator> <rights> © 2026 Aditya Kumar </rights> <icon>/assets/img/favicons/favicon.ico</icon> <logo>/assets/img/favicons/favicon-96x96.png</logo> <entry><title>Building vLLM from Source: A Field Guide (with all the pitfalls)</title><link href="https://hiraditya.github.io/posts/building-vllm-from-source/" rel="alternate" type="text/html" title="Building vLLM from Source: A Field Guide (with all the pitfalls)" /><published>2026-06-19T15:00:00+00:00</published> <updated>2026-06-19T15:00:00+00:00</updated> <id>https://hiraditya.github.io/posts/building-vllm-from-source/</id> <content type="text/html" src="https://hiraditya.github.io/posts/building-vllm-from-source/" /> <author> <name>Aditya Kumar</name> </author> <category term="Systems" /> <category term="Compilers" /> <category term="Engineering" /> <summary>A step-by-step field guide to building vLLM from source on Ubuntu 26.04, covering Python 3.14 compatibility, CUDA driver issues, and toolchain pitfalls.</summary> </entry> <entry><title>vLLM's op IR, or: where the inference engine meets the compiler</title><link href="https://hiraditya.github.io/posts/vllm-op-ir-where-inference-meets-compiler/" rel="alternate" type="text/html" title="vLLM&amp;apos;s op IR, or: where the inference engine meets the compiler" /><published>2026-06-17T15:00:00+00:00</published> <updated>2026-06-19T17:06:15+00:00</updated> <id>https://hiraditya.github.io/posts/vllm-op-ir-where-inference-meets-compiler/</id> <content type="text/html" src="https://hiraditya.github.io/posts/vllm-op-ir-where-inference-meets-compiler/" /> <author> <name>Aditya Kumar</name> </author> <category term="Systems" /> <category term="Compilers" /> <summary>How vLLM's op‑level IR reconciles the tension between a compiler target and hand‑tuned kernel dispatch, enabling graph‑level fusion while supporting multiple back‑ends.</summary> </entry> <entry><title>Loop Unrolling in the ML Era</title><link href="https://hiraditya.github.io/posts/why-loop-unrolling-is-popular-again/" rel="alternate" type="text/html" title="Loop Unrolling in the ML Era" /><published>2026-06-16T15:00:00+00:00</published> <updated>2026-06-17T19:22:58+00:00</updated> <id>https://hiraditya.github.io/posts/why-loop-unrolling-is-popular-again/</id> <content type="text/html" src="https://hiraditya.github.io/posts/why-loop-unrolling-is-popular-again/" /> <author> <name>Aditya Kumar</name> </author> <category term="Systems" /> <category term="Compilers" /> <summary>If you have a massive compute architecture—whether it’s a modern wide-SIMD vector engine, a Tensor Core array, or a custom deep learning accelerator like a Systolic Array—you face one fundamental problem: feeding the beast. You have immense execution width, but if your instructions are bottlenecked by branch overhead and short basic blocks, those execution units sit idle. This architectural sh...</summary> </entry> <entry><title>"Hello, World!" in a Heterogeneous System</title><link href="https://hiraditya.github.io/posts/hello-world-in-a-heterogeneous-system/" rel="alternate" type="text/html" title="&amp;quot;Hello, World!&amp;quot; in a Heterogeneous System" /><published>2026-06-13T15:00:00+00:00</published> <updated>2026-06-16T14:25:05+00:00</updated> <id>https://hiraditya.github.io/posts/hello-world-in-a-heterogeneous-system/</id> <content type="text/html" src="https://hiraditya.github.io/posts/hello-world-in-a-heterogeneous-system/" /> <author> <name>Aditya Kumar</name> </author> <category term="Systems" /> <category term="Hardware" /> <summary>In a previous post, we explored the monumental software stack required to run a simple “Hello, World!” program on a modern operating system. But what happens when we apply these concepts to a heterogeneous system—where a host machine is solely responsible for launching the program on a completely different target architecture? Applying concepts like loaders, stack initialization, and ABI const...</summary> </entry> <entry><title>Hardening the ELF: Understanding RELRO and GOT Overwrites</title><link href="https://hiraditya.github.io/posts/hardening-the-elf-understanding-relro/" rel="alternate" type="text/html" title="Hardening the ELF: Understanding RELRO and GOT Overwrites" /><published>2026-06-12T15:00:00+00:00</published> <updated>2026-06-16T05:38:50+00:00</updated> <id>https://hiraditya.github.io/posts/hardening-the-elf-understanding-relro/</id> <content type="text/html" src="https://hiraditya.github.io/posts/hardening-the-elf-understanding-relro/" /> <author> <name>Aditya Kumar</name> </author> <category term="Security" /> <category term="Systems" /> <summary>In our previous post, we took a deep dive into the hidden complexities of the simplest C program. We discussed how modern Position Independent Executables (PIE) rely on the PLT (Procedure Linkage Table) and GOT (Global Offset Table) to dynamically resolve shared library functions like puts(). We noted that under “lazy binding”, the dynamic linker looks up the true memory address of puts on the...</summary> </entry> </feed>
