<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Technical Deep Dives on David An</title>
    <link>https://davidan.dev/categories/technical-deep-dives/</link>
    <description>Recent content in Technical Deep Dives on David An</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <lastBuildDate>Sat, 01 Nov 2025 00:00:00 +0000</lastBuildDate><atom:link href="https://davidan.dev/categories/technical-deep-dives/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Distributed Inference for Fun and Profit</title>
      <link>https://davidan.dev/posts/dif/</link>
      <pubDate>Sat, 01 Nov 2025 00:00:00 +0000</pubDate>
      
      <guid>https://davidan.dev/posts/dif/</guid>
      <description>You ever just wonder how large models serve at scale? Or how to actually go from query to answer? Over the course of this article, we will take a look at approaches to inference and explore the tradeoffs of various approaches from a technical perspective.
We assume that the reader has basic knowledge of ML concepts and how Transformers work. Additionally, all of the work here is done on a single Nvidia RTX 3090 GPU with the respective drivers installed (nvidia-smi, nvidia-ctk, etc.</description>
    </item>
    
    <item>
      <title>A Dive into GPU Math</title>
      <link>https://davidan.dev/posts/gpumath/</link>
      <pubDate>Wed, 15 Oct 2025 00:00:00 +0000</pubDate>
      
      <guid>https://davidan.dev/posts/gpumath/</guid>
      <description>You ever wonder what goes on when you ask ChatGPT a question and how that is served? Or what people mean when by using a A100 to train a model and the time it takes? Or even considering the levels of abstraction between the model and the hardware? This article will aim to bring light to many of the concepts related to GPUs and the math behind them.
We assume that the reader has a basic understanding of how recent LLM technologies work.</description>
    </item>
    
    <item>
      <title>A Discussion on Pandas and Data Mining</title>
      <link>https://davidan.dev/posts/datamining/</link>
      <pubDate>Tue, 17 Jan 2023 00:00:00 +0000</pubDate>
      
      <guid>https://davidan.dev/posts/datamining/</guid>
      <description>At the beginning of any data analytics/data science project, the most usual case is utilizing Pandas to load the data into a object called a DataFrame and perform preprocessing tasks on it. While the Pandas library is convenient and comes with a trove of useful analytic tools, it does have some inefficiencies, many due to the nature of Python. To investigate this, let&amp;rsquo;s look at how Pandas actually works on top of Python.</description>
    </item>
    
  </channel>
</rss>
