machine-learning on David An

machine-learning on David An https://davidan.dev/tags/machine-learning/ Recent content in machine-learning on David An Hugo -- gohugo.io en-us Mon, 12 Jan 2026 00:00:00 +0000 Fine Tuning Llama 3.2B with Unsloth https://davidan.dev/posts/ftsql/ Mon, 12 Jan 2026 00:00:00 +0000 https://davidan.dev/posts/ftsql/ In this article, we will be fine tuning the Llama 3.2B model with Unsloth on the Spider 1.0 SQL dataset. The goal of the article is to improve the SQL capabilities of a general Llama 3.2B model. Prerequisites Before we get started, we assume that the reader has access to a GPU which they are able to use for training. Additionally, we assume that the reader has a Python setup. Local-first GPU Cluster with nvkind and Time Splitting https://davidan.dev/posts/nvkind/ Sun, 14 Dec 2025 00:00:00 +0000 https://davidan.dev/posts/nvkind/ You have a brand new shiny GPU and want to start experimenting with it by running some sample experiments in Kubernetes, but how would you start that. In this short tutorial, we go over how to use nvkind, the gpu-operator to start running some basic experiemtns using your new GPU. We assume that the reader already has things such as Docker, golang, and relevant drivers/systems (nvidia-ctk, nvidia-smi, etc.) installed too. AI's Second Act: Predictions for the Future of AI https://davidan.dev/posts/secondact/ Sun, 23 Nov 2025 00:00:00 +0000 https://davidan.dev/posts/secondact/ I want to preface that any opinions expressed here are of my own and not representative of any organization or my employers. In 2023, ChatGPT was released to the world, fundamentally changing how technology is created and how we think about businesses. What once took weeks to build now takes hours to minutes with a simple prompt and a few clicks. Tools like Cursor and Codex brought AI to our fingertips, allowing us to seamlessly interact with AI on a daily basis. Distributed Inference for Fun and Profit https://davidan.dev/posts/dif/ Sat, 01 Nov 2025 00:00:00 +0000 https://davidan.dev/posts/dif/ You ever just wonder how large models serve at scale? Or how to actually go from query to answer? Over the course of this article, we will take a look at approaches to inference and explore the tradeoffs of various approaches from a technical perspective. We assume that the reader has basic knowledge of ML concepts and how Transformers work. Additionally, all of the work here is done on a single Nvidia RTX 3090 GPU with the respective drivers installed (nvidia-smi, nvidia-ctk, etc. A Dive into GPU Math https://davidan.dev/posts/gpumath/ Wed, 15 Oct 2025 00:00:00 +0000 https://davidan.dev/posts/gpumath/ You ever wonder what goes on when you ask ChatGPT a question and how that is served? Or what people mean when by using a A100 to train a model and the time it takes? Or even considering the levels of abstraction between the model and the hardware? This article will aim to bring light to many of the concepts related to GPUs and the math behind them. We assume that the reader has a basic understanding of how recent LLM technologies work. Tokenization and Embeddings: A Primer https://davidan.dev/posts/tokenization/ Wed, 01 Oct 2025 00:00:00 +0000 https://davidan.dev/posts/tokenization/ Lately, all we have heard about is tokenization and embeddings and the role they play in the greater LLM and AI ecosystem. These two concepts are one of the most fundamental concepts in language modeling and remain the foundation of the technology we interact with on a daily basis. In this article, we will cover some of the basics around tokenizing and embedding sequences of texts and the nuances of them. Building an NLP-Powered Repository for Cyber Risk Literature https://davidan.dev/research/nlpsearch/ Fri, 13 May 2022 00:00:00 +0000 https://davidan.dev/research/nlpsearch/ Building an NLP-Powered Repository for Cyber Risk Literature [Poster] David An, Linfeng Zheng, Zhiyu (Frank) Quan Abstract With the large and growing body of cyber risk literature, we see three major challenges faced by the actuarial research community: there is no context aware tool for finding cyber literature, no central repository of cyber risk resources, and a lack of accounting of literature trends. To address the abovementioned challenges, we propose to build a repository of cyber-risk articles with an NLP powered search tool that can easily be used by researchers to find relevant materials.