<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>nlp on David An</title>
    <link>https://davidan.dev/tags/nlp/</link>
    <description>Recent content in nlp on David An</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <lastBuildDate>Wed, 01 Oct 2025 00:00:00 +0000</lastBuildDate><atom:link href="https://davidan.dev/tags/nlp/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Tokenization and Embeddings: A Primer</title>
      <link>https://davidan.dev/posts/tokenization/</link>
      <pubDate>Wed, 01 Oct 2025 00:00:00 +0000</pubDate>
      
      <guid>https://davidan.dev/posts/tokenization/</guid>
      <description>Lately, all we have heard about is tokenization and embeddings and the role they play in the greater LLM and AI ecosystem. These two concepts are one of the most fundamental concepts in language modeling and remain the foundation of the technology we interact with on a daily basis. In this article, we will cover some of the basics around tokenizing and embedding sequences of texts and the nuances of them.</description>
    </item>
    
    <item>
      <title>Building an NLP-Powered Repository for Cyber Risk Literature</title>
      <link>https://davidan.dev/research/nlpsearch/</link>
      <pubDate>Fri, 13 May 2022 00:00:00 +0000</pubDate>
      
      <guid>https://davidan.dev/research/nlpsearch/</guid>
      <description>Building an NLP-Powered Repository for Cyber Risk Literature [Poster] David An, Linfeng Zheng, Zhiyu (Frank) Quan
Abstract With the large and growing body of cyber risk literature, we see three major challenges faced by the actuarial research community: there is no context aware tool for finding cyber literature, no central repository of cyber risk resources, and a lack of accounting of literature trends. To address the abovementioned challenges, we propose to build a repository of cyber-risk articles with an NLP powered search tool that can easily be used by researchers to find relevant materials.</description>
    </item>
    
    <item>
      <title>Fake News Detection Using NLP (FaDe-Net)</title>
      <link>https://davidan.dev/research/fadenet/</link>
      <pubDate>Wed, 12 May 2021 00:00:00 +0000</pubDate>
      
      <guid>https://davidan.dev/research/fadenet/</guid>
      <description>FaDe-Net [Writeup] David An - AP Research Project
Abstract The rapid development of social media and online news outlets has accelerated the spread of fake news across the internet. The accessibility and convenience of social media has further driven the drastic change of information consumption. As a consequence, fake news has become a significant concern because of 1) its inevitable exposure to large populations and 2) the potential to cause significant damage in modern society.</description>
    </item>
    
  </channel>
</rss>
