GPU is faster than CPU, right?!
Overview This post is a brief experimentation on trying compare the performance of similar code on CPU & GPU. I ran my experiments keeping in the context of ML on a Nvidia A100 GPU using PyTor...
Overview This post is a brief experimentation on trying compare the performance of similar code on CPU & GPU. I ran my experiments keeping in the context of ML on a Nvidia A100 GPU using PyTor...
Overview Historically, a Central Processing Unit (CPU) is a general purpose compute designed to handle a wide variety of tasks, optimized for sequential processing and overall system management. O...
Overview The paper, Attention is all you need, written in 2017 had a significant impact on NLP and deep learning and paved the way for later breakthroughs such as BERT or GPT-3. The model of 201...
Issue with LSTM encoder-decoder LSTMs were a good step forward towards machine translation but it had its short comings. If you want to just translate a sentence or two, it performs well but say y...
Overview In the previous parts, we covered how to tokenize the data, create word embeddings for training a large language model (LLM) & the fundamentals of RNNs. Now let’s look into the model ...
Overview In the previous parts, we covered how to tokenize the data & create word embeddings for training a large language model (LLM). Now let’s look into the model architecture itself. In...
Overview In the previous part, we covered how to tokenize the data for training a language model for next word prediction. Tokenization was the first step, now let’s look into word embeddings. ...
Language Model One of the most popular and early applications of NLP was next word prediction. All of us have probably leveraged this capability from our smartphone’s keyboard! Next-word predictio...
What’s in this LLM Series? The purpose of this series if two-fold: Serve as a template for people who are on the LLM learning journey Capture my learnings and serve as notes for me to go back...
Reference Fluent Python Chapter-5 Overview Python offers a few ways to build a simple class that is just a collection of fields, with little or no extra functionality. That pattern is known as a...