Large Language Models (LLMs)

Introduction

LLMs typically stands for Large Language Models.

These are machine learning models trained on massive amounts of text data to understand and generate human-like language. LLMs are a subset of natural language processing (NLP) and are built using deep learning architectures, most commonly transformers (like GPT, BERT, T5, etc.).

Key Features of LLMs:

Trained on huge datasets: Often comprising billions of words or more.
Understand context: They can handle long-range dependencies in language.
Generate text: Can write articles, summarize, translate, answer questions, code, and more.
Examples: OpenAI’s GPT-4, Google’s PaLM, Meta’s LLaMA, Anthropic’s Claude, etc.

What is Token?

A token is the basic unit of text that an LLM processes [ can be word, part of a word, or even a character can be a token ].

1000 tokens ≈ 2000 words in English. ( ex. 2 words as 1 token)

Example:

“Artificial Intelligence is amazing!” → 5 tokens
“AI is great!” → 3 tokens

The number of tokens an LLM can handle determines how much information it can process at once.

What is Context Window ?

A context window refers to the maximum number of tokens an AI model can process at a time.

Gemma 3 supports a 128K-token context window, meaning it can understand long documents in one go.

Why is this important?

Larger context windows allow AI to maintain better memory and coherence across longer inputs.
For tasks like PDF summarization, a large context window helps retain key details without losing meaning.

Example:

A 2K-token model might forget the beginning of a document when processing long papers.
A 128K-token model (like Gemma 3) can process entire chapters or research papers in one go!

Boyang Yan

Explorer

Large Language Models (LLMs)

Introduction

What is Token?

What is Context Window ?

Graph View

Table of Contents

Backlinks