Introduction

LLMs typically stands for Large Language Models.

These are machine learning models trained on massive amounts of text data to understand and generate human-like language. LLMs are a subset of natural language processing (NLP) and are built using deep learning architectures, most commonly transformers (like GPT, BERT, T5, etc.).

Key Features of LLMs:

  • Trained on huge datasets: Often comprising billions of words or more.
  • Understand context: They can handle long-range dependencies in language.
  • Generate text: Can write articles, summarize, translate, answer questions, code, and more.
  • Examples: OpenAI’s GPT-4, Google’s PaLM, Meta’s LLaMA, Anthropic’s Claude, etc.

What is Token?

A token is the basic unit of text that an LLM processes [ can be word, part of a word, or even a character can be a token ].

  • 1000 tokens ≈ 2000 words in English. ( ex. 2 words as 1 token)

Example:

  • “Artificial Intelligence is amazing!” → 5 tokens
  • “AI is great!” → 3 tokens

The number of tokens an LLM can handle determines how much information it can process at once.

What is Context Window ?

A context window refers to the maximum number of tokens an AI model can process at a time.

  • Gemma 3 supports a 128K-token context window, meaning it can understand long documents in one go.

Why is this important?

  • Larger context windows allow AI to maintain better memory and coherence across longer inputs.
  • For tasks like PDF summarization, a large context window helps retain key details without losing meaning.

Example:

  • A 2K-token model might forget the beginning of a document when processing long papers.
  • A 128K-token model (like Gemma 3) can process entire chapters or research papers in one go!