Blog - * Scholarly Communications * - LibGuides at Illinois Institute of Technology

Talking Tokens in Artificial Intelligence

by Aric Ahrens on 2025-11-21T07:28:04-06:00 in Research, Scholarly Communication | 0 Comments

In AI, especially in natural language processing (NLP), a token is a unit of text that the model processes. Tokens are the building blocks of input and output for language models.

So, what exactly Is a token?

A token can be:

A word (e.g., "hello")
A subword (e.g., "un", "break", "able" from "unbreakable")
A character (in some models)
Or even punctuation (e.g., ".", ",")

Most modern models, like those based on the Transformer architecture, use subword tokenization. This helps handle rare or unknown words more efficiently.

Tokens are how the model understands and generates language. For example:

The sentence "I love pizza!" might be split into tokens like:
```
["I", "love", "pizza", "!"]
```
Or, in subword form:
```
["I", "love", "piz", "za", "!"]
```

Each token is converted into a numerical representation (embedding) that the model can process.

A token is a chunk of text (word, subword, or character).
Models process text as sequences of tokens.
Tokenization helps models handle language efficiently and flexibly.

Add a Comment

0 Comments.

Search this Blog

Subjects

Research
Scholarly Communication

Return to Blog

* Scholarly Communications *

Talking Tokens in Artificial Intelligence

0 Comments.

Search this Blog

Recent Posts

Archive

Subjects

This post is closed for further discussion.

* Scholarly Communications *

Talking Tokens in Artificial Intelligence

0 Comments.

Search this Blog

Recent Posts

Subscribe

Archive

Subjects

This post is closed for further discussion.