Vocabulary Field Guide — Learning Machines

§ 0 · The map

13 terms · 4 modalities · one method

The vocabulary clusters by modality — the same arc the camp runs: text, then images, then video, then the cross-cutting ideas that apply to all three. The color on each term below tells you which world it lives in.

3Text · Session 1

Token · Temperature · Greedy vs. sampled

4Image · Session 2

Default · Diffusion · CFG · Latent space

3Video · Session 3

Drift · Spatiotemporal · Coherence

3Cross-cutting

Modality · Hallucination · Human-in-the-loop

§ A · Quick glossary

term · plain meaning · the tool that shows it

Term	Plain meaning	See it in
Token	A chunk of text a model processes — often not a whole word.	Tokenizer
Temperature	How much randomness enters sampling — low stays safe, high takes risks.	Tokenizer
Greedy vs. sampled	Greedy always takes the most likely next token; sampling draws from the ranked options (often top-k).	Tokenizer
Default	What appears when the prompt doesn't specify — the model fills in the blanks.	Default Test
Diffusion	An iterative noise-to-image process, refined step by step.	Diffusion Viewer
CFG (guidance)	How hard generation is pushed to obey the prompt versus its own defaults.	Prompt Pressure (CFG scale)
Latent space	The compressed numeric space a model works in instead of raw pixels or words.	Latent Space Compressor
Drift	Unwanted change over time — the subject won't stay put.	Temporal Telephone
Spatiotemporal	Across both space and time — what a video must hold consistent frame to frame.	Metronome Scrubber
Coherence	Staying consistent across frames: identity, objects, camera, physics.	Video Viewer
Modality	A kind of medium a model works in — text, image, or video.	Tool index
Hallucination	A plausible-sounding output without reliable grounding.	Claim Checker
Human-in-the-loop	Human judgment before, during, and after generation.	Model Card

§ B · See it, don't define it

each term as a small instrument

A definition tells you what a word means; a picture shows you the mechanism. Here are the terms that are easiest to misread, each rendered as the thing the tool actually does.

learning machines

Text · token

Token

Text is broken into chunks the model can count and predict — and the chunks may not match words. Try your own in the Tokenizer.

88%

54%

28%

12%

Text · temperature

Temperature

Low temperature favors the top token; raise it and less-likely continuations enter the sample. Watch the bars flatten in the Tokenizer.

Image · default

Default

When a prompt is vague, the system invents bodies, settings, and roles. Compare those choices in the Default Test.

Image · diffusion

Diffusion

The image is refined step by step from noise toward a prompt-guided result. Pause each stage in the Diffusion Viewer.

Video · drift

Drift

The subject keeps changing when it should stay stable. Temporal tools make it visible frame by frame — start with Temporal Telephone.

Video · coherence

Coherence

Frames belong together: identity, objects, camera, and physics stay consistent. Inspect it with the Frame-by-Frame Viewer.

Sounds confident: 94%
"According to Rivera (2021)…"

Evidence: source not found

Cross · hallucination

Hallucination

A response can sound fluent and specific while lacking grounding. Treat it as a claim to verify in the Confidence Is Not Truth Explorer.

predictgenerate or inspectverify evidencerevise / refuse

Cross · human-in-the-loop

Human-in-the-loop

Human judgment belongs before, during, and after generation — setting purpose, checking evidence, naming limits. Document it with the Model Card.