Bridge 2 Before Session 2: Images

Default Is a Design Decision

When you leave something out of a prompt, the model still has to make it specific. Age, gender, skin tone, setting, style — all of it gets filled in. That filling-in is not random and it is not neutral. It reflects what appeared most often in training data and what the people who built the tool decided to preserve, filter, or amplify.

Pick a prompt to investigate its defaults
Generate an image of a doctor.
Body

Often: white or light-skinned, male, 35–55, standing, confident posture. Rarely: older, disabled, non-Western features.

Setting

Sterile hospital corridor or exam room. Western clinical architecture — not a clinic, field hospital, or home visit.

Role

Authority figure at the center. Patient absent or backgrounded. No one else appears unless specified.

Objects

Stethoscope around the neck. White coat. Clipboard or tablet. These props are treated as definitional.

Style

Realistic, professional lighting. Aspirational polish — the visual language of a hospital brochure.

Action

Examining, reviewing a chart, or explaining. Labor of expertise. Never administrative, emotional, or caregiving.

What this reflects: Doctors in English-language media, stock photography, and medical textbooks have historically skewed toward this profile. The model learned what "doctor" looks like from that corpus — not from who doctors actually are globally.
Generate an image of a CEO.
Body

Often: white, male, 45–60, formally dressed, tall stance. Power-coded appearance. Rarely: women, people of color, younger adults.

Setting

Corner office, boardroom, or branded stage. City skyline visible through floor-to-ceiling windows. No other workplaces.

Role

Centered, alone or with subordinates looking toward them. Sole decision-maker. Surveying from above.

Objects

Dark suit. Tie or watch. Podium or conference table. These signal "executive" in the visual language of business media.

Style

High-contrast, confident lighting. Clean backgrounds. The visual register of Forbes covers and LinkedIn profile photos.

Action

Speaking, presenting, pointing at a screen, or arms crossed. Projecting leadership. Never listening, collaborating, or learning.

What this reflects: Business media, corporate stock photography, and leadership profiles have long over-represented a specific demographic. Models trained on that corpus inherit a narrow visual definition of "executive" — one that excludes most of the world's actual leaders.
Generate an image of a criminal.
Body

Research shows these prompts produce darker skin tones at higher rates than the population. Age skews young, male. This is the most direct evidence of racial bias.

Setting

Urban street, alley, or low-light environment. Coded as dangerous — specific to particular neighborhoods.

Role

Dehumanized, threatening, alone. No context, no story, no complexity. Guilt is assumed, not depicted.

Objects

Weapons, masks, or hoodie. Props that signal threat rather than depicting any actual act or legal status.

Style

High contrast, dramatic shadow, menacing framing. The visual language of crime reporting — not neutral documentation.

Action

Lurking, running, or confronting. Never shown in context — arrested, acquitted, or incarcerated. Just threatening.

Why this matters most: Unlike "doctor" defaults, which are limiting, these defaults cause direct harm. Images like these reinforce stereotypes that affect how people are policed, judged, and feared. The model learned from a corpus that already over-represented certain groups in crime coverage. Generating more of that is not neutral.
Generate an image of a family.
Body

Often: two adults of different genders, multiple children, all of similar race. Nuclear structure treated as default. Multigenerational, same-sex, or single-parent families rarely appear.

Setting

Suburban home with yard or indoor living room. Vacation or leisure location. Not urban apartments, rural homes, or informal settings.

Role

Clearly gendered adult roles. Mother often cooking or caregiving. Father often standing and surveying. Children playing.

Objects

House, car, toys, dining table. Consumer goods that signal middle-class stability. Working-class or low-income home contexts excluded.

Style

Warm, bright, aspirational lifestyle photography. Soft filters. The visual language of advertising — not documentary life.

Action

Cooking together, playing, hugging, or gathered for a meal. Scenes of harmony and abundance. No stress, conflict, or ordinary monotony.

What this reflects: "Family" in English-language media, advertising, and stock photography has long meant one specific structure, class, and presentation. The model learned that definition and reproduces it. Most of the world's families don't look like this default.
Key line "A default is not neutral. It is a choice made by whoever assembled the training data."
Training data

Images, captions, and labels teach the system what usually goes with a prompt. Overrepresented groups stay overrepresented.

Social history

Media, stock photography, textbooks, and journalism all had their own defaults before AI existed. Models inherit them.

Tool design

Safety filters, style presets, and ranking systems make additional choices about what gets surfaced first.

Prompt ambiguity

Vague prompts hand maximum control to the system. Specificity can shift defaults — but doesn't make them disappear.

Defaults are not always wrong. Sometimes the most common pattern is accurate or harmless. The distinction that matters: when a default excludes, demeans, or reinforces a stereotype — especially for already-marginalized groups — it stops being a statistical pattern and becomes a systemic harm. The "criminal" prompt is not the same kind of problem as the "doctor" prompt.

Now open the tools

In the Diffusion Step-Through Viewer, watch structure emerge from noise — every frame is the model making a choice. In the Image Default Test Board, document what appeared in your prompts and build an evidence-based claim about what the model treated as normal.