Developmental Science

The Research Behind the App

Why we built it the way we did

Every design decision in this app is grounded in peer-reviewed developmental science. This page explains the key findings — and links to the original papers so you can read them yourself.

Troseth & DeLoache (1998) — Child Development

Toddlers don't automatically learn from screens

This is the paper that started it all. Researchers showed two-year-olds a video of someone hiding a toy in the next room, then asked them to go find it. The toddlers couldn't do it — even though they had just watched exactly where it went. But when children watched through a window showing the identical scene, they found the toy almost every time.

Same information. Completely different result.

What this tells us: toddlers don't automatically treat what they see on a screen as real and reliable. The connection between flat surfaces and three-dimensional reality has to be built. That's what the calibration stages of this app are designed to do.

Troseth, Saylor & Archer (2006) — Child Development

But the right kind of interaction on screen can close that gap

Researchers gave two-year-olds just five minutes of live, back-and-forth video interaction with an adult who responded specifically to them — used their name, reacted to what they did, had a real conversation. After that, the same children who previously couldn't use screen information suddenly could.

The key wasn't just seeing a screen. It was experiencing a screen that genuinely responded to them.

This is why the early stages of this app use the front-facing camera rather than videos or images. A screen that responds to your child's specific movements is fundamentally different from a screen that plays content at them — and the research shows the difference matters enormously.

Miyazaki & Hiraki (2006) — Child Development

Timing matters more than you'd think

Researchers introduced a tiny delay — just two seconds — between a toddler's movement and their reflection on screen. That two-second lag was enough to prevent children from recognizing themselves.

Two seconds. That's it.

This finding shaped one of the most important technical decisions in building this app: the camera response has to be as close to instantaneous as possible. We engineered the app to respond in under 300 milliseconds — faster than a blink — because the research is clear that even small delays break the very feedback loop the child needs to learn from.

Bahrick & Watson (1985) — Developmental Psychology

Your child can sense contingency from the very beginning

Even babies as young as three months old can tell the difference between a video that responds to their movements and one that doesn't. They aren't recognizing themselves yet — but they already notice when something moves because they moved.

That sensitivity is what the app's first stage taps into. Your child doesn't need to understand what a screen is for the early stages to work. They just need to notice that something responds when they do — and it turns out they're wired to notice that from very early on.

DeLoache (2000) — Child Development

The path from live video to photos has to be gradual

A photograph of a ball is not the same as a ball — and for a young toddler, that distinction is genuinely hard to bridge. Researcher Judy DeLoache spent decades studying how children learn to treat images as stand-ins for real things rather than as objects in their own right. Her finding: children have to hold both ideas at once — "this is a flat image AND it represents something real" — and that's a genuinely difficult mental move that takes time to develop.

This is why the app moves gradually from live camera through recorded video to still images. Each step removes one more layer of familiarity. The sequence isn't arbitrary — it follows the path the research shows children's brains actually take.

Witt, Cermak & Coster (1990) — American Journal of Occupational Therapy

Children can identify body parts on screen before they fully recognize themselves

Most children can reliably identify basic body parts — nose, eyes, ears, mouth — by 12 to 15 months. This matters for the app because it means body part mapping in live camera mode is a reasonable activity well before a child fully understands that the image on screen is them. The game works because the child already knows where their nose is — the camera just makes that knowledge visible in a new way.

Rochat & Morgan (1995) — Infant Behavior and Development

Self-awareness starts with sensing your own movement

Infants distinguish between a camera feed that responds to their movements and one that doesn't — and this sensitivity is one of the earliest forms of self-awareness we can detect. This research grounds the app's approach in a rich account of how babies first come to understand their own bodies as distinct from the world around them.

Meltzoff & Moore (1994) — Current Directions in Psychological Science

Babies arrive wired to connect action and image

Andrew Meltzoff's work on imitation and body awareness shows that infants come equipped with a surprisingly sophisticated sense of their own bodies from very early on — and that this early body awareness is connected to how they learn from watching others. This research confirms that the calibration sequence the app uses is aligned with how self-awareness actually develops.

Sandhofer & Smith (1999) — Developmental Psychology

Learning a category means learning what it isn't, not just what it is

This is the research that changed how we designed the colors module. Researchers at Indiana University found that children don't learn color words one at a time — they learn them as a system. A child who only ever hears "red" doesn't really know what red means, because they have nothing to contrast it against. It's only when they encounter red next to blue, or yellow next to green, that the boundaries between categories become clear.

The implication for the app is significant. Simply labeling things — "that's red," "that's blue" — is less effective than presenting colors in contrast with each other. The app is designed around this finding: when introducing a new color, we always show it alongside a clearly different color, so the child's brain can see the boundary, not just the label. The same principle applies to shapes, animals, and every other domain.

Knowing what something is requires knowing what it isn't.

Dehaene-Lambertz, Monzalvo & Dehaene (2018) — PLOS Biology

Letters aren't special — they're shapes the brain learns to recognize

Before a child can read, they have to learn to see. Not see in the general sense — they can do that from birth — but see letters specifically as distinct visual objects worth tracking and remembering.

Neuroscientist Stanislas Dehaene's research shows that the brain region that eventually specializes in reading — the visual word form area — doesn't start out specialized for letters at all. It's a general-purpose visual region that gets repurposed when a child starts learning to read. The brain essentially learns that these particular shapes — the curves, lines, and angles that make up letters — matter and deserve their own dedicated processing.

What this means for how we teach letters: before phonics can work well, children need to be able to visually discriminate letter forms from each other. A child who can reliably distinguish a shape that curves one way from one that curves another is laying the perceptual foundation that makes letter learning possible.

This is why the app introduces letters as visual objects — distinct shapes to recognize — before connecting them to sounds or names. The perceptual foundation has to be built first, and it's the same foundation used to learn any other visual category.

Massaro (2015) — Journal of Literacy Research

What actually makes reading aloud so powerful — and it's not the pictures

Parents are told to read to their children. But why does it work? Most assume it's something about the pictures — that seeing images while hearing words helps children connect language to the world.

The research tells a more surprising story. UC Santa Cruz psychologist Dominic Massaro analyzed the vocabulary in picture books and compared it to what parents actually say to their children in everyday conversation. Picture books, it turns out, contain two to three times as many rare and unusual words as normal parent-child conversation — and even more than most adult-to-adult conversation.

What drives the benefit of reading aloud is the language the parent uses while reading — the words on the page, spoken aloud, that a parent would rarely use in ordinary conversation. A child whose parent reads to them regularly is receiving a vocabulary lesson that conversation alone could rarely match.

This finding shapes how the app thinks about parent involvement. The parent's voice is the most powerful learning tool in the room. The app is designed to amplify that voice — to give it more to work with, at the right moment, in the right domain — not to replace it.

Want to go deeper?

The full framework behind this app — including why the three-dimensional world always comes first, and what it needs to look like before the screen becomes useful — is covered in the book Training Minds: What Building AI Models Taught Me About Teaching My Toddler and at the Training Minds Substack.