What builds fluency

Nobody can tell you how long it takes

Every timeline you've ever seen is made up. The interesting question is why.

Ahha · February 27, 2026 · 8 min read

You type "how long to learn Thai" into Google. The first result says 44 weeks. A language app says 6 months. A forum post says 2 years if you're serious, 5 if you're not. The numbers are specific enough to feel like data. They carry the confidence of measurement.

None of them are based on anything that applies to you.

This is the most common question in language learning, and it has a structural problem that makes it unanswerable. The interesting thing is understanding why.

Where the numbers come from

Most timeline claims trace back to a single source: the Foreign Service Institute, the US government's language training program for diplomats. Since the 1970s, the FSI has published difficulty categories based on how long it takes their students to reach professional working proficiency. Category I languages like Spanish and French take around 600 to 750 class hours. Category IV languages like Arabic, Mandarin, and Japanese take around 2,200.

These are real numbers from a real program. The problem is who the students are. FSI trainees are adults selected for above-average language aptitude, studying full-time in intensive classroom settings with small groups and expert instructors, six to eight hours a day, five days a week. They're motivated by career requirements and supported by one of the most effective language programs ever built.

Applying these numbers to someone studying on their phone for twenty minutes a day is like using a restaurant kitchen's prep time to estimate how long dinner takes a home cook. The measurement is real, but it describes a different population doing a different thing.

Why prediction genuinely fails

The deeper problem isn't that the FSI data is narrow. It's that language learning timelines resist prediction even in principle.

Consider what would go into an honest estimate for a single learner. Your native language and how much structural overlap it has with the target. How quickly your ear adapts to unfamiliar sound contrasts. Whether incomplete understanding keeps you engaged or shuts you down. The quality of your input: whether you're getting material at the right difficulty or just logging hours of noise. Whether you live somewhere the language is spoken. Your sleep quality, since consolidation happens between sessions. Your age. Other languages you've learned. What kind of motivation is driving you and how it holds up when progress stalls.

These variables don't sum. They interact. Someone with high tolerance for ambiguity and mediocre input might progress faster than someone with low tolerance and excellent input, because the first person stays engaged while the second quits. Two people with identical study schedules can have radically different trajectories because the combination of factors produces different dynamics, not just different speeds.

In complex systems, this is called sensitive dependence on initial conditions. Small differences in starting state lead to large, unpredictable differences in outcome. Weather forecasting has the same property: beyond a few days, the variables interact in ways that defeat point prediction regardless of model quality. Language learning timelines are the same kind of problem. The variable space isn't just large. It's interactive in ways that compound rather than average out.

John Carroll, one of the first psychologists to study language aptitude systematically, identified four independent components in the 1960s: phonetic coding ability, grammatical sensitivity, inductive language learning ability, and rote memory. Each varies across people on its own axis. And aptitude is just one cluster among many. Someone learning because they're fascinated by the culture acquires differently from someone learning for a promotion, and both differ from someone learning because their partner speaks the language. Dörnyei's motivational research has documented this pattern extensively: the type of motivation reshapes the acquisition process itself. The total space doesn't have a shape you can collapse into a single number.

Fluency is a gradient, not a finish line

Timeline questions have a second structural problem: they assume a fixed destination.

Every "how long" question treats fluency as binary, as a state you either have or haven't reached. But fluency is a gradient. Can you ask for directions? Follow a drama? Argue about politics? Write a poem? These abilities live at different points on a continuous scale, and people draw the line in different places.

The CEFR framework divides proficiency into six levels from A1 to C2. The FSI uses a zero-to-five scale. Neither defines a single point called "fluent." A B1 speaker can handle basic travel situations. A C1 speaker can follow a university lecture. Depending on the learner, months or years separate those levels. When someone asks how long it takes to become fluent, they might mean either one. Or they might mean something no framework captures at all, like understanding their partner's family at dinner.

You can't estimate an arrival time when the destination doesn't have fixed coordinates. The question feels precise, but it collapses two kinds of vagueness into one: the road has no known length, and the endpoint has no agreed-upon location.

Why fabricated numbers work

Given all of this, why does every app and blog post still offer a timeline?

Because humans can't begin an open-ended process without anchoring on a duration. Daniel Kahneman and Amos Tversky identified what they called the planning fallacy: people systematically underestimate how long tasks will take, and they prefer a specific wrong estimate over an honest admission of uncertainty. A concrete number, even a fabricated one, gives the planning system something to commit to. "Fluent in 3 months" generates action. "Months to years, depending on many variables" generates hesitation.

This isn't cynicism about the apps. It's how human decision-making works. A made-up timeline outperforms an honest range in every market test, not because people are gullible, but because the brain's planning machinery needs a temporal anchor to allocate effort and attention, and a fabricated number provides exactly that.

The damage comes after commitment. Once a number takes hold, Goodhart's Law sets in: when a measure becomes a target, it ceases to be a good measure. "Conversational in six months" becomes "I need to study X hours per week to stay on schedule," which becomes "I did my thirty minutes today." The timeline has become the metric for success. The counting itself crowds out the kind of engagement that actually produces acquisition. A learner watching a countdown is optimizing for a number that was fabricated in the first place, and the optimization distorts the very behavior the number was supposed to predict.

What we can actually say

The honest answer isn't nothing. Real patterns emerge from the research, and they're more useful than a fabricated number.

Consistency matters more than intensity. Your brain consolidates between sessions, integrating what it absorbed during sleep and rest. Brief daily contact with the language gives the consolidation cycle steady material to work with. A short session every day, sustained over months, produces more acquisition than occasional binges followed by gaps.

Background noise contributes almost nothing to acquisition. What matters is focused listening at the right difficulty, where you follow most of what's being said but not all of it.

Comprehension has to develop before production can follow. Trying to force speaking before comprehension has a foundation leads to the kind of stalling that makes people think they've hit their ceiling.

Adults bring real advantages that compress the process compared to children, but only when those advantages are pointed at selecting good input and maintaining consistency rather than at analyzing grammar.

The hours have to accumulate. There's no shortcut around that. But understanding the shape of the process is more actionable than any timeline. A learner who knows these patterns can make real decisions: what to listen to, how to structure their time, when to push and when to be patient. A learner chasing a deadline can only count.

A better question

"How long until I'm fluent?" feels urgent. But it asks about an endpoint without fixed coordinates, on a timeline that can't be estimated, using a definition nobody agrees on.

A more useful question is smaller and more immediate. What did I understand today that I didn't last week? Am I still working with material at the right difficulty, or have I outgrown it? When I listen, am I following the meaning or just hearing sound?

These questions have answers you can check. They point at the process rather than at a finish line. And if you keep asking them and acting on the answers, the timeline question stops mattering. One day you realize you followed a conversation without thinking about it, and you can't point to when the shift happened.