How Vocabcord's vocabulary is built

Every phrase you learn in Vocabcord comes from one curated corpus: 29,000+ phrases across six learner languages paired with English, plus monolingual English, graded A1 to C2 and grouped by everyday theme. Here is exactly how it is put together, and where the underlying data comes from.

The corpus

The corpus covers Spanish, French, German, Italian, Portuguese, and Polish, each paired with English, plus a monolingual English track. Phrases are graded across the six CEFR levels (A1 through C2) and sorted into everyday themes such as greetings, food, travel, work, health, and numbers. Each entry carries the phrase, its translation, an example showing it in a sentence, and a recorded audio clip.

How phrases are chosen and graded

Selection is frequency-first: the words and phrases people actually use most come first, so a beginner spends their earliest minutes on the vocabulary that does the most work. Levels follow the CEFR scale, the six-level A1 to C2 standard published by the Council of Europe, from A1 survival basics up to C2 mastery. Within a level, phrases are organized by theme so a learner can focus on the slice of the language they need next.

Native-speaker audio

Every phrase is recorded, so you learn it by ear, the way you first picked up your own language. That matters for a habit built on listening: when a phrase plays as you plug in your charger, pronunciation is part of the lesson from the first exposure.

Sources and licensing

The proficiency levels follow the Common European Framework of Reference for Languages (CEFR), the A1 to C2 standard developed and maintained by the Council of Europe and used across European language education.

The English frequency backbone is the New General Service List (NGSL 1.2) by Charles Browne, Brent Culligan, and Joseph Phillips, derived from the Cambridge English Corpus and published under a Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license. Vocabcord's derived vocabulary lists are shared under the same license. The audio recordings, the app, and its software are separate, original works under their own license.

Corrections

If you spot a phrase, translation, or level that looks off, tell us on the support page. Real corrections from learners make the corpus better for everyone.