The BNC/Coca family lists are based on large corpora with families as complete as possible in order to classify every word of any text (in, e.g., VocabProfiles). But even K-1 to K-3 families may contain members that learners will never meet, or which appear mainly in specific text types (medicine, engineering). There is thus a case for reducing these lists to their essentials in both initial and specialist learning. Nuclear List Builder "crosses" family lists against word frequencies in a smaller (1-4 million words) corpus to obtain a list of just the family members that are frequent in that corpus. Why is this interesting? Read a
paper about this, or its summary. (*Parallel French study en route arrivé 14 janvier 2026*)