Last week I discussed the impossibility of even estimating the number of words in a language. Today I discovered an old (1998) CBC News article on the longest dictionary every written (click here) Consisisting of 40 volumes, it took 147 years to compile, edit, proof-read, and publish. It documents Dutch and Flemish (a dialect of Dutch) words dating back to 1500. By the time it was finished, it was already 29 years out of date.
This is an excellent example of the failure of dictionaries to account for all the words in a language. A dictionary is only someone’s sample of the words in a language. No matter how many people you put on the committee to compile a dictionary, you will only get those words the members of the committee have heard or read.
So why not do a Wiktionary, like the Wikipedia, allow everyone speaking the language to put in whatever they think is a good word, their opinion of its forms, part of speech, definitions, usage, etc? Forget editing and proofreading.
The result it then that of the Urban Dictionary with dozens of definitions for each word and the compilers arguing among themselves as to which is the correct one. Can you vote on which is correct? If the majority say that “ain’t” is a good word, is it then?
The best approach is to enjoy languages and the words in them, appreciate the creativity that brings more new words to the surface each day than any one person can master, and forget statistics. Language and statistics get along like oil and water.