Genealogy: Cracking the Soundex Code
In this article, you will find:
- A phonetic system
- Coding nuances
A phonetic system
Genealogy: Cracking the Soundex Code
Soundex is an indexing system based on the phonetic sound of the consonants in the surname. Each name is assigned a letter and three numbers. The letter is always the first letter of the surname. The Miracode for the 1910 census uses the same sound system but arranges the resulting lists by the visitation number assigned by the enumerator, rather than page numbers of the census schedule as in the Soundex.
Federal indexing for the 1880, 1900, 1910, 1920, and 1930 censuses is based on a phonetic system called the Soundex (or a similar one called Miracode). It was devised to overcome the vagaries of spelling by grouping together surnames that sound alike, but are spelled differently. In the Soundex, Bream, Breem, or Briem will be found in indexes under B650; Wier, Weer, Wiere, and Ware would be W600.
The 1880 census was Soundexed only for households having children under the age of 10. (These children were the first to become eligible for old-age benefits in 1935.) This does not mean that other households were not enumerated. It does mean that if your ancestor was not in a household with children under the age of 10, he or she will not be on the Soundex indexes, but will be in the census itself.
Only the Soundex for 1880 has the limitation of including only the households with children under 10 in the home. The 1900 through 1930 censuses include in the Soundex all heads of households. In 1910, only 21 states had a Soundex or Miracode; in 1930 only 10 states and portions of two others were Soundexed. The rest of the 1910 and 1930 censuses must be searched using other methods, discussed in Genealogy: The Evolution of the Census.
Those 1880 Children Retiring
With the passing of the Social Security Act in 1935, the government had to determine who was eligible for benefits. Those eligible needed to prove their ages. If they had no birth record, they could help substantiate the birth by a census record. The government, realizing that the first group of applicants was born before there was statewide registration of births, needed an efficient way to locate individuals on the census for verification of the birth. The only ones in the household that were important to the government in this initial Soundex were those who were under 10 in 1880. It was that group who would be applying for Social Security.
Learning the Soundex System
Applying the Soundex code to the surnames on your list before you go to the repository will speed up your census research when you get there. The table below explains the code.
|1||B P F V|
|2||C S K G J Q X Z|
The letters A, E, I, O, U, W, Y, and H are disregarded. To apply the Soundex to your surnames, follow these steps:
- Use the first letter of the surname to begin the code.
- Cross out all the vowels and the letters W, Y, and H in the surname.
- Using the table, assign an equivalent number to the first three letters left in the surname.
- Disregard any remaining letters in the surname.
If the surname has fewer than three letters left, assign zeros to those places.
Further refinements include the following:
The last refinement in the Soundex coding rules is a reference to the “lost H and W rule,” which is very important for researchers looking for surnames with either an H or W in them. For a period of time, those preparing Soundex coding erroneously treated H and W as they did vowels. That is, they used them as separators, so letters having the same value in the code were coded as one letter. When “the lost H and W rule,” also called the “Ashcraft rule,” was rediscovered, coding reverted to its original rules. Researchers tracking surnames with the letters H or W in them should look for them under both codes. For example, if the surname is Halkowski, look for it under H422 as well as H420.
- Double letters are treated as one and coded with one number; thus 2 Ls will be 4, not 44.
- Two or more letters with the same code number that appear in sequence in a surname are assigned one number; thus CK in Dickson is coded as 2, not 22, and Szalay is S400.
- Two consonants with the same code number separated by a vowel are both coded; thus Svoboda is coded S123, not S103.
- A name that yields no letters is assigned 3 zeros, thus Chu becomes C000.
- Code names with prefixes, such as Van, Von, De, Le, with and without the prefix. You may find it either way in the Soundex.
- When two consonants with the same code number are separated by either an H or W, only the first is coded with the second being ignored; thus Burroughs is coded B622, not B620.
Examples using these rules are shown in the table below:
|Li, Lee, Law||L000|
|Von Kemp||V525 or K51|