This Phase I proposal involves the development of a race predictor system for health surveillance of people from the Indian subcontinent. People from the Indian subcontinent, termed South Asians, are among the fastest growing segment of Asians, in the United States of America. It is known that risk of developing cancer and other diseases varies among different races and geographical areas, and increases with migration from regions of low incidence to regions of high incidence. Accurate assessment of cancer risks in the South Asian community in the United States is not currently available. A reason for its unavailability is the lack of tools to clearly demonstrate the racial identity of these individuals in population based cancer registries. Surnames have been used as a reliable predictor of ethnicity in calculating cancer parameters. The prototype of the database of forename and surname listings from specific regions of at least two South Asian countries will be developed and internally evaluated during Phase I. Additional validation, involving interviews with the subjects and comparison with more robust and reliable external databases containing self-reported information regarding name and ethnicity, will be carried out in Phase II, thereby creating a fully functional system and an effective tool in determining relative rates of cancer and other important diseases in South Asians.