Frequently educational and psychological investigators are interested in determining if two or more tests measure the same underlying attribute. When two tests measure the same trait except for errors of measurement their correlation corrected for attenuation is l.0, and they are said to be equivalent measures. The need for a significance test of the hypothesis that a perfect linear relation exists between the true scores of two tests is evident in many studies of the reliability and construct validity of tests. In recent years as the use of multiple indicator models to test theories has grown, the need for such a hypothesis test has been felt beyond the boundaries of psychometrics and educational measurement. Statistical tests of the hypothesis that two measures are equivalent have been proposed by Wilks (l946), Votaw (l948), Lord (l957, l973), McNemar (l958 , Forsyth and Feldt (l969, l970), Joreskog (l97l), and Kristof (l973). In general these tests are besed on different sets of assumptions and are derived from different hypothesis testing techniques. They are so difficult to compare analytically that in the published literature one is unable to locate such a comparison. Thus, researchers can find no guidance concerning which test procedure is optimal for their purposes. In this proposal we review each of these statistical methods and propose a series of Monte Carlo studies which would furnish the data for an empirical comparison of the procedures. The effects of sample size, nonnormal distribution, test score model, and magnitude and equality of reliabilities on the relative performances of the tests with simulated data will be investigated. We will ascertain the actual frequencies of Type I errors under a true null hypothesis and of Type II errors under a false null. The results of this research should identify the best significance test and will specify the conditions which will adversely affect the performance of this test.