Comparing disease prevalence in two groups is an important topic in medical research, and prevalence rates are obtained by classifying subjects according to whether they have the disease. Both high-cost infallible gold-standard classifiers or low-cost fallible classifiers can be used to classify subjects. However, statistical analysis that is based on data sets with misclassifications leads to biased results. As a compromise between the two classification approaches, partially validated sets are often used in which all individuals are classified by fallible classifiers, and some of the individuals are validated by the accurate gold-standard classifiers. In this article, we develop several reliable test procedures and approximate sample size formulas for disease prevalence studies based on the difference between two disease prevalence rates with two independent partially validated series. Empirical studies show that (i) the Score test produces close-to-nominal level and is preferred in practice; and (ii) the sample size formula based on the Score test is also fairly accurate in terms of the empirical power and type I error rate, and is hence recommended. A real example from an aplastic anemia study is used to illustrate the proposed methodologies.