I'm running logistic regression models with binary response data in SPSS, and am requesting the Hosmer-Lemeshow goodness-of-fit test. I've discovered that in some situations, the results of the HL test depend upon which category of the dependent variable is chosen as the target category. That is, if I reverse a 0-1 or recode the 0 values to 2, so that the logits are formed differently, transforming my predicted probabilities from p to 1-p, the HL chi-square and significance can be different. Is this a bug in SPSS?
Resolving the problem
Assuming that there are sufficiently numbers of distinct predicted probabilities, the Hosmer-Lemeshow test is computed based on a grouping of the ordered predicted probabilities into ten groups, or "deciles of risk," attempting to place 10% of the cases in each decile. Unless the number of cases is an exact multiple of 10 and there are no ties among cases at the grouping cut points, there is no unique way to assign cases to groups. This leads to HL statistics differing among different software packages even on the same data with the same target category of the dependent variable. Most algorithms for choosing cut points (including the one used in the SPSS LOGISTIC REGRESSION procedure) begin with the smallest predicted probability, and will not in general produce identical results when you reverse these predicted probabilities