The representativeness of the norm sample is crucial for estimating valid norm scores. Typically, random sampling is employed to achieve this. However, even without systematic biases in data collection, the resulting sample may still deviate from the population's actual composition. cNORM offers functionality to incorporate sampling weights into the norming process, thereby mitigating negative effects of non-representative norm samples on norm score quality.
To accomplish this, cNORM integrates a technique known as raking, which is an iterative proportional fitting procedure. This method allows for post-stratification of the norm sample with respect to one or more stratification variables (SVs), based on given population marginals of these SVs. Individual cases are weighted so that the composition of the weighted dataset aligns with the representative population.
To compute the weights, you need to provide a data frame with three columns that specify the population marginals:
In the following example, there are two stratification variables ('sex' and 'migration'), each with two factor levels, which are coded 1 and 2 in the case of sex and 0 and 1 in the case of migration. The weights are calculated for the ppvt dataset, which contains both stratification variables.
marginals <- data.frame(var = c("sex", "sex", "migration", "migration"),
level = c(1,2,0,1),
prop = c(0.51, 0.49, 0.65, 0.35))
weights <- computeWeights(data = ppvt, population.margins = marginals)
Passing the weights to either the 'cnorm()' or the 'cnorm.betabinomial()' function using the 'weights' parameter will automatically incorporated these weights into the subsequent norming process.
Certain deviations from representativeness are already corrected or mitigated by continuous norming, even without using weights. This is, for example the case when the deviation from representativeness only occurs in individual age groups. In various simulation studies, we investigated whether the additional use of weighting improves the resulting norming scores when deviations from representativeness occur not just in individual age groups, but throughout the entire sample. For this purpose, we simulated samples that deviated from representativeness to varying degrees and in different ways. We found that weighted norming works very well in most, but not all, use cases. Please note the following points:
Data preparation |
Modeling |