The representativeness of the norm sample is crucial for estimating valid norm scores. Typically, random sampling is employed to achieve this. However, even without systematic biases in data collection, the resulting sample may still deviate from the population's actual composition. cNORM offers functionality to incorporate sampling weights into the norming process, thereby mitigating negative effects of non-representative norm samples on norm score quality.

To accomplish this, cNORM integrates a technique known as raking, which is an iterative proportional fitting procedure. This method allows for post-stratification of the norm sample with respect to one or more stratification variables (SVs), based on given population marginals of these SVs. Individual cases are weighted so that the composition of the weighted dataset aligns with the representative population.

To compute the weights, you need to provide a data frame with three columns that specify the population marginals:

- The first column designates the stratification variables.
- The second column lists the factor levels of these stratification variables.
- The third column indicates the proportion of each respective stratum in the representative population.

In the following example, there are two stratification variables ('sex' and 'migration'), each with two factor levels, which are coded 1 and 2 in the case of sex and 0 and 1 in the case of migration. The weights are calculated for the ppvt dataset, which contains both stratification variables.

marginals <- data.frame(var = c("sex", "sex", "migration", "migration"),

level = c(1,2,0,1),

prop = c(0.51, 0.49, 0.65, 0.35))

weights <- computeWeights(data = ppvt, population.margins = marginals)

Passing the weights to either the 'cnorm()' or the 'cnorm.betabinomial()' function using the 'weights' parameter will automatically incorporated these weights into the subsequent norming process.

Certain deviations from representativeness are already corrected or mitigated by continuous norming, even without using weights. This is, for example the case when the deviation from representativeness only occurs in individual age groups. In various simulation studies, we investigated whether the additional use of weighting improves the resulting norming scores when deviations from representativeness occur not just in individual age groups, but throughout the entire sample. For this purpose, we simulated samples that deviated from representativeness to varying degrees and in different ways. We found that weighted norming works very well in most, but not all, use cases. Please note the following points:

- In most cases, lack of representativeness leads to at least a slightly increased error in the resulting norm scores, even with weighting. Therefore, it is always better to ensure the highest possible degree of representativeness during data collection already.
- The data collection should be as random as possible.
- In most - but not all - cases, weighting reduces the negative effects of non-representative norm samples. Persistent biases occur primarily when the variance within the sample is significantly lower than the variance within the population.
- Generally, the degree of non-representativeness in the sample should not be too large. If the mean of the standardized weights exceeds a value of
*m*= 2, this indicates that the dataset deviates too strongly from the reference population. In this case, you should collect additional cases rather than weighting the data. - When only small deviations from representativeness exist in individual age groups, weighting is unnecessary if continuous norming is used.
- Use weighting only for stratification variables that have a substantial influence on the dependent variable.
- Avoid using too many stratification variables with many finely graduated levels, as this can lead to high weights. Instead, you can combine different levels of the stratification variables in case the corresponding subgroups do not differ strongly in terms of test results.
- If available, joint probabilities of different stratification variables (e.g., sex × parental education) can also be used. In this case, recode the variables into a single stratification variable and directly provide the joint probabilities (e.g., proportion of males with high parental education).

Data preparation |
Modeling |