*Wolfgang Lenhard, Alexandra Lenhard & Sebastian Gary*

cNORM is a package for the R environment for statistical computing that aims at generating continuous test norms in psychometrics and biometrics and to analyze the model fit. It is based on the approach of A. Lenhard, Lenhard, Suggate and Segerer (2016).

The method stems from psychometric test construction and was developed to create continuous norms for age or grade in performance assessment (e. g. vocabulary development, A. Lenhard, Lenhard, Segerer & Suggate, 2015; reading and writing development, W. Lenhard, Lenhard & Schneider, 2017). It can however be applied wherever test data like psychological (e. g. intelligence), physiological (e. g. weight) or other measures are dependent on continuous (e.g., age) or discrete (e.g., sex or test mode) explanatory variables.

The package estimates percentile curves in dependence of the explanatory variable (e. g. schooling duration, age ...) via Taylor polynomials, thus offering several advantages:

- By optimizing the model on the basis of the total sample, small deviations from the representativeness of individual subsamples, for example due to incomplete data stratification, are minimized.
- Gaps between different discrete levels of the explanatory variable are closed. For example, in school performance tests, norm tables can be created not only for the discrete measurement point of the norm sample collection (e.g. midyear or end of the year), but also at any time of the school year with the desired accuracy.
- The total sample size for the norm data collection is reduced because all norm tables are determined on the basis of the entire sample.
- The limits of the model fit can be evaluated graphically and analytically. For example, it is possible to determine where the model deviates strongly from the manifest data or where strong floor or ceiling effects occur. This makes it possible to specify at which points the test scores can no longer be interpreted in a meaningful way.
- cNORM does not require any distribution assumptions. If floor or ceiling effects occur, the data can therefore often be modeled much more precisely,
than with parametric methods. This is particularly true for those areas that deviate relatively strongly from the population average, but often represent precisely those areas that have the highest relevance in diagnostic practice. The following figure illustrates this:

Figure: Modelling of the manifest data (black dots) of the ELFE reading comprehension test at the beginning of the fifth grade with cNORM (green line) and a parametric method (modelling of location, dispersion and skewness with Box-Cox-transformation; red line). It can clearly be seen that parametric modeling in this case significantly overestimates the probability densities in the lower performance range. As a consequence, the normal scores of low performing children would also be overestimated and therefore the children would be identified too rarely as dyslexic. By contrast, cNORM provides reliable data modeling across the entire performance range.

For the mathematical background, please have a look at the mathematical derivation of the method.

On the following pages, we demonstrate the necessary steps for the application of the R package with real human performance data, namely, with the standardization sample of the sentence comprehension subtest of *ELFE 1-6*, a reading comprehension test in German language (W. Lenhard & Schneider, 2006). Essentially, there are five steps to complete:

- Installation of the R-Package
- Data Preparation
- Data Modeling
- Model Validation
- Generating Norm Tables

Installation |

cNORM is licensed under GNU Affero General Public License v3 (AGPL-3.0). This means that copyrighted parts of cNORM may only be used free of charge in commercial and non-commercial projects that run under this same license, retain the copyright notice, provide their source code and correctly cite cNORM. Copyright protection includes, for example, the reproduction and distribution of source code or parts of the source code of cNORM or of graphics created with cNORM. The integration of the package into a server environment in order to access the functionality of the software (e.g. for online delivery of norm scores) is also subject to this license. However, a regression function determined with cNORM is not subject to copyright protection and may be used freely for commercial or non-commercial projects. If you want to apply cNORM in a way that is not compatible with the terms of the AGPL 3.0 license, please do not hesitate to contact us to negotiate individual conditions.

If you want to use cNORM for scientific publications, we would also ask you to quote the source.

CDC (2012). National Health and Nutrition Examination Survey: Questionaires, Datasets and Related Documentation. available: https://wwwn.cdc.gov/nchs/nhanes/OtherNhanesData.aspx. date of retrieval: 25/08/2018 |

Lenhard, A., Lenhard, W., Segerer, R. & Suggate, S. (2015). Peabody Picture Vocabulary Test - Revision IV (Deutsche Adaption). Frankfurt a. M.: Pearson Assessment. |

Lenhard, A., Lenhard, W., Suggate, S. & Segerer, R. (2016). A continuous solution to the norming problem. Assessment, Online first, 1-14. doi: 10.1177/1073191116656437 |

Lenhard, W., Lenhard, A. & Schneider, W. (2017). ELFE II - Ein Leseverständnistest für Erst- bis Siebtklässler. Göttingen: Hogrefe. |

Lenhard, W. & Schneider, W. (2006). ELFE 1-6 - Ein Leseverständnistest für Erst- bis Sechstklässler. Göttingen: Hogrefe. |

The World Bank (2018). Mortality rate, infant (per 1,000 live births). Data Source available https://data.worldbank.org/indicator/SP.DYN.IMRT.IN (date of retrieval: 02/09/2018) |

The World Bank (2018). Life expectancy at birth, total (years). Data Source World Development Indicators available https://data.worldbank.org/indicator/sp.dyn.le00.in (date of retrieval: 01/09/2018) |