`getReferenceW()` generates multiple reference datasets and calculates the within-cluster dispersion \(W_b\) for each, across 1 to Kmax clusters.
Reference datasets can be generated either: - Uniformly within the original data range for each variable (`ref.gen="uniform"`) - Using a principal components transformation to preserve variance structure (`ref.gen="PC"`)
This function is used in the computation of the Gap statistic to compare the observed clustering dispersion to what is expected under a null reference.
getReferenceW(X, Kmax, B, ref.gen, ...)Numeric data matrix (observations × variables) to generate reference datasets from.
Maximum number of clusters to compute W for.
Number of reference datasets to generate.
Reference generation method: - `"PC"`: uses PCA-based transformation for preserving variance structure - any other value: generates uniform reference data per variable.
Additional arguments passed on to other functions (e.g., `dist.method` for distance calculation, `cl.method` for clustering method, `linkage`, `cor.method`, `nstart`).
calculated Wb