A non-parametric method to estimate the number of clusters

“An important and yet unsolved problem in unsupervised data clustering is how to determine the number of clusters. The proposed slope statistic is a non-parametric and data driven approach for estimating the number of clusters in a dataset. This technique uses the output of any clustering algorithm and identifies the maximum number of groups that breaks down the structure of the dataset. Intensive Monte Carlo simulation studies show that the slope statistic outperforms (for the considered examples) some popular methods that have been proposed in the literature. Applications in graph clustering, in iris and breast cancer datasets are shown.” (Fujita et al., 2014)

Fujita et al. (2014). A non-parametric method to estimate the number of clusters, Computational Statistics and Data Analysis, 73, 27-39.

Link to the paper


Committee on Publication Ethics

“The COPE flowcharts have been written and designed as
a practical step-by-step guide for journal editors to
deal with the most common breaches of publication
ethics that crop up repeatedly in scientific and
biomedical journals before and after publication.

The 14 flowcharts have been informed by the hundreds
of cases from around the world on which COPE has
advised since its foundation in 1997. These breaches
range from duplicate (redundant) publication through
to copying other researchers? work (plagiarism) to out
and out fraud.”

For more information see here and the flowcharts here.