Sample Data Sets For Cluster Analysis

SUMMARY:

These data sets are included to illustrate some of the cluster analysis methods in S-PLUS.

ARGUMENTS:

animals
Six binary attributes for 20 animals. The codes are 1 = absent, 2 = present. There are missing values in the data. The 20 by 6 data frame contains columns:
war

is the animal warm blooded?

fly
does the animal fly?
ver
is the animal a vertebrate?
end
is the animal endangered?
gro
does the animal live in social groups?
hai
is the animal hairy? Note that bees, caterpillars, spiders are considered hairy.

The row names are names or abbreviations for the animals: ant - ant, bee - bee, cat - cat, cpl - caterpillar, chi - chimpanzee, cow - cow, duc - duck, eag - eagle, ele - elephant, fly - fly, fro - frog, her - hermit crab, lio - lion, liz - lizard, lob - lobster, man - man, rab - rabbit, sal - salamander, spi - spider, wha - whale.

euro
The gross national product in 1992 and the percentage of the gross national product due to agriculture for 12 countries in Europe. The data frame has 12 rows and 2 columns:
landbouw

percentage of the gross national product due to agriculture.

bbp
the gross national product.

The row names are abbreviations for the countries: B - Belgium, D - Germany, DK - Denmark, E - Spain, F - France, GR - Greece, I - Italy, IRL - Ireland, L - Luxembourg, NL - Netherlands, P - Portugal, UK - United Kingdom.

The data are an extract from the brochure Cijfers en feiten: Een statistisch portret van de Europese Unie published by Eurostat, the European agency for statistics.

ruspini
Ruspini data. An artificial data set with four groups created by Ruspini (1970) to illustrate clustering methods. The data frame has 75 rows and 2 columns

REFERENCE:

Ruspini, E. H. (1970). Numerical methods for fuzzy clustering. Inform. Sci. 2, 319-320.