Hyperoverlap can be used to detect and visualise overlap in n-dimensional space.
To explore the functions in hyperoverlap, we’ll use the iris
dataset. This dataset contains 150 observations of three species of iris (“setosa”, “versicolor” and “virginica”). These data are four-dimensional (Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) and are documented in ?iris
. We’ll set up five test datasets to explore the different functions: 1. test1
two entities (setosa, virginica); three dimensions (Sepal.Length, Sepal.Width, Petal.Length) 1. test2
two entities (versicolor, virginica); three dimensions (as above) 1. test3
two entities (setosa, virginica); four dimensions 1. test4
two entities (versicolor, virginica); four dimensions 1. test5
all entities, all dimensions
<- iris[which(iris$Species!="versicolor"),c(1:3,5)]
test1 <- iris[which(iris$Species!="setosa"),c(1:3,5)]
test2 <- iris[which(iris$Species!="versicolor"),]
test3 <- iris[which(iris$Species!="setosa"),]
test4 <- iris test5
Note that entities may be species, genera, populations etc.
To plot the decision boundary using hyperoverlap_plot
, the data cannot exceed three dimensions. For high-dimensional visualisation, see hyperoverlap_lda
.
library(hyperoverlap)
<- hyperoverlap_detect(test1[,1:3], test1$Species)
setosa_virginica3d <- hyperoverlap_detect(test2[,1:3], test2$Species) versicolor_virginica3d
To examine the result:
@result #gives us the result: overlap or non-overlap?
setosa_virginica3d#> [1] "non-overlap"
@result
versicolor_virginica3d#> [1] "overlap"
@shape #for the non-overlapping pair, was the decision boundary linear or curvilinear?
setosa_virginica3d#> [1] "linear"
hyperoverlap_plot(setosa_virginica3d) #plot the data and the decision boundary in 3d