hyperoverlap

Matilda Brown

2021-08-10

Hyperoverlap can be used to detect and visualise overlap in n-dimensional space.

Data: iris

To explore the functions in hyperoverlap, we’ll use the iris dataset. This dataset contains 150 observations of three species of iris (“setosa”, “versicolor” and “virginica”). These data are four-dimensional (Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) and are documented in ?iris. We’ll set up five test datasets to explore the different functions: 1. test1 two entities (setosa, virginica); three dimensions (Sepal.Length, Sepal.Width, Petal.Length) 1. test2 two entities (versicolor, virginica); three dimensions (as above) 1. test3 two entities (setosa, virginica); four dimensions 1. test4 two entities (versicolor, virginica); four dimensions 1. test5 all entities, all dimensions

test1 <- iris[which(iris$Species!="versicolor"),c(1:3,5)]
test2 <- iris[which(iris$Species!="setosa"),c(1:3,5)]
test3 <- iris[which(iris$Species!="versicolor"),]
test4 <- iris[which(iris$Species!="setosa"),]
test5 <- iris

Note that entities may be species, genera, populations etc.

Examining overlap between two entities in 3D

To plot the decision boundary using hyperoverlap_plot, the data cannot exceed three dimensions. For high-dimensional visualisation, see hyperoverlap_lda.

library(hyperoverlap)
setosa_virginica3d <- hyperoverlap_detect(test1[,1:3], test1$Species)
versicolor_virginica3d <- hyperoverlap_detect(test2[,1:3], test2$Species)

To examine the result:

setosa_virginica3d@result             #gives us the result: overlap or non-overlap?
#> [1] "non-overlap"
versicolor_virginica3d@result
#> [1] "overlap"

setosa_virginica3d@shape              #for the non-overlapping pair, was the decision boundary linear or curvilinear? 
#> [1] "linear"


hyperoverlap_plot(setosa_virginica3d) #plot the data and the decision boundary in 3d