Identifies the optimal number of clusters by finding the "elbow" point in the Within-cluster Sum of Squares (WSS) curve. The elbow represents the k value where increasing k provides diminishing returns in reducing WSS.
Value
A list with three elements:
- optimal_k
Integer, the suggested optimal k value
- wss_values
Named numeric vector of WSS for each k tested
- knee_point
Integer, same as optimal_k (the detected elbow)
Details
The function uses the gradient method to detect the elbow:
Computes WSS for each k using k-means clustering
Calculates first derivative (rate of WSS decrease)
Calculates second derivative (rate of change of slope)
Identifies the k where the second derivative is maximum (sharpest bend)
Memory complexity: O(n*p) - linear in dataset size, suitable for large data.