kMean Blues

I have written a basic kMeans algorithm in java which takes in a two dimensional cluster field expressed as a series of points described in a CSV file. The algorithm should place a given number of potential centers in the space and associates each point with its nearest potential center. The centers are then moved to the average location of their associated points, and the points are rechecked and re-associated with their newest nearest neighbors.


The catch being that it must be known the number of clusters the algorithm should look for. There are other clustering algorithms; but, to my admittedly limited understanding, kMeans is by far the easiest to implement.

I am currently bouncing between two different implementation methods in Java: one a more functional approach, and the other more object oriented. The functional approach involves parsing the CSV file into a two dimensional array, the first index of which represents the coordinate, and the second representing the point index. The relationships between the centers and the points are represented using an integer array, the index of which corresponds to the point index, and the values corresponding to the centers index.

The object-oriented approach involves representing each point and center as an instance of an object. Each point has an integer variable representing the index of its nearest center and a method for finding its nearest center. As a result, the constructor for the point class will take in coordinates and an array list of centers.

Comments

Popular posts from this blog

Schwoop, Roadblocks