Stream Cluster

Our streamcluster kernel is modified upon the streamcluster benchmark in the Parsec suite developed by Princeton University.

The following is the decription of the streamcluster from Parsec technical report[1]

“For a stream of input points, it finds a predetermined number of medians so that each point is assigned to its nearest center. The quality of the clustering is measured by the sum of squared distances (SSQ) metric.”

Our CUDA version parallelized the pgain function. “Given a preliminary solution, the function computes how much cost can be saved by opening a new center. For every new point, it weighs the cost of making it a new center and reassigning some of the existing points to it against the savings caused by minimizing the distance between two points x and y for all points.”

[1]Christian Bienia, Sanjeev Kumar, Jaswidner Pal Singh and Kai Li. The PARSEC Benchmark Suite: Characterization and Architectural Implications. Technical Report TR-811-08, Princeton University, January 2008.

Retrieved from “

  • streamcluster.txt
  • Last modified: 2018/10/03 17:45
  • by