# Difference between revisions of "Streamcluster"

(New page: Page Constructing.... Our streamcluster kernel is modified upon the streamcluster benchmark in the Parsec suite developed by Princeton University. The following is the decription of the...) |
|||

Line 1: | Line 1: | ||

− | |||

− | |||

Our streamcluster kernel is modified upon the streamcluster benchmark in the Parsec suite developed by Princeton University. | Our streamcluster kernel is modified upon the streamcluster benchmark in the Parsec suite developed by Princeton University. | ||

Line 8: | Line 6: | ||

so that each point is assigned to its nearest center. The quality of the clustering is measured by the sum of squared distances (SSQ) metric." | so that each point is assigned to its nearest center. The quality of the clustering is measured by the sum of squared distances (SSQ) metric." | ||

− | + | Our CUDA version parallelized the pgain function. "Given a preliminary solution, the function computes | |

+ | how much cost can be saved by opening a new center. For every new point, it weighs the cost of making it a new center | ||

+ | and reassigning some of the existing points to it against the savings caused by minimizing the distance between two points x and y for all points..." | ||

[[Downloads]] | [[Downloads]] |

## Revision as of 02:00, 11 February 2009

Our streamcluster kernel is modified upon the streamcluster benchmark in the Parsec suite developed by Princeton University.

The following is the decription of the streamcluster from Parsec technical report[1]

"For a stream of input points, it finds a predetermined number of medians so that each point is assigned to its nearest center. The quality of the clustering is measured by the sum of squared distances (SSQ) metric."

Our CUDA version parallelized the pgain function. "Given a preliminary solution, the function computes how much cost can be saved by opening a new center. For every new point, it weighs the cost of making it a new center and reassigning some of the existing points to it against the savings caused by minimizing the distance between two points x and y for all points..."

[1]Christian Bienia, Sanjeev Kumar, Jaswidner Pal Singh and Kai Li. Technical Report TR-811-08, Princeton University, January 2008.