VoltageClusterer (jung2 2.1 API)

java.lang.Object
- edu.uci.ics.jung.algorithms.cluster.VoltageClusterer<V,E>

```
public class VoltageClusterer<V,E>
extends Object
```
Clusters vertices of a Graph based on their ranks as calculated by VoltageScorer. This algorithm is based on, but not identical with, the method described in the paper below. The primary difference is that Wu and Huberman assume a priori that the clusters are of approximately the same size, and therefore use a more complex method than k-means (which is used here) for determining cluster membership based on co-occurrence data.
The algorithm proceeds as follows:
- first, generate a set of candidate clusters as follows:
  - pick (widely separated) vertex pair, run VoltageScorer
  - group the vertices in two clusters according to their voltages
  - store resulting candidate clusters
- second, generate k-1 clusters as follows:
  - pick a vertex v as a cluster 'seed'
    (Wu/Huberman: most frequent vertex in candidate clusters)
  - calculate co-occurrence over all candidate clusters of v with each other vertex
  - separate co-occurrence counts into high/low; high vertices constitute a cluster
  - remove v's vertices from candidate clusters; continue
- finally, remaining unassigned vertices are assigned to the kth ("garbage") cluster.
NOTE: Depending on how the co-occurrence data splits the data into clusters, the number of clusters returned by this algorithm may be less than the number of clusters requested. The number of clusters will never be more than the number requested, however.
Author:

Joshua O'Madadhain

See Also:

"'Finding communities in linear time: a physics approach', Fang Wu and Bernardo Huberman, http://www.hpl.hp.com/research/idl/papers/linear/", VoltageScorer, KMeansClusterer

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

protected class VoltageClusterer.MapValueArrayComparator

Nested Classes
Modifier and Type	Class and Description
`protected class`	`VoltageClusterer.MapValueArrayComparator`

Field Summary

Fields
Modifier and Type Field and Description

protected Graph<V,E> g

protected KMeansClusterer<V> kmc

protected int num_candidates

protected Random rand

Fields
Modifier and Type	Field and Description
`protected Graph<V,E>`	`g`
`protected KMeansClusterer<V>`	`kmc`
`protected int`	`num_candidates`
`protected Random`	`rand`

Constructor Summary

Constructors
Constructor and Description

VoltageClusterer(Graph<V,E> g, int num_candidates)
Creates an instance of a VoltageCluster with the specified parameters.

Constructors
Constructor and Description
`VoltageClusterer(Graph<V,E> g, int num_candidates)` Creates an instance of a VoltageCluster with the specified parameters.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`protected void`	`addOneCandidateCluster(LinkedList<Set<V>> candidates, Map<V,double[]> voltage_ranks)` alternative to addTwoCandidateClusters(): cluster vertices by voltages into 2 clusters.
`protected void`	`addTwoCandidateClusters(LinkedList<Set<V>> candidates, Map<V,double[]> voltage_ranks)` Do k-means with three intervals and pick the smaller two clusters (presumed to be on the ends); this is closer to the Wu-Huberman method.
`protected Collection<Set<V>>`	`cluster_internal(V origin, int num_clusters)` Does the work of `getCommunity` and `cluster`.
`Collection<Set<V>>`	`cluster(int num_clusters)` Clusters the vertices of `g` into `num_clusters` clusters, based on their connectivity.
`Collection<Set<V>>`	`getCommunity(V v)`
`protected Map<V,double[]>`	`getObjectCounts(Collection<Set<V>> candidates, V seed)`
`protected List<V>`	`getSeedCandidates(Collection<Set<V>> candidates)` Returns a list of cluster seeds, ranked in decreasing order of number of appearances in the specified collection of candidate clusters.
`protected void`	`setRandomSeed(int random_seed)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - num_candidates
```
protected int num_candidates
```
  - kmc
```
protected KMeansClusterer<V> kmc
```
  - rand
```
protected Random rand
```
  - g
```
protected Graph<V,E> g
```
- Constructor Detail
  - VoltageClusterer
```
public VoltageClusterer(Graph<V,E> g,
                        int num_candidates)
```
    Creates an instance of a VoltageCluster with the specified parameters. These are mostly parameters that are passed directly to VoltageScorer and KMeansClusterer.
    
    Parameters:
    
    g - the graph whose vertices are to be clustered
    
    num_candidates - the number of candidate clusters to create
- Method Detail
  - setRandomSeed
```
protected void setRandomSeed(int random_seed)
```
  - getCommunity
```
public Collection<Set<V>> getCommunity(V v)
```
    Parameters:
    
    v - the vertex whose community we wish to discover
    
    Returns:
    
    a community (cluster) centered around v.
  - cluster
```
public Collection<Set<V>> cluster(int num_clusters)
```
    Clusters the vertices of g into num_clusters clusters, based on their connectivity.
    
    Parameters:
    
    num_clusters - the number of clusters to identify
    
    Returns:
    
    a collection of clusters (sets of vertices)
  - cluster_internal
```
protected Collection<Set<V>> cluster_internal(V origin,
                                              int num_clusters)
```
    Does the work of getCommunity and cluster.
    
    Parameters:
    
    origin - the vertex around which clustering is to be done
    
    num_clusters - the (maximum) number of clusters to find
    
    Returns:
    
    a collection of clusters (sets of vertices)
  - addTwoCandidateClusters
```
protected void addTwoCandidateClusters(LinkedList<Set<V>> candidates,
                                       Map<V,double[]> voltage_ranks)
```
    Do k-means with three intervals and pick the smaller two clusters (presumed to be on the ends); this is closer to the Wu-Huberman method.
    
    Parameters:
    
    candidates - the list of clusters to populate
    
    voltage_ranks - the voltage values for each vertex
  - addOneCandidateCluster
```
protected void addOneCandidateCluster(LinkedList<Set<V>> candidates,
                                      Map<V,double[]> voltage_ranks)
```
    alternative to addTwoCandidateClusters(): cluster vertices by voltages into 2 clusters. We only consider the smaller of the two clusters returned by k-means to be a 'true' cluster candidate; the other is a garbage cluster.
    
    Parameters:
    
    candidates - the list of clusters to populate
    
    voltage_ranks - the voltage values for each vertex
  - getSeedCandidates
```
protected List<V> getSeedCandidates(Collection<Set<V>> candidates)
```
    Returns a list of cluster seeds, ranked in decreasing order of number of appearances in the specified collection of candidate clusters.
    
    Parameters:
    
    candidates - the set of candidate clusters
    
    Returns:
    
    a set of cluster seeds
  - getObjectCounts
```
protected Map<V,double[]> getObjectCounts(Collection<Set<V>> candidates,
                                          V seed)
```

Class VoltageClusterer<V,E>

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

num_candidates

kmc

rand

g

Constructor Detail

VoltageClusterer

Method Detail

setRandomSeed

getCommunity

cluster

cluster_internal

addTwoCandidateClusters

addOneCandidateCluster

getSeedCandidates

getObjectCounts