python - Comparing scikit learn clusterings using a decision tree -


i doing project class take data libsvm , run through 2 different clustering algorithms. have kmeans generating 8 clusters, while agglomerative grouping them 3 clusters.

now, i'm trying tell if cluster labels generated kmeans can used predict cluster labels generated agglomerative clustering, e.g. instances in cluster #6 map cluster#1 agg clustering.

my professor has advised use of decision tree classifier i'm not quite sure how this. know take agg clustering labels class labels , input data , see how classified. questions lie , have several:

1) scikit learn decision tree classifier output? list of probabilities each instance might classified as? or explicitly classify each instance?

2) after input data , each instance has been classified 1 of 3 clusters generated agg, how go in , find out clustering belonged kmeans?

3) there better way this? need "compare clusters produced different methods in quantitative way" don't need use decision tree classifiers, i'm not sure way be. i've considered rand , adjusted rand index don't seem i'm looking for

any appreciated! in advance!

let me answer 3) first. yes! sklearn.metrics.cluster see documentation. written viewpoint of "true reference" not necessary, adjusted rand index , normalized mutual information great comparing how similar 2 clusterings are, , each produce meaningful number.

1) either. predict give classes, predict_proba gives probabilities.

2) don't understand question.


Popular posts from this blog