Graph of scikit-learn ExtraTreeClassifier and RandomForestClassifier -
i trying make graphs illustrate difference between randomforestclassifier , extratreeclassifier in scikit-learn. think might have figured out unsure. here code fit , graph iris dataset both:
import numpy np sklearn.datasets import load_iris sklearn.externals.six import stringio sklearn import tree import pydot iris = load_iris() x = iris.data y = iris.target clf = tree.extratreeclassifier() clf = clf.fit(iris.data, iris.target) dot_data = stringio() tree.export_graphviz(clf, out_file=dot_data) graph = pydot.graph_from_dot_data(dot_data.getvalue()) file_name = "et_iris.pdf" graph.write_pdf(file_name) clf = tree.decisiontreeclassifier() clf = clf.fit(iris.data, iris.target) dot_data = stringio() tree.export_graphviz(clf, out_file=dot_data) graph = pydot.graph_from_dot_data(dot_data.getvalue()) file_name = "rdf_iris.pdf" graph.write_pdf(file_name)
the graphs produces seem correct, et graph "bushier" decision tree graph.
am correct decisiontreeclassifier same single tree in randomforestclassifier , extratreeclassifier same single tree in extratreeclassifier?
is there way trees in actual rdf or et classifier? tried using .estimators_ in forests not have export method seems.
export_graphviz
not method, function. none of trees "has" it. can use estimators_
. right extratreeclassifier
being single tree in extratreesclassifier
, decisiontreeclassifier
being single tree in randomforestclassifier
. however, doesn't cover it, because:
randomforestclassifier
bootstraps data set separately each tree,extratreesclassifier
not bootstrap (by default).max_features=n_features
single trees default, features can used in every split.