版权声明:本文为博主原创文章,未经博主允许不得转载。 https://cloud.tencent.com/developer/article/1338368
本文主要是使用python sklearn,完成决策树的demo,以及可视化,最终生成的决策树结果。
代码语言:javascript复制from sklearn.datasets import load_iris
from sklearn import tree
from sklearn.tree import export_graphviz
import subprocess
def visualize_tree(tree, feature_name, dot_file):
"""Create tree png using graphviz.
tree -- scikit-learn DecsisionTree.
feature_names -- list of feature names.
dot_file -- dot file name and path
"""
with open("tree.dot", 'w') as f:
export_graphviz(tree, out_file=f,
feature_names=feature_name)
dt_png = "dt.png"
command = ["dot", "-Tpng", dot_file, "-o", dt_png]
try:
subprocess.check_call(command)
except Exception as e:
print e
exit("Could not run dot, ie graphviz, to "
"produce visualization")
def iris_demo():
clf = tree.DecisionTreeClassifier()
iris = load_iris()
# iris.data属性150*4,iris.target 类别归一化为了0,1,2(150*1)
clf = clf.fit(iris.data, iris.target)
dot_file = 'tree.dot'
tree.export_graphviz(clf, out_file=dot_file)
visualize_tree(clf, iris.feature_names, dot_file)
# (graph,) = pydot.graph_from_dot_file('tree.dot')
# graph.write_png('somefile.png')
if __name__ == '__main__':
iris_demo()
pass
数据集
1. 花的分类的四种属性,150个示例
2. 花的分类,一共三类对应于0,1,2
3. 花的四个属性的描述
最终生成的结果:
pydot的安装见另一篇bolg
http://blog.csdn.net/haluoluo211/article/details/78200078
转载注明出处,并在下面留言!!!
参考
http://chrisstrelioff.ws/sandbox/2015/06/08/decision_trees_in_python_with_scikit_learn_and_pandas.html
http://www.kdnuggets.com/2017/05/simplifying-decision-tree-interpretation-decision-rules-python.html