## 2017年2月20日 Random Forest Classifier 原

airxiechao

Random Forests builds various decision trees on bootstrap samples and random seleted features, prediction can be made by averaging

``````from __future__ import division
import numpy as np
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier

X = data[data.columns[:-1]].as_matrix()
y = data[data.columns[-1]].as_matrix()

class myRandomForest:

def fit(self, X, y, n_estimators, max_depth):
self.learners = []
for n in range(n_estimators):
idx_data = np.random.choice(X.shape[0],X.shape[0],replace=True)
idx_feat = np.random.choice(X.shape[1],int(np.sqrt(X.shape[1])),replace=False)
X_sub = X[idx_data][:,idx_feat]
y_sub = y[idx_data]

clf = DecisionTreeClassifier(max_depth=max_depth)
clf.fit(X_sub, y_sub)

self.learners.append({
'tree': clf,
'idx_feat': idx_feat
})

def predict(self, X):
ps = []
for i in range(len(self.learners)):
clf = self.learners[i]['tree']
idx_feat = self.learners[i]['idx_feat']

p = clf.predict(X[:, idx_feat])
ps.append(p)

ps = np.transpose(ps).tolist()
return [ max(p, key=p.count) for p in ps]

def score(self, X, y):
return np.count_nonzero(self.predict(X) == y)/len(y)

rf = myRandomForest()
rf.fit(X,y,50,3)
print 'score:', rf.score(X,y)
#score: 0.942307692308

dt = DecisionTreeClassifier(max_depth=3)
dt.fit(X,y)
print 'decision tree score:', np.count_nonzero(dt.predict(X) == y)/len(y)
#decision tree score: 0.884615384615``````

### airxiechao

2018/06/26
0
0

byxdaz
05/15
0
0
Orange脚本调用Data Mining Library

openthings
2016/01/02
226
0
Introduction to random forests

1: Introduction In the past three missions, we learned about decision trees, and looked at ways to reduce overfitting. The most powerful method to reduce decision tree overfitti......

Betty__
2016/09/29
12
0

Petuum 是一个分布式机器学习框架。它致力于提供一个超大型机器学习的通用算法和系统接口。它主要集中在系统上 "plumbing work"和算法加速的优化上面，当简化分布式 ML 程序实现时——允许你...

2015/05/18
1K
0

Spring Boot 2 实战：使用 Spring Boot Admin 监控你的应用

1. 前言 生产上对 Web 应用 的监控是十分必要的。我们可以近乎实时来对应用的健康、性能等其他指标进行监控来及时应对一些突发情况。避免一些故障的发生。对于 Spring Boot 应用来说我们可以...

53分钟前
4
0
ZetCode 教程翻译计划正式启动 | ApacheCN

ApacheCN_飞龙

4
0
CSS定位

CSS定位 relative相对定位 absolute绝对定位 fixed和sticky及zIndex relative相对定位 position特性：css position属性用于指定一个元素在文档中的定位方式。top、right、bottom、left属性则...

studywin

6
0

Java技术江湖

5
0

Moks角木

7
0