文档章节

决策树分类Decision tree classifier

hblt-j
 hblt-j
发布于 2017/08/29 11:28
字数 1690
阅读 11
收藏 0
点赞 0
评论 0
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.Dataset
import org.apache.spark.sql.Row
import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.Column
import org.apache.spark.sql.DataFrameReader
import org.apache.spark.rdd.RDD
import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
import org.apache.spark.sql.Encoder
import org.apache.spark.sql.DataFrameStatFunctions
import org.apache.spark.sql.functions._

import org.apache.spark.ml.Pipeline
import org.apache.spark.ml.classification.DecisionTreeClassificationModel
import org.apache.spark.ml.classification.DecisionTreeClassifier
import org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
import org.apache.spark.ml.feature.{ VectorAssembler, IndexToString, StringIndexer, VectorIndexer }


val spark = SparkSession.builder().appName("Spark decision tree classifier").config("spark.some.config.option", "some-value").getOrCreate()

// For implicit conversions like converting RDDs to DataFrames
import spark.implicits._

// 这里仅仅是示例数据,完整的数据源,请参考我的博客http://blog.csdn.net/hadoop_spark_storm/article/details/53412598
val dataList: List[(Double, String, Double, Double, String, Double, Double, Double, Double)] = List(
      (0, "male", 37, 10, "no", 3, 18, 7, 4),
      (0, "female", 27, 4, "no", 4, 14, 6, 4),
      (0, "female", 32, 15, "yes", 1, 12, 1, 4),
      (0, "male", 57, 15, "yes", 5, 18, 6, 5),
      (0, "male", 22, 0.75, "no", 2, 17, 6, 3),
      (0, "female", 32, 1.5, "no", 2, 17, 5, 5))


val data = dataList.toDF("affairs", "gender", "age", "yearsmarried", "children", "religiousness", "education", "occupation", "rating") 
data: org.apache.spark.sql.DataFrame = [affairs: double, gender: string ... 7 more fields]

data.printSchema() 
root 
 |-- affairs: double (nullable = false) 
 |-- gender: string (nullable = true) 
 |-- age: double (nullable = false) 
 |-- yearsmarried: double (nullable = false) 
 |-- children: string (nullable = true) 
 |-- religiousness: double (nullable = false) 
 |-- education: double (nullable = false) 
 |-- occupation: double (nullable = false) 
 |-- rating: double (nullable = false) 


data.show(10,truncate=false)
+-------+------+----+------------+--------+-------------+---------+----------+------+
|affairs|gender|age |yearsmarried|children|religiousness|education|occupation|rating|
+-------+------+----+------------+--------+-------------+---------+----------+------+
|0.0    |male  |37.0|10.0        |no      |3.0          |18.0     |7.0       |4.0   |
|0.0    |female|27.0|4.0         |no      |4.0          |14.0     |6.0       |4.0   |
|0.0    |female|32.0|15.0        |yes     |1.0          |12.0     |1.0       |4.0   |
|0.0    |male  |57.0|15.0        |yes     |5.0          |18.0     |6.0       |5.0   |
|0.0    |male  |22.0|0.75        |no      |2.0          |17.0     |6.0       |3.0   |
|0.0    |female|32.0|1.5         |no      |2.0          |17.0     |5.0       |5.0   |
|0.0    |female|22.0|0.75        |no      |2.0          |12.0     |1.0       |3.0   |
|0.0    |male  |57.0|15.0        |yes     |2.0          |14.0     |4.0       |4.0   |
|0.0    |female|32.0|15.0        |yes     |4.0          |16.0     |1.0       |2.0   |
|0.0    |male  |22.0|1.5         |no      |4.0          |14.0     |4.0       |5.0   |
+-------+------+----+------------+--------+-------------+---------+----------+------+
only showing top 10 rows

// 查看数据分布情况
data.describe("affairs", "gender", "age", "yearsmarried", "children", "religiousness", "education", "occupation", "rating").show(10,truncate=false)
+-------+------------------+------+-----------------+-----------------+--------+------------------+-----------------+-----------------+------------------+
|summary|affairs           |gender|age              |yearsmarried     |children|religiousness     |education        |occupation       |rating            |
+-------+------------------+------+-----------------+-----------------+--------+------------------+-----------------+-----------------+------------------+
|count  |601               |601   |601              |601              |601     |601               |601              |601              |601               |
|mean   |1.4559068219633944|null  |32.48752079866888|8.17769550748752 |null    |3.1164725457570714|16.16638935108153|4.194675540765391|3.9317803660565724|
|stddev |3.298757728494681 |null  |9.28876170487667 |5.571303149963791|null    |1.1675094016730692|2.402554565766698|1.819442662708579|1.1031794920503795|
|min    |0.0               |female|17.5             |0.125            |no      |1.0               |9.0              |1.0              |1.0               |
|max    |12.0              |male  |57.0             |15.0             |yes     |5.0               |20.0             |7.0              |5.0               |
+-------+------------------+------+-----------------+-----------------+--------+------------------+-----------------+-----------------+------------------+

data.createOrReplaceTempView("data")

// 字符类型转换成数值
val labelWhere = "case when affairs=0 then 0 else cast(1 as double) end as label"
labelWhere: String = case when affairs=0 then 0 else cast(1 as double) end as label

val genderWhere = "case when gender='female' then 0 else cast(1 as double) end as gender"
genderWhere: String = case when gender='female' then 0 else cast(1 as double) end as gender

val childrenWhere = "case when children='no' then 0 else cast(1 as double) end as children"
childrenWhere: String = case when children='no' then 0 else cast(1 as double) end as children

val dataLabelDF = spark.sql(s"select $labelWhere, $genderWhere,age,yearsmarried,$childrenWhere,religiousness,education,occupation,rating from data")
dataLabelDF: org.apache.spark.sql.DataFrame = [label: double, gender: double ... 7 more fields]

val featuresArray = Array("gender", "age", "yearsmarried", "children", "religiousness", "education", "occupation", "rating")
featuresArray: Array[String] = Array(gender, age, yearsmarried, children, religiousness, education, occupation, rating)

// 字段转换成特征向量
val assembler = new VectorAssembler().setInputCols(featuresArray).setOutputCol("features")
assembler: org.apache.spark.ml.feature.VectorAssembler = vecAssembler_6e2c6bdd631e

val vecDF: DataFrame = assembler.transform(dataLabelDF)
vecDF: org.apache.spark.sql.DataFrame = [label: double, gender: double ... 8 more fields]

vecDF.show(10,truncate=false)
+-----+------+----+------------+--------+-------------+---------+----------+------+------------------------------------+
|label|gender|age |yearsmarried|children|religiousness|education|occupation|rating|features                            |
+-----+------+----+------------+--------+-------------+---------+----------+------+------------------------------------+
|0.0  |1.0   |37.0|10.0        |0.0     |3.0          |18.0     |7.0       |4.0   |[1.0,37.0,10.0,0.0,3.0,18.0,7.0,4.0]|
|0.0  |0.0   |27.0|4.0         |0.0     |4.0          |14.0     |6.0       |4.0   |[0.0,27.0,4.0,0.0,4.0,14.0,6.0,4.0] |
|0.0  |0.0   |32.0|15.0        |1.0     |1.0          |12.0     |1.0       |4.0   |[0.0,32.0,15.0,1.0,1.0,12.0,1.0,4.0]|
|0.0  |1.0   |57.0|15.0        |1.0     |5.0          |18.0     |6.0       |5.0   |[1.0,57.0,15.0,1.0,5.0,18.0,6.0,5.0]|
|0.0  |1.0   |22.0|0.75        |0.0     |2.0          |17.0     |6.0       |3.0   |[1.0,22.0,0.75,0.0,2.0,17.0,6.0,3.0]|
|0.0  |0.0   |32.0|1.5         |0.0     |2.0          |17.0     |5.0       |5.0   |[0.0,32.0,1.5,0.0,2.0,17.0,5.0,5.0] |
|0.0  |0.0   |22.0|0.75        |0.0     |2.0          |12.0     |1.0       |3.0   |[0.0,22.0,0.75,0.0,2.0,12.0,1.0,3.0]|
|0.0  |1.0   |57.0|15.0        |1.0     |2.0          |14.0     |4.0       |4.0   |[1.0,57.0,15.0,1.0,2.0,14.0,4.0,4.0]|
|0.0  |0.0   |32.0|15.0        |1.0     |4.0          |16.0     |1.0       |2.0   |[0.0,32.0,15.0,1.0,4.0,16.0,1.0,2.0]|
|0.0  |1.0   |22.0|1.5         |0.0     |4.0          |14.0     |4.0       |5.0   |[1.0,22.0,1.5,0.0,4.0,14.0,4.0,5.0] |
+-----+------+----+------------+--------+-------------+---------+----------+------+------------------------------------+
only showing top 10 rows


// 索引标签,将元数据添加到标签列中
val labelIndexer = new StringIndexer().setInputCol("label").setOutputCol("indexedLabel").fit(vecDF)
labelIndexer: org.apache.spark.ml.feature.StringIndexerModel = strIdx_d00cad619cd5

labelIndexer.transform(vecDF).show(10,truncate=false)
+-----+------+----+------------+--------+-------------+---------+----------+------+------------------------------------+------------+
|label|gender|age |yearsmarried|children|religiousness|education|occupation|rating|features                            |indexedLabel|
+-----+------+----+------------+--------+-------------+---------+----------+------+------------------------------------+------------+
|0.0  |1.0   |37.0|10.0        |0.0     |3.0          |18.0     |7.0       |4.0   |[1.0,37.0,10.0,0.0,3.0,18.0,7.0,4.0]|0.0         |
|0.0  |0.0   |27.0|4.0         |0.0     |4.0          |14.0     |6.0       |4.0   |[0.0,27.0,4.0,0.0,4.0,14.0,6.0,4.0] |0.0         |
|0.0  |0.0   |32.0|15.0        |1.0     |1.0          |12.0     |1.0       |4.0   |[0.0,32.0,15.0,1.0,1.0,12.0,1.0,4.0]|0.0         |
|0.0  |1.0   |57.0|15.0        |1.0     |5.0          |18.0     |6.0       |5.0   |[1.0,57.0,15.0,1.0,5.0,18.0,6.0,5.0]|0.0         |
|0.0  |1.0   |22.0|0.75        |0.0     |2.0          |17.0     |6.0       |3.0   |[1.0,22.0,0.75,0.0,2.0,17.0,6.0,3.0]|0.0         |
|0.0  |0.0   |32.0|1.5         |0.0     |2.0          |17.0     |5.0       |5.0   |[0.0,32.0,1.5,0.0,2.0,17.0,5.0,5.0] |0.0         |
|0.0  |0.0   |22.0|0.75        |0.0     |2.0          |12.0     |1.0       |3.0   |[0.0,22.0,0.75,0.0,2.0,12.0,1.0,3.0]|0.0         |
|0.0  |1.0   |57.0|15.0        |1.0     |2.0          |14.0     |4.0       |4.0   |[1.0,57.0,15.0,1.0,2.0,14.0,4.0,4.0]|0.0         |
|0.0  |0.0   |32.0|15.0        |1.0     |4.0          |16.0     |1.0       |2.0   |[0.0,32.0,15.0,1.0,4.0,16.0,1.0,2.0]|0.0         |
|0.0  |1.0   |22.0|1.5         |0.0     |4.0          |14.0     |4.0       |5.0   |[1.0,22.0,1.5,0.0,4.0,14.0,4.0,5.0] |0.0         |
+-----+------+----+------------+--------+-------------+---------+----------+------+------------------------------------+------------+
only showing top 10 rows

// 自动识别分类的特征,并对它们进行索引
// 具有大于8个不同的值的特征被视为连续。
val featureIndexer = new VectorIndexer().setInputCol("features").setOutputCol("indexedFeatures").setMaxCategories(8).fit(vecDF)
featureIndexer: org.apache.spark.ml.feature.VectorIndexerModel = vecIdx_8fbcad97fb60

featureIndexer.transform(vecDF).show(10,truncate=false)
+-----+------+----+------------+--------+-------------+---------+----------+------+------------------------------------+----------------------------------+
|label|gender|age |yearsmarried|children|religiousness|education|occupation|rating|features                            |indexedFeatures                   |
+-----+------+----+------------+--------+-------------+---------+----------+------+------------------------------------+----------------------------------+
|0.0  |1.0   |37.0|10.0        |0.0     |3.0          |18.0     |7.0       |4.0   |[1.0,37.0,10.0,0.0,3.0,18.0,7.0,4.0]|[1.0,37.0,6.0,0.0,2.0,5.0,6.0,3.0]|
|0.0  |0.0   |27.0|4.0         |0.0     |4.0          |14.0     |6.0       |4.0   |[0.0,27.0,4.0,0.0,4.0,14.0,6.0,4.0] |[0.0,27.0,4.0,0.0,3.0,2.0,5.0,3.0]|
|0.0  |0.0   |32.0|15.0        |1.0     |1.0          |12.0     |1.0       |4.0   |[0.0,32.0,15.0,1.0,1.0,12.0,1.0,4.0]|[0.0,32.0,7.0,1.0,0.0,1.0,0.0,3.0]|
|0.0  |1.0   |57.0|15.0        |1.0     |5.0          |18.0     |6.0       |5.0   |[1.0,57.0,15.0,1.0,5.0,18.0,6.0,5.0]|[1.0,57.0,7.0,1.0,4.0,5.0,5.0,4.0]|
|0.0  |1.0   |22.0|0.75        |0.0     |2.0          |17.0     |6.0       |3.0   |[1.0,22.0,0.75,0.0,2.0,17.0,6.0,3.0]|[1.0,22.0,2.0,0.0,1.0,4.0,5.0,2.0]|
|0.0  |0.0   |32.0|1.5         |0.0     |2.0          |17.0     |5.0       |5.0   |[0.0,32.0,1.5,0.0,2.0,17.0,5.0,5.0] |[0.0,32.0,3.0,0.0,1.0,4.0,4.0,4.0]|
|0.0  |0.0   |22.0|0.75        |0.0     |2.0          |12.0     |1.0       |3.0   |[0.0,22.0,0.75,0.0,2.0,12.0,1.0,3.0]|[0.0,22.0,2.0,0.0,1.0,1.0,0.0,2.0]|
|0.0  |1.0   |57.0|15.0        |1.0     |2.0          |14.0     |4.0       |4.0   |[1.0,57.0,15.0,1.0,2.0,14.0,4.0,4.0]|[1.0,57.0,7.0,1.0,1.0,2.0,3.0,3.0]|
|0.0  |0.0   |32.0|15.0        |1.0     |4.0          |16.0     |1.0       |2.0   |[0.0,32.0,15.0,1.0,4.0,16.0,1.0,2.0]|[0.0,32.0,7.0,1.0,3.0,3.0,0.0,1.0]|
|0.0  |1.0   |22.0|1.5         |0.0     |4.0          |14.0     |4.0       |5.0   |[1.0,22.0,1.5,0.0,4.0,14.0,4.0,5.0] |[1.0,22.0,3.0,0.0,3.0,2.0,3.0,4.0]|
+-----+------+----+------------+--------+-------------+---------+----------+------+------------------------------------+----------------------------------+
only showing top 10 rows

// 将数据分为训练和测试集(30%进行测试)
val Array(trainingData, testData) = vecDF.randomSplit(Array(0.7, 0.3))
trainingData: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [label: double, gender: double ... 8 more fields]
testData: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [label: double, gender: double ... 8 more fields]

// 训练决策树模型
val dt = new DecisionTreeClassifier()
.setLabelCol("indexedLabel")
.setFeaturesCol("indexedFeatures")
.setImpurity("entropy") // 不纯度
.setMaxBins(100) // 离散化"连续特征"的最大划分数
.setMaxDepth(5) // 树的最大深度
.setMinInfoGain(0.01) //一个节点分裂的最小信息增益,值为[0,1]
.setMinInstancesPerNode(10) //每个节点包含的最小样本数 
.setSeed(123456)

// 将索引标签转换回原始标签
val labelConverter = new IndexToString().setInputCol("prediction").setOutputCol("predictedLabel").setLabels(labelIndexer.labels)
labelConverter: org.apache.spark.ml.feature.IndexToString = idxToStr_2598e79a1d08

// Chain indexers and tree in a Pipeline.
val pipeline = new Pipeline().setStages(Array(labelIndexer, featureIndexer, dt, labelConverter))

// Train model. This also runs the indexers.
val model = pipeline.fit(trainingData)


// 作出预测
val predictions = model.transform(testData)
predictions: org.apache.spark.sql.DataFrame = [label: double, gender: double ... 14 more fields]


// 选择几个示例行展示
predictions.select("predictedLabel", "label", "features").show(10,truncate=false)
+--------------+-----+-------------------------------------+
|predictedLabel|label|features                             |
+--------------+-----+-------------------------------------+
|0.0           |0.0  |[0.0,22.0,0.125,0.0,2.0,14.0,4.0,5.0]|
|0.0           |0.0  |[0.0,22.0,0.125,0.0,2.0,16.0,6.0,3.0]|
|0.0           |0.0  |[0.0,22.0,0.125,0.0,4.0,12.0,4.0,5.0]|
|0.0           |0.0  |[0.0,22.0,0.417,0.0,1.0,17.0,6.0,4.0]|
|0.0           |0.0  |[0.0,22.0,0.75,0.0,2.0,16.0,5.0,5.0] |
|0.0           |0.0  |[0.0,22.0,1.5,0.0,1.0,14.0,1.0,5.0]  |
|0.0           |0.0  |[0.0,22.0,1.5,0.0,2.0,14.0,5.0,4.0]  |
|0.0           |0.0  |[0.0,22.0,1.5,0.0,2.0,16.0,5.0,5.0]  |
|0.0           |0.0  |[0.0,22.0,1.5,0.0,3.0,16.0,6.0,5.0]  |
|0.0           |0.0  |[0.0,22.0,1.5,0.0,4.0,17.0,5.0,5.0]  |
+--------------+-----+-------------------------------------+
only showing top 10 rows


// 选择(预测标签,实际标签),并计算测试误差。
val evaluator = new MulticlassClassificationEvaluator().setLabelCol("indexedLabel").setPredictionCol("prediction").setMetricName("accuracy")

val accuracy = evaluator.evaluate(predictions)
accuracy: Double = 0.7032967032967034

println("Test Error = " + (1.0 - accuracy))
Test Error = 0.29670329670329665


// 这里的stages(2)中的“2”对应pipeline中的“dt”,将model强制转换为DecisionTreeClassificationModel类型
val treeModel = model.stages(2).asInstanceOf[DecisionTreeClassificationModel]
treeModel: org.apache.spark.ml.classification.DecisionTreeClassificationModel = DecisionTreeClassificationModel (uid=dtc_7a8baf97abe7) of depth 5 with 33 nodes

treeModel.getLabelCol
res53: String = indexedLabel

treeModel.getFeaturesCol
res54: String = indexedFeatures

treeModel.featureImportances
res55: org.apache.spark.ml.linalg.Vector = (8,[0,2,3,4,5,6,7],[0.0640344247735859,0.1052957011097811,0.05343872372010684,0.17367191628391196,0.20372870264756315,0.2063093687074741,0.1935211627575769])

treeModel.getPredictionCol
res56: String = prediction

treeModel.getProbabilityCol
res57: String = probability

treeModel.numClasses
res58: Int = 2

treeModel.numFeatures
res59: Int = 8

treeModel.depth
res60: Int = 5

treeModel.numNodes
res61: Int = 33

treeModel.getImpurity
res62: String = entropy

treeModel.getMaxBins
res63: Int = 100

treeModel.getMaxDepth
res64: Int = 5

treeModel.getMaxMemoryInMB
res65: Int = 256

treeModel.getMinInfoGain
res66: Double = 0.01

treeModel.getMinInstancesPerNode
res67: Int = 10

 // 查看决策树
println("Learned classification tree model:\n" + treeModel.toDebugString)
Learned classification tree model:
DecisionTreeClassificationModel (uid=dtc_7a8baf97abe7) of depth 5 with 33 nodes
  If (feature 2 in {0.0,1.0,2.0,3.0})
   If (feature 5 in {3.0,6.0})
    Predict: 0.0
   Else (feature 5 not in {3.0,6.0})
    If (feature 4 in {3.0})
     Predict: 0.0
    Else (feature 4 not in {3.0})
     If (feature 3 in {0.0})
      If (feature 6 in {0.0,4.0,5.0})
       Predict: 0.0
      Else (feature 6 not in {0.0,4.0,5.0})
       Predict: 0.0
     Else (feature 3 not in {0.0})
      Predict: 0.0
  Else (feature 2 not in {0.0,1.0,2.0,3.0})
   If (feature 4 in {0.0,1.0,3.0,4.0})
    If (feature 7 in {0.0,1.0,2.0})
     If (feature 6 in {0.0,1.0,6.0})
      If (feature 4 in {1.0,4.0})
       Predict: 0.0
      Else (feature 4 not in {1.0,4.0})
       Predict: 0.0
     Else (feature 6 not in {0.0,1.0,6.0})
      If (feature 7 in {0.0,2.0})
       Predict: 0.0
      Else (feature 7 not in {0.0,2.0})
       Predict: 1.0
    Else (feature 7 not in {0.0,1.0,2.0})
     If (feature 5 in {0.0,1.0})
      Predict: 0.0
     Else (feature 5 not in {0.0,1.0})
      If (feature 6 in {0.0,1.0,2.0,5.0,6.0})
       Predict: 0.0
      Else (feature 6 not in {0.0,1.0,2.0,5.0,6.0})
       Predict: 0.0
   Else (feature 4 not in {0.0,1.0,3.0,4.0})
    If (feature 5 in {0.0,1.0,2.0,3.0,5.0,6.0})
     If (feature 0 in {0.0})
      If (feature 7 in {3.0})
       Predict: 0.0
      Else (feature 7 not in {3.0})
       Predict: 0.0
     Else (feature 0 not in {0.0})
      If (feature 7 in {0.0,2.0,4.0})
       Predict: 0.0
      Else (feature 7 not in {0.0,2.0,4.0})
       Predict: 1.0
    Else (feature 5 not in {0.0,1.0,2.0,3.0,5.0,6.0})
     Predict: 1.0

本文转载自:http://www.cnblogs.com/wwxbi/p/6114039.html

共有 人打赏支持
hblt-j
粉丝 14
博文 92
码字总数 11113
作品 0
海淀
架构师
MLF(弥勒佛) —— 大数据机器学习框架

让天下没有难做的大数据模型! 功能 下面是弥勒佛框架解决的问题类型,括号中的斜体代表尚未实现以及预计实现的时间 监督式学习:最大熵分类模型(max entropy classifier),决策树模型(d...

oschina ⋅ 2016/05/03 ⋅ 0

机器学习:随机森林学习笔记

前言 随机森林是一个很强大的模型,由一组决策树投票得到最后的结果。要研究清楚随机森林,首先需要研究清楚决策树,然后理解随机森林如何通过多棵树的集成提高模型效果。 本文的目的是将自己...

丹追兵 ⋅ 2017/11/29 ⋅ 0

GBDT/GBRT迭代决策树

GBDT基本概念 用ID3算法和C4.5算法学习得到的决策树,有可能导致模型过拟合,通常使用剪枝算法来解决。随着集成学习的发展,出现了比较典型的迭代决策树GBDT和随机森林RF,即将多棵单决策树进...

初雪之音 ⋅ 2016/02/16 ⋅ 0

机器学习笔记-Decision Tree

集成学习系列: Blending and Bagging Adaptive Boosting Decision Tree Random Forest Gradient Boosted Decision Tree Decision Tree 上一篇讲解了 算法,这个算法有两个特点:第一个是在第...

robin_Xu_shuai ⋅ 2017/11/29 ⋅ 0

决策树(Decision Tree)简介

决策树(Decision Tree)及其变种是另一类将输入空间分成不同的区域,每个区域有独立参数的算法。决策树分类算法是一种基于实例的归纳学习方法,它能从给定的无序的训练样本中,提炼出树型的分...

fengbingchun ⋅ 2017/12/23 ⋅ 0

Decision Trees 笔记

Decision Tree(决策树)属于监督学习算法, 训练过程根据训练集各个属性的不同取值进行逐层划分, 每一层都是对一个属性的划分, 直至节点中的所有样本都为同一个标签(label), 从而构建出决策树;...

sailtseng ⋅ 2013/11/18 ⋅ 0

大数据机器学习框架(弥勒佛)--MLF

让天下没有难做的大数据模型! 功能 下面是弥勒佛框架解决的问题类型,括号中的斜体代表尚未实现以及预计实现的时间 监督式学习:最大熵分类模型(max entropy classifier),决策树模型(d...

匿名 ⋅ 2016/04/18 ⋅ 4

数据挖掘 - 分类算法比较

分类算法介绍 以下介绍典型的分类算法。 Bayes 贝叶斯分类器的分类原理是通过某对象的先验概率,利用贝叶斯公式计算出其后验概率,即该对象属于某一类的概率,选择具有最大后验概率的类作为该...

一只死笨死笨的猪 ⋅ 2014/10/23 ⋅ 0

通过ID3理解机器学习之决策树算法

注:本文素材大部分来自彭亮博士的课程 使用范围:决策树适用于处理小规模数据集,直观,便于理解 一、定义 决策树(Decision Tree):决策树是一种基本的分类与回归方法。决策树模型呈树形结...

fengyueshengqi ⋅ 01/25 ⋅ 0

MIT Introduction to Algorithms 学习笔记(八)

Lecture 7: Linear-Time Sorting 比较排序(Comparison Sorting) 堆排序,合并排序都是比较排序,最坏的情况下运行时间为O(n lg n). 比较模型(Comparison Model of Computation) input items a...

hyaicc ⋅ 2016/01/06 ⋅ 0

没有更多内容

加载失败,请刷新页面

加载更多

下一页

Java集合类总结笔记

一、集合类的层次关系 主要容器集合类的特点: ArrayList 一种可以动态增长和缩减的索引序列 LinkedList 一种可以在任何位置进行高效地插入和删除的有序序列 ArrayDeque 一种用循环数组实现的...

edwardGe ⋅ 11分钟前 ⋅ 0

spring RMI远程调用

RMI https://www.cnblogs.com/wdh1995/p/6792407.html

BobwithB ⋅ 17分钟前 ⋅ 0

Jenkins实践2 之基本配置

1 插件管理 系统管理->插件管理 在可选插件中可以自主安装插件 2 管理用户 系统管理->管理用户->新建用户 3 安全配置 系统管理->全局安全配置 授权策略 选择安全矩阵 然后添加现有的用户,赋...

晨猫 ⋅ 17分钟前 ⋅ 0

c++智能指针

1、是一种泛型类,针对指针类型的泛型类,会保存指针 2、重载了符号 *和-> 对智能指针使用这两个符号,相当于对保存的泛型使用这两个符号 3、当智能指针引用计数为0时,会去释放指针指向的资...

国仔饼 ⋅ 18分钟前 ⋅ 0

Spring Boot错误处理机制

1)、SpringBoot默认的错误处理机制 默认效果: 1)、浏览器,返回一个默认的错误页面 浏览器发送请求的请求头: 2)、如果是其他客户端,默认响应一个json数据 原理: 可以参照ErrorMvcAut...

小致dad ⋅ 19分钟前 ⋅ 0

ftp连接不上的终极办法 SFTP

假如FTP由于各种原因就是连不上,那么用SFTP协议吧,使用登录服务器的账号密码。

sskill ⋅ 24分钟前 ⋅ 0

Unity 围绕旋转角度限制(Transform.RotateAround)

在 Unity 中可以利用 Transform.RotateAround 围绕指定物体进行旋转,但某些情况下可能需要对旋转角度进行控制。我是先计算出预设角度大小,然后判断是否在限定角度范围内是则进行旋转。 相关...

大轩 ⋅ 25分钟前 ⋅ 0

阿里沙箱环境支付宝测试demo

阿里支付宝支付和微信支付,包括:阿里沙箱环境支付宝测试demo,支付宝支付整合到spring+springmvc+mybatis环境和微信整合到如上环境,功能非常齐全,只需要修改对应的配置文件即可,帮助文档...

码代码的小司机 ⋅ 27分钟前 ⋅ 0

JDK1.6和JDK1.7中,Collections.sort的区别,

背景 最近,项目正在集成测试阶段,项目在服务器上运行了一段时间,点击表格的列进行排序的时候,有的列排序正常,有的列在排序的时候,在后台会抛出如下异常,查询到不到数据,而且在另外一...

tsmyk0715 ⋅ 44分钟前 ⋅ 0

C++ 中命名空间的 5 个常见用法

相信小伙伴们对C++已经非常熟悉,但是对命名空间经常使用到的地方还不是很明白,这篇文章就针对命名空间这一块做了一个叙述。 命名空间在1995年被引入到 c++ 标准中,通常是这样定义的: 命名...

柳猫 ⋅ 49分钟前 ⋅ 0

没有更多内容

加载失败,请刷新页面

加载更多

下一页

返回顶部
顶部