mlr3verse机器学习

R语言
机器学习
作者

不止BI

发布于

2024年4月4日

修改于

2024年6月11日

mlr3 是一个用于机器学习的R语言包。它是mlr包的下一代版本,旨在提供更强大、更灵活的机器学习功能。mlr3 提供了一种模块化的框架,使用户可以轻松地执行各种机器学习任务。

mlr3verse 是一组用于机器学习的R语言包的集合,旨在扩展mlr3框架的功能。这个集合包括了一系列用于数据预处理、特征选择、模型调优和评估的包,为用户提供了更多的工具和功能来进行机器学习任务。mlr3verse 的包括 mlr3mlr3learnersmlr3pipelinesmlr3tuningmlr3viz 等,每个包都有其特定的功能和用途,用户可以根据自己的需求选择适合的包来完成各种机器学习任务。

创建机器学习任务

mlr3 中需要将数据封装为 task 类,然后再进行机器学习的相关操作

代码
library(mlr3verse)
tsk_penguins <- as_task_classif(palmerpenguins::penguins,
  target = "species",
  id = "penguins"
)

tsk_penguins
<TaskClassif:penguins> (344 x 8)
* Target: species
* Properties: multiclass
* Features (7):
  - int (3): body_mass_g, flipper_length_mm, year
  - dbl (2): bill_depth_mm, bill_length_mm
  - fct (2): island, sex

常用task方法

代码
# 查看列结构
tsk_penguins$col_info
Key: <id>
                  id    type                  levels  label fix_factor_levels
              <char>  <char>                  <list> <char>            <lgcl>
1:          ..row_id integer                           <NA>             FALSE
2:     bill_depth_mm numeric                           <NA>             FALSE
3:    bill_length_mm numeric                           <NA>             FALSE
4:       body_mass_g integer                           <NA>             FALSE
5: flipper_length_mm integer                           <NA>             FALSE
6:            island  factor  Biscoe,Dream,Torgersen   <NA>             FALSE
7:               sex  factor             female,male   <NA>             FALSE
8:           species  factor Adelie,Chinstrap,Gentoo   <NA>             FALSE
9:              year integer                           <NA>             FALSE
代码
# 查看数据
tsk_penguins$data(1:5)
   species bill_depth_mm bill_length_mm body_mass_g flipper_length_mm    island
    <fctr>         <num>          <num>       <int>             <int>    <fctr>
1:  Adelie          18.7           39.1        3750               181 Torgersen
2:  Adelie          17.4           39.5        3800               186 Torgersen
3:  Adelie          18.0           40.3        3250               195 Torgersen
4:  Adelie            NA             NA          NA                NA Torgersen
5:  Adelie          19.3           36.7        3450               193 Torgersen
      sex  year
   <fctr> <int>
1:   male  2007
2: female  2007
3: female  2007
4:   <NA>  2007
5: female  2007
代码
# 查看缺失值
tsk_penguins$missings()
          species     bill_depth_mm    bill_length_mm       body_mass_g 
                0                 2                 2                 2 
flipper_length_mm            island               sex              year 
                2                 0                11                 0 
代码
tsk_penguins$formula()
species ~ .
NULL

数据预处理

mlr3的数据预处理是通过mlr3pipelines实现的。mlr3pipelines中的数据预处理的所有步骤都采用了字符来表示,比较难记忆,可以在官网查看说明或通过mlr_pipeops来查看可用操作及说明

代码
library(DT)
datatable(as.data.table(mlr_pipeops), options = list(
  language = list(url = "https://cdn.datatables.net/plug-ins/1.10.11/i18n/Chinese.json"),
  pageLength = 5
))

从上表可以看到,mlr3verse 提供了丰富的数据处理方法。使用mlr3内置的iris等任务,我们抽取部分常用的数据预处理流程进行示例

代码
task <- tsk("iris")
task
<TaskClassif:iris> (150 x 5): Iris Flowers
* Target: Species
* Properties: multiclass
* Features (4):
  - dbl (4): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width

中心化和标准化

mlr3pipelines 中所有的数据预处理步骤都是通过po函数实现,每一个 po 除了自己的数据预处理参数之外,可以额外指定一个id作为该步骤的唯一名称,方便之后定位该步骤。

affect_columns参数可以控制该步骤影响的列,可以通过selector_*族函数来快捷的选择列

  1. selector_all():选择所有变量。

  2. selector_none():不选择任何变量。

  3. selector_type(types):按照变量类型选择,比如字符型、数值型等。

  4. selector_grep(pattern, ignore.case = FALSE, perl = FALSE, fixed = FALSE):使用正则表达式选择变量。

  5. selector_name(feature_names, assert_present = FALSE):按照变量名选择。

  6. selector_invert(selector):反选变量,即删除指定的变量。

  7. selector_intersect(selector_x, selector_y):选择两个选择器的交集。

  8. selector_union(selector_x, selector_y):选择两个选择器的并集。

  9. selector_setdiff(selector_x, selector_y):选择两个选择器的差集。

  10. selector_missing():选择存在缺失值的变量。

  11. selector_cardinality_greater_than(min_cardinality):选择分类特征基数(唯一值的数量)大于某个值的变量。

代码
# 选择预处理步骤
pos <- po(
  "scale",
  center = T, # 中心化
  scale = F, # 标准化
  affect_columns = selector_name(c("Petal.Length", "Petal.Width", "Sepal.Length")), # 作用的变量
  id = "scale_iris" # 为该步骤命名
)
pos
PipeOp: <scale_iris> (not trained)
values: <robust=FALSE, center=TRUE, scale=FALSE, affect_columns=<Selector>>
Input channels <name [train type, predict type]>:
  input [Task,Task]
Output channels <name [train type, predict type]>:
  output [Task,Task]
代码
# 执行并提取处理后数据
pos$train(list(task))[[1]]$data()
       Species Petal.Length Petal.Width Sepal.Length Sepal.Width
        <fctr>        <num>       <num>        <num>       <num>
  1:    setosa       -2.358  -0.9993333  -0.74333333         3.5
  2:    setosa       -2.358  -0.9993333  -0.94333333         3.0
  3:    setosa       -2.458  -0.9993333  -1.14333333         3.2
  4:    setosa       -2.258  -0.9993333  -1.24333333         3.1
  5:    setosa       -2.358  -0.9993333  -0.84333333         3.6
 ---                                                            
146: virginica        1.442   1.1006667   0.85666667         3.0
147: virginica        1.242   0.7006667   0.45666667         2.5
148: virginica        1.442   0.8006667   0.65666667         3.0
149: virginica        1.642   1.1006667   0.35666667         3.4
150: virginica        1.342   0.6006667   0.05666667         3.0

缺失值处理

  • imputelearner:算法插补

  • imputemean:均值

  • imputemedian:中位数

  • imputeconstant:常数

  • imoutehist:直方图

  • imputemode:众数插补

  • imputesample:随机

均值插补

代码
task <- tsk("pima")
task$missings()
diabetes      age  glucose  insulin     mass pedigree pregnant pressure 
       0        0        5      374       11        0        0       35 
 triceps 
     227 
代码
# 决策树插补
po <- po("imputemean")
new_task <- po$train(list(task = task))[[1]]
new_task$missings()
diabetes      age pedigree pregnant  glucose  insulin     mass pressure 
       0        0        0        0        0        0        0        0 
 triceps 
       0 

算法插补

代码
task <- tsk("pima")
task$missings()
diabetes      age  glucose  insulin     mass pedigree pregnant pressure 
       0        0        5      374       11        0        0       35 
 triceps 
     227 
代码
# 决策树插补
po <- po("imputelearner", lrn("regr.rpart"))
new_task <- po$train(list(task = task))[[1]]
new_task$missings()
diabetes      age pedigree pregnant  glucose  insulin     mass pressure 
       0        0        0        0        0        0        0        0 
 triceps 
       0 

变量选择

代码
# task = mlr3::tsk("mtcars")
# filter = flt("find_correlation")
# filter$calculate(task)
# as.data.table(filter)
library(dplyr)
task <- tsk("mtcars")
pos <-
  # 去除高度相关的列
  po("filter",
    filter = mlr3filters::flt("find_correlation"),
    filter.cutoff = 0.1
  ) %>>%
  # 去掉常数
  po("removeconstants") %>>%
  # 去掉方差较小变量
  po("filter",
    filter = mlr3filters::flt("variance"),
    filter.frac = 0.5
  )
pos$train(task)[[1]]$data()
      mpg  carb  disp    hp  qsec
    <num> <num> <num> <num> <num>
 1:  21.0     4 160.0   110 16.46
 2:  21.0     4 160.0   110 17.02
 3:  22.8     1 108.0    93 18.61
 4:  21.4     1 258.0   110 19.44
 5:  18.7     2 360.0   175 17.02
 6:  18.1     1 225.0   105 20.22
 7:  14.3     4 360.0   245 15.84
 8:  24.4     2 146.7    62 20.00
 9:  22.8     2 140.8    95 22.90
10:  19.2     4 167.6   123 18.30
11:  17.8     4 167.6   123 18.90
12:  16.4     3 275.8   180 17.40
13:  17.3     3 275.8   180 17.60
14:  15.2     3 275.8   180 18.00
15:  10.4     4 472.0   205 17.98
16:  10.4     4 460.0   215 17.82
17:  14.7     4 440.0   230 17.42
18:  32.4     1  78.7    66 19.47
19:  30.4     2  75.7    52 18.52
20:  33.9     1  71.1    65 19.90
21:  21.5     1 120.1    97 20.01
22:  15.5     2 318.0   150 16.87
23:  15.2     2 304.0   150 17.30
24:  13.3     4 350.0   245 15.41
25:  19.2     2 400.0   175 17.05
26:  27.3     1  79.0    66 18.90
27:  26.0     2 120.3    91 16.70
28:  30.4     2  95.1   113 16.90
29:  15.8     4 351.0   264 14.50
30:  19.7     6 145.0   175 15.50
31:  15.0     8 301.0   335 14.60
32:  21.4     2 121.0   109 18.60
      mpg  carb  disp    hp  qsec

数据编码

独热编码

代码
data <- data.table::data.table(x = factor(letters[1:3]), y = factor(letters[1:3]))
task <- as_task_classif(data, target = "y")

poe <- po("encode", method = "one-hot")

# 默认 "one-hot"
poe$train(list(task))[[1]]$data()
        y   x.a   x.b   x.c
   <fctr> <num> <num> <num>
1:      a     1     0     0
2:      b     0     1     0
3:      c     0     0     1

treatment编码:

代码
poe$param_set$values$method <- "treatment"
poe$train(list(task))[[1]]$data()
        y   x.b   x.c
   <fctr> <num> <num>
1:      a     0     0
2:      b     1     0
3:      c     0     1

其他

不平衡数据

  • ratio:和参考类别相比的倍数;

  • reference:设置参考类别;

  • adjust:选择过采样还是欠采样

  • shuffle:是否对结果打乱顺序,默认TRUE

代码
data(hacide, package = "ROSE")

table(hacide.train$cls)

  0   1 
980  20 
代码
task <- as_task_classif(hacide.train, target = "cls")
pos <- po("classbalancing",
  ratio = 1,
  reference = "major",
  adjust = "all",
  shuffle = T
)
blanced <- pos$train(list(task))[[1]]$data()
table(blanced$cls)

  0   1 
980 980 

离散化

代码
task <- tsk("mtcars")
pos <- po("quantilebin", numsplits = 10, affect_columns = selector_name(c("disp", "hp")))

pos$train(list(task))[[1]]$data() %>% head()
     mpg       disp         hp    am  carb   cyl  drat  gear  qsec    vs    wt
   <num>      <ord>      <ord> <num> <num> <num> <num> <num> <num> <num> <num>
1:  21.0  (142,160]  (106,110]     1     4     6  3.90     4 16.46     0 2.620
2:  21.0  (142,160]  (106,110]     1     4     6  3.90     4 17.02     0 2.875
3:  22.8 (80.6,120]  (66,93.4]     1     1     4  3.85     4 18.61     1 2.320
4:  21.4  (196,276]  (106,110]     0     1     6  3.08     3 19.44     1 3.215
5:  18.7  (351,396]  (165,178]     0     2     8  3.15     3 17.02     0 3.440
6:  18.1  (196,276] (93.4,106]     0     1     6  2.76     3 20.22     1 3.460

一般流程

构建学习器

查看 mlr3 支持的算法

代码
mlr_learners
<DictionaryLearner> with 49 stored values
Keys: classif.cv_glmnet, classif.debug, classif.featureless,
  classif.glmnet, classif.kknn, classif.lda, classif.log_reg,
  classif.multinom, classif.naive_bayes, classif.nnet, classif.qda,
  classif.ranger, classif.rpart, classif.svm, classif.xgboost,
  clust.agnes, clust.ap, clust.cmeans, clust.cobweb, clust.dbscan,
  clust.dbscan_fpc, clust.diana, clust.em, clust.fanny,
  clust.featureless, clust.ff, clust.hclust, clust.hdbscan,
  clust.kkmeans, clust.kmeans, clust.MBatchKMeans, clust.mclust,
  clust.meanshift, clust.optics, clust.pam, clust.SimpleKMeans,
  clust.xmeans, regr.cv_glmnet, regr.debug, regr.featureless,
  regr.glmnet, regr.kknn, regr.km, regr.lm, regr.nnet, regr.ranger,
  regr.rpart, regr.svm, regr.xgboost

创建学习器

代码
learner <- lrn("classif.rpart")
learner
<LearnerClassifRpart:classif.rpart>: Classification Tree
* Model: -
* Parameters: xval=0
* Packages: mlr3, rpart
* Predict Types:  [response], prob
* Feature Types: logical, integer, numeric, factor, ordered
* Properties: importance, missings, multiclass, selected_features,
  twoclass, weights

查看学习器支持的超参数

代码
learner$param_set
<ParamSet>
                id    class lower upper nlevels
            <char>   <char> <num> <num>   <num>
 1:             cp ParamDbl     0     1     Inf
 2:     keep_model ParamLgl    NA    NA       2
 3:     maxcompete ParamInt     0   Inf     Inf
 4:       maxdepth ParamInt     1    30      30
 5:   maxsurrogate ParamInt     0   Inf     Inf
 6:      minbucket ParamInt     1   Inf     Inf
 7:       minsplit ParamInt     1   Inf     Inf
 8: surrogatestyle ParamInt     0     1       2
 9:   usesurrogate ParamInt     0     2       3
10:           xval ParamInt     0   Inf     Inf
                                                                                      default
                                                                                       <list>
 1:                                                                                      0.01
 2:                                                                                     FALSE
 3:                                                                                         4
 4:                                                                                        30
 5:                                                                                         5
 6: <NoDefault>\n  Public:\n    clone: function (deep = FALSE) \n    initialize: function () 
 7:                                                                                        20
 8:                                                                                         0
 9:                                                                                         2
10:                                                                                        10
     value
    <list>
 1:       
 2:       
 3:       
 4:       
 5:       
 6:       
 7:       
 8:       
 9:       
10:      0

设置学习器参数

代码
learner <- lrn("classif.rpart", xval = 0, cp = 0.001)

划分数据集

stratify设置分层抽样

代码
task <- tsk("penguins") # 使用内置数据集
split <- partition(task, ratio = 0.6, stratify = T)

训练数据

代码
learner$train(task, row_ids = split$train)

预测

代码
prediction <- learner$predict(task, row_ids = split$test)
print(prediction)
<PredictionClassif> for 138 observations:
    row_ids     truth  response
          1    Adelie    Adelie
          3    Adelie    Adelie
          5    Adelie    Adelie
---                            
        339 Chinstrap Chinstrap
        340 Chinstrap    Gentoo
        341 Chinstrap    Adelie

评估模型

代码
prediction$confusion
           truth
response    Adelie Chinstrap Gentoo
  Adelie        60         3      0
  Chinstrap      1        20      1
  Gentoo         0         4     49
代码
autoplot(prediction)

代码
# 查看支持的指标
# msrs()
measures <- msrs(c("classif.acc", "classif.ce"))
prediction$score(measures)
classif.acc  classif.ce 
 0.93478261  0.06521739 

进阶用法

模型比较

使用 benchmark 可以同时进行多个任务、多个模型、多重抽样方法的模型比较

代码
design <- benchmark_grid(
  tasks = tsks(c("spam", "german_credit", "sonar")),
  learners = lrns(c("classif.ranger", "classif.rpart", "classif.featureless"), predict_type = "prob"),
  resamplings = rsmps(c("holdout", "cv"))
)
print(design)
             task             learner resampling
           <char>              <char>     <char>
 1:          spam      classif.ranger    holdout
 2:          spam      classif.ranger         cv
 3:          spam       classif.rpart    holdout
 4:          spam       classif.rpart         cv
 5:          spam classif.featureless    holdout
 6:          spam classif.featureless         cv
 7: german_credit      classif.ranger    holdout
 8: german_credit      classif.ranger         cv
 9: german_credit       classif.rpart    holdout
10: german_credit       classif.rpart         cv
11: german_credit classif.featureless    holdout
12: german_credit classif.featureless         cv
13:         sonar      classif.ranger    holdout
14:         sonar      classif.ranger         cv
15:         sonar       classif.rpart    holdout
16:         sonar       classif.rpart         cv
17:         sonar classif.featureless    holdout
18:         sonar classif.featureless         cv

Holdout和CV都是用于评估机器学习模型性能的方法,但两者之间存在一些关键差异。

Holdout

Holdout方法是将数据集划分为训练集和测试集,其中训练集用于训练模型,测试集用于评估模型性能。训练集和测试集的大小通常是固定的,例如70%的训练集和30%的测试集。

Holdout方法简单易用,但存在以下缺点:

  • 训练集和测试集的划分方式可能会影响模型性能评估结果。例如,如果训练集和测试集的分布不一致,则模型性能评估结果可能不准确。

  • Holdout方法只使用了一部分数据来训练模型,因此模型性能评估结果可能不够可靠。

CV

CV方法将数据集划分为多个子集,每个子集轮流作为训练集和测试集。这样可以使每个数据点都有机会被用作训练集和测试集,从而提高模型性能评估结果的可靠性。

常用的CV方法包括k折交叉验证和留一交叉验证。k折交叉验证将数据集划分为k个子集,每个子集轮流作为测试集,其余k-1个子集作为训练集。留一交叉验证将数据集划分为n个子集,其中每个子集包含一个数据点,每个数据点单独作为测试集,其余n-1个数据点作为训练集。

CV方法比Holdout方法更复杂,但具有以下优点:

  • CV方法可以充分利用数据,提高模型性能评估结果的可靠性。

  • CV方法可以用于选择最佳的超参数。

代码
bmr <- benchmark(design, store_models = T)
INFO  [23:43:25.808] [mlr3] Running benchmark with 99 resampling iterations
INFO  [23:43:25.882] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 1/1)
INFO  [23:43:27.498] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 1/10)
INFO  [23:43:28.601] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 2/10)
INFO  [23:43:30.403] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 3/10)
INFO  [23:43:31.511] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 4/10)
INFO  [23:43:32.881] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 5/10)
INFO  [23:43:33.969] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 6/10)
INFO  [23:43:35.063] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 7/10)
INFO  [23:43:36.434] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 8/10)
INFO  [23:43:37.570] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 9/10)
INFO  [23:43:38.624] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 10/10)
INFO  [23:43:39.728] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 1/1)
INFO  [23:43:40.070] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 1/10)
INFO  [23:43:40.128] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 2/10)
INFO  [23:43:40.186] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 3/10)
INFO  [23:43:40.248] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 4/10)
INFO  [23:43:40.305] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 5/10)
INFO  [23:43:40.361] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 6/10)
INFO  [23:43:40.425] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 7/10)
INFO  [23:43:40.485] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 8/10)
INFO  [23:43:40.546] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 9/10)
INFO  [23:43:40.606] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 10/10)
INFO  [23:43:40.670] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 1/1)
INFO  [23:43:40.681] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 1/10)
INFO  [23:43:40.695] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 2/10)
INFO  [23:43:40.706] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 3/10)
INFO  [23:43:40.716] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 4/10)
INFO  [23:43:40.726] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 5/10)
INFO  [23:43:40.737] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 6/10)
INFO  [23:43:40.747] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 7/10)
INFO  [23:43:40.758] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 8/10)
INFO  [23:43:40.769] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 9/10)
INFO  [23:43:40.781] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 10/10)
INFO  [23:43:40.794] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 1/1)
INFO  [23:43:40.936] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 1/10)
INFO  [23:43:41.109] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 2/10)
INFO  [23:43:41.286] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 3/10)
INFO  [23:43:41.453] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 4/10)
INFO  [23:43:41.623] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 5/10)
INFO  [23:43:41.791] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 6/10)
INFO  [23:43:42.276] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 7/10)
INFO  [23:43:42.441] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 8/10)
INFO  [23:43:42.608] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 9/10)
INFO  [23:43:42.776] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 10/10)
INFO  [23:43:42.942] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 1/1)
INFO  [23:43:42.959] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 1/10)
INFO  [23:43:42.978] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 2/10)
INFO  [23:43:42.997] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 3/10)
INFO  [23:43:43.016] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 4/10)
INFO  [23:43:43.035] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 5/10)
INFO  [23:43:43.055] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 6/10)
INFO  [23:43:43.074] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 7/10)
INFO  [23:43:43.094] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 8/10)
INFO  [23:43:43.114] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 9/10)
INFO  [23:43:43.135] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 10/10)
INFO  [23:43:43.165] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 1/1)
INFO  [23:43:43.174] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 1/10)
INFO  [23:43:43.185] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 2/10)
INFO  [23:43:43.194] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 3/10)
INFO  [23:43:43.204] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 4/10)
INFO  [23:43:43.213] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 5/10)
INFO  [23:43:43.222] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 6/10)
INFO  [23:43:43.230] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 7/10)
INFO  [23:43:43.239] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 8/10)
INFO  [23:43:43.248] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 9/10)
INFO  [23:43:43.256] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 10/10)
INFO  [23:43:43.265] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 1/1)
INFO  [23:43:43.317] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 1/10)
INFO  [23:43:43.382] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 2/10)
INFO  [23:43:43.447] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 3/10)
INFO  [23:43:43.513] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 4/10)
INFO  [23:43:43.578] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 5/10)
INFO  [23:43:43.644] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 6/10)
INFO  [23:43:43.730] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 7/10)
INFO  [23:43:43.796] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 8/10)
INFO  [23:43:43.861] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 9/10)
INFO  [23:43:43.925] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 10/10)
INFO  [23:43:43.990] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 1/1)
INFO  [23:43:44.006] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 1/10)
INFO  [23:43:44.024] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 2/10)
INFO  [23:43:44.041] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 3/10)
INFO  [23:43:44.059] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 4/10)
INFO  [23:43:44.076] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 5/10)
INFO  [23:43:44.433] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 6/10)
INFO  [23:43:44.451] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 7/10)
INFO  [23:43:44.468] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 8/10)
INFO  [23:43:44.487] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 9/10)
INFO  [23:43:44.504] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 10/10)
INFO  [23:43:44.521] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 1/1)
INFO  [23:43:44.529] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 1/10)
INFO  [23:43:44.538] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 2/10)
INFO  [23:43:44.546] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 3/10)
INFO  [23:43:44.554] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 4/10)
INFO  [23:43:44.563] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 5/10)
INFO  [23:43:44.572] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 6/10)
INFO  [23:43:44.580] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 7/10)
INFO  [23:43:44.588] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 8/10)
INFO  [23:43:44.597] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 9/10)
INFO  [23:43:44.605] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 10/10)
INFO  [23:43:44.653] [mlr3] Finished benchmark
代码
measures <- msrs(c("classif.acc", "classif.mcc"))

bmr$aggregate(measures)
       nr       task_id          learner_id resampling_id iters classif.acc
    <int>        <char>              <char>        <char> <int>       <num>
 1:     1          spam      classif.ranger       holdout     1   0.9556714
 2:     2          spam      classif.ranger            cv    10   0.9513109
 3:     3          spam       classif.rpart       holdout     1   0.8794003
 4:     4          spam       classif.rpart            cv    10   0.8934962
 5:     5          spam classif.featureless       holdout     1   0.6075619
 6:     6          spam classif.featureless            cv    10   0.6059610
 7:     7 german_credit      classif.ranger       holdout     1   0.7357357
 8:     8 german_credit      classif.ranger            cv    10   0.7670000
 9:     9 german_credit       classif.rpart       holdout     1   0.7057057
10:    10 german_credit       classif.rpart            cv    10   0.7410000
11:    11 german_credit classif.featureless       holdout     1   0.6966967
12:    12 german_credit classif.featureless            cv    10   0.7000000
13:    13         sonar      classif.ranger       holdout     1   0.8260870
14:    14         sonar      classif.ranger            cv    10   0.8269048
15:    15         sonar       classif.rpart       holdout     1   0.6521739
16:    16         sonar       classif.rpart            cv    10   0.7452381
17:    17         sonar classif.featureless       holdout     1   0.5942029
18:    18         sonar classif.featureless            cv    10   0.5333333
    classif.mcc
          <num>
 1:   0.9068346
 2:   0.8973169
 3:   0.7491612
 4:   0.7760810
 5:   0.0000000
 6:   0.0000000
 7:   0.3235231
 8:   0.3975598
 9:   0.2541632
10:   0.3281152
11:   0.0000000
12:   0.0000000
13:   0.6379015
14:   0.6556091
15:   0.2709858
16:   0.4769956
17:   0.0000000
18:   0.0000000
Hidden columns: resample_result
代码
library(ggplot2)
autoplot(bmr)

超参数调优

机器学习模型在实际应用中,往往会遇到性能不佳的问题。机器学习的模型都有默认的超参数,但默认的超参数并不一定最适合你的模型,在这种情况下,就需要进行超参数调优。

mlr3包含自动调参的策略,自动调参需要指定以下信息:

  • 搜索空间:指模型超参数取值的范围。

  • 优化算法:指用于搜索最优解的算法。

  • 评估方法:指用于评估模型性能的方法。

  • 评价指标:指用于衡量模型性能的指标

调优示例

代码
library(mlr3verse)
task <- tsk("pima")
learner <- lrn("classif.rpart")
# 查看算法支持的超参数
learner$param_set
<ParamSet>
                id    class lower upper nlevels
            <char>   <char> <num> <num>   <num>
 1:             cp ParamDbl     0     1     Inf
 2:     keep_model ParamLgl    NA    NA       2
 3:     maxcompete ParamInt     0   Inf     Inf
 4:       maxdepth ParamInt     1    30      30
 5:   maxsurrogate ParamInt     0   Inf     Inf
 6:      minbucket ParamInt     1   Inf     Inf
 7:       minsplit ParamInt     1   Inf     Inf
 8: surrogatestyle ParamInt     0     1       2
 9:   usesurrogate ParamInt     0     2       3
10:           xval ParamInt     0   Inf     Inf
                                                                                      default
                                                                                       <list>
 1:                                                                                      0.01
 2:                                                                                     FALSE
 3:                                                                                         4
 4:                                                                                        30
 5:                                                                                         5
 6: <NoDefault>\n  Public:\n    clone: function (deep = FALSE) \n    initialize: function () 
 7:                                                                                        20
 8:                                                                                         0
 9:                                                                                         2
10:                                                                                        10
     value
    <list>
 1:       
 2:       
 3:       
 4:       
 5:       
 6:       
 7:       
 8:       
 9:       
10:      0

设置调参空间

代码
search_space <- ps(
  cp = p_dbl(lower = 0.001, upper = 0.1), # 复杂度参数
  minsplit = p_int(lower = 1, upper = 10)
)
search_space
<ParamSet>
         id    class lower upper nlevels
     <char>   <char> <num> <num>   <num>
1:       cp ParamDbl 0.001   0.1     Inf
2: minsplit ParamInt 1.000  10.0      10
                                                                                     default
                                                                                      <list>
1: <NoDefault>\n  Public:\n    clone: function (deep = FALSE) \n    initialize: function () 
2: <NoDefault>\n  Public:\n    clone: function (deep = FALSE) \n    initialize: function () 
    value
   <list>
1:       
2:       

设置重抽样方法和性能指标

代码
cv <- rsmp("cv")
measures <- msrs(c("classif.ce", "time_train", "classif.acc"))

代码
library(mlr3tuning)
# 设置终止条件为10轮后停止,通过mlr_terminators可以查看支持的其他终止条件,比如run_time可以时长

instance <- tune(
  tuner = tnr("grid_search", resolution = 5, batch_size = 2),
  task = task,
  learner = learner,
  resampling = cv,
  measure = measures,
  search_space = search_space,
  term_evals = 10
)
INFO  [23:43:46.531] [bbotk] Starting to optimize 2 parameter(s) with '<TunerGridSearch>' and '<TerminatorEvals> [n_evals=10, k=0]'
INFO  [23:43:46.533] [bbotk] Evaluating 2 configuration(s)
INFO  [23:43:46.540] [mlr3] Running benchmark with 20 resampling iterations
INFO  [23:43:46.545] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:46.563] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:46.581] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:46.598] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:46.620] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:46.642] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:46.668] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:46.684] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:46.700] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:46.715] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:46.730] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:46.744] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:46.758] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:46.773] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:46.787] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:46.803] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:46.817] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:46.832] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:46.846] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:46.860] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:46.875] [mlr3] Finished benchmark
INFO  [23:43:47.015] [bbotk] Result of batch 1:
INFO  [23:43:47.017] [bbotk]     cp minsplit classif.ce time_train classif.acc warnings errors
INFO  [23:43:47.017] [bbotk]  0.001        3  0.2955229      0.005   0.7044771        0      0
INFO  [23:43:47.017] [bbotk]  0.001       10  0.2773069      0.003   0.7226931        0      0
INFO  [23:43:47.017] [bbotk]  runtime_learners                                uhash
INFO  [23:43:47.017] [bbotk]              0.08 dbef4b88-4e79-45a3-967a-6227d2bb6546
INFO  [23:43:47.017] [bbotk]              0.05 7a374b06-7f67-4405-9330-fe0bc4352c36
INFO  [23:43:47.018] [bbotk] Evaluating 2 configuration(s)
INFO  [23:43:47.023] [mlr3] Running benchmark with 20 resampling iterations
INFO  [23:43:47.027] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:47.041] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:47.057] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:47.071] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:47.086] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:47.101] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:47.116] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:47.141] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:47.157] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:47.172] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:47.187] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:47.202] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:47.217] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:47.231] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:47.245] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:47.258] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:47.272] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:47.285] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:47.301] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:47.316] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:47.330] [mlr3] Finished benchmark
INFO  [23:43:47.468] [bbotk] Result of batch 2:
INFO  [23:43:47.469] [bbotk]      cp minsplit classif.ce time_train classif.acc warnings errors
INFO  [23:43:47.469] [bbotk]  0.0505        5  0.2408578      0.000   0.7591422        0      0
INFO  [23:43:47.469] [bbotk]  0.1000        8  0.2733766      0.004   0.7266234        0      0
INFO  [23:43:47.469] [bbotk]  runtime_learners                                uhash
INFO  [23:43:47.469] [bbotk]              0.07 12b3953e-869c-4606-9718-ed7be80de174
INFO  [23:43:47.469] [bbotk]              0.07 a2a61b3a-e33b-49ab-b052-022f7e524d7e
INFO  [23:43:47.470] [bbotk] Evaluating 2 configuration(s)
INFO  [23:43:47.475] [mlr3] Running benchmark with 20 resampling iterations
INFO  [23:43:47.479] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:47.493] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:47.508] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:47.522] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:47.536] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:47.549] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:47.563] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:47.577] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:47.592] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:47.606] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:47.621] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:47.635] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:47.649] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:47.664] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:47.680] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:47.696] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:47.713] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:47.727] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:47.742] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:47.764] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:47.779] [mlr3] Finished benchmark
INFO  [23:43:47.919] [bbotk] Result of batch 3:
INFO  [23:43:47.920] [bbotk]       cp minsplit classif.ce time_train classif.acc warnings errors
INFO  [23:43:47.920] [bbotk]  0.05050        1  0.2408578      0.003   0.7591422        0      0
INFO  [23:43:47.920] [bbotk]  0.07525        8  0.2408578      0.006   0.7591422        0      0
INFO  [23:43:47.920] [bbotk]  runtime_learners                                uhash
INFO  [23:43:47.920] [bbotk]              0.06 45437bef-7faa-4b1a-8bd1-285d4bc84ca1
INFO  [23:43:47.920] [bbotk]              0.12 e9495aa2-6dfa-4f21-a33a-0f37312f2486
INFO  [23:43:47.921] [bbotk] Evaluating 2 configuration(s)
INFO  [23:43:47.926] [mlr3] Running benchmark with 20 resampling iterations
INFO  [23:43:47.930] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:47.944] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:47.959] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:47.973] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:47.988] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:48.002] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:48.018] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:48.033] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:48.047] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:48.061] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:48.075] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:48.089] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:48.102] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:48.115] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:48.129] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:48.146] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:48.162] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:48.178] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:48.192] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:48.207] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:48.222] [mlr3] Finished benchmark
INFO  [23:43:48.365] [bbotk] Result of batch 4:
INFO  [23:43:48.367] [bbotk]       cp minsplit classif.ce time_train classif.acc warnings errors
INFO  [23:43:48.367] [bbotk]  0.02575        1  0.2447539      0.003   0.7552461        0      0
INFO  [23:43:48.367] [bbotk]  0.07525        1  0.2408578      0.000   0.7591422        0      0
INFO  [23:43:48.367] [bbotk]  runtime_learners                                uhash
INFO  [23:43:48.367] [bbotk]              0.03 3ad9ea32-6869-4746-9c7e-06d21c6fd27c
INFO  [23:43:48.367] [bbotk]              0.05 4db13743-13b7-4d5b-8dc6-8cd439f873e7
INFO  [23:43:48.368] [bbotk] Evaluating 2 configuration(s)
INFO  [23:43:48.373] [mlr3] Running benchmark with 20 resampling iterations
INFO  [23:43:48.377] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:48.391] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:48.406] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:48.423] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:48.439] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:48.453] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:48.468] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:48.483] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:48.498] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:48.513] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:48.527] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:48.541] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:48.554] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:48.568] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:48.581] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:48.595] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:48.609] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:48.624] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:48.638] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:48.654] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:48.669] [mlr3] Finished benchmark
INFO  [23:43:48.810] [bbotk] Result of batch 5:
INFO  [23:43:48.812] [bbotk]      cp minsplit classif.ce time_train classif.acc warnings errors
INFO  [23:43:48.812] [bbotk]  0.0010        5  0.2798360      0.002   0.7201640        0      0
INFO  [23:43:48.812] [bbotk]  0.0505        3  0.2408578      0.006   0.7591422        0      0
INFO  [23:43:48.812] [bbotk]  runtime_learners                                uhash
INFO  [23:43:48.812] [bbotk]              0.09 96e654eb-1406-437a-8c5d-c32aa7eddd4c
INFO  [23:43:48.812] [bbotk]              0.08 97f453d6-c154-47ca-8bf7-08b70bad9220
INFO  [23:43:48.815] [bbotk] Finished optimizing after 10 evaluation(s)
INFO  [23:43:48.815] [bbotk] Result:
INFO  [23:43:48.817] [bbotk]       cp minsplit learner_param_vals  x_domain classif.ce time_train
INFO  [23:43:48.817] [bbotk]    <num>    <int>             <list>    <list>      <num>      <num>
INFO  [23:43:48.817] [bbotk]  0.05050        5          <list[3]> <list[2]>  0.2408578          0
INFO  [23:43:48.817] [bbotk]  0.07525        1          <list[3]> <list[2]>  0.2408578          0
INFO  [23:43:48.817] [bbotk]  classif.acc
INFO  [23:43:48.817] [bbotk]        <num>
INFO  [23:43:48.817] [bbotk]    0.7591422
INFO  [23:43:48.817] [bbotk]    0.7591422
代码
# 设置搜索方法:grid_search为网格搜索,random_search 为随机搜索
# 注意这里设置的resolution = 5,表示会基于cp和minsplit设置5*5的均匀网格搜索,正常会搜索25个组合的参数,但是由于前面我们设置了最大轮次为10轮,所以10次就结束了
instance
<TuningInstanceMultiCrit>
* State:  Optimized
* Objective: <ObjectiveTuning:classif.rpart_on_pima>
* Search Space:
         id    class lower upper nlevels
     <char>   <char> <num> <num>   <num>
1:       cp ParamDbl 0.001   0.1     Inf
2: minsplit ParamInt 1.000  10.0      10
* Terminator: <TerminatorEvals>
* Result:
        cp minsplit classif.ce time_train classif.acc
     <num>    <int>      <num>      <num>       <num>
1: 0.05050        5  0.2408578          0   0.7591422
2: 0.07525        1  0.2408578          0   0.7591422
* Archive:
         cp minsplit classif.ce time_train classif.acc
      <num>    <int>      <num>      <num>       <num>
 1: 0.00100        3  0.2955229      0.005   0.7044771
 2: 0.00100       10  0.2773069      0.003   0.7226931
 3: 0.05050        5  0.2408578      0.000   0.7591422
 4: 0.10000        8  0.2733766      0.004   0.7266234
 5: 0.05050        1  0.2408578      0.003   0.7591422
 6: 0.07525        8  0.2408578      0.006   0.7591422
 7: 0.02575        1  0.2447539      0.003   0.7552461
 8: 0.07525        1  0.2408578      0.000   0.7591422
 9: 0.00100        5  0.2798360      0.002   0.7201640
10: 0.05050        3  0.2408578      0.006   0.7591422

查看调参结果

代码
instance
<TuningInstanceMultiCrit>
* State:  Optimized
* Objective: <ObjectiveTuning:classif.rpart_on_pima>
* Search Space:
         id    class lower upper nlevels
     <char>   <char> <num> <num>   <num>
1:       cp ParamDbl 0.001   0.1     Inf
2: minsplit ParamInt 1.000  10.0      10
* Terminator: <TerminatorEvals>
* Result:
        cp minsplit classif.ce time_train classif.acc
     <num>    <int>      <num>      <num>       <num>
1: 0.05050        5  0.2408578          0   0.7591422
2: 0.07525        1  0.2408578          0   0.7591422
* Archive:
         cp minsplit classif.ce time_train classif.acc
      <num>    <int>      <num>      <num>       <num>
 1: 0.00100        3  0.2955229      0.005   0.7044771
 2: 0.00100       10  0.2773069      0.003   0.7226931
 3: 0.05050        5  0.2408578      0.000   0.7591422
 4: 0.10000        8  0.2733766      0.004   0.7266234
 5: 0.05050        1  0.2408578      0.003   0.7591422
 6: 0.07525        8  0.2408578      0.006   0.7591422
 7: 0.02575        1  0.2447539      0.003   0.7552461
 8: 0.07525        1  0.2408578      0.000   0.7591422
 9: 0.00100        5  0.2798360      0.002   0.7201640
10: 0.05050        3  0.2408578      0.006   0.7591422

将训练好的参数应用于模型,重新训练数据

查看调整好的参数

代码
instance$result_learner_param_vals
[[1]]
[[1]]$xval
[1] 0

[[1]]$cp
[1] 0.0505

[[1]]$minsplit
[1] 5


[[2]]
[[2]]$xval
[1] 0

[[2]]$cp
[1] 0.07525

[[2]]$minsplit
[1] 1

模型性能

代码
instance$result_y
   classif.ce time_train classif.acc
        <num>      <num>       <num>
1:  0.2408578          0   0.7591422
2:  0.2408578          0   0.7591422

将调优选择参数应用回模型

代码
learner$param_set$values <- instance$result_learner_param_vals[[1]]
learner$train(task)

pred <- learner$predict(task)

pred$confusion
        truth
response pos neg
     pos 150  58
     neg 118 442
代码
pred$score(msr("classif.acc"))
classif.acc 
  0.7708333 

auto_learner可以直接返回最优的那个模型

代码
task <- tsk("pima")
leanrer <- lrn("classif.rpart")
search_space <- ps(
  cp = p_dbl(0.001, 0.1),
  minsplit = p_int(1, 10)
)
cv <- rsmp("cv")
measures <- msr("classif.acc")

auto_learner <- auto_tuner(
  tuner = tnr("random_search", batch_size = 2),
  learner = learner,
  resampling = cv,
  measure = measures,
  search_space = search_space,
  term_evals = 10
)
auto_learner$train(task)
INFO  [23:43:49.021] [bbotk] Starting to optimize 2 parameter(s) with '<OptimizerRandomSearch>' and '<TerminatorEvals> [n_evals=10, k=0]'
INFO  [23:43:49.029] [bbotk] Evaluating 2 configuration(s)
INFO  [23:43:49.034] [mlr3] Running benchmark with 20 resampling iterations
INFO  [23:43:49.038] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:49.053] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:49.068] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:49.084] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:49.098] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:49.113] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:49.127] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:49.142] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:49.156] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:49.172] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:49.186] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:49.201] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:49.215] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:49.229] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:49.243] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:49.257] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:49.270] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:49.284] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:49.297] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:49.311] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:49.326] [mlr3] Finished benchmark
INFO  [23:43:49.357] [bbotk] Result of batch 1:
INFO  [23:43:49.359] [bbotk]          cp minsplit classif.acc warnings errors runtime_learners
INFO  [23:43:49.359] [bbotk]  0.09716243       10   0.7136364        0      0             0.12
INFO  [23:43:49.359] [bbotk]  0.06401417        1   0.7370984        0      0             0.05
INFO  [23:43:49.359] [bbotk]                                 uhash
INFO  [23:43:49.359] [bbotk]  4494a6a2-8fc1-4c67-953d-d8533a4f0629
INFO  [23:43:49.359] [bbotk]  51f45167-f6b1-4a8c-a17a-f700972fda6f
INFO  [23:43:49.361] [bbotk] Evaluating 2 configuration(s)
INFO  [23:43:49.368] [mlr3] Running benchmark with 20 resampling iterations
INFO  [23:43:49.372] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:49.387] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:49.401] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:49.417] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:49.431] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:49.445] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:49.460] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:49.474] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:49.489] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:49.503] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:49.517] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:49.531] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:49.545] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:49.559] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:49.573] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:49.588] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:49.612] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:49.626] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:49.640] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:49.654] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:49.669] [mlr3] Finished benchmark
INFO  [23:43:49.702] [bbotk] Result of batch 2:
INFO  [23:43:49.704] [bbotk]          cp minsplit classif.acc warnings errors runtime_learners
INFO  [23:43:49.704] [bbotk]  0.06292075        2   0.7370984        0      0             0.10
INFO  [23:43:49.704] [bbotk]  0.06739708       10   0.7370984        0      0             0.08
INFO  [23:43:49.704] [bbotk]                                 uhash
INFO  [23:43:49.704] [bbotk]  c79ef995-762b-445a-8ed2-ba57e6bf0817
INFO  [23:43:49.704] [bbotk]  bf756d50-e3d4-4749-903d-5989fe28ee2e
INFO  [23:43:49.706] [bbotk] Evaluating 2 configuration(s)
INFO  [23:43:49.712] [mlr3] Running benchmark with 20 resampling iterations
INFO  [23:43:49.716] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:49.730] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:49.745] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:49.758] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:49.773] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:49.789] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:49.804] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:49.818] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:49.832] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:49.847] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:49.861] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:49.876] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:49.890] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:49.904] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:49.918] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:49.933] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:49.947] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:49.961] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:49.976] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:49.990] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:50.005] [mlr3] Finished benchmark
INFO  [23:43:50.038] [bbotk] Result of batch 3:
INFO  [23:43:50.040] [bbotk]          cp minsplit classif.acc warnings errors runtime_learners
INFO  [23:43:50.040] [bbotk]  0.03809861        8   0.7370984        0      0             0.07
INFO  [23:43:50.040] [bbotk]  0.01926267        7   0.7474710        0      0             0.06
INFO  [23:43:50.040] [bbotk]                                 uhash
INFO  [23:43:50.040] [bbotk]  e843b5e0-7a61-4339-8971-23651abe1a04
INFO  [23:43:50.040] [bbotk]  9b205bc9-44b2-4d2f-b3df-c5fd7e590b56
INFO  [23:43:50.042] [bbotk] Evaluating 2 configuration(s)
INFO  [23:43:50.047] [mlr3] Running benchmark with 20 resampling iterations
INFO  [23:43:50.051] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:50.066] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:50.080] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:50.095] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:50.110] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:50.125] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:50.140] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:50.157] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:50.173] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:50.189] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:50.203] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:50.217] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:50.231] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:50.245] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:50.260] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:50.274] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:50.297] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:50.311] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:50.325] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:50.340] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:50.355] [mlr3] Finished benchmark
INFO  [23:43:50.388] [bbotk] Result of batch 4:
INFO  [23:43:50.390] [bbotk]          cp minsplit classif.acc warnings errors runtime_learners
INFO  [23:43:50.390] [bbotk]  0.03296924        3   0.7370984        0      0             0.09
INFO  [23:43:50.390] [bbotk]  0.06494301        2   0.7370984        0      0             0.10
INFO  [23:43:50.390] [bbotk]                                 uhash
INFO  [23:43:50.390] [bbotk]  a0089066-8b9f-4651-a5f0-52e8818cc023
INFO  [23:43:50.390] [bbotk]  957eee8c-fd20-4d66-85e4-9a94bed29ee0
INFO  [23:43:50.392] [bbotk] Evaluating 2 configuration(s)
INFO  [23:43:50.397] [mlr3] Running benchmark with 20 resampling iterations
INFO  [23:43:50.401] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:50.416] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:50.430] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:50.444] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:50.458] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:50.473] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:50.487] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:50.502] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:50.516] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:50.529] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:50.543] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [23:43:50.557] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [23:43:50.571] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [23:43:50.584] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [23:43:50.598] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [23:43:50.613] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [23:43:50.629] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [23:43:50.643] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [23:43:50.657] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [23:43:50.671] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [23:43:50.685] [mlr3] Finished benchmark
INFO  [23:43:50.721] [bbotk] Result of batch 5:
INFO  [23:43:50.722] [bbotk]         cp minsplit classif.acc warnings errors runtime_learners
INFO  [23:43:50.722] [bbotk]  0.0731832        6   0.7370984        0      0             0.06
INFO  [23:43:50.722] [bbotk]  0.0917563        7   0.7188312        0      0             0.07
INFO  [23:43:50.722] [bbotk]                                 uhash
INFO  [23:43:50.722] [bbotk]  1c5de2e6-289a-4229-ba7a-10249400bf41
INFO  [23:43:50.722] [bbotk]  87baed5c-6692-4131-800d-470425aa81af
INFO  [23:43:50.727] [bbotk] Finished optimizing after 10 evaluation(s)
INFO  [23:43:50.727] [bbotk] Result:
INFO  [23:43:50.728] [bbotk]          cp minsplit learner_param_vals  x_domain classif.acc
INFO  [23:43:50.728] [bbotk]       <num>    <int>             <list>    <list>       <num>
INFO  [23:43:50.728] [bbotk]  0.01926267        7          <list[3]> <list[2]>    0.747471

auto_tuner调参的结果可以直接用于预测新数据

代码
auto_learner$predict(task)
<PredictionClassif> for 768 observations:
    row_ids truth response
          1   pos      pos
          2   neg      neg
          3   pos      neg
---                       
        766   neg      neg
        767   pos      neg
        768   neg      neg

特征选择

当我们拿到一份数据进行构建模型时,有很多信息对于目标变量来说,其实是无效的,如果把这些变量用于建模,只会增加噪声,降低模型的表现。

去除无效、冗余的变量,选择合适的变量的过程,就被称为特征选择

mlr3可按照一些方法,将所有预测变量计算一个分数,然后按照分数对变量进行排名和筛选

查看支持的计算方法

代码
as.data.table(mlr_filters)
Key: <key>
                  key                                                    label
               <char>                                                   <char>
 1:             anova                                             ANOVA F-Test
 2:               auc                           Area Under the ROC Curve Score
 3:            boruta                                                   Burota
 4:          carscore                   Correlation-Adjusted coRrelation Score
 5:      carsurvscore          Correlation-Adjusted coRrelation Survival Score
 6:              cmim      Minimal Conditional Mutual Information Maximization
 7:       correlation                                              Correlation
 8:              disr                       Double Input Symmetrical Relevance
 9:  find_correlation                                  Correlation-based Score
10:        importance                                         Importance Score
11:  information_gain                                         Information Gain
12:               jmi                                 Joint Mutual Information
13:              jmim            Minimal Joint Mutual Information Maximization
14:      kruskal_test                                      Kruskal-Wallis Test
15:               mim                          Mutual Information Maximization
16:              mrmr                     Minimum Redundancy Maximal Relevancy
17:             njmim Minimal Normalised Joint Mutual Information Maximization
18:       performance                                   Predictive Performance
19:       permutation                                        Permutation Score
20:            relief                                                   RELIEF
21: selected_features                               Embedded Feature Selection
22:    univariate_cox                            Univariate Cox Survival Score
23:          variance                                                 Variance
                  key                                                    label
      task_types task_properties
          <list>          <list>
 1:      classif                
 2:      classif        twoclass
 3: regr,classif                
 4:         regr                
 5:         surv                
 6: classif,regr                
 7:         regr                
 8: classif,regr                
 9:           NA                
10:      classif                
11: classif,regr                
12: classif,regr                
13: classif,regr                
14:      classif                
15: classif,regr                
16: classif,regr                
17: classif,regr                
18:      classif                
19:      classif                
20: classif,regr                
21:      classif                
22:         surv                
23:           NA                
      task_types task_properties
                                                 params
                                                 <list>
 1:                                                    
 2:                                                    
 3: pValue,mcAdj,maxRuns,doTrace,holdHistory,getImp,...
 4:                             lambda,diagonal,verbose
 5:                                  maxIPCweight,denom
 6:                                             threads
 7:                                          use,method
 8:                                             threads
 9:                                          use,method
10:                                              method
11:                     type,equal,discIntegers,threads
12:                                             threads
13:                                             threads
14:                                           na.action
15:                                             threads
16:                                             threads
17:                                             threads
18:                                              method
19:                                     standardize,nmc
20:                          neighboursCount,sampleSize
21:                                              method
22:                                                    
23:                                               na.rm
                                                 params
                                           feature_types          packages
                                                  <list>            <list>
 1:                                      integer,numeric             stats
 2:                                      integer,numeric      mlr3measures
 3:                                      integer,numeric            Boruta
 4:                              logical,integer,numeric              care
 5:                                      integer,numeric carSurv,mlr3proba
 6:                       integer,numeric,factor,ordered           praznik
 7:                                      integer,numeric             stats
 8:                       integer,numeric,factor,ordered           praznik
 9:                                      integer,numeric             stats
10: logical,integer,numeric,character,factor,ordered,...              mlr3
11:                       integer,numeric,factor,ordered     FSelectorRcpp
12:                       integer,numeric,factor,ordered           praznik
13:                       integer,numeric,factor,ordered           praznik
14:                                      integer,numeric             stats
15:                       integer,numeric,factor,ordered           praznik
16:                       integer,numeric,factor,ordered           praznik
17:                       integer,numeric,factor,ordered           praznik
18: logical,integer,numeric,character,factor,ordered,... mlr3,mlr3measures
19: logical,integer,numeric,character,factor,ordered,... mlr3,mlr3measures
20:                       integer,numeric,factor,ordered     FSelectorRcpp
21: logical,integer,numeric,character,factor,ordered,...              mlr3
22:                              integer,numeric,logical          survival
23:                                      integer,numeric             stats
                                           feature_types          packages

计算分数

代码
filter <- flt("jmim")

task <- tsk("iris")

filter$calculate(task)

filter
<FilterJMIM:jmim>: Minimal Joint Mutual Information Maximization
Task Types: classif, regr
Properties: -
Task Properties: -
Packages: praznik
Feature types: integer, numeric, factor, ordered
        feature     score
1:  Petal.Width 1.0000000
2: Sepal.Length 0.6666667
3: Petal.Length 0.3333333
4:  Sepal.Width 0.0000000

根据相关性

代码
task <- tsk("mtcars")

filter_cor <- flt("correlation")


filter_cor$param_set
<ParamSet>
       id    class lower upper nlevels    default  value
   <char>   <char> <num> <num>   <int>     <list> <list>
1:    use ParamFct    NA    NA       5 everything       
2: method ParamFct    NA    NA       3    pearson       
代码
filter_cor$param_set$values <- list(method = "spearman")
filter_cor$param_set
<ParamSet>
       id    class lower upper nlevels    default    value
   <char>   <char> <num> <num>   <int>     <list>   <list>
1:    use ParamFct    NA    NA       5 everything         
2: method ParamFct    NA    NA       3    pearson spearman
代码
filter_cor$calculate(task)

filter_cor
<FilterCorrelation:correlation>: Correlation
Task Types: regr
Properties: missings
Task Properties: -
Packages: stats
Feature types: integer, numeric
    feature     score
 1:     cyl 0.9108013
 2:    disp 0.9088824
 3:      hp 0.8946646
 4:      wt 0.8864220
 5:      vs 0.7065968
 6:    carb 0.6574976
 7:    drat 0.6514555
 8:      am 0.5620057
 9:    gear 0.5427816
10:    qsec 0.4669358

计算变量重要性

代码
lrn <- lrn("classif.ranger", importance = "impurity")

task <- tsk("iris")
filter <- flt("importance", learner = lrn)
filter$calculate(task)
filter
<FilterImportance:importance>: Importance Score
Task Types: classif
Properties: -
Task Properties: -
Packages: mlr3, mlr3learners, ranger
Feature types: logical, integer, numeric, character, factor, ordered
        feature     score
1:  Petal.Width 44.005410
2: Petal.Length 43.545815
3: Sepal.Length  9.573771
4:  Sepal.Width  2.109720

组合方法

类似超参数调优,构建不同特征变量的模型,通过模型效果来选择

代码
library(mlr3fselect)

task <- tsk("pima")
learner <- lrn("classif.rpart")
hout <- rsmp("holdout")
measure <- msr("classif.ce")

evals10 <- trm("evals", n_evals = 20)

instance <- FSelectInstanceSingleCrit$new(
  task = task,
  learner = learner,
  resampling = hout,
  measure = measure,
  terminator = evals10
)
instance
<FSelectInstanceSingleCrit>
* State:  Not optimized
* Objective: <ObjectiveFSelect:classif.rpart_on_pima>
* Terminator: <TerminatorEvals>
代码
fselector <- fs("random_search")

lgr::get_logger("bbotk")$set_threshold("warn")

fselector$optimize(instance)
INFO  [23:43:51.198] [mlr3] Running benchmark with 10 resampling iterations
INFO  [23:43:51.203] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.215] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.226] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.237] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.249] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.259] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.270] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.281] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.291] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.303] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.314] [mlr3] Finished benchmark
INFO  [23:43:51.404] [mlr3] Running benchmark with 10 resampling iterations
INFO  [23:43:51.408] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.420] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.432] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.444] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.458] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.470] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.481] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.492] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.503] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.514] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [23:43:51.525] [mlr3] Finished benchmark
      age glucose insulin   mass pedigree pregnant pressure triceps
   <lgcl>  <lgcl>  <lgcl> <lgcl>   <lgcl>   <lgcl>   <lgcl>  <lgcl>
1:   TRUE    TRUE    TRUE   TRUE     TRUE     TRUE    FALSE    TRUE
                                         features n_features classif.ce
                                           <list>      <int>      <num>
1: age,glucose,insulin,mass,pedigree,pregnant,...          7  0.2382812
代码
# 查看选中特征
instance$result_feature_set
[1] "age"      "glucose"  "insulin"  "mass"     "pedigree" "pregnant" "triceps" 
代码
# 查看评估
instance$result_y
classif.ce 
 0.2382812 
代码
# 查看过程
as.data.table(instance$archive)
       age glucose insulin   mass pedigree pregnant pressure triceps classif.ce
    <lgcl>  <lgcl>  <lgcl> <lgcl>   <lgcl>   <lgcl>   <lgcl>  <lgcl>      <num>
 1:   TRUE    TRUE    TRUE   TRUE     TRUE     TRUE     TRUE    TRUE  0.2382812
 2:  FALSE    TRUE   FALSE  FALSE    FALSE    FALSE    FALSE    TRUE  0.3242188
 3:   TRUE    TRUE    TRUE  FALSE     TRUE    FALSE     TRUE    TRUE  0.3203125
 4:  FALSE   FALSE    TRUE   TRUE     TRUE    FALSE    FALSE    TRUE  0.2968750
 5:  FALSE    TRUE   FALSE  FALSE    FALSE    FALSE     TRUE   FALSE  0.3007812
 6:   TRUE    TRUE   FALSE   TRUE    FALSE    FALSE    FALSE    TRUE  0.2460938
 7:  FALSE    TRUE   FALSE   TRUE    FALSE    FALSE    FALSE   FALSE  0.2460938
 8:   TRUE    TRUE   FALSE  FALSE    FALSE    FALSE    FALSE   FALSE  0.3554688
 9:   TRUE    TRUE    TRUE   TRUE     TRUE     TRUE    FALSE    TRUE  0.2382812
10:   TRUE    TRUE    TRUE   TRUE     TRUE     TRUE     TRUE    TRUE  0.2382812
11:  FALSE    TRUE    TRUE  FALSE    FALSE    FALSE    FALSE   FALSE  0.3007812
12:   TRUE   FALSE   FALSE  FALSE    FALSE    FALSE    FALSE    TRUE  0.3906250
13:   TRUE    TRUE    TRUE  FALSE     TRUE    FALSE     TRUE    TRUE  0.3203125
14:   TRUE   FALSE    TRUE   TRUE     TRUE    FALSE     TRUE    TRUE  0.2851562
15:  FALSE    TRUE    TRUE  FALSE     TRUE    FALSE    FALSE   FALSE  0.2812500
16:  FALSE   FALSE   FALSE  FALSE    FALSE     TRUE    FALSE   FALSE  0.3632812
17:   TRUE    TRUE   FALSE  FALSE     TRUE    FALSE     TRUE   FALSE  0.3281250
18:  FALSE   FALSE   FALSE  FALSE     TRUE    FALSE    FALSE   FALSE  0.3359375
19:  FALSE   FALSE    TRUE   TRUE    FALSE    FALSE    FALSE   FALSE  0.3281250
20:   TRUE    TRUE   FALSE  FALSE    FALSE    FALSE    FALSE   FALSE  0.3554688
       age glucose insulin   mass pedigree pregnant pressure triceps classif.ce
    runtime_learners           timestamp batch_nr warnings errors
               <num>              <POSc>    <int>    <int>  <int>
 1:             0.00 2024-06-25 23:43:51        1        0      0
 2:             0.00 2024-06-25 23:43:51        1        0      0
 3:             0.01 2024-06-25 23:43:51        1        0      0
 4:             0.00 2024-06-25 23:43:51        1        0      0
 5:             0.00 2024-06-25 23:43:51        1        0      0
 6:             0.02 2024-06-25 23:43:51        1        0      0
 7:             0.00 2024-06-25 23:43:51        1        0      0
 8:             0.02 2024-06-25 23:43:51        1        0      0
 9:             0.01 2024-06-25 23:43:51        1        0      0
10:             0.00 2024-06-25 23:43:51        1        0      0
11:             0.00 2024-06-25 23:43:51        2        0      0
12:             0.00 2024-06-25 23:43:51        2        0      0
13:             0.01 2024-06-25 23:43:51        2        0      0
14:             0.00 2024-06-25 23:43:51        2        0      0
15:             0.00 2024-06-25 23:43:51        2        0      0
16:             0.01 2024-06-25 23:43:51        2        0      0
17:             0.02 2024-06-25 23:43:51        2        0      0
18:             0.00 2024-06-25 23:43:51        2        0      0
19:             0.00 2024-06-25 23:43:51        2        0      0
20:             0.01 2024-06-25 23:43:51        2        0      0
    runtime_learners           timestamp batch_nr warnings errors
                                          features n_features  resample_result
                                            <list>     <list>           <list>
 1: age,glucose,insulin,mass,pedigree,pregnant,...          8 <ResampleResult>
 2:                                glucose,triceps          2 <ResampleResult>
 3:  age,glucose,insulin,pedigree,pressure,triceps          6 <ResampleResult>
 4:                  insulin,mass,pedigree,triceps          4 <ResampleResult>
 5:                               glucose,pressure          2 <ResampleResult>
 6:                       age,glucose,mass,triceps          4 <ResampleResult>
 7:                                   glucose,mass          2 <ResampleResult>
 8:                                    age,glucose          2 <ResampleResult>
 9: age,glucose,insulin,mass,pedigree,pregnant,...          7 <ResampleResult>
10: age,glucose,insulin,mass,pedigree,pregnant,...          8 <ResampleResult>
11:                                glucose,insulin          2 <ResampleResult>
12:                                    age,triceps          2 <ResampleResult>
13:  age,glucose,insulin,pedigree,pressure,triceps          6 <ResampleResult>
14:     age,insulin,mass,pedigree,pressure,triceps          6 <ResampleResult>
15:                       glucose,insulin,pedigree          3 <ResampleResult>
16:                                       pregnant          1 <ResampleResult>
17:                  age,glucose,pedigree,pressure          4 <ResampleResult>
18:                                       pedigree          1 <ResampleResult>
19:                                   insulin,mass          2 <ResampleResult>
20:                                    age,glucose          2 <ResampleResult>
                                          features n_features  resample_result
代码
# 将选中变量应用于模型
task$select(instance$result_feature_set) # 只使用选中的变量
learner$train(task)

自动选择

代码
task <- tsk("penguins")
split <- partition(task, ratio = 0.8)

afs <- auto_fselector(
  fselector = fs("random_search"),
  learner = lrn("classif.rpart"),
  resampling = rsmp("holdout"),
  measure = msr("classif.ce"),
  term_evals = 4
)


afs$train(task, row_ids = split$train)
INFO  [23:43:51.746] [mlr3] Running benchmark with 10 resampling iterations
INFO  [23:43:51.750] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [23:43:51.759] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [23:43:51.768] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [23:43:51.776] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [23:43:51.784] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [23:43:51.793] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [23:43:51.802] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [23:43:51.810] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [23:43:51.818] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [23:43:51.826] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [23:43:51.834] [mlr3] Finished benchmark
代码
afs$predict(task, row_ids = split$test)
<PredictionClassif> for 69 observations:
    row_ids     truth  response
          5    Adelie    Adelie
          9    Adelie    Adelie
         13    Adelie    Adelie
---                            
        327 Chinstrap Chinstrap
        335 Chinstrap Chinstrap
        341 Chinstrap Chinstrap

pipelines

通过mlr3pipelines可以将数据预处理、建模、模型比较、集成学习结合在一起

基本用法

将数据预处理、特征选择结合

代码
graph <-
  po("imputehist", # 插补
    id = "impute_num", # 重命名步骤
    affect_columns = is.numeric # 设置处理哪些列
  ) %>>%
  po("imputeoor", id = "impute_fct", affect_columns = is.factor) %>>% # 插补因子
  po("filter", mlr3filters::flt("information_gain"),
    filter.frac = 0.95
  ) %>>%
  po("encode", method = "one-hot") %>>%
  po("learner", lrn("classif.rpart"))
graph$plot()

代码
task <- tsk("pima")
lrn_graph <- as_learner(graph)
lrn_graph$train(task)
pred <- lrn_graph$predict(task)
pred$confusion
        truth
response pos neg
     pos 207  73
     neg  61 427

分块训练

针对数据量较大时,可以将数据分块训练,最后将各块

代码
chks <- po("chunk", 4)
lrns <- ppl("greplicate", po("learner", lrn("classif.rpart")), 4)

mjv <- po("classifavg", 4)

pipeline <- chks %>>% lrns %>>% mjv
pipeline$plot(html = FALSE)

代码
task <- tsk("iris")
split <- partition(task, ratio = 0.7, stratify = T)

pipelrn <- as_learner(pipeline)
pipelrn$train(task, split$train)$
  predict(task, split$test)$
  score(msr("classif.acc"))
classif.acc 
  0.7777778 

装袋

代码
single_pred <- po("subsample", frac = 0.7) %>>%
  po("learner", lrn("classif.rpart")) # 建立一个模型

pred_set <- ppl("greplicate", single_pred, 10L) # 复制10次

bagging <- pred_set %>>%
  po("classifavg", innum = 10)

bagging$plot(html = FALSE)

代码
task <- tsk("iris")
split <- partition(task, ratio = 0.7, stratify = T)


baglrn <- as_learner(bagging)
baglrn$train(task, row_ids = split$train)
baglrn$predict(task, row_ids = split$test)$
  score(msr("classif.acc"))
classif.acc 
  0.9555556 

堆叠

代码
lrn <- lrn("classif.rpart")
lrn_0 <- po("learner_cv", lrn$clone())
lrn_0$id <- "rpart_cv"
level_0 <- gunion(list(lrn_0, po("nop")))
combined <- level_0 %>>% po("featureunion", 2)
stack <- combined %>>% po("learner", lrn$clone())
stack$plot(html = FALSE)

代码
stacklrn <- as_learner(stack)
stacklrn$train(task, split$train)
INFO  [23:43:53.801] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 1/3)
INFO  [23:43:53.812] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 2/3)
INFO  [23:43:53.823] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 3/3)
代码
stacklrn$predict(task, split$test)$
  score(msr("classif.acc"))
classif.acc 
  0.9555556 

复杂案例

代码
library("magrittr")
library("mlr3learners")

rprt <- lrn("classif.rpart", predict_type = "prob")
glmn <- lrn("classif.glmnet", predict_type = "prob")

# 创建学习器
lrn_0 <- po("learner_cv", rprt, id = "rpart_cv_1")
lrn_0$param_set$values$maxdepth <- 5L
lrn_1 <- po("pca", id = "pca1") %>>% po("learner_cv", rprt, id = "rpart_cv_2")
lrn_1$param_set$values$rpart_cv_2.maxdepth <- 1L
lrn_2 <- po("pca", id = "pca2") %>>% po("learner_cv", glmn)

# 第0层
level_0 <- gunion(list(lrn_0, lrn_1, lrn_2, po("nop", id = "NOP1")))

# 第1层
level_1 <- level_0 %>>%
  po("featureunion", 4) %>>%
  po("copy", 3) %>>%
  gunion(list(
    po("learner_cv", rprt, id = "rpart_cv_l1"),
    po("learner_cv", glmn, id = "glmnt_cv_l1"),
    po("nop", id = "NOP_l1")
  ))

# 第2层
level_2 <- level_1 %>>%
  po("featureunion", 3, id = "u2") %>>%
  po("learner", rprt, id = "rpart_l2")


level_2$plot(html = FALSE)

代码
task <- tsk("iris")
lrn <- as_learner(level_2)

lrn$train(task, split$train)$
  predict(task, split$test)$
  score(msr("classif.acc"))
INFO  [23:43:54.627] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 1/3)
INFO  [23:43:54.638] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 2/3)
INFO  [23:43:54.649] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 3/3)
INFO  [23:43:54.730] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 1/3)
INFO  [23:43:54.758] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 2/3)
INFO  [23:43:54.775] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 3/3)
INFO  [23:43:55.029] [mlr3] Applying learner 'classif.glmnet' on task 'iris' (iter 1/3)
INFO  [23:43:55.066] [mlr3] Applying learner 'classif.glmnet' on task 'iris' (iter 2/3)
INFO  [23:43:55.094] [mlr3] Applying learner 'classif.glmnet' on task 'iris' (iter 3/3)
INFO  [23:43:55.192] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 1/3)
INFO  [23:43:55.218] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 2/3)
INFO  [23:43:55.244] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 3/3)
INFO  [23:43:55.351] [mlr3] Applying learner 'classif.glmnet' on task 'iris' (iter 1/3)
INFO  [23:43:55.391] [mlr3] Applying learner 'classif.glmnet' on task 'iris' (iter 2/3)
INFO  [23:43:55.430] [mlr3] Applying learner 'classif.glmnet' on task 'iris' (iter 3/3)
classif.acc 
  0.9555556 
回到顶部