mlr3verse机器学习

R语言
机器学习
作者

不止BI

发布于

2024年4月4日

修改于

2024年6月11日

mlr3 是一个用于机器学习的R语言包。它是mlr包的下一代版本,旨在提供更强大、更灵活的机器学习功能。mlr3 提供了一种模块化的框架,使用户可以轻松地执行各种机器学习任务。

mlr3verse 是一组用于机器学习的R语言包的集合,旨在扩展mlr3框架的功能。这个集合包括了一系列用于数据预处理、特征选择、模型调优和评估的包,为用户提供了更多的工具和功能来进行机器学习任务。mlr3verse 的包括 mlr3mlr3learnersmlr3pipelinesmlr3tuningmlr3viz 等,每个包都有其特定的功能和用途,用户可以根据自己的需求选择适合的包来完成各种机器学习任务。

创建机器学习任务

mlr3 中需要将数据封装为 task 类,然后再进行机器学习的相关操作

代码
library(mlr3verse)
tsk_penguins <- as_task_classif(palmerpenguins::penguins,
  target = "species",
  id = "penguins"
)

tsk_penguins
<TaskClassif:penguins> (344 x 8)
* Target: species
* Properties: multiclass
* Features (7):
  - int (3): body_mass_g, flipper_length_mm, year
  - dbl (2): bill_depth_mm, bill_length_mm
  - fct (2): island, sex

常用task方法

代码
# 查看列结构
tsk_penguins$col_info
Key: <id>
                  id    type                  levels  label fix_factor_levels
              <char>  <char>                  <list> <char>            <lgcl>
1:          ..row_id integer                           <NA>             FALSE
2:     bill_depth_mm numeric                           <NA>             FALSE
3:    bill_length_mm numeric                           <NA>             FALSE
4:       body_mass_g integer                           <NA>             FALSE
5: flipper_length_mm integer                           <NA>             FALSE
6:            island  factor  Biscoe,Dream,Torgersen   <NA>             FALSE
7:               sex  factor             female,male   <NA>             FALSE
8:           species  factor Adelie,Chinstrap,Gentoo   <NA>             FALSE
9:              year integer                           <NA>             FALSE
代码
# 查看数据
tsk_penguins$data(1:5)
   species bill_depth_mm bill_length_mm body_mass_g flipper_length_mm    island
    <fctr>         <num>          <num>       <int>             <int>    <fctr>
1:  Adelie          18.7           39.1        3750               181 Torgersen
2:  Adelie          17.4           39.5        3800               186 Torgersen
3:  Adelie          18.0           40.3        3250               195 Torgersen
4:  Adelie            NA             NA          NA                NA Torgersen
5:  Adelie          19.3           36.7        3450               193 Torgersen
      sex  year
   <fctr> <int>
1:   male  2007
2: female  2007
3: female  2007
4:   <NA>  2007
5: female  2007
代码
# 查看缺失值
tsk_penguins$missings()
          species     bill_depth_mm    bill_length_mm       body_mass_g 
                0                 2                 2                 2 
flipper_length_mm            island               sex              year 
                2                 0                11                 0 
代码
tsk_penguins$formula()
species ~ .
NULL

数据预处理

mlr3的数据预处理是通过mlr3pipelines实现的。mlr3pipelines中的数据预处理的所有步骤都采用了字符来表示,比较难记忆,可以在官网查看说明或通过mlr_pipeops来查看可用操作及说明

代码
library(DT)
datatable(as.data.table(mlr_pipeops), options = list(
  language = list(url = "https://cdn.datatables.net/plug-ins/1.10.11/i18n/Chinese.json"),
  pageLength = 5
))

从上表可以看到,mlr3verse 提供了丰富的数据处理方法。使用mlr3内置的iris等任务,我们抽取部分常用的数据预处理流程进行示例

代码
task <- tsk("iris")
task
<TaskClassif:iris> (150 x 5): Iris Flowers
* Target: Species
* Properties: multiclass
* Features (4):
  - dbl (4): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width

中心化和标准化

mlr3pipelines 中所有的数据预处理步骤都是通过po函数实现,每一个 po 除了自己的数据预处理参数之外,可以额外指定一个id作为该步骤的唯一名称,方便之后定位该步骤。

affect_columns参数可以控制该步骤影响的列,可以通过selector_*族函数来快捷的选择列

  1. selector_all():选择所有变量。

  2. selector_none():不选择任何变量。

  3. selector_type(types):按照变量类型选择,比如字符型、数值型等。

  4. selector_grep(pattern, ignore.case = FALSE, perl = FALSE, fixed = FALSE):使用正则表达式选择变量。

  5. selector_name(feature_names, assert_present = FALSE):按照变量名选择。

  6. selector_invert(selector):反选变量,即删除指定的变量。

  7. selector_intersect(selector_x, selector_y):选择两个选择器的交集。

  8. selector_union(selector_x, selector_y):选择两个选择器的并集。

  9. selector_setdiff(selector_x, selector_y):选择两个选择器的差集。

  10. selector_missing():选择存在缺失值的变量。

  11. selector_cardinality_greater_than(min_cardinality):选择分类特征基数(唯一值的数量)大于某个值的变量。

代码
# 选择预处理步骤
pos <- po(
  "scale",
  center = T, # 中心化
  scale = F, # 标准化
  affect_columns = selector_name(c("Petal.Length", "Petal.Width", "Sepal.Length")), # 作用的变量
  id = "scale_iris" # 为该步骤命名
)
pos
PipeOp: <scale_iris> (not trained)
values: <robust=FALSE, center=TRUE, scale=FALSE, affect_columns=<Selector>>
Input channels <name [train type, predict type]>:
  input [Task,Task]
Output channels <name [train type, predict type]>:
  output [Task,Task]
代码
# 执行并提取处理后数据
pos$train(list(task))[[1]]$data()
       Species Petal.Length Petal.Width Sepal.Length Sepal.Width
        <fctr>        <num>       <num>        <num>       <num>
  1:    setosa       -2.358  -0.9993333  -0.74333333         3.5
  2:    setosa       -2.358  -0.9993333  -0.94333333         3.0
  3:    setosa       -2.458  -0.9993333  -1.14333333         3.2
  4:    setosa       -2.258  -0.9993333  -1.24333333         3.1
  5:    setosa       -2.358  -0.9993333  -0.84333333         3.6
 ---                                                            
146: virginica        1.442   1.1006667   0.85666667         3.0
147: virginica        1.242   0.7006667   0.45666667         2.5
148: virginica        1.442   0.8006667   0.65666667         3.0
149: virginica        1.642   1.1006667   0.35666667         3.4
150: virginica        1.342   0.6006667   0.05666667         3.0

缺失值处理

  • imputelearner:算法插补

  • imputemean:均值

  • imputemedian:中位数

  • imputeconstant:常数

  • imoutehist:直方图

  • imputemode:众数插补

  • imputesample:随机

均值插补

代码
task <- tsk("pima")
task$missings()
diabetes      age  glucose  insulin     mass pedigree pregnant pressure 
       0        0        5      374       11        0        0       35 
 triceps 
     227 
代码
# 决策树插补
po <- po("imputemean")
new_task <- po$train(list(task = task))[[1]]
new_task$missings()
diabetes      age pedigree pregnant  glucose  insulin     mass pressure 
       0        0        0        0        0        0        0        0 
 triceps 
       0 

算法插补

代码
task <- tsk("pima")
task$missings()
diabetes      age  glucose  insulin     mass pedigree pregnant pressure 
       0        0        5      374       11        0        0       35 
 triceps 
     227 
代码
# 决策树插补
po <- po("imputelearner", lrn("regr.rpart"))
new_task <- po$train(list(task = task))[[1]]
new_task$missings()
diabetes      age pedigree pregnant  glucose  insulin     mass pressure 
       0        0        0        0        0        0        0        0 
 triceps 
       0 

变量选择

代码
# task = mlr3::tsk("mtcars")
# filter = flt("find_correlation")
# filter$calculate(task)
# as.data.table(filter)
library(dplyr)
task <- tsk("mtcars")
pos <-
  # 去除高度相关的列
  po("filter",
    filter = mlr3filters::flt("find_correlation"),
    filter.cutoff = 0.1
  ) %>>%
  # 去掉常数
  po("removeconstants") %>>%
  # 去掉方差较小变量
  po("filter",
    filter = mlr3filters::flt("variance"),
    filter.frac = 0.5
  )
pos$train(task)[[1]]$data()
      mpg  carb  disp    hp  qsec
    <num> <num> <num> <num> <num>
 1:  21.0     4 160.0   110 16.46
 2:  21.0     4 160.0   110 17.02
 3:  22.8     1 108.0    93 18.61
 4:  21.4     1 258.0   110 19.44
 5:  18.7     2 360.0   175 17.02
 6:  18.1     1 225.0   105 20.22
 7:  14.3     4 360.0   245 15.84
 8:  24.4     2 146.7    62 20.00
 9:  22.8     2 140.8    95 22.90
10:  19.2     4 167.6   123 18.30
11:  17.8     4 167.6   123 18.90
12:  16.4     3 275.8   180 17.40
13:  17.3     3 275.8   180 17.60
14:  15.2     3 275.8   180 18.00
15:  10.4     4 472.0   205 17.98
16:  10.4     4 460.0   215 17.82
17:  14.7     4 440.0   230 17.42
18:  32.4     1  78.7    66 19.47
19:  30.4     2  75.7    52 18.52
20:  33.9     1  71.1    65 19.90
21:  21.5     1 120.1    97 20.01
22:  15.5     2 318.0   150 16.87
23:  15.2     2 304.0   150 17.30
24:  13.3     4 350.0   245 15.41
25:  19.2     2 400.0   175 17.05
26:  27.3     1  79.0    66 18.90
27:  26.0     2 120.3    91 16.70
28:  30.4     2  95.1   113 16.90
29:  15.8     4 351.0   264 14.50
30:  19.7     6 145.0   175 15.50
31:  15.0     8 301.0   335 14.60
32:  21.4     2 121.0   109 18.60
      mpg  carb  disp    hp  qsec

数据编码

独热编码

代码
data <- data.table::data.table(x = factor(letters[1:3]), y = factor(letters[1:3]))
task <- as_task_classif(data, target = "y")

poe <- po("encode", method = "one-hot")

# 默认 "one-hot"
poe$train(list(task))[[1]]$data()
        y   x.a   x.b   x.c
   <fctr> <num> <num> <num>
1:      a     1     0     0
2:      b     0     1     0
3:      c     0     0     1

treatment编码:

代码
poe$param_set$values$method <- "treatment"
poe$train(list(task))[[1]]$data()
        y   x.b   x.c
   <fctr> <num> <num>
1:      a     0     0
2:      b     1     0
3:      c     0     1

其他

不平衡数据

  • ratio:和参考类别相比的倍数;

  • reference:设置参考类别;

  • adjust:选择过采样还是欠采样

  • shuffle:是否对结果打乱顺序,默认TRUE

代码
data(hacide, package = "ROSE")

table(hacide.train$cls)

  0   1 
980  20 
代码
task <- as_task_classif(hacide.train, target = "cls")
pos <- po("classbalancing",
  ratio = 1,
  reference = "major",
  adjust = "all",
  shuffle = T
)
blanced <- pos$train(list(task))[[1]]$data()
table(blanced$cls)

  0   1 
980 980 

离散化

代码
task <- tsk("mtcars")
pos <- po("quantilebin", numsplits = 10, affect_columns = selector_name(c("disp", "hp")))

pos$train(list(task))[[1]]$data() %>% head()
     mpg       disp         hp    am  carb   cyl  drat  gear  qsec    vs    wt
   <num>      <ord>      <ord> <num> <num> <num> <num> <num> <num> <num> <num>
1:  21.0  (142,160]  (106,110]     1     4     6  3.90     4 16.46     0 2.620
2:  21.0  (142,160]  (106,110]     1     4     6  3.90     4 17.02     0 2.875
3:  22.8 (80.6,120]  (66,93.4]     1     1     4  3.85     4 18.61     1 2.320
4:  21.4  (196,276]  (106,110]     0     1     6  3.08     3 19.44     1 3.215
5:  18.7  (351,396]  (165,178]     0     2     8  3.15     3 17.02     0 3.440
6:  18.1  (196,276] (93.4,106]     0     1     6  2.76     3 20.22     1 3.460

一般流程

构建学习器

查看 mlr3 支持的算法

代码
mlr_learners
<DictionaryLearner> with 49 stored values
Keys: classif.cv_glmnet, classif.debug, classif.featureless,
  classif.glmnet, classif.kknn, classif.lda, classif.log_reg,
  classif.multinom, classif.naive_bayes, classif.nnet, classif.qda,
  classif.ranger, classif.rpart, classif.svm, classif.xgboost,
  clust.agnes, clust.ap, clust.cmeans, clust.cobweb, clust.dbscan,
  clust.dbscan_fpc, clust.diana, clust.em, clust.fanny,
  clust.featureless, clust.ff, clust.hclust, clust.hdbscan,
  clust.kkmeans, clust.kmeans, clust.MBatchKMeans, clust.mclust,
  clust.meanshift, clust.optics, clust.pam, clust.SimpleKMeans,
  clust.xmeans, regr.cv_glmnet, regr.debug, regr.featureless,
  regr.glmnet, regr.kknn, regr.km, regr.lm, regr.nnet, regr.ranger,
  regr.rpart, regr.svm, regr.xgboost

创建学习器

代码
learner <- lrn("classif.rpart")
learner
<LearnerClassifRpart:classif.rpart>: Classification Tree
* Model: -
* Parameters: xval=0
* Packages: mlr3, rpart
* Predict Types:  [response], prob
* Feature Types: logical, integer, numeric, factor, ordered
* Properties: importance, missings, multiclass, selected_features,
  twoclass, weights

查看学习器支持的超参数

代码
learner$param_set
<ParamSet>
                id    class lower upper nlevels
            <char>   <char> <num> <num>   <num>
 1:             cp ParamDbl     0     1     Inf
 2:     keep_model ParamLgl    NA    NA       2
 3:     maxcompete ParamInt     0   Inf     Inf
 4:       maxdepth ParamInt     1    30      30
 5:   maxsurrogate ParamInt     0   Inf     Inf
 6:      minbucket ParamInt     1   Inf     Inf
 7:       minsplit ParamInt     1   Inf     Inf
 8: surrogatestyle ParamInt     0     1       2
 9:   usesurrogate ParamInt     0     2       3
10:           xval ParamInt     0   Inf     Inf
                                                                                      default
                                                                                       <list>
 1:                                                                                      0.01
 2:                                                                                     FALSE
 3:                                                                                         4
 4:                                                                                        30
 5:                                                                                         5
 6: <NoDefault>\n  Public:\n    clone: function (deep = FALSE) \n    initialize: function () 
 7:                                                                                        20
 8:                                                                                         0
 9:                                                                                         2
10:                                                                                        10
     value
    <list>
 1:       
 2:       
 3:       
 4:       
 5:       
 6:       
 7:       
 8:       
 9:       
10:      0

设置学习器参数

代码
learner <- lrn("classif.rpart", xval = 0, cp = 0.001)

划分数据集

stratify设置分层抽样

代码
task <- tsk("penguins") # 使用内置数据集
split <- partition(task, ratio = 0.6, stratify = T)

训练数据

代码
learner$train(task, row_ids = split$train)

预测

代码
prediction <- learner$predict(task, row_ids = split$test)
print(prediction)
<PredictionClassif> for 138 observations:
    row_ids     truth  response
          1    Adelie    Adelie
          3    Adelie    Adelie
          4    Adelie    Adelie
---                            
        337 Chinstrap Chinstrap
        338 Chinstrap Chinstrap
        344 Chinstrap Chinstrap

评估模型

代码
prediction$confusion
           truth
response    Adelie Chinstrap Gentoo
  Adelie        61         3      0
  Chinstrap      0        24      2
  Gentoo         0         0     48
代码
autoplot(prediction)

代码
# 查看支持的指标
# msrs()
measures <- msrs(c("classif.acc", "classif.ce"))
prediction$score(measures)
classif.acc  classif.ce 
 0.96376812  0.03623188 

进阶用法

模型比较

使用 benchmark 可以同时进行多个任务、多个模型、多重抽样方法的模型比较

代码
design <- benchmark_grid(
  tasks = tsks(c("spam", "german_credit", "sonar")),
  learners = lrns(c("classif.ranger", "classif.rpart", "classif.featureless"), predict_type = "prob"),
  resamplings = rsmps(c("holdout", "cv"))
)
print(design)
             task             learner resampling
           <char>              <char>     <char>
 1:          spam      classif.ranger    holdout
 2:          spam      classif.ranger         cv
 3:          spam       classif.rpart    holdout
 4:          spam       classif.rpart         cv
 5:          spam classif.featureless    holdout
 6:          spam classif.featureless         cv
 7: german_credit      classif.ranger    holdout
 8: german_credit      classif.ranger         cv
 9: german_credit       classif.rpart    holdout
10: german_credit       classif.rpart         cv
11: german_credit classif.featureless    holdout
12: german_credit classif.featureless         cv
13:         sonar      classif.ranger    holdout
14:         sonar      classif.ranger         cv
15:         sonar       classif.rpart    holdout
16:         sonar       classif.rpart         cv
17:         sonar classif.featureless    holdout
18:         sonar classif.featureless         cv

Holdout和CV都是用于评估机器学习模型性能的方法,但两者之间存在一些关键差异。

Holdout

Holdout方法是将数据集划分为训练集和测试集,其中训练集用于训练模型,测试集用于评估模型性能。训练集和测试集的大小通常是固定的,例如70%的训练集和30%的测试集。

Holdout方法简单易用,但存在以下缺点:

  • 训练集和测试集的划分方式可能会影响模型性能评估结果。例如,如果训练集和测试集的分布不一致,则模型性能评估结果可能不准确。

  • Holdout方法只使用了一部分数据来训练模型,因此模型性能评估结果可能不够可靠。

CV

CV方法将数据集划分为多个子集,每个子集轮流作为训练集和测试集。这样可以使每个数据点都有机会被用作训练集和测试集,从而提高模型性能评估结果的可靠性。

常用的CV方法包括k折交叉验证和留一交叉验证。k折交叉验证将数据集划分为k个子集,每个子集轮流作为测试集,其余k-1个子集作为训练集。留一交叉验证将数据集划分为n个子集,其中每个子集包含一个数据点,每个数据点单独作为测试集,其余n-1个数据点作为训练集。

CV方法比Holdout方法更复杂,但具有以下优点:

  • CV方法可以充分利用数据,提高模型性能评估结果的可靠性。

  • CV方法可以用于选择最佳的超参数。

代码
bmr <- benchmark(design, store_models = T)
INFO  [11:58:33.785] [mlr3] Running benchmark with 99 resampling iterations
INFO  [11:58:33.829] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 1/1)
INFO  [11:58:35.345] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 1/10)
INFO  [11:58:36.419] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 2/10)
INFO  [11:58:37.694] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 3/10)
INFO  [11:58:38.746] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 4/10)
INFO  [11:58:40.047] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 5/10)
INFO  [11:58:41.111] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 6/10)
INFO  [11:58:42.148] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 7/10)
INFO  [11:58:43.456] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 8/10)
INFO  [11:58:44.498] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 9/10)
INFO  [11:58:45.532] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 10/10)
INFO  [11:58:46.569] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 1/1)
INFO  [11:58:46.922] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 1/10)
INFO  [11:58:46.977] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 2/10)
INFO  [11:58:47.033] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 3/10)
INFO  [11:58:47.094] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 4/10)
INFO  [11:58:47.151] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 5/10)
INFO  [11:58:47.207] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 6/10)
INFO  [11:58:47.268] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 7/10)
INFO  [11:58:47.324] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 8/10)
INFO  [11:58:47.379] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 9/10)
INFO  [11:58:47.440] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 10/10)
INFO  [11:58:47.497] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 1/1)
INFO  [11:58:47.508] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 1/10)
INFO  [11:58:47.519] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 2/10)
INFO  [11:58:47.530] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 3/10)
INFO  [11:58:47.540] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 4/10)
INFO  [11:58:47.551] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 5/10)
INFO  [11:58:47.562] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 6/10)
INFO  [11:58:47.573] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 7/10)
INFO  [11:58:47.585] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 8/10)
INFO  [11:58:47.596] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 9/10)
INFO  [11:58:47.607] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 10/10)
INFO  [11:58:47.619] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 1/1)
INFO  [11:58:47.764] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 1/10)
INFO  [11:58:47.937] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 2/10)
INFO  [11:58:48.113] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 3/10)
INFO  [11:58:48.284] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 4/10)
INFO  [11:58:48.452] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 5/10)
INFO  [11:58:48.620] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 6/10)
INFO  [11:58:49.100] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 7/10)
INFO  [11:58:49.274] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 8/10)
INFO  [11:58:49.447] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 9/10)
INFO  [11:58:49.617] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 10/10)
INFO  [11:58:49.786] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 1/1)
INFO  [11:58:49.802] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 1/10)
INFO  [11:58:49.819] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 2/10)
INFO  [11:58:49.835] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 3/10)
INFO  [11:58:49.855] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 4/10)
INFO  [11:58:49.876] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 5/10)
INFO  [11:58:49.896] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 6/10)
INFO  [11:58:49.916] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 7/10)
INFO  [11:58:49.936] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 8/10)
INFO  [11:58:49.955] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 9/10)
INFO  [11:58:49.975] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 10/10)
INFO  [11:58:50.003] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 1/1)
INFO  [11:58:50.011] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 1/10)
INFO  [11:58:50.020] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 2/10)
INFO  [11:58:50.028] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 3/10)
INFO  [11:58:50.036] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 4/10)
INFO  [11:58:50.045] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 5/10)
INFO  [11:58:50.053] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 6/10)
INFO  [11:58:50.061] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 7/10)
INFO  [11:58:50.069] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 8/10)
INFO  [11:58:50.078] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 9/10)
INFO  [11:58:50.086] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 10/10)
INFO  [11:58:50.095] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 1/1)
INFO  [11:58:50.148] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 1/10)
INFO  [11:58:50.216] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 2/10)
INFO  [11:58:50.279] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 3/10)
INFO  [11:58:50.345] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 4/10)
INFO  [11:58:50.410] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 5/10)
INFO  [11:58:50.478] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 6/10)
INFO  [11:58:50.562] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 7/10)
INFO  [11:58:50.629] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 8/10)
INFO  [11:58:50.694] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 9/10)
INFO  [11:58:50.758] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 10/10)
INFO  [11:58:50.823] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 1/1)
INFO  [11:58:50.838] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 1/10)
INFO  [11:58:50.856] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 2/10)
INFO  [11:58:50.874] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 3/10)
INFO  [11:58:50.890] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 4/10)
INFO  [11:58:50.907] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 5/10)
INFO  [11:58:51.258] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 6/10)
INFO  [11:58:51.275] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 7/10)
INFO  [11:58:51.291] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 8/10)
INFO  [11:58:51.307] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 9/10)
INFO  [11:58:51.323] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 10/10)
INFO  [11:58:51.339] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 1/1)
INFO  [11:58:51.346] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 1/10)
INFO  [11:58:51.355] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 2/10)
INFO  [11:58:51.363] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 3/10)
INFO  [11:58:51.371] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 4/10)
INFO  [11:58:51.379] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 5/10)
INFO  [11:58:51.387] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 6/10)
INFO  [11:58:51.395] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 7/10)
INFO  [11:58:51.403] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 8/10)
INFO  [11:58:51.411] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 9/10)
INFO  [11:58:51.419] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 10/10)
INFO  [11:58:51.466] [mlr3] Finished benchmark
代码
measures <- msrs(c("classif.acc", "classif.mcc"))

bmr$aggregate(measures)
       nr       task_id          learner_id resampling_id iters classif.acc
    <int>        <char>              <char>        <char> <int>       <num>
 1:     1          spam      classif.ranger       holdout     1   0.9498044
 2:     2          spam      classif.ranger            cv    10   0.9517467
 3:     3          spam       classif.rpart       holdout     1   0.9087353
 4:     4          spam       classif.rpart            cv    10   0.8902367
 5:     5          spam classif.featureless       holdout     1   0.5906128
 6:     6          spam classif.featureless            cv    10   0.6059483
 7:     7 german_credit      classif.ranger       holdout     1   0.7477477
 8:     8 german_credit      classif.ranger            cv    10   0.7660000
 9:     9 german_credit       classif.rpart       holdout     1   0.7027027
10:    10 german_credit       classif.rpart            cv    10   0.7440000
11:    11 german_credit classif.featureless       holdout     1   0.7087087
12:    12 german_credit classif.featureless            cv    10   0.7000000
13:    13         sonar      classif.ranger       holdout     1   0.7971014
14:    14         sonar      classif.ranger            cv    10   0.8409524
15:    15         sonar       classif.rpart       holdout     1   0.7681159
16:    16         sonar       classif.rpart            cv    10   0.7409524
17:    17         sonar classif.featureless       holdout     1   0.4927536
18:    18         sonar classif.featureless            cv    10   0.5335714
    classif.mcc
          <num>
 1:   0.8961122
 2:   0.8989061
 3:   0.8110336
 4:   0.7693295
 5:   0.0000000
 6:   0.0000000
 7:   0.3376976
 8:   0.3921118
 9:   0.1888653
10:   0.3472153
11:   0.0000000
12:   0.0000000
13:   0.5941176
14:   0.6989740
15:   0.5620063
16:   0.4877863
17:   0.0000000
18:   0.0000000
Hidden columns: resample_result
代码
library(ggplot2)
autoplot(bmr)

超参数调优

机器学习模型在实际应用中,往往会遇到性能不佳的问题。机器学习的模型都有默认的超参数,但默认的超参数并不一定最适合你的模型,在这种情况下,就需要进行超参数调优。

mlr3包含自动调参的策略,自动调参需要指定以下信息:

  • 搜索空间:指模型超参数取值的范围。

  • 优化算法:指用于搜索最优解的算法。

  • 评估方法:指用于评估模型性能的方法。

  • 评价指标:指用于衡量模型性能的指标

调优示例

代码
library(mlr3verse)
task = tsk('pima')
learner = lrn('classif.rpart')
# 查看算法支持的超参数
learner$param_set
<ParamSet>
                id    class lower upper nlevels
            <char>   <char> <num> <num>   <num>
 1:             cp ParamDbl     0     1     Inf
 2:     keep_model ParamLgl    NA    NA       2
 3:     maxcompete ParamInt     0   Inf     Inf
 4:       maxdepth ParamInt     1    30      30
 5:   maxsurrogate ParamInt     0   Inf     Inf
 6:      minbucket ParamInt     1   Inf     Inf
 7:       minsplit ParamInt     1   Inf     Inf
 8: surrogatestyle ParamInt     0     1       2
 9:   usesurrogate ParamInt     0     2       3
10:           xval ParamInt     0   Inf     Inf
                                                                                      default
                                                                                       <list>
 1:                                                                                      0.01
 2:                                                                                     FALSE
 3:                                                                                         4
 4:                                                                                        30
 5:                                                                                         5
 6: <NoDefault>\n  Public:\n    clone: function (deep = FALSE) \n    initialize: function () 
 7:                                                                                        20
 8:                                                                                         0
 9:                                                                                         2
10:                                                                                        10
     value
    <list>
 1:       
 2:       
 3:       
 4:       
 5:       
 6:       
 7:       
 8:       
 9:       
10:      0

设置调参空间

代码
search_space = ps(
  cp = p_dbl(lower = 0.001,upper = 0.1), # 复杂度参数
  minsplit = p_int(lower = 1,upper = 10)
)
search_space
<ParamSet>
         id    class lower upper nlevels
     <char>   <char> <num> <num>   <num>
1:       cp ParamDbl 0.001   0.1     Inf
2: minsplit ParamInt 1.000  10.0      10
                                                                                     default
                                                                                      <list>
1: <NoDefault>\n  Public:\n    clone: function (deep = FALSE) \n    initialize: function () 
2: <NoDefault>\n  Public:\n    clone: function (deep = FALSE) \n    initialize: function () 
    value
   <list>
1:       
2:       

设置重抽样方法和性能指标

代码
cv = rsmp('cv')
measures = msrs(c("classif.ce","time_train",'classif.acc'))

代码
library(mlr3tuning)
# 设置终止条件为10轮后停止,通过mlr_terminators可以查看支持的其他终止条件,比如run_time可以时长

instance <- tune(
  tuner = tnr("grid_search",  resolution = 5, batch_size = 2),
  task = task,
  learner = learner,
  resampling = cv,
  measure = measures,
  search_space = search_space,
  term_evals = 10
)
INFO  [11:58:53.292] [bbotk] Starting to optimize 2 parameter(s) with '<TunerGridSearch>' and '<TerminatorEvals> [n_evals=10, k=0]'
INFO  [11:58:53.295] [bbotk] Evaluating 2 configuration(s)
INFO  [11:58:53.302] [mlr3] Running benchmark with 20 resampling iterations
INFO  [11:58:53.306] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:53.322] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:53.337] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:53.351] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:53.368] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:53.387] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:53.404] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:53.420] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:53.438] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:53.455] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:53.472] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:53.488] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:53.504] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:53.521] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:53.537] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:53.554] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:53.570] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:53.586] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:53.602] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:53.618] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:53.635] [mlr3] Finished benchmark
INFO  [11:58:53.773] [bbotk] Result of batch 1:
INFO  [11:58:53.775] [bbotk]     cp minsplit classif.ce time_train classif.acc warnings errors
INFO  [11:58:53.775] [bbotk]  0.001        5  0.2878332      0.000   0.7121668        0      0
INFO  [11:58:53.775] [bbotk]  0.100        5  0.2643882      0.006   0.7356118        0      0
INFO  [11:58:53.775] [bbotk]  runtime_learners                                uhash
INFO  [11:58:53.775] [bbotk]              0.08 4dfa04c1-efa6-43e9-a7bb-d78c9f2983c2
INFO  [11:58:53.775] [bbotk]              0.11 091311c6-610c-480f-8c24-72e5abdff0dd
INFO  [11:58:53.776] [bbotk] Evaluating 2 configuration(s)
INFO  [11:58:53.781] [mlr3] Running benchmark with 20 resampling iterations
INFO  [11:58:53.785] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:53.798] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:53.812] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:53.825] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:53.839] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:53.855] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:53.872] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:53.895] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:53.909] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:53.923] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:53.937] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:53.950] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:53.963] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:53.976] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:53.990] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:54.003] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:54.016] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:54.029] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:54.042] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:54.055] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:54.071] [mlr3] Finished benchmark
INFO  [11:58:54.210] [bbotk] Result of batch 2:
INFO  [11:58:54.212] [bbotk]       cp minsplit classif.ce time_train classif.acc warnings errors
INFO  [11:58:54.212] [bbotk]  0.02575        3  0.2540157      0.003   0.7459843        0      0
INFO  [11:58:54.212] [bbotk]  0.10000        1  0.2643882      0.002   0.7356118        0      0
INFO  [11:58:54.212] [bbotk]  runtime_learners                                uhash
INFO  [11:58:54.212] [bbotk]              0.06 19c97e3a-d62a-4734-8f41-2de444f8d075
INFO  [11:58:54.212] [bbotk]              0.06 92ad6388-43ef-461d-9c16-f22fad542e25
INFO  [11:58:54.213] [bbotk] Evaluating 2 configuration(s)
INFO  [11:58:54.220] [mlr3] Running benchmark with 20 resampling iterations
INFO  [11:58:54.224] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:54.239] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:54.253] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:54.267] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:54.282] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:54.296] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:54.310] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:54.323] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:54.337] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:54.350] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:54.364] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:54.377] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:54.390] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:54.403] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:54.416] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:54.430] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:54.443] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:54.456] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:54.477] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:54.491] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:54.506] [mlr3] Finished benchmark
INFO  [11:58:54.643] [bbotk] Result of batch 3:
INFO  [11:58:54.644] [bbotk]       cp minsplit classif.ce time_train classif.acc warnings errors
INFO  [11:58:54.644] [bbotk]  0.02575        5  0.2540157      0.004   0.7459843        0      0
INFO  [11:58:54.644] [bbotk]  0.10000        3  0.2643882      0.003   0.7356118        0      0
INFO  [11:58:54.644] [bbotk]  runtime_learners                                uhash
INFO  [11:58:54.644] [bbotk]              0.07 9234cfca-25f4-45bc-aa8c-fdd686482ffe
INFO  [11:58:54.644] [bbotk]              0.10 6753a206-77fd-48d9-9779-0ef3097834f0
INFO  [11:58:54.645] [bbotk] Evaluating 2 configuration(s)
INFO  [11:58:54.650] [mlr3] Running benchmark with 20 resampling iterations
INFO  [11:58:54.654] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:54.669] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:54.683] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:54.697] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:54.712] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:54.726] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:54.740] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:54.753] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:54.766] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:54.780] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:54.794] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:54.808] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:54.822] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:54.836] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:54.852] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:54.867] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:54.881] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:54.895] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:54.909] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:54.925] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:54.940] [mlr3] Finished benchmark
INFO  [11:58:55.086] [bbotk] Result of batch 4:
INFO  [11:58:55.088] [bbotk]       cp minsplit classif.ce time_train classif.acc warnings errors
INFO  [11:58:55.088] [bbotk]  0.07525        5  0.2514012      0.008   0.7485988        0      0
INFO  [11:58:55.088] [bbotk]  0.07525       10  0.2514012      0.000   0.7485988        0      0
INFO  [11:58:55.088] [bbotk]  runtime_learners                                uhash
INFO  [11:58:55.088] [bbotk]              0.08 906c74c7-8bef-4b8e-8b54-12af0a6a3b7f
INFO  [11:58:55.088] [bbotk]              0.04 d52ce222-14cc-4c65-9202-cf8d15ec6547
INFO  [11:58:55.088] [bbotk] Evaluating 2 configuration(s)
INFO  [11:58:55.093] [mlr3] Running benchmark with 20 resampling iterations
INFO  [11:58:55.097] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:55.112] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:55.126] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:55.141] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:55.155] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:55.169] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:55.183] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:55.200] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:55.216] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:55.231] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:55.245] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:55.258] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:55.271] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:55.284] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:55.297] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:55.310] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:55.323] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:55.336] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:55.350] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:55.363] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:55.377] [mlr3] Finished benchmark
INFO  [11:58:55.517] [bbotk] Result of batch 5:
INFO  [11:58:55.518] [bbotk]     cp minsplit classif.ce time_train classif.acc warnings errors
INFO  [11:58:55.518] [bbotk]  0.001        1  0.3008202      0.005   0.6991798        0      0
INFO  [11:58:55.518] [bbotk]  0.100        8  0.2643882      0.005   0.7356118        0      0
INFO  [11:58:55.518] [bbotk]  runtime_learners                                uhash
INFO  [11:58:55.518] [bbotk]              0.11 9a5f38ba-76e6-4276-bfe7-d13c59189357
INFO  [11:58:55.518] [bbotk]              0.06 3b580416-a657-4364-bde8-cfa02c05383b
INFO  [11:58:55.522] [bbotk] Finished optimizing after 10 evaluation(s)
INFO  [11:58:55.522] [bbotk] Result:
INFO  [11:58:55.524] [bbotk]       cp minsplit learner_param_vals  x_domain classif.ce time_train
INFO  [11:58:55.524] [bbotk]    <num>    <int>             <list>    <list>      <num>      <num>
INFO  [11:58:55.524] [bbotk]  0.07525       10          <list[3]> <list[2]>  0.2514012          0
INFO  [11:58:55.524] [bbotk]  classif.acc
INFO  [11:58:55.524] [bbotk]        <num>
INFO  [11:58:55.524] [bbotk]    0.7485988
代码
# 设置搜索方法:grid_search为网格搜索,random_search 为随机搜索
# 注意这里设置的resolution = 5,表示会基于cp和minsplit设置5*5的均匀网格搜索,正常会搜索25个组合的参数,但是由于前面我们设置了最大轮次为10轮,所以10次就结束了
instance
<TuningInstanceMultiCrit>
* State:  Optimized
* Objective: <ObjectiveTuning:classif.rpart_on_pima>
* Search Space:
         id    class lower upper nlevels
     <char>   <char> <num> <num>   <num>
1:       cp ParamDbl 0.001   0.1     Inf
2: minsplit ParamInt 1.000  10.0      10
* Terminator: <TerminatorEvals>
* Result:
        cp minsplit classif.ce time_train classif.acc
     <num>    <int>      <num>      <num>       <num>
1: 0.07525       10  0.2514012          0   0.7485988
* Archive:
         cp minsplit classif.ce time_train classif.acc
      <num>    <int>      <num>      <num>       <num>
 1: 0.00100        5  0.2878332      0.000   0.7121668
 2: 0.10000        5  0.2643882      0.006   0.7356118
 3: 0.02575        3  0.2540157      0.003   0.7459843
 4: 0.10000        1  0.2643882      0.002   0.7356118
 5: 0.02575        5  0.2540157      0.004   0.7459843
 6: 0.10000        3  0.2643882      0.003   0.7356118
 7: 0.07525        5  0.2514012      0.008   0.7485988
 8: 0.07525       10  0.2514012      0.000   0.7485988
 9: 0.00100        1  0.3008202      0.005   0.6991798
10: 0.10000        8  0.2643882      0.005   0.7356118

查看调参结果

代码
instance
<TuningInstanceMultiCrit>
* State:  Optimized
* Objective: <ObjectiveTuning:classif.rpart_on_pima>
* Search Space:
         id    class lower upper nlevels
     <char>   <char> <num> <num>   <num>
1:       cp ParamDbl 0.001   0.1     Inf
2: minsplit ParamInt 1.000  10.0      10
* Terminator: <TerminatorEvals>
* Result:
        cp minsplit classif.ce time_train classif.acc
     <num>    <int>      <num>      <num>       <num>
1: 0.07525       10  0.2514012          0   0.7485988
* Archive:
         cp minsplit classif.ce time_train classif.acc
      <num>    <int>      <num>      <num>       <num>
 1: 0.00100        5  0.2878332      0.000   0.7121668
 2: 0.10000        5  0.2643882      0.006   0.7356118
 3: 0.02575        3  0.2540157      0.003   0.7459843
 4: 0.10000        1  0.2643882      0.002   0.7356118
 5: 0.02575        5  0.2540157      0.004   0.7459843
 6: 0.10000        3  0.2643882      0.003   0.7356118
 7: 0.07525        5  0.2514012      0.008   0.7485988
 8: 0.07525       10  0.2514012      0.000   0.7485988
 9: 0.00100        1  0.3008202      0.005   0.6991798
10: 0.10000        8  0.2643882      0.005   0.7356118

将训练好的参数应用于模型,重新训练数据

查看调整好的参数

代码
instance$result_learner_param_vals
[[1]]
[[1]]$xval
[1] 0

[[1]]$cp
[1] 0.07525

[[1]]$minsplit
[1] 10

模型性能

代码
instance$result_y
   classif.ce time_train classif.acc
        <num>      <num>       <num>
1:  0.2514012          0   0.7485988

将调优选择参数应用回模型

代码
learner$param_set$values <- instance$result_learner_param_vals[[1]]
learner$train(task)

pred = learner$predict(task)

pred$confusion
        truth
response pos neg
     pos 150  58
     neg 118 442
代码
pred$score(msr('classif.acc'))
classif.acc 
  0.7708333 

auto_learner可以直接返回最优的那个模型

代码
task <- tsk("pima") 
leanrer <- lrn("classif.rpart") 
search_space <- ps(
  cp = p_dbl(0.001, 0.1),
  minsplit = p_int(1,10)
) 
cv = rsmp('cv')
measures = msr('classif.acc')

auto_learner <- auto_tuner(
tuner = tnr("random_search", batch_size = 2),
  learner = learner,
  resampling = cv,
  measure = measures,
  search_space = search_space,
  term_evals = 10
)
auto_learner$train(task)
INFO  [11:58:55.720] [bbotk] Starting to optimize 2 parameter(s) with '<OptimizerRandomSearch>' and '<TerminatorEvals> [n_evals=10, k=0]'
INFO  [11:58:55.729] [bbotk] Evaluating 2 configuration(s)
INFO  [11:58:55.734] [mlr3] Running benchmark with 20 resampling iterations
INFO  [11:58:55.738] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:55.752] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:55.766] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:55.779] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:55.792] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:55.805] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:55.819] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:55.832] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:55.846] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:55.861] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:55.875] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:55.889] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:55.902] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:55.914] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:55.928] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:55.941] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:55.954] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:55.967] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:55.979] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:55.992] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:56.006] [mlr3] Finished benchmark
INFO  [11:58:56.036] [bbotk] Result of batch 1:
INFO  [11:58:56.038] [bbotk]          cp minsplit classif.acc warnings errors runtime_learners
INFO  [11:58:56.038] [bbotk]  0.04293148        3   0.7434381        0      0             0.07
INFO  [11:58:56.038] [bbotk]  0.09595728        7   0.7265550        0      0             0.04
INFO  [11:58:56.038] [bbotk]                                 uhash
INFO  [11:58:56.038] [bbotk]  123e5148-854f-4375-aff2-c6a0eb48bb69
INFO  [11:58:56.038] [bbotk]  ba1b0a22-9ac6-436b-9a95-91f4f072cd41
INFO  [11:58:56.040] [bbotk] Evaluating 2 configuration(s)
INFO  [11:58:56.045] [mlr3] Running benchmark with 20 resampling iterations
INFO  [11:58:56.049] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:56.063] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:56.077] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:56.090] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:56.103] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:56.117] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:56.130] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:56.145] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:56.158] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:56.172] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:56.186] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:56.200] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:56.216] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:56.230] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:56.244] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:56.267] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:56.281] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:56.295] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:56.308] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:56.321] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:56.335] [mlr3] Finished benchmark
INFO  [11:58:56.369] [bbotk] Result of batch 2:
INFO  [11:58:56.370] [bbotk]          cp minsplit classif.acc warnings errors runtime_learners
INFO  [11:58:56.370] [bbotk]  0.08062288        2    0.739542        0      0             0.11
INFO  [11:58:56.370] [bbotk]  0.08577322        9    0.739542        0      0             0.04
INFO  [11:58:56.370] [bbotk]                                 uhash
INFO  [11:58:56.370] [bbotk]  65cee486-1b24-4bd2-93fe-b3f2ae83c0f2
INFO  [11:58:56.370] [bbotk]  199449db-2685-46e4-8374-d26a78d36aee
INFO  [11:58:56.372] [bbotk] Evaluating 2 configuration(s)
INFO  [11:58:56.377] [mlr3] Running benchmark with 20 resampling iterations
INFO  [11:58:56.382] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:56.396] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:56.409] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:56.422] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:56.436] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:56.449] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:56.462] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:56.477] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:56.491] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:56.504] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:56.518] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:56.532] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:56.546] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:56.560] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:56.574] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:56.589] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:56.603] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:56.617] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:56.630] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:56.643] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:56.658] [mlr3] Finished benchmark
INFO  [11:58:56.690] [bbotk] Result of batch 3:
INFO  [11:58:56.691] [bbotk]          cp minsplit classif.acc warnings errors runtime_learners
INFO  [11:58:56.691] [bbotk]  0.03510606       10   0.7434381        0      0             0.09
INFO  [11:58:56.691] [bbotk]  0.06361289       10   0.7395420        0      0             0.08
INFO  [11:58:56.691] [bbotk]                                 uhash
INFO  [11:58:56.691] [bbotk]  6ef18e95-cae4-41bf-b8e9-84dbcecdfed1
INFO  [11:58:56.691] [bbotk]  7d28e3a7-fa7a-43d8-8f93-bdc3e2f317e9
INFO  [11:58:56.693] [bbotk] Evaluating 2 configuration(s)
INFO  [11:58:56.699] [mlr3] Running benchmark with 20 resampling iterations
INFO  [11:58:56.703] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:56.717] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:56.730] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:56.743] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:56.757] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:56.770] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:56.783] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:56.796] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:56.810] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:56.823] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:56.836] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:56.851] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:56.866] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:56.880] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:56.894] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:56.916] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:56.929] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:56.943] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:56.956] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:56.969] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:56.983] [mlr3] Finished benchmark
INFO  [11:58:57.014] [bbotk] Result of batch 4:
INFO  [11:58:57.015] [bbotk]          cp minsplit classif.acc warnings errors runtime_learners
INFO  [11:58:57.015] [bbotk]  0.07434293        8    0.739542        0      0             0.07
INFO  [11:58:57.015] [bbotk]  0.07073870        7    0.739542        0      0             0.05
INFO  [11:58:57.015] [bbotk]                                 uhash
INFO  [11:58:57.015] [bbotk]  9e3c62ee-a9a4-4b58-80ca-edb7305c1b48
INFO  [11:58:57.015] [bbotk]  6d75f6eb-14cf-4b2b-b7e3-3c49ced4f650
INFO  [11:58:57.017] [bbotk] Evaluating 2 configuration(s)
INFO  [11:58:57.022] [mlr3] Running benchmark with 20 resampling iterations
INFO  [11:58:57.026] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:57.040] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:57.054] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:57.067] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:57.081] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:57.094] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:57.106] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:57.119] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:57.133] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:57.145] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:57.159] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/10)
INFO  [11:58:57.172] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 2/10)
INFO  [11:58:57.187] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 3/10)
INFO  [11:58:57.202] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 4/10)
INFO  [11:58:57.216] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 5/10)
INFO  [11:58:57.230] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 6/10)
INFO  [11:58:57.244] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 7/10)
INFO  [11:58:57.257] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 8/10)
INFO  [11:58:57.270] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 9/10)
INFO  [11:58:57.284] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 10/10)
INFO  [11:58:57.299] [mlr3] Finished benchmark
INFO  [11:58:57.331] [bbotk] Result of batch 5:
INFO  [11:58:57.333] [bbotk]          cp minsplit classif.acc warnings errors runtime_learners
INFO  [11:58:57.333] [bbotk]  0.08316705        3   0.7395420        0      0             0.08
INFO  [11:58:57.333] [bbotk]  0.03915675        5   0.7434381        0      0             0.04
INFO  [11:58:57.333] [bbotk]                                 uhash
INFO  [11:58:57.333] [bbotk]  20dd0911-42d0-4a26-9033-052d6a850c81
INFO  [11:58:57.333] [bbotk]  aecbc8e8-ec86-47c5-9ec4-ea3f6531ee2e
INFO  [11:58:57.337] [bbotk] Finished optimizing after 10 evaluation(s)
INFO  [11:58:57.337] [bbotk] Result:
INFO  [11:58:57.338] [bbotk]          cp minsplit learner_param_vals  x_domain classif.acc
INFO  [11:58:57.338] [bbotk]       <num>    <int>             <list>    <list>       <num>
INFO  [11:58:57.338] [bbotk]  0.04293148        3          <list[3]> <list[2]>   0.7434381

auto_tuner调参的结果可以直接用于预测新数据

代码
auto_learner$predict(task)
<PredictionClassif> for 768 observations:
    row_ids truth response
          1   pos      pos
          2   neg      neg
          3   pos      neg
---                       
        766   neg      neg
        767   pos      neg
        768   neg      neg

特征选择

当我们拿到一份数据进行构建模型时,有很多信息对于目标变量来说,其实是无效的,如果把这些变量用于建模,只会增加噪声,降低模型的表现。

去除无效、冗余的变量,选择合适的变量的过程,就被称为特征选择

mlr3可按照一些方法,将所有预测变量计算一个分数,然后按照分数对变量进行排名和筛选

查看支持的计算方法

代码
as.data.table(mlr_filters)
Key: <key>
                  key                                                    label
               <char>                                                   <char>
 1:             anova                                             ANOVA F-Test
 2:               auc                           Area Under the ROC Curve Score
 3:            boruta                                                   Burota
 4:          carscore                   Correlation-Adjusted coRrelation Score
 5:      carsurvscore          Correlation-Adjusted coRrelation Survival Score
 6:              cmim      Minimal Conditional Mutual Information Maximization
 7:       correlation                                              Correlation
 8:              disr                       Double Input Symmetrical Relevance
 9:  find_correlation                                  Correlation-based Score
10:        importance                                         Importance Score
11:  information_gain                                         Information Gain
12:               jmi                                 Joint Mutual Information
13:              jmim            Minimal Joint Mutual Information Maximization
14:      kruskal_test                                      Kruskal-Wallis Test
15:               mim                          Mutual Information Maximization
16:              mrmr                     Minimum Redundancy Maximal Relevancy
17:             njmim Minimal Normalised Joint Mutual Information Maximization
18:       performance                                   Predictive Performance
19:       permutation                                        Permutation Score
20:            relief                                                   RELIEF
21: selected_features                               Embedded Feature Selection
22:    univariate_cox                            Univariate Cox Survival Score
23:          variance                                                 Variance
                  key                                                    label
      task_types task_properties
          <list>          <list>
 1:      classif                
 2:      classif        twoclass
 3: regr,classif                
 4:         regr                
 5:         surv                
 6: classif,regr                
 7:         regr                
 8: classif,regr                
 9:           NA                
10:      classif                
11: classif,regr                
12: classif,regr                
13: classif,regr                
14:      classif                
15: classif,regr                
16: classif,regr                
17: classif,regr                
18:      classif                
19:      classif                
20: classif,regr                
21:      classif                
22:         surv                
23:           NA                
      task_types task_properties
                                                 params
                                                 <list>
 1:                                                    
 2:                                                    
 3: pValue,mcAdj,maxRuns,doTrace,holdHistory,getImp,...
 4:                             lambda,diagonal,verbose
 5:                                  maxIPCweight,denom
 6:                                             threads
 7:                                          use,method
 8:                                             threads
 9:                                          use,method
10:                                              method
11:                     type,equal,discIntegers,threads
12:                                             threads
13:                                             threads
14:                                           na.action
15:                                             threads
16:                                             threads
17:                                             threads
18:                                              method
19:                                     standardize,nmc
20:                          neighboursCount,sampleSize
21:                                              method
22:                                                    
23:                                               na.rm
                                                 params
                                           feature_types          packages
                                                  <list>            <list>
 1:                                      integer,numeric             stats
 2:                                      integer,numeric      mlr3measures
 3:                                      integer,numeric            Boruta
 4:                              logical,integer,numeric              care
 5:                                      integer,numeric carSurv,mlr3proba
 6:                       integer,numeric,factor,ordered           praznik
 7:                                      integer,numeric             stats
 8:                       integer,numeric,factor,ordered           praznik
 9:                                      integer,numeric             stats
10: logical,integer,numeric,character,factor,ordered,...              mlr3
11:                       integer,numeric,factor,ordered     FSelectorRcpp
12:                       integer,numeric,factor,ordered           praznik
13:                       integer,numeric,factor,ordered           praznik
14:                                      integer,numeric             stats
15:                       integer,numeric,factor,ordered           praznik
16:                       integer,numeric,factor,ordered           praznik
17:                       integer,numeric,factor,ordered           praznik
18: logical,integer,numeric,character,factor,ordered,... mlr3,mlr3measures
19: logical,integer,numeric,character,factor,ordered,... mlr3,mlr3measures
20:                       integer,numeric,factor,ordered     FSelectorRcpp
21: logical,integer,numeric,character,factor,ordered,...              mlr3
22:                              integer,numeric,logical          survival
23:                                      integer,numeric             stats
                                           feature_types          packages

计算分数

代码
filter = flt('jmim')

task = tsk('iris')

filter$calculate(task)

filter
<FilterJMIM:jmim>: Minimal Joint Mutual Information Maximization
Task Types: classif, regr
Properties: -
Task Properties: -
Packages: praznik
Feature types: integer, numeric, factor, ordered
        feature     score
1:  Petal.Width 1.0000000
2: Sepal.Length 0.6666667
3: Petal.Length 0.3333333
4:  Sepal.Width 0.0000000

根据相关性

代码
task = tsk('mtcars')

filter_cor <- flt("correlation")


filter_cor$param_set
<ParamSet>
       id    class lower upper nlevels    default  value
   <char>   <char> <num> <num>   <int>     <list> <list>
1:    use ParamFct    NA    NA       5 everything       
2: method ParamFct    NA    NA       3    pearson       
代码
filter_cor$param_set$values <- list(method = "spearman")
filter_cor$param_set
<ParamSet>
       id    class lower upper nlevels    default    value
   <char>   <char> <num> <num>   <int>     <list>   <list>
1:    use ParamFct    NA    NA       5 everything         
2: method ParamFct    NA    NA       3    pearson spearman
代码
filter_cor$calculate(task)

filter_cor
<FilterCorrelation:correlation>: Correlation
Task Types: regr
Properties: missings
Task Properties: -
Packages: stats
Feature types: integer, numeric
    feature     score
 1:     cyl 0.9108013
 2:    disp 0.9088824
 3:      hp 0.8946646
 4:      wt 0.8864220
 5:      vs 0.7065968
 6:    carb 0.6574976
 7:    drat 0.6514555
 8:      am 0.5620057
 9:    gear 0.5427816
10:    qsec 0.4669358

计算变量重要性

代码
lrn <- lrn("classif.ranger", importance = "impurity")

task <- tsk("iris")
filter <- flt("importance", learner = lrn)
filter$calculate(task)
filter
<FilterImportance:importance>: Importance Score
Task Types: classif
Properties: -
Task Properties: -
Packages: mlr3, mlr3learners, ranger
Feature types: logical, integer, numeric, character, factor, ordered
        feature     score
1: Petal.Length 46.696384
2:  Petal.Width 41.213500
3: Sepal.Length  9.011293
4:  Sepal.Width  2.300987

组合方法

类似超参数调优,构建不同特征变量的模型,通过模型效果来选择

代码
library(mlr3fselect)

task <- tsk("pima")
learner <- lrn("classif.rpart")
hout <- rsmp("holdout")
measure <- msr("classif.ce")

evals10 <- trm("evals", n_evals = 20) 

instance <- FSelectInstanceSingleCrit$new(
  task = task,
  learner = learner,
  resampling = hout,
  measure = measure,
  terminator = evals10
)
instance
<FSelectInstanceSingleCrit>
* State:  Not optimized
* Objective: <ObjectiveFSelect:classif.rpart_on_pima>
* Terminator: <TerminatorEvals>
代码
fselector <- fs("random_search")

lgr::get_logger("bbotk")$set_threshold("warn")

fselector$optimize(instance)
INFO  [11:58:57.797] [mlr3] Running benchmark with 10 resampling iterations
INFO  [11:58:57.801] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:57.813] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:57.823] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:57.834] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:57.845] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:57.858] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:57.869] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:57.880] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:57.891] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:57.902] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:57.913] [mlr3] Finished benchmark
INFO  [11:58:58.000] [mlr3] Running benchmark with 10 resampling iterations
INFO  [11:58:58.004] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:58.016] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:58.028] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:58.038] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:58.050] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:58.061] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:58.071] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:58.083] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:58.094] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:58.105] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1)
INFO  [11:58:58.116] [mlr3] Finished benchmark
      age glucose insulin   mass pedigree pregnant pressure triceps
   <lgcl>  <lgcl>  <lgcl> <lgcl>   <lgcl>   <lgcl>   <lgcl>  <lgcl>
1:   TRUE    TRUE    TRUE  FALSE    FALSE    FALSE     TRUE    TRUE
                               features n_features classif.ce
                                 <list>      <int>      <num>
1: age,glucose,insulin,pressure,triceps          5  0.2539062
代码
# 查看选中特征
instance$result_feature_set
[1] "age"      "glucose"  "insulin"  "pressure" "triceps" 
代码
# 查看评估
instance$result_y
classif.ce 
 0.2539062 
代码
# 查看过程
as.data.table(instance$archive)
       age glucose insulin   mass pedigree pregnant pressure triceps classif.ce
    <lgcl>  <lgcl>  <lgcl> <lgcl>   <lgcl>   <lgcl>   <lgcl>  <lgcl>      <num>
 1:  FALSE    TRUE    TRUE  FALSE    FALSE    FALSE    FALSE   FALSE  0.2656250
 2:   TRUE   FALSE    TRUE   TRUE    FALSE     TRUE    FALSE   FALSE  0.2773438
 3:   TRUE   FALSE   FALSE   TRUE     TRUE    FALSE    FALSE    TRUE  0.3046875
 4:   TRUE    TRUE    TRUE   TRUE    FALSE     TRUE     TRUE    TRUE  0.2578125
 5:  FALSE    TRUE    TRUE   TRUE     TRUE    FALSE    FALSE    TRUE  0.2539062
 6:   TRUE   FALSE    TRUE  FALSE    FALSE     TRUE     TRUE   FALSE  0.3359375
 7:   TRUE   FALSE   FALSE   TRUE    FALSE    FALSE    FALSE   FALSE  0.3242188
 8:   TRUE    TRUE    TRUE   TRUE     TRUE     TRUE     TRUE    TRUE  0.2578125
 9:   TRUE    TRUE    TRUE  FALSE     TRUE     TRUE    FALSE    TRUE  0.2812500
10:  FALSE    TRUE    TRUE   TRUE     TRUE     TRUE    FALSE    TRUE  0.2773438
11:   TRUE    TRUE    TRUE   TRUE     TRUE     TRUE     TRUE    TRUE  0.2578125
12:   TRUE    TRUE    TRUE   TRUE     TRUE     TRUE     TRUE    TRUE  0.2578125
13:   TRUE   FALSE   FALSE   TRUE    FALSE    FALSE    FALSE    TRUE  0.3242188
14:   TRUE    TRUE    TRUE  FALSE    FALSE    FALSE     TRUE    TRUE  0.2539062
15:  FALSE    TRUE    TRUE   TRUE    FALSE    FALSE    FALSE    TRUE  0.2773438
16:  FALSE   FALSE   FALSE  FALSE    FALSE     TRUE    FALSE   FALSE  0.3320312
17:   TRUE    TRUE    TRUE   TRUE     TRUE     TRUE     TRUE    TRUE  0.2578125
18:   TRUE    TRUE    TRUE   TRUE    FALSE    FALSE    FALSE    TRUE  0.2734375
19:   TRUE    TRUE    TRUE  FALSE    FALSE     TRUE     TRUE   FALSE  0.2812500
20:   TRUE   FALSE   FALSE   TRUE    FALSE     TRUE     TRUE    TRUE  0.3203125
       age glucose insulin   mass pedigree pregnant pressure triceps classif.ce
    runtime_learners           timestamp batch_nr warnings errors
               <num>              <POSc>    <int>    <int>  <int>
 1:             0.01 2024-06-22 11:58:57        1        0      0
 2:             0.00 2024-06-22 11:58:57        1        0      0
 3:             0.00 2024-06-22 11:58:57        1        0      0
 4:             0.01 2024-06-22 11:58:57        1        0      0
 5:             0.02 2024-06-22 11:58:57        1        0      0
 6:             0.00 2024-06-22 11:58:57        1        0      0
 7:             0.00 2024-06-22 11:58:57        1        0      0
 8:             0.02 2024-06-22 11:58:57        1        0      0
 9:             0.00 2024-06-22 11:58:57        1        0      0
10:             0.00 2024-06-22 11:58:57        1        0      0
11:             0.02 2024-06-22 11:58:58        2        0      0
12:             0.00 2024-06-22 11:58:58        2        0      0
13:             0.00 2024-06-22 11:58:58        2        0      0
14:             0.02 2024-06-22 11:58:58        2        0      0
15:             0.01 2024-06-22 11:58:58        2        0      0
16:             0.00 2024-06-22 11:58:58        2        0      0
17:             0.02 2024-06-22 11:58:58        2        0      0
18:             0.01 2024-06-22 11:58:58        2        0      0
19:             0.00 2024-06-22 11:58:58        2        0      0
20:             0.00 2024-06-22 11:58:58        2        0      0
    runtime_learners           timestamp batch_nr warnings errors
                                          features n_features  resample_result
                                            <list>     <list>           <list>
 1:                                glucose,insulin          2 <ResampleResult>
 2:                      age,insulin,mass,pregnant          4 <ResampleResult>
 3:                      age,mass,pedigree,triceps          4 <ResampleResult>
 4: age,glucose,insulin,mass,pregnant,pressure,...          7 <ResampleResult>
 5:          glucose,insulin,mass,pedigree,triceps          5 <ResampleResult>
 6:                  age,insulin,pregnant,pressure          4 <ResampleResult>
 7:                                       age,mass          2 <ResampleResult>
 8: age,glucose,insulin,mass,pedigree,pregnant,...          8 <ResampleResult>
 9:  age,glucose,insulin,pedigree,pregnant,triceps          6 <ResampleResult>
10: glucose,insulin,mass,pedigree,pregnant,triceps          6 <ResampleResult>
11: age,glucose,insulin,mass,pedigree,pregnant,...          8 <ResampleResult>
12: age,glucose,insulin,mass,pedigree,pregnant,...          8 <ResampleResult>
13:                               age,mass,triceps          3 <ResampleResult>
14:           age,glucose,insulin,pressure,triceps          5 <ResampleResult>
15:                   glucose,insulin,mass,triceps          4 <ResampleResult>
16:                                       pregnant          1 <ResampleResult>
17: age,glucose,insulin,mass,pedigree,pregnant,...          8 <ResampleResult>
18:               age,glucose,insulin,mass,triceps          5 <ResampleResult>
19:          age,glucose,insulin,pregnant,pressure          5 <ResampleResult>
20:             age,mass,pregnant,pressure,triceps          5 <ResampleResult>
                                          features n_features  resample_result
代码
# 将选中变量应用于模型
task$select(instance$result_feature_set) # 只使用选中的变量
learner$train(task)

自动选择

代码
task = tsk("penguins")
split = partition(task, ratio = 0.8)

afs = auto_fselector(
  fselector = fs("random_search"),
  learner = lrn("classif.rpart"),
  resampling = rsmp ("holdout"),
  measure = msr("classif.ce"),
  term_evals = 4)


afs$train(task, row_ids = split$train)
INFO  [11:58:58.335] [mlr3] Running benchmark with 10 resampling iterations
INFO  [11:58:58.340] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [11:58:58.348] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [11:58:58.357] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [11:58:58.366] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [11:58:58.374] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [11:58:58.382] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [11:58:58.391] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [11:58:58.399] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [11:58:58.407] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [11:58:58.416] [mlr3] Applying learner 'classif.rpart' on task 'penguins' (iter 1/1)
INFO  [11:58:58.423] [mlr3] Finished benchmark
代码
afs$predict(task, row_ids = split$test)
<PredictionClassif> for 69 observations:
    row_ids     truth  response
         14    Adelie    Adelie
         16    Adelie    Adelie
         24    Adelie    Adelie
---                            
        327 Chinstrap Chinstrap
        329 Chinstrap Chinstrap
        344 Chinstrap Chinstrap

pipelines

通过mlr3pipelines可以将数据预处理、建模、模型比较、集成学习结合在一起

基本用法

将数据预处理、特征选择结合

代码
graph = 
  po("imputehist", # 插补
     id = "impute_num", # 重命名步骤
      affect_columns  = is.numeric # 设置处理哪些列
     ) %>>%
  po("imputeoor",id = "impute_fct",affect_columns = is.factor) %>>%  # 插补因子
  po("filter", mlr3filters::flt("information_gain"),
    filter.frac = 0.95) %>>%
  po("encode", method = "one-hot") %>>%
  po("learner",lrn("classif.rpart"))
graph$plot()

代码
task = tsk('pima')
lrn_graph = as_learner(graph)
lrn_graph$train(task)
pred = lrn_graph$predict(task)
pred$confusion
        truth
response pos neg
     pos 207  73
     neg  61 427

分块训练

针对数据量较大时,可以将数据分块训练,最后将各块

代码
chks = po("chunk", 4)
lrns = ppl("greplicate", po("learner", lrn("classif.rpart")), 4)

mjv = po("classifavg", 4)

pipeline = chks %>>% lrns %>>% mjv
pipeline$plot(html = FALSE)

代码
task = tsk("iris")
split = partition(task,ratio = 0.7,stratify = T)

pipelrn = as_learner(pipeline)
pipelrn$train(task, split$train)$
    predict(task, split$test)$
    score(msr('classif.acc'))
classif.acc 
  0.9333333 

装袋

代码
single_pred <- po("subsample", frac = 0.7) %>>%
  po("learner", lrn("classif.rpart")) # 建立一个模型

pred_set <- ppl("greplicate", single_pred, 10L) # 复制10次

bagging <- pred_set %>>%
  po("classifavg", innum = 10)

bagging$plot(html = FALSE)

代码
task <- tsk("iris")
split <- partition(task, ratio = 0.7, stratify = T)


baglrn <- as_learner(bagging)
baglrn$train(task, row_ids = split$train)
baglrn$predict(task, row_ids = split$test)$
    score(msr('classif.acc'))
classif.acc 
  0.9111111 

堆叠

代码
lrn <- lrn("classif.rpart")
lrn_0 <- po("learner_cv", lrn$clone())
lrn_0$id<- "rpart_cv"
level_0 <- gunion(list(lrn_0, po("nop")))
combined <- level_0 %>>% po("featureunion", 2)
stack <- combined %>>% po("learner", lrn$clone())
stack$plot(html = FALSE)

代码
stacklrn <- as_learner(stack)
stacklrn$train(task, split$train)
INFO  [11:59:00.275] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 1/3)
INFO  [11:59:00.287] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 2/3)
INFO  [11:59:00.297] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 3/3)
代码
stacklrn$predict(task, split$test)$
    score(msr('classif.acc'))
classif.acc 
  0.9111111 

复杂案例

代码
library("magrittr")
library("mlr3learners") 

rprt = lrn("classif.rpart", predict_type = "prob")
glmn = lrn("classif.glmnet", predict_type = "prob")

# 创建学习器
lrn_0 = po("learner_cv", rprt, id = "rpart_cv_1")
lrn_0$param_set$values$maxdepth = 5L
lrn_1 = po("pca", id = "pca1") %>>% po("learner_cv", rprt, id = "rpart_cv_2")
lrn_1$param_set$values$rpart_cv_2.maxdepth = 1L
lrn_2 = po("pca", id = "pca2") %>>% po("learner_cv", glmn)

# 第0层
level_0 = gunion(list(lrn_0, lrn_1, lrn_2, po("nop", id = "NOP1")))

# 第1层
level_1 = level_0 %>>%
  po("featureunion", 4) %>>%
  po("copy", 3) %>>%
  gunion(list(
    po("learner_cv", rprt, id = "rpart_cv_l1"),
    po("learner_cv", glmn, id = "glmnt_cv_l1"),
    po("nop", id = "NOP_l1")
  ))

# 第2层
level_2 = level_1 %>>%
  po("featureunion", 3, id = "u2") %>>%
  po("learner", rprt, id = "rpart_l2")


level_2$plot(html = FALSE)

代码
task = tsk("iris")
lrn = as_learner(level_2)

lrn$train(task, split$train)$
  predict(task, split$test)$
  score(msr('classif.acc'))
INFO  [11:59:01.074] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 1/3)
INFO  [11:59:01.086] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 2/3)
INFO  [11:59:01.096] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 3/3)
INFO  [11:59:01.184] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 1/3)
INFO  [11:59:01.200] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 2/3)
INFO  [11:59:01.216] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 3/3)
INFO  [11:59:01.400] [mlr3] Applying learner 'classif.glmnet' on task 'iris' (iter 1/3)
INFO  [11:59:01.450] [mlr3] Applying learner 'classif.glmnet' on task 'iris' (iter 2/3)
INFO  [11:59:01.477] [mlr3] Applying learner 'classif.glmnet' on task 'iris' (iter 3/3)
INFO  [11:59:01.568] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 1/3)
INFO  [11:59:01.592] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 2/3)
INFO  [11:59:01.617] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 3/3)
INFO  [11:59:01.713] [mlr3] Applying learner 'classif.glmnet' on task 'iris' (iter 1/3)
INFO  [11:59:01.753] [mlr3] Applying learner 'classif.glmnet' on task 'iris' (iter 2/3)
INFO  [11:59:01.789] [mlr3] Applying learner 'classif.glmnet' on task 'iris' (iter 3/3)
classif.acc 
  0.9111111 
回到顶部