HalvingGridSearchCV设置factor=1.5,min_resources=500,数据集大小为1400,参数空间大小为25,感觉理论上可以迭代3次,但实际输出看起来只迭代了2次是为什么
500*1.5**2=1125
500*1.5**1=750
感觉应该还有一次n_resources为1125的迭代
后面试了试,限制数据集大小为1000,其余不变,也迭代了两次,这个感觉是合理的
1400个样本
from sklearn.ensemble import RandomForestRegressor
from sklearn.experimental import enable_halving_search_cv
from sklearn.model_selection import HalvingGridSearchCV,KFold,GridSearchCV,cross_validate
import numpy as np
param_grid_simple = {'n_estimators': [*range(50,100,10)]
, 'max_depth': [*range(15,25,2)]
}
reg = RandomForestRegressor(random_state=110,n_jobs=8,verbose=True)
cv = KFold(random_state=110,shuffle=True)
search = HalvingGridSearchCV(estimator=reg
,param_grid=param_grid_simple
,factor=1.5
,min_resources=500
,verbose = True
,random_state=110
,cv = cv
,n_jobs=8)
search.fit(X[:1400,:],y[:1400])
输出
n_iterations: 2
n_required_iterations: 8
n_possible_iterations: 2
min_resources_: 500
max_resources_: 1400
aggressive_elimination: False
factor: 1.5
----------
iter: 0
n_candidates: 25
n_resources: 500
Fitting 5 folds for each of 25 candidates, totalling 125 fits
----------
iter: 1
n_candidates: 17
n_resources: 750
Fitting 5 folds for each of 17 candidates, totalling 85 fits
[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done 34 tasks | elapsed: 0.0s
[Parallel(n_jobs=8)]: Done 60 out of 60 | elapsed: 0.1s finished
HalvingGridSearchCV(cv=KFold(n_splits=5, random_state=110, shuffle=True),
estimator=RandomForestRegressor(n_jobs=8, random_state=110,
verbose=True),
factor=1.5, min_resources=500, n_jobs=8,
param_grid={'max_depth': [15, 17, 19, 21, 23],
'n_estimators': [50, 60, 70, 80, 90]},
random_state=110, verbose=True)
1000个样本
search.fit(X[:1000,:],y[:1000])
输出
n_iterations: 2
n_required_iterations: 8
n_possible_iterations: 2
min_resources_: 500
max_resources_: 1000
aggressive_elimination: False
factor: 1.5
----------
iter: 0
n_candidates: 25
n_resources: 500
Fitting 5 folds for each of 25 candidates, totalling 125 fits
----------
iter: 1
n_candidates: 17
n_resources: 750
Fitting 5 folds for each of 17 candidates, totalling 85 fits
[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done 34 tasks | elapsed: 0.0s
[Parallel(n_jobs=8)]: Done 90 out of 90 | elapsed: 0.1s finished
HalvingGridSearchCV(cv=KFold(n_splits=5, random_state=110, shuffle=True),
estimator=RandomForestRegressor(n_jobs=8, random_state=110,
verbose=True),
factor=1.5, min_resources=500, n_jobs=8,
param_grid={'max_depth': [15, 17, 19, 21, 23],
'n_estimators': [50, 60, 70, 80, 90]},
random_state=110, verbose=True)