wx61090d1892228的博客_automl

sklearn DecisionTree 源码分析

sklearn.tree._classes.BaseDecisionTree#fity至少为1维（意思是可以处理multilabel

子节点

sed

决策树算法

数据结构

过拟合

原创 2021-08-15 13:26:52 949 阅读

pt.darts源码分析

models.search_cnn.SearchCNNController n_ops = 8, n_nodes = 4 之所有有个i + 2是因为前置节点有两个 for i in range(n_nodes): self.alpha_normal.append(nn.Parameter(1e-3*torch.randn (i+2, n_ops))) self.alpha_reduce.append(nn.Parameter(1e-3*torch.randn (i+2, n_

ide

2d

.net

权重

构造函数

原创 2021-08-04 10:48:25 768 阅读

UltraOpt：比HyperOpt更强的超参优化库

这个月，笔者复盘了2020年做的一些AutoML项目，并在多个优秀的开源项目的基础上，博采众长，写了一个超参优化库：UltraOpt。这个库包含一个笔者自研的贝叶斯优化算法：ETPE，其在基准测试中比HyperOpt的TPE算法表现更为出色。UltraOpt对分布式计算有更强的适应性，支持MapReduce和异步通信两种并行策略，并且可以扩展到各种计算环境中。除此之外，UltraOpt对与新手也特别友好，笔者特地花了3周的时间写中文文档，就是为了让小白也能0基础看懂AutoML（自动机器学习）是在做什么。

可视化工具

中文文档

代码仓库

github

优化算法

原创 2021-08-04 10:48:23 908 阅读 1评论

ENAS-pytorch源码分析

https://github.com/carpedm20/ENAS-pytorch 虽然该项目声称可以对CNN(cifar, mnist)和RNN(ptb, wikitest)进行NAS, 但经过实测, CNN部分的代码根本没写完,估计还是要看原版tensorflow的代码 In ENAS, there are two sets of learnable parameters: the parameters of the controller LSTM, denoted by θ\thetaθ, and t

git

方差

激活函数

子节点

符号表

原创 2021-08-04 10:42:39 543 阅读

CASH问题

编程

原创 2021-08-04 10:42:36 241 阅读

用hyperopt搜索svm最优参数

from hyperopt import hp, STATUS_OK, Trials, fmin, tpeimport hyperoptfrom sklearn.model_selection import cross_val_scorefrom sklearn import svmfrom sklearn.datasets import load_irisimport numpy as...

搜索

经验分享

原创 2021-08-04 09:58:13 223 阅读

hyperopt源码理解

hyperopt的tpe前20个都是随机搜运行目标函数的地方在:hyperopt.fmin.FMinIter#serial_evaluate

lua

经验分享

原创 2021-08-04 09:58:10 448 阅读

smac源码笔记

smac.optimizer.smbo.SMBO#runself.start()smac/optimizer/smbo.py:156self.incumbent = self.initial_design.run()incumbent是现任者，可以理解为最优解。进入initial_design.pysmac/initial_design/initial_design.py:116...

最优解

经验分享

原创 2021-08-04 09:58:07 147 阅读

smac 二次开发

smac/facade/smac_ac_facade.py:411原代码： # initial design if initial_design is not None and initial_configurations is not None: raise ValueError( "Either use ...

初始化

自定义

原代码

回调函数

经验分享

原创 2021-08-04 09:58:03 614 阅读

auto-sklearn 学习要点

找包的代码原来在这，我tm自己瞎写了一个autosklearn.pipeline.components.base.find_componentsdef find_components(package, directory, base_class): components = OrderedDict() for module_loader, module_name, ispkg ...

经验分享

原创 2021-08-04 09:58:02 140 阅读

MLxtend 轮子记录

看了下MLxtend，造了特别多轮子，感觉之前的一些轮子他都造好了。evaluation抽样评价，bootstraphttp://rasbt.github.io/mlxtend/user_guide/evaluate/BootstrapOutOfBag/用户自定义验证集http://rasbt.github.io/mlxtend/api_subpackages/mlxtend.e...

github

lua

bootstrap

自定义

拟合

原创 2021-08-04 09:56:38 274 阅读

魔改smac记录

dsmac/tae/execute_func.py:160在这里添加try catch，或者在evaluate中try catch

lua

经验分享

原创 2021-08-04 09:56:36 120 阅读

auto-sklearn 日志管理分析

def setup_logger(output_file=None, logging_config=None): # logging_config must be a dictionary object specifying the configuration # for the loggers to be used in auto-sklearn. if logging_...

sed

经验分享

原创 2021-08-04 09:56:35 169 阅读

理解smac的intensify

进入对应区域 smac.optimizer.smbo.SMBO#runself.aggregate_funcOut[9]: <function smac.optimizer.objective.average_cost(config, run_history, instance_seed_pairs=None)>找到这个函数，只有一句话：return np.mean(_c...

经验分享

原创 2021-08-04 09:56:34 144 阅读

smac论文阅读

One promising approach constructs explicit regression models to describe the dependence of target algorithm performance on parameter settingshowever, this approach has so far been limited to the opti...

sed

ios

经验分享

原创 2021-08-04 09:56:33 361 阅读

resource manager中废弃的直接操作数据库代码

原代码中直接用SQL操作数据库，现在改为用peewee做ORM def init_db(self): conn = sqlite3.connect(self.db_path) cur = conn.cursor() cur.execute( "create table if not exists record(tr...

sqlite

数据库

数据丢失

删除表

原代码

原创 2021-08-04 09:56:30 67 阅读

dsmac中废弃的mapreduce方案

import multiprocessing as mpfrom copy import deepcopy# import rayfrom frozendict import frozendictfrom joblib import parallel_backend, delayed, Parallelfrom dsmac.runhistory.runhistory import R...

经验分享

原创 2021-08-04 09:56:29 76 阅读

StackEnsembleBuilder废弃代码

在原来的实现中，我采用将数据库转存为csv，且保存模型文件到文件系统中。在现在的实现中，数据库不实时转存，且模型文件有可能存储在数据库记录中from typing import List, Union, Dictimport numpy as npimport pandas as pdfrom joblib import loadfrom pandas import DataFram...

数据库

文件系统

内存溢出

经验分享

原创 2021-08-04 09:56:27 123 阅读

scikit-optimizer 源码分析

文章目录训练前对样本空间进行TransformInteger, Real随机森林是怎样预测标准差的采集函数的计算方法EIPILCB训练前对样本空间进行Transformskopt.utils.cook_estimator 这个函数在构建代理函数skopt.optimizer.optimizer.Optimizer#_tell这里是在做代理模型训练Integer, Realself.XiOut[4]: [[1, 0.01032326035197658, 4, 11, 84]]self.space

方差

ide

结点

决策树

样本空间

原创 2021-08-04 09:56:25 391 阅读

RoBO源码分析

GP-MCMC专有采集函数的计算robo.acquisition_functions.marginalization.MarginalizationGPMCMC#computeself.estimatorsOut[10]: [<robo.acquisition_functions.log_ei.LogEI at 0x7f3067cdeeb8>, <robo.acquisition_functions.log_ei.LogEI at 0x7f3067b0b320>, &l

c#

经验分享

原创 2021-08-04 09:56:23 109 阅读

通过源码分析GBDT是怎么实现early stopping的

GBDT文档：Early stopping of Gradient Boosting有无early stopping的比较 gbes = ensemble.GradientBoostingClassifier(n_estimators=n_estimators, validation_fraction=0.2,

拟合

源码分析

for循环

热启动

成员变量

原创 2021-08-04 09:56:20 94 阅读

Titanic数据集：仅用名字列就取得0.8的正确率

文章目录前言表格机器学习的4类特征text 特征组数据处理载入数据数据清洗分词删除低频词建模sklearnTF-IDFNMFTruncatedSVDgensimLDALSIRPHDP前言表格机器学习的4类特征最近在思考表格机器学习，或者说对表格数据、结构化数据的有监督机器学习的工作流。我认为在大部分场景下，大概有4类特征：categoricalnumericaldatetext...

数据

建模

机器学习

稀疏矩阵

矩阵分解

原创 2021-08-04 09:56:16 242 阅读

auto-sklearn实验部分源码阅读

scripts/2015_nips_paper/run/run_auto_sklearn.py元学习的LeaveOneOut留一验证 if use_metalearning is True: # path to the original metadata directory. metadata_directory = os.path.abspath(os.path.dirname(__file__)) metadata_directory = os.

经验分享

原创 2021-08-04 09:56:13 85 阅读

PoSH-autosklearn源码分析

论文：（2018ICML）https://ml.informatik.uni-freiburg.de/papers/18-AUTOML-AutoChallenge.pdf代码：http://ml.informatik.uni-freiburg.de/downloads/automl_competition_2018.zip数据：(codalab平台，需要注册)https://competitions.codalab.org/competitions/17767#participate-get_da

特征选择

ide

数据

迭代

源码分析

原创 2021-08-04 09:56:12 167 阅读

深入理解HpBandSter

hpbandster.core.result.Result#__init__self.data[0].keys()Out[32]: dict_keys([(0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 0, 3), (0, 0, 4), (0, 0, 5), (0, 0, 6), (0, 0, 7), (0, 0, 8), (0, 0, 9), (0, 0, 10), (0, 0, 11), (0, 0, 12), (0, 0, 13), (0, 0, 14), (0, 0,

热启动

json

时间戳

迭代

数据结构

原创 2021-08-04 09:56:10 479 阅读

ATM源码分析

example/example.pyfrom atm import ATMatm = ATM()results = atm.run(train_path="/home/tqc/PycharmProjects/automl/ATM/demos/pollution_1.csv")results.describe()atm.worker.Worker#select_hyperpartition调试打印的信息和论文描述的一致，超划分hyperpartition表示条件参数树(conditiona

ide

数据库

搜索

实例化

3c

原创 2021-08-04 09:56:08 191 阅读

BUG记录

行方向上拼接两个数据框pandas=1.0.1 work,pandas= 0.25.3 不workdf = pd.concat(Xs, axis=0)df.sort_index(inplace=True)df = pd.concat(Xs, axis=0, sort=False)df.sort_index(inplace=True)

python

数据

数据集

lua

缺失率

原创 2021-08-04 09:55:48 524 阅读

pmf-automl源码分析

pmf-automl是2018NIPS论文中提出的一种新的automl方法，他构造了一个离散的pipeline空间，并用概率矩阵分解作为概率模型来实现贝叶斯优化。

数据集

协方差矩阵

概率密度函数

数据

核函数

原创 2021-08-04 09:55:46 377 阅读

整合HpBandSter:开发与复盘

实例化BOHB或其他Master需要提供ConfigSpaceclass BOHB(Master): def __init__( self, configspace=None, eta=3, min_budget=0.01, max_budget=1, min_points_in_model=None, top_n_percent=

lua

拟合

迭代

热启动

边界条件

原创 2021-08-04 09:55:44 246 阅读

几种测试用的黑盒函数

经验分享

原创 2021-08-04 09:55:43 150 阅读

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯

51CTO博客

wx61090d1892228的博客