
























Abstract:Feature Engineering (FE) is pivotal in automated machine learning (AutoML) but remains a bottleneck for traditional methods, which operate within rigid search spaces and lack domain awareness. While Large Language Models (LLMs) offer a promising alternative to generate unbounded operators with semantic reasoning, existing methods focus on isolated subtasks such as feature generation, falling short of free-form FE pipelines. Moreover, they are rarely coupled with hyperparameter optimization (HPO) of the downstream ML model, leading to greedy "FE-then-HPO" workflows that cannot capture strong FE-HPO interactions. In this paper, we present CoFEH, a collaborative framework that interleaves LLM-based FE and Bayesian HPO for robust end-to-end AutoML. CoFEH uses an LLM-driven FE optimizer powered by Tree of Thought (TOT) to explore flexible FE pipelines, a Bayesian optimization (BO) module to solve HPO, and a dynamic optimizer selector that adaptively interleaves FE and HPO steps. Crucially, we introduce a mutual conditioning mechanism that shares context between LLM and BO, enabling mutually informed decisions. Experiments show that CoFEH outperforms both traditional and LLM-based baselines in both standalone FE and joint FE+HPO settings.
| Comments: | Accepted at KDD 2026. Extended version with full appendices |
| Subjects: | Machine Learning (cs.LG) |
| Cite as: | arXiv:2602.09851 [cs.LG] |
| (or arXiv:2602.09851v2 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2602.09851 arXiv-issued DOI via DataCite |
|
| Related DOI: | https://doi.org/10.1145/3770855.3817664
DOI(s) linking to related resources |
From: Beicheng Xu [view email]
[v1]
Tue, 10 Feb 2026 14:54:17 UTC (3,925 KB)
[v2]
Thu, 21 May 2026 15:03:06 UTC (3,926 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。