慣性聚合 高效追蹤和閱讀你感興趣的部落格、新聞、科技資訊
閱讀原文 在慣性聚合中打開

推薦訂閱源

博客园 - 司徒正美
V
V2EX
T
Tailwind CSS Blog
有赞技术团队
有赞技术团队
aimingoo的专栏
aimingoo的专栏
Apple Machine Learning Research
Apple Machine Learning Research
IT之家
IT之家
Blog — PlanetScale
Blog — PlanetScale
A
About on SuperTechFans
月光博客
月光博客
T
The Blog of Author Tim Ferriss
宝玉的分享
宝玉的分享
Martin Fowler
Martin Fowler
博客园 - 聂微东
The GitHub Blog
The GitHub Blog
V
Visual Studio Blog
WordPress大学
WordPress大学
酷 壳 – CoolShell
酷 壳 – CoolShell
Engineering at Meta
Engineering at Meta
GbyAI
GbyAI

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python)
AutoML 指南
Luis M · 2026-05-24 · via DEV Community

Luis M

SynapCores AutoML 指南

直接透過 SQL 建立強大的機器學習模型,無需撰寫任何 Python 代码。

概覽

SynapCores AutoML 提供透過 SQL 語法創建機器學習實驗的全面選項。使用熟悉的資料庫命令來訓練、調優和部署即時生產模型。

執行任務類型

執行任務類型 描述 預設指標
regression 連續值預測 R平方
binary_classification 二元分類 AUC
classification/multiclass 多類別分類 準確率
clustering 無監督分群 輪廓分數
anomaly 異常檢測 F1 分數
time_series 時間序列預測 MAPE

創建 AutoML 實驗

基本語法

選項 1: AS 語法

CREATE EXPERIMENT <experiment_name> AS
<SELECT_query>
WITH (<options>)

進入全螢幕模式 退出全螢幕模式

選項 2: USING 語法

CREATE EXPERIMENT <experiment_name>
USING (<SELECT_query>)
TARGET <target_column>
OPTIONS (<options>)

進入全螢幕模式 退出全螢幕模式

設定選項

常規選項

選項 類型 預設值 說明
task_type 字串 'binary_classification' 機器學習任務類型
target_column 字串 必需 預測欄位
max_trials 整數 100 最大訓練試驗次數
time_budget_minutes 整數 60 最大時間預算
validation_split 浮點數 0.2 驗證數據比例
cv_folds 整數 5 交叉驗證折疊數
optimization_metric 字串 任務相關 優化指標
ensemble 布林值 true 建立集成模型
early_stopping_patience 整數 10 沒有進步的試驗次數
random_seed 整數 42 為了可重現性而設置的隨機種子

可用的演算法

  • 'linear_regression' - 線性迴歸
  • 'logistic_regression' - 電子分類迴歸
  • 'decision_tree' - 決策樹
  • 'random_forest' - 隨機森林
  • 'gradient_boosting' - 梯度提升
  • 'xgboost' - XGBoost
  • 'neural_network' - 神經網絡
  • 'knn' - K-最近鄰居
  • 'naive_bayes' - 偏見貝氏
  • 'svm' - 支持向量機

算法選擇策略

  • 'all' - 試用所有可用算法
  • 'fast' - 僅限快速演算法(線性模型、決策樹、朴素貝葉斯、knn)
  • 'accurate' - 僅限高精度演算法(隨機森林、梯度提升、xgboost、神經網路)
  • 'interpretable' - 僅限可解釋演算法(線性迴歸、邏輯斯迴歸、決策樹)

演算法特定選項

隨機森林

超參數 類型 預設 描述
n_estimators 整數 100 樹木數量
max_depth 整數 最大樹深度
min_samples_split 整數 2 分割所需最小樣本數
max_features 字串/浮點數 'sqrt' 考慮的特徵
WITH (
  task_type='classification',
  algorithms=['random_forest'],
  n_estimators=200,
  max_depth=10
)

進入全螢幕模式 離開全螢幕模式

�神經網絡

超參數 類型 預設值 說明
hidden_layers array [100] 隱藏層大小
learning_rate float 0.001 初始學習率
batch_size 整數 32 小型批次大小
n_epochs 整數 100 最大迭代次數
activation 字串 'relu' 激勵函數
dropout_rate 浮點數 0.0 丟失率
WITH (
  task_type='classification',
  algorithms=['neural_network'],
  hidden_layers=[128, 64, 32],
  dropout_rate=0.2
)

Enter fullscreen mode 退出全螢幕模式

增量提升 / XGBoost

超參數 類型 預設值 說明
n_estimators 整數 100 提升階段數
learning_rate 浮點數 0.1 學習率
max_depth 整數 3 最大樹深度
subsample 浮點數 1.0 樣本比例

特徵工程選項

選項 類型 預設值 說明
auto_features 布林值 true 自動產生特徵
polynomial_degree 整數 2 多項式特徵次數
interaction_features 布林值 錯誤 產生交互相關特徵
scaling 字串 'standard' 特徵縮放方法
missing_values 字串 '平均' 遺失值處理
categorical_encoding 字串 'onehot' 分類編碼方法

標準化方法

  • 'standard' - 標準化(零均值,單位變異數)
  • 'minmax' - 最小-最大標準化至[0, 1]
  • 'robust' - 使用中位數和四分位距的強健標準化
  • 'none' - 不進行標準化

分類編碼

  • 'onehot' - one-hot編碼
  • 'label' - 標籤編碼
  • 'target' - 目標編碼
  • 'ordinal' - 序列編碼

完整範例

客戶流失預測

CREATE EXPERIMENT churn_prediction AS
SELECT customer_id, age, tenure, monthly_charges, total_charges, churned
FROM customers
WITH (
  task_type='binary_classification',
  target_column='churned',
  max_trials=50,
  validation_split=0.2
);

Enter fullscreen mode Exit fullscreen mode

房價回歸

CREATE EXPERIMENT house_price_model AS
SELECT * FROM housing_data
WITH (
  task_type='regression',
  target_column='price',
  algorithms=['random_forest', 'xgboost', 'gradient_boosting'],
  max_trials=100,
  n_estimators=200
);

Enter fullscreen mode Exit fullscreen mode

特徵工程中的欺騙檢測

CREATE EXPERIMENT fraud_detection AS
SELECT * FROM transactions
WITH (
  task_type='binary_classification',
  target_column='is_fraud',
  algorithms=['xgboost', 'neural_network'],
  auto_features=true,
  polynomial_degree=2,
  interaction_features=true,
  scaling='robust',
  categorical_encoding='target',
  max_trials=150
);

進入全螢幕模式 離開全螢幕模式

時間序列預測

CREATE EXPERIMENT sales_forecast AS
SELECT date, product_id, sales, promotions, holidays
FROM sales_data
WITH (
  task_type='time_series',
  target_column='sales',
  algorithms=['gradient_boosting', 'neural_network'],
  cv_folds=5
);

進入全螢幕模式 離開全螢幕模式

可解釋模型以符合規範

CREATE EXPERIMENT loan_approval AS
SELECT * FROM loan_applications
WITH (
  task_type='binary_classification',
  target_column='approved',
  algorithms=['logistic_regression', 'decision_tree'],
  max_depth=5
);

進入全螢幕模式 離開全螢幕模式

模型操作

顯示所有實驗

SHOW MODELS;

進入全螢幕模式 退出全螢幕模式

部署模型

DEPLOY MODEL best_model FROM EXPERIMENT churn_prediction
WITH (replicas=3, memory='2Gi');

進入全螢幕模式 退出全螢幕模式

進行預測

PREDICT churn_probability, risk_score
USING churn_model
AS SELECT customer_id, age, tenure FROM new_customers;

進入全螢幕模式 退出全螢幕模式

描述模型

DESCRIBE MODEL churn_model;

進入全螢幕模式 退出全螢幕模式

放置模型

DROP MODEL old_model;

進入全螢幕模式 退出全螢幕模式

最佳實踐

  1. 參數調整:針對特定演算法的選項適用於所有選定的演算法,只要兼容即可.

  2. 預設值:所有選項都有合理的預設值。僅指定與預設值不同的選項。

  3. 資源限制:實驗同時尊重max_trialstime_budget_minutes,當任一限制達到時停止。

  4. 可重現性:設置random_seed以確保跨次運行結果的一致性。

  5. 算法兼容性: 系統會自動篩選不相容的演算法,針對每個任務類型。


文件版本: 1.0
最後更新: 2025年12月
網站: https://synapcores.com


原發布於 synapcores.com — SynapCores 是一個免費、單二進制原生 AI 數據庫(向量 + 圖形 + SQL + 大語言模型)。