惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

人人都是产品经理
人人都是产品经理
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
P
Proofpoint News Feed
T
Tailwind CSS Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
G
GRAHAM CLULEY
Engineering at Meta
Engineering at Meta
Blog — PlanetScale
Blog — PlanetScale
量子位
GbyAI
GbyAI
C
Cybersecurity and Infrastructure Security Agency CISA
Know Your Adversary
Know Your Adversary
阮一峰的网络日志
阮一峰的网络日志
P
Privacy International News Feed
T
Tenable Blog
Cisco Talos Blog
Cisco Talos Blog
P
Privacy & Cybersecurity Law Blog
T
Tor Project blog
L
Lohrmann on Cybersecurity
S
Secure Thoughts
Y
Y Combinator Blog
S
Securelist
H
Hackread – Cybersecurity News, Data Breaches, AI and More
有赞技术团队
有赞技术团队
月光博客
月光博客
Cyberwarzone
Cyberwarzone
H
Heimdal Security Blog
博客园 - 聂微东
Latest news
Latest news
The Hacker News
The Hacker News
小众软件
小众软件
T
Troy Hunt's Blog
Google Online Security Blog
Google Online Security Blog
D
DataBreaches.Net
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
Martin Fowler
Martin Fowler
罗磊的独立博客
www.infosecurity-magazine.com
www.infosecurity-magazine.com
U
Unit 42
Vercel News
Vercel News
T
The Blog of Author Tim Ferriss
F
Fortinet All Blogs
SecWiki News
SecWiki News
MongoDB | Blog
MongoDB | Blog
C
Check Point Blog
aimingoo的专栏
aimingoo的专栏
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
Stack Overflow Blog
Stack Overflow Blog
WordPress大学
WordPress大学

Towards AI

The Verified Identity Agent Bridge | Towards AI You Can’t Prompt Your Away Your LLM Problems | Towards AI The Free Agent Trap | Towards AI Your Agentic Loop Will Drift. Here Is the KL Divergence Equation That Measures How Far It Has Wandered From Its Original Instruction. | Towards AI Beyond Chat: Processing Images, PDFs, and Documents with the OpenAI Adapter in Oracle Integration Cloud | Towards AI Building AI Agents in Rust — part 3 | Towards AI Self-Hosting Airflow at Home: Automating Stock Price Data Collection | Towards AI The 76-Hour Frontier: How the Takedown of Claude Fable 5 Birthed the Military-Industrial-AI Complex | Towards AI We Replaced ChatGPT With a Local AI Server. Six Months of Honest Data. | Towards AI What Really Makes Cars Pollute? A Data Science Deep Dive into CO₂ Emissions | Towards AI Training GPT-2 From Scratch on a GTX1050 | Towards AI Principal Component Analysis (PCA): Theory, Mathematics, and Applications Build a Zero-Cost Web Automation Pipeline With OpenRouter, OpenClaw, and MediaUse I Gave Qwen3.7-Plus a Screenshot and It Found the Exact Pixel to Click for $0.40 Beyond the Prompt: Why Autonomous AI Agents Are Replacing the Chatbot Moonshot Cracked Claude Code’s Playbook with an MIT Terminal Agent and a $0.60 Model Connections, Roles, and Warehouses: Getting CoCo Desktop Production-Ready from Day One My First $5,000 Month Writing About AI Engineering on Medium Google Shrank Gemma 4 by 72% and Unsloth Fixed the 4-Bit Bug Nobody Else Caught on One 4090, and 4-Bit Shouldn’t Be This Good LangChain Explained: Understanding Models, Prompts, Chains, Memory, Indexes, and Agents TOON: Beyond JSON for LLMs Claude Code Casual, Pro, Elite: The Three Working Personas of Claude Code Mastery MiniMax M3 Decodes 1M Tokens 15x Faster — and It Shouldn’t Be This Cheap Using Amazon SQS for AI Agent Orchestration I Ran a 1.5B-Active Model on My Laptop That Embarrassed a 26B by 46 Points How to Build a Self-Improving Company with AI Part 3 — Implementation/Engine-Level: Choosing the Runtime That Gives You These for Free Part 2 — Serve-Level Speed: System Design That Stabilizes P95/P99 3-Part Series: LLM Latency in Production (Part 1) Claude Code: The AI Coding Partner Changing How Developers Build Software Claude Code Pitfalls: Claude Code Won’t Do What You Told It: A Troubleshooting Catalog Full-Stack Data Scientists for the Agentic Coding World Building Production-Grade AI Skills with Snowflake Cortex AI Function Studio I Tried 10 AI Agent Frameworks in 2026 — Here’s the Honest Guide I Wish I Had Earlier How One Spring Boot Optimization Saved Our Startup $30,000 a Year Inside Palantir AIP: How the World’s Most Controversial AI Platform Actually Works What Is a Reverse Proxy? (And Why Every Backend Developer Should Care) What Claude Opus 4.8 Actually Changes If You’re Building Agents QWEN 3.7 Max Worked For 35 Hrs Straight And The Results Were Mind-blowing When LLMs Meet Knowledge Graphs on the Battlefield Fine-Tuning is Dead: Why Context Orchestration Won in 2026 5 Things Broke When I Shipped a RAG + MCP Agent to Production. Google Co-Scientist: Hyper Scaling Research and Discovery Microsoft Just Embarrassed Browser Web Agents — 1,000 Lines Made GPT-5.4 Beat Opus 4.6 on 200 Web Tasks The Modern Data Stack Is Broken — Here’s How to Fix It With AI, Governance, and Real Architecture Building Production MCP Servers: What the Spec Won’t Tell You When Should an Agent Stop? The Anatomy of Termination Harness Engineering: The Layer That Matters More Than the Model AI Engineers Who Can’t Debug Are Getting Fired (Here’s How I Debug with Claude Code) Claude Code Memory: Why You Keep Explaining the Same Thing to Claude (and the Five Layers That Fix It) Claude Code Subagents: The Claude Code Feature You Skip Every Day (And Why It Quietly Wrecks Your Sessions) Agentic AI and the SMB Banking Advantage Claude Code: Spec-Driven Development — Why Your AI Coding Sessions Fall Apart at Hour Three The Real Cost of Agentic AI Nobody Budgets For SVM : 40 must visit Interview Questions (Part 2) Your AI Agent Works Perfectly in the Demo. Here Are the 6 Ways It Dies in Production. Unleashing the Power of ONNX for Speedier SBERT Inference Terraform vs CI/CD for Serverless Deployments Merve Noyan Stopped Writing Training Scripts — Her Agent Just Fine-Tuned 18 Models Solo for $11.40 Why Your Sales Forecast Is Always 20% Wrong (And How To Make It 12% Wrong) Genetic Cubic n{C/A} Ratios For Elementary Robotics Design Top 20 AdaBoost Interview Questions & Answers (Part 2 of 2) Agentic AI Vs AI Agents — What Are the Key Differences? LAI #127: The Infrastructure Layer of AI Is Becoming the Product Anthropic Caught Its Own AI Planning to Blackmail Engineers RNNs Cannot Think What Transformers Think Cheaply. ICLR 2026 Proved the Gap Is Exponential. Time Series Made So Easy My Aunt Got It on the Second Read Claude Cowork 101 | Towards AI Is 3-Bit KV Cache the Holy Grail? A Reality Check on Google’s TurboQuant LangGraph Multi-Agent Architecture: Building a Self-Critiquing AI Debate System AutoML on Autopilot | Towards AI I Ran This Open-Source AI Tool on a Messy Codebase and Got 71x Fewer Tokens — Here Is Exactly What Happened Month in 4 Papers (April 2026) AI Kept Forgetting My Notes. Fixing That Taught Me How It Actually Works. How ChatGPT Makes You Addicted Crack ML Interviews with Confidence: K-Nearest Neighbors (KNN 20 Q&A) The Event-Driven Blueprint: How I Scaled a Spring Boot System to 10 Million Kafka Messages/Day Building Vector Search? Why FAISS Alone Isn’t Enough TAI #202: GPT-5.5 Moves Codex Into Real Work Machine Learning System Design -The Model Serving Triangle, With One Forward Pass Flowing Through Every Trade-off (Part3) AI Orchestration in Action: How MuleSoft and LLMs Fuel the Future of Enterprise AI GPT-4 Has 1.8 Trillion Parameters. It Uses 2% of Them Per Token. Part 20: Data Manipulation in Multi-Dimensional Aggregation A Fundamental Introduction to Genetic Algorithm -Part Two TAI #200: Anthropic’s Mythos Capability Step Change and Gated Release From Notebook to Production: Running ML in the Real World (Part 4) Sqribble’s Template‑Driven Document Automation Anthropic Just Shipped the Layer That’s Already Going to Zero Long-Term vs Short-Term Memory for AI Agents: A Practical Guide Without the Hype The L1 Loss Gradient, Explained From Scratch Your Postcode Is Deciding Your Care. I Built a Pipeline to Prove It. I Directed AI Agents to Build a Tool That Stress-Tests Incentive Designs. Here’s What It Found. Your System Prompt Is the Product — Not the Feature The LLM Wiki Trend Has a Retention Problem Nobody Mentions Top 20 Data Preparation Interview Questions and Answers (Part 2 of 2) LAI #122: Word Embeddings Started in 1948, Not With Word2Vec Top 15 Computer Vision Datasets [2026] 40 Generative AI Interview Questions That Actually Get Asked in 2026 (With Answers)
I Trained a Markdown File to Boost GPT-5.5 by 23 Points — It Shouldn't Work | Towards AI
Chew Loong Nian - AI ENGINEER · 2026-06-18 · via Towards AI

Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net

228 Park Avenue South New York, NY 10003 United States

Website: https://towardsai.net/ Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about

Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, Website, Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc.

Cover:

Towards AI Cover

Logo:

Towards AI Logo

Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net

5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Free: 6-day Agentic AI Engineering Email Guide.
Learnings from Towards AI's hands-on work with real clients.

Last Updated on June 18, 2026 by

Author(s): Chew Loong Nian – AI ENGINEER

Originally published on Towards AI.

I did not fine-tune anything. I did not touch a single weight. I ran a training loop whose only output was a 1,400-token Markdown file, dropped that file into the context window of a frozen GPT-5.5, and watched its six-benchmark average jump from 58.8 to 82.3. That is +23.5 points from a text file you can open in any editor.

Image

After the intro, the article explains the core idea behind SkillOpt: treat the skill document (a Markdown “skill file”) like trainable state while keeping the target model frozen, and use a stronger optimizer model during training to propose bounded edits (add/delete/replace) that are only accepted if they strictly improve a held-out validation score—mirroring concepts from gradient descent stability in text space. It then summarizes the reported results across 52 (model, benchmark, harness) combinations, highlighting that SkillOpt is best-or-tied-best everywhere and that GPT-5.5 in direct chat rises from 58.8 to 82.3 (+23.5), with especially large gains on procedural, format-checked tasks like SpreadsheetBench. The author shows what the resulting “trained” skills look like—rules for checking structure, writing explicitly evaluated values, tracking state in embodied navigation, and anchoring answers to the correct table row—and notes that improvements can come from just a few accepted edits and a small artifact size. It provides a practical setup for reproducing the workflow quickly (installing SkillOpt, configuring backends, running the loop, and deploying by prepending the learned Markdown to the model’s context). The piece also covers SkillOpt-Sleep, a plugin-style extension that learns from a user’s own past transcripts with a review-and-adopt, validation-gated offline consolidation loop. Finally, it addresses two limitations—reliance on automatic scoring judges and the fact it optimizes one document at a time—before concluding that for procedural, checkable agent tasks, training the document instead of the model is a more reliable and cheaper optimization than fine-tuning.

Read the full blog for free on Medium.

Published via Towards AI


Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

15 engineers. 100,000+ students. Towards AI Academy teaches what actually survives production.

Start free — no commitment:

6-Day Agentic AI Engineering Email Guide — one practical lesson per day

Agents Architecture Cheatsheet — 3 years of architecture decisions in 6 pages

Our courses:

AI Engineering Certification — 90+ lessons from project selection to deployed product. The most comprehensive practical LLM course out there.

Agent Engineering Course — Hands on with production agent architectures, memory, routing, and eval frameworks — built from real enterprise engagements.

AI for Work — Understand, evaluate, and apply AI for complex work tasks.

Note: Article content contains the views of the contributing authors and not Towards AI.