慣性聚合 高效追蹤和閱讀你感興趣的部落格、新聞、科技資訊
閱讀原文 在慣性聚合中打開

推薦訂閱源

WordPress大学
WordPress大学
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
博客园 - 【当耐特】
The Cloudflare Blog
宝玉的分享
宝玉的分享
大猫的无限游戏
大猫的无限游戏
月光博客
月光博客
腾讯CDC
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Last Week in AI
Last Week in AI
G
Google Developers Blog
小众软件
小众软件
Google DeepMind News
Google DeepMind News
量子位
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
V
Visual Studio Blog
爱范儿
爱范儿
Engineering at Meta
Engineering at Meta
阮一峰的网络日志
阮一峰的网络日志

Recent Announcements

Amazon EC2 X8i instances are now available in Europe (Paris) Amazon CloudWatch pipelines now supports drop and conditional processing AWS Deadline Cloud supports monitor creation in multiple regions Amazon CloudWatch pipelines introduces new compliance and governance capabilities Second-generation Amazon FSx for NetApp ONTAP is now available in four additional AWS commercial and AWS GovCloud (US) Regions AWS Billing and Cost Management Dashboards Now Supports Scheduled Email Delivery AWS RTB Fabric supports health checks for real-time bidding workloads AWS Backup extends Amazon FSx support to 5 additional AWS Regions and expands cross-Region and cross-account copy to 14 AWS Regions Amazon RDS now supports the latest CU and GDR updates for Microsoft SQL Server Amazon Timestream for InfluxDB Now Supports Customer-Defined Maintenance Windows Amazon Bedrock now supports cost allocation by IAM user and role Amazon OpenSearch Service supports Managed Prometheus and agent tracing Amazon S3 Lifecycle pauses actions on objects that are unable to replicate Amazon RDS Blue/Green Deployments now supports Amazon RDS Proxy AWS Marketplace announces the Discovery API for programmatic access to catalog data AWS Agent Registry for centralized agent discovery and governance is now available in Preview Amazon OpenSearch Serverless now supports Zstandard (zstd) codec for index compression AWS Private CA now supports customer managed permissions for cross-account sharing Amazon EC2 Capacity Manager now supports tag-based dimensions Amazon Route 53 Resolver endpoints now support DNS delegation for private hosted zones in AWS GovCloud (US) Regions SageMaker HyperPod now supports gang scheduling for distributed training workloads Amazon IVS 实时直播功能现已支持冗余摄取 Amazon EKS 托管节点组现在支持 EC2 Auto Scaling 暖池 Amazon Bedrock AgentCore 浏览器新增操作系统级交互功能 Amazon WorkSpaces Advisor now available for AI-powered troubleshooting Amazon OpenSearch Service now supports Graviton4 based i8ge instances Oracle Database@AWS is now available in twelve AWS Regions AWS Lambda expands response streaming support to all commercial AWS Regions AWS Cost Explorer launches Natural Language Query capabilities powered by Amazon Q Amazon Lightsail 现已在亚太地区(马来西亚)区域推出 Amazon Bedrock now offers Claude Mythos Preview (Gated Research Preview) Amazon SageMaker 为 Identity Center 域新增无服务器工作流 Announcing Amazon S3 Files, making S3 buckets accessible as file systems Amazon RDS for Oracle now supports M8i and R8i instances Amazon Braket 新增对 Rigetti 108 量子比特 Cepheus QPU 的支持 AWS Transfer Family 现已为连接器和 Web 应用程序提供 IPv6 支持 Amazon Aurora 现在支持 PostgreSQL 17.9、16.13、15.17 和 14.22 AWS Certificate Manager 现在支持原生证书搜索 Amazon SageMaker Unified Studio adds notebook import/export and developer acceleration features Amazon Verified Permissions now supports policy store aliases and named policies and policy templates Amazon WorkSpaces Personal now supports unique DNS names for PrivateLink Amazon FSx for OpenZFS is now available in the AWS Asia Pacific (Melbourne) Region AWS announces general availability of Smithy-Java client framework AWS IoT Greengrass component SDK for C, C++, and Rust applications Amazon S3 starts rolling out new security best practice to new and existing buckets by default Amazon RDS for Oracle now supports Oracle Management Agent version 24.1.0.0.v1 for Oracle Enterprise Manager Cloud Control 24aiR1 Apache Spark troubleshooting and upgrade agents now available as Kiro powers Amazon Bedrock Data Automation now supports custom vocabulary AWS Glue Schema Registry is now available in three more AWS regions Amazon SageMaker Data Agent introduces charting capabilities and support for materialized views
Amazon SageMaker HyperPod 現已支持為 Slurm 集群配置基於 AMI 的節點生命週期
aws@amazon.c · 2026-05-08 · via Recent Announcements

Amazon SageMaker HyperPod 現已支持基於 AMI 的配置,可為 Slurm 集群節點預置生產級環境所需的軟件與配置,從而運行人工智能/機器學習訓練工作負載。用戶無需下載、配置生命週期配置腳本,也無需將該腳本上傳到 Amazon S3。準備集群所需的運維步驟較少,並且無需在節點預置過程中執行生命週期配置腳本,這大大縮短了集群創建時間,使您能夠更快地啟動運行作業。

基於 AMI 的配置包括 Docker、Enroot 和 Pyxis 等必備軟件,以及 Slurm 計費統計、SSH 密鑰生成、Slurm 日誌輪換和用戶主目錄設置等配置。要啟用基於 AMI 的配置,請在使用 CreateCluster API 創建集群時,從實例組配置中省略 LifeCycleConfig 塊,或者在使用 SageMaker AI 控制台時,在自定義設置的生命週期腳本下選擇“無”。要在基於 AMI 的配置基準的基礎上進一步自定義,可以提供擴展腳本,這樣一來,您只需專注要添加的功能和軟件即可,例如用戶配置、可觀測性或 LDAP 集成。

通過 API 和 SageMaker AI 控制台創建集群時,均可配置擴展腳本。藉助 CreateCluster API,可在 LifeCycleConfig 塊中指定新的 OnInitComplete 參數和 SourceS3Uri。通過控制台,可在自定義設置的“S3 中的擴展腳本文件”字段中,為擴展腳本提供 S3 URI。對於需要完全控制預置的高級使用案例,API 和 SageMaker AI 控制台仍完全支持自定義生命週期配置腳本。

這項功能已在提供 SageMaker HyperPod 的所有 AWS 區域推出。要開始使用基於 AMI 的節點生命週期配置創建 HyperPod Slurm 集群,請參閱《SageMaker AI 開發人員指南》中的通過 AWS CLI 開始使用 SageMaker HyperPod通過 SageMaker AI 控制台開始使用 SageMaker HyperPod