惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

美团技术团队
T
Troy Hunt's Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
S
Schneier on Security
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
Cisco Talos Blog
Cisco Talos Blog
AWS News Blog
AWS News Blog
NISL@THU
NISL@THU
The Hacker News
The Hacker News
Know Your Adversary
Know Your Adversary
L
Lohrmann on Cybersecurity
SecWiki News
SecWiki News
S
Security Affairs
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Help Net Security
Help Net Security
L
LINUX DO - 热门话题
Application and Cybersecurity Blog
Application and Cybersecurity Blog
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
I
Intezer
S
Secure Thoughts
罗磊的独立博客
Attack and Defense Labs
Attack and Defense Labs
G
GRAHAM CLULEY
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
博客园_首页
Cyberwarzone
Cyberwarzone
IT之家
IT之家
T
Threatpost
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
The Cloudflare Blog
博客园 - 叶小钗
Cloudbric
Cloudbric
量子位
Scott Helme
Scott Helme
N
News | PayPal Newsroom
L
LINUX DO - 最新话题
O
OpenAI News
C
Cyber Attacks, Cyber Crime and Cyber Security
Security Archives - TechRepublic
Security Archives - TechRepublic
C
Cybersecurity and Infrastructure Security Agency CISA
J
Java Code Geeks
有赞技术团队
有赞技术团队
月光博客
月光博客
大猫的无限游戏
大猫的无限游戏
W
WeLiveSecurity
宝玉的分享
宝玉的分享
P
Privacy International News Feed
A
Arctic Wolf
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
云风的 BLOG
云风的 BLOG

Replicate's blog

How to make remarkable videos with Seedance 2.0 – Replicate blog How to prompt Seedream 5.0 – Replicate blog Recraft V4: image generation with design taste – Replicate blog Run Isaac 0.1 on Replicate – Replicate blog Run FLUX.2 on Replicate – Replicate blog How to prompt Nano Banana Pro – Replicate blog Retro Diffusion's pixel art models are now on Replicate – Replicate blog Replicate is joining Cloudflare – Replicate blog Extract text from documents and images with Datalab Marker and OCR – Replicate blog How to prompt Veo 3.1 – Replicate blog IBM's Granite 4.0 is now on Replicate – Replicate blog Which image editing model should I use? – Replicate blog Introducing our new search API – Replicate blog Torch compile caching for inference speed – Replicate blog Announcing Replicate's remote MCP server – Replicate blog How to prompt Veo 3 with images – Replicate blog Open source video is back – Replicate blog Generate consistent characters – Replicate blog Bria is now on Replicate – Replicate blog How we optimized FLUX.1 Kontext [dev] – Replicate blog Compare AI video models – Replicate blog The FLUX.1 Kontext hackathon – Replicate blog How to prompt Veo 3 for the best results – Replicate blog Get the most from Google Veo 3 – Replicate blog FLUX.1 Kontext from the community – Replicate blog Use FLUX.1 Kontext to edit images with words – Replicate blog Generate incredible images with Google's Imagen 4 – Replicate blog Run OpenAI’s latest models on Replicate – Replicate blog NVIDIA H100 GPUs are here – Replicate blog Run 30,000+ LoRAs on Hugging Face with Replicate – Replicate blog Ideogram 3.0 on Replicate – Replicate blog Run MiniMax Speech-02 models with an API – Replicate blog Easel AI is now on Replicate – Replicate blog Stylized video with Wan2.1 – Replicate blog Creative roundup: avatars, lightsabers, and LoRA tricks – Replicate blog Wan2.1: generate videos with an API – Replicate blog Wan2.1 parameter sweep – Replicate blog You can now fine-tune open-source video models – Replicate blog Generate short videos with the Replicate playground – Replicate blog AI video is having its Stable Diffusion moment – Replicate blog FLUX fine-tunes are now fast – Replicate blog FLUX.1 Tools – Control and steerability for FLUX – Replicate blog Ideogram v2 is an outstanding new inpainting model – Replicate blog Stable Diffusion 3.5 is here – Replicate blog FLUX is fast and it's open source – Replicate blog FLUX1.1 [pro] is here – Replicate blog Using synthetic training data to improve Flux finetunes – Replicate blog Fine-tune FLUX.1 with an API – Replicate blog Fine-tune FLUX.1 to create images of yourself – Replicate blog Replicate Intelligence #12 – Replicate blog Replicate Intelligence #11 – Replicate blog Fine-tune FLUX.1 with your own images – Replicate blog Replicate Intelligence #10 – Replicate blog FLUX.1: First Impressions – Replicate blog Replicate Intelligence #9 – Replicate blog Run FLUX with an API – Replicate blog Replicate Intelligence #8 – Replicate blog Run Meta Llama 3.1 405B with an API – Replicate blog Replicate Intelligence #7 – Replicate blog Replicate Intelligence #6 – Replicate blog Replicate Intelligence #5 – Replicate blog How to get the best results from Stable Diffusion 3 – Replicate blog Run Stable Diffusion 3 on your Apple Silicon Mac – Replicate blog Push a custom version of Stable Diffusion 3 – Replicate blog Replicate Intelligence #4 – Replicate blog Run Stable Diffusion 3 on your own machine with ComfyUI – Replicate blog H100s are coming to Replicate – Replicate blog Run Stable Diffusion 3 with an API – Replicate blog Replicate Intelligence #3 – Replicate blog Replicate Intelligence #2 – Replicate blog Replicate Intelligence #1 – Replicate blog Shared network vulnerability disclosure – Replicate blog Run Snowflake Arctic with an API – Replicate blog Run Meta Llama 3 with an API – Replicate blog Run Code Llama 70B with an API – Replicate blog How to create an AI narrator for your life – Replicate blog Clone your voice using open-source models – Replicate blog Businesses are building on open-source AI – Replicate blog How to run Yi chat models with an API – Replicate blog Scaffold Replicate apps with one command – Replicate blog Using open-source models for faster and cheaper text embeddings – Replicate blog Generate music from chord progressions and text prompts with MusicGen-Chord – Replicate blog Generate images in one second on your Mac using a latent consistency model – Replicate blog How to use retrieval augmented generation with ChromaDB and Mistral – Replicate blog Fine-tune MusicGen to generate music in any style – Replicate blog Jet-setting with Llama 2 + Grammars – Replicate blog How to run Mistral 7B with an API – Replicate blog Make smooth AI generated videos with AnimateDiff and an interpolator – Replicate blog Fine-tuned models now boot in less than one second – Replicate blog Painting with words: a history of text-to-image AI – Replicate blog We're cutting our prices in half – Replicate blog A guide to prompting Llama 2 – Replicate blog Streaming output for language models – Replicate blog Fine-tune SDXL with your own images – Replicate blog Run Llama 2 with an API – Replicate blog Run SDXL with an API – Replicate blog A comprehensive guide to running Llama 2 locally – Replicate blog Fine-tune Llama 2 on Replicate – Replicate blog What happened with Llama 2 in the last 24 hours? 🦙 – Replicate blog Make any large language model a better poet – Replicate blog
NVIDIA L40S GPUs are here – Replicate blog
2024-11-15 · via Replicate's blog

Posted November 15, 2024 by

Today we added NVIDIA L40S GPUs to our supported hardware types. These new GPUs are around 40% faster than A40 GPUs.

We’re also going to be removing support for A40 GPUs. We will begin migrating all existing models and deployments from A40 GPUs to L40S GPUs over the coming weeks. You’ll continue to pay the same price for your private models and deployments, but you might pay more if you’re using public models or training models on A40 GPUs.

You can now run L40S GPUs for any new models, existing models, or deployments. To learn how to change the hardware type for your models and deployments, check out the docs.

Starting today, you have the option to switch any of your existing models and deployments to L40S GPUs, but you are not required to do so. If you choose not to switch, your models and deployments will continue to run on A40 GPUs for another few weeks until we migrate them to L40S GPUs.

Migration timeline

To give you better performance per dollar, we will begin migrating all existing models and deployments from A40 GPUs to L40S GPUs over the coming weeks:

  • 2024-12-02: Public model migration. We will begin migrating all public models running on A40 GPUs, and will finish migrating all models by 2024-12-04.
  • 2024-12-09: Private model and deployment migration. We will begin migrating all private models and deployments running on A40 GPUs, and will finish migrating all models by 2024-12-11. A40 GPUs will no longer be available after this date.

Speed and cost

The L40S GPUs are more expensive per second than the A40 GPUs, but they are also faster.

Our benchmarks across the highest-usage A40 models today show a ~40% median speed improvement when migrating from A40 GPUs to L40S GPUs. For more detail, see our Observable notebook which covers our benchmarking methodology, collection methods, and results.

How much you’ll pay if your models are migrated for L40S GPUs depends on what you’re using them for:

  • Private models and deployments: If you’re currently using A40 GPUs for private models or deployments, this is all good news: you’ll continue to pay the same price you pay today after we’ve migrated them to L40S GPUs. Because these new GPUs are faster, your bill will probably go down, and at worst stay the same.
  • Public models and training: If you’re using public models or training models on A40 GPUs, you’ll pay the new price of $3.51 per hour for L40S GPUs.
    • If you’re using standard A40s, the price per hour of L40S GPUs is 70% higher per hour, but they’re about 40% faster, so you can expect your bill to stay roughly the same. (1.7 × 0.6 ≈ 1)
    • If you’re using large A40s, the price per hour of L40S GPUs is 34% higher per hour, but they’re about 40% faster, so you can expect your bill to go down by about 20%. (1.34 × 0.6 ≈ 0.8)

To compare the performance and pricing of all our available hardware types, visit replicate.com/pricing.

Updating your deployments

If you have deployments that are using A40 GPUs, you will need to decrease their minimum instances to avoid being charged more.

For example, if your deployment is running 10 minimum instances on A40 GPUs, you should change your deployment configuration to 6 minimum instances when you switch to L40S GPUs, as they are approximately 40% faster.

You can edit your deployment configuration on the web or use the HTTP API.

If you’re not sure how to best configure your deployments, email us at support@replicate.com.