惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Help Net Security
T
ThreatConnect
SecWiki News
SecWiki News
F
Future of Privacy Forum
AWS News Blog
AWS News Blog
C
Cisco Blogs
A
Arctic Wolf
Vercel News
Vercel News
The GitHub Blog
The GitHub Blog
Scott Helme
Scott Helme
V
V2EX
博客园 - 叶小钗
阮一峰的网络日志
阮一峰的网络日志
K
Kaspersky official blog
G
Google Developers Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
P
Privacy International News Feed
C
Cyber Attacks, Cyber Crime and Cyber Security
N
News | PayPal Newsroom
Schneier on Security
Schneier on Security
NISL@THU
NISL@THU
Microsoft Azure Blog
Microsoft Azure Blog
量子位
The Hacker News
The Hacker News
Stack Overflow Blog
Stack Overflow Blog
Security Latest
Security Latest
M
Microsoft Research Blog - Microsoft Research
Google Online Security Blog
Google Online Security Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
I
InfoQ
Google DeepMind News
Google DeepMind News
Y
Y Combinator Blog
The Cloudflare Blog
Microsoft Security Blog
Microsoft Security Blog
Martin Fowler
Martin Fowler
Cisco Talos Blog
Cisco Talos Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Troy Hunt's Blog
F
Fox-IT International blog
S
Security @ Cisco Blogs
博客园 - 司徒正美
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
C
Comments on: Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
L
LINUX DO - 最新话题
GbyAI
GbyAI
Project Zero
Project Zero
腾讯CDC
T
Tailwind CSS Blog

DEV Community

LangGraph 워크플로우 템플릿 (v38) Find Remove duplicated files in Google Drive How to Detect GPU Waste in a Kubernetes Cluster The Privacy Bug in My First Chrome Extension (And How to Avoid It) Serverless Mental Models: What They Don't Tell You Before You Build Preventing GPT hallucination in automated content pipelines: how I structure Make.com flows with data injection Hmm, where were we? AI Visibility Tools, Math Proofs, and Stripped Guardrails Shape Developer Landscape How AI and Electronics Are Changing Healthcare Devices: The Future of Smart Healthcare Author: Shivam Wakade | Founder, PrivSR Making Claude Sound Like Optimus Prime Understanding Reinforcement Learning with Human Feedback Part 5: Training the Reward Model with Loss Functions Learning Progress Pt.20 How Secure LoRa Communication Devices Work: Building the Future of Private and Long-Range Connectivity Author: Shivam Wakade | Founder, PrivSR How I Rebuilt an RPG Map Editor with Rust, React, and WASM Building a System That Automates YouTube Post-Production Building a 100% Serverless Digital Asset Packager in the Browser Game Recommended AI What is Human-In-The-Loop (HITL)? Deep Dive: React Server Components in TanStack Start Migrating off Google Analytics: Umami vs Plausible vs Fathom Building a Portfolio That Actually Demonstrates Software Engineering Async/Await in JavaScript: From Callbacks to Clean Code (2026) Benchmarking LLM Structured Outputs Angular 21 Multiselect Dropdown: A Migration-Friendly Component with Live Functional Tests ShareBox v5 — GPU transcoding, Netflix-style grid, and why I don't need Plex anymore TOML Schema is live Handling Duplicate Shopify Webhook Events (And Why You Must) Original Kubernetes Dashboard — retired upstream, upgraded to Angular 21. لماذا أسست ترينافو للتجار العرب الذين تتجاهلهم المنصات الغربية Construyendo un recomendador de películas en Python: de los datos al modelo When APIs Lie: A Lesson in Defensive Debugging Pope Leo XIV's AI Encyclical: What Builders Must Know (2026) Donna v0.3.0 HTB — MonitorsFour | Writeup The Free Tool You Trust Is the One You Should Fear the Most HTB — MonitorsFour | Writeup Fr 97. Embeddings and Vector Search: Semantic Search That Works Deep Dive: Building "Gravity Paint" - A Tactile Physics Instrument with React, Matter.js, and p5.js ABAP Unit Testing with Test Doubles and Mocking Frameworks: A Senior Architects Guide to Isolating Dependencies in SAP S/4HANA LeetCode Solution: 5. Longest Palindromic Substring kovax-react 0.8: Tailwind v4 preset, FormField adapters, ColorModeScript, and Storybook I built an AI résumé tool that refuses to lie about your experience The hat Azure Entra ID User & Role Management — Step-by-Step Practical Guide With A Simple Excercise The AI-Native Company: How a Single Founder Can Build Global Organizations Powered by AWS and an Ecosystem of Artificial Intelligences Building a Lightweight Remote MCP Knowledge Base on Cloudflare Workers Why I built Trinavo for the MENA merchants Western platforms ignore The N+1 Query That Killed Our Database, And How I Fixed It Docstrings vs Markdown Docs: What Should Developers Actually Write? Training Data Provenance: The Manifest Diff That Explains the Hash Add SVGIcons MCP to Claude Code and Find SVG Icons from Your Terminal 3 CLI Tools You Can Buy with Crypto — No KYC, No Subscriptions COSS Weekly: OpenClaw competitor NanoClaw Raises $12M, Dust Raises $40M, Sonar Acquires Gitar, and more How to know if you actually need mobile proxies (without buying any) Building Cursor for Community: A Buildathon Built on Time Pressure How we built a PII masking layer for LLM APIs — local detection, reversible tokens, one line to integrate Why MLFQ Was Way Ahead of Its Time Add Runtime Limits to Claude Agent Workflows I Built a Prompt Injection Detector with 98% Recall on Unseen Attacks. Here's Why Data Beat Architecture. 8 Vite Config Options Every Developer Should Know (Vite 8) Feature Flags That Forgot to Leave Why Trust Infrastructure Is Becoming the Hidden Layer of Donation Platforms XyPriss: Rethinking Core Performance and Zero-Trust Architecture in Modern Backends Designing Configuration for Scalable Treasure Hunts SSH Login Delays: The 10-Second Wait That Drives Us Crazy Building Production Multi-Agent Workflows in n8n: What 50 Deployments Taught Us A 3-layer memory system that gives Claude Code persistent context across sessions. Trishul SNMP Suite 2.0.1: Better MIBs, Traps, and SNMP Labs How I built a production AI SaaS as a solo developer Auto-labelling 1.2M robotics frames with VLMs: a failover story India’s Laws Were Not Built for AI — And Courts Are Filling the Gap skill-insp: A Skill That Scores Other Skills Clprolf Minimalist Messaging in the Age of AI What's actually in a good .cursorrules file? I built 10 of them — here's what I learned Building Strong Python Basics – Loops, Functions and Logic How to Choose the Right Tech Stack for Your Project I built a free multi-tab JSON editor — here's what I learned HTTP Headers Every Developer Should Know (2026) Building Cross-Platform Digital Products: Challenges and Best Practices Data Privacy in the Age of AI: How Product Teams Can Build Trust with Users What Would WordPress Look Like If It Were Designed Today? Why Backup Success Does Not Mean Database Recoverability Local AI Office Assistant That Never Sends Your Documents to the Cloud Building TaskForge: Translating Enterprise Chaos into an Open-Source Scheduler Tesla P40 in a Homelab: 24GB of Inference on a Budget Llama 4: Meta's Latest — Scout, Maverick, and the MoE Revolution George Hotz called AI code 'slop.' He's half right. Como Construir um Fluxo de Trabalho Baseado em Engenharia de Prompt e Automação We Audited Our Agent Tool-Call Traces. Half Our Eval Data Was Garbage. The Hidden Cost of Downtime: How SRE Error Budgets Protect National Economic Infrastructure Getting started with openHUMANS can be an exciting venture for developers looking to create innovative applications in the realm of human-ce Stack Overflow: A Powerful Community for Developers and Learners From Language Models to Humanoid Minds ✨ Road to Senior #2: How Computers Think in Numbers Why LLM debugging fails on fragmented repository context How to Deploy a LangGraph Agent on AWS Bedrock AgentCore An outreach kit for solo founders whose drafts can't hallucinate Open Satchel is live Amy Kwalwasser and the Growing Importance of Quantum Risk Modeling
Sustainable AI Starts with Efficient AI
Sara Han · 2026-05-26 · via DEV Community

AI is no longer a side experiment. It is already part of products, workflows, and day-to-day operations. So the question is not whether companies should use AI, but how to use it responsibly at scale. From our perspective, it starts with efficiency: getting the same or better results with less compute, less energy, and a lower environmental footprint.

The way models are chosen affects energy use and emissions, and as expectations around transparency grow, it's important to have a more solid way to explain those choices. The good news is that reducing AI’s footprint does not mean slowing innovation. It means running better systems: smaller models, faster inference, and clearer measurement.

How Big Is the AI Sustainability Impact?

Before we go further, let’s start with a few facts about AI and sustainability.

  • 3-40Wh: Amount of energy consumed for one small to long ChatGPT query (Source, 2025)
  • 2 nuclear plants: Number of nuclear plants to constantly work to generate enough energy if 80M people generate 5 pages per day (Source, 2025)
  • 61,848.0x: Difference between the highest and lowest energy use in energy leaderboard for AI models (Source, 2025).
  • +160%: Expected increase of data center power consumption by 2030 (Source)

How Does AI Impact the Environment?

These figures are clear. Behind powerful AI tools, there is an environmental cost. AI systems require large amounts of natural resources and contribute to greenhouse gas emissions. Understanding how AI impacts the environment is important because the choices made now will shape whether AI becomes a tool for sustainability or a source of greater environmental pressure.

When discussing environmental impact, it is often easy to overlook which parts of an AI model’s lifecycle affect the environment. In general, the focus is usually placed on the impact of using AI tools, especially the energy required by large data centers. However, environmental effects occur across every stage of the AI lifecycle.

For example, it is not only during the training of AI models, but also during deployment and even the earlier stages required to make AI possible, such as material extraction, equipment manufacturing, cooling, networking, and storage.

The environmental costs of AI can take different forms, including the use of natural resources such as energy, water, and minerals, as well as greenhouse gas emissions.

Energy

Energy is one of the most visible resources used by AI systems. AI models require significant amounts of electricity not only during training, but also during deployment, allowing users to interact with them in real time. This process requires physical hardware. The amount of hardware needed depends on the size and optimization of the model: some models may be deployed using a single GPU (Graphics Processing Unit), while others may require multiple GPUs and, therefore, more energy. As a result, data centers are energy-intensive facilities. They concentrate powerful computing hardware, which can account for around 40–50% of a data center’s total energy use, as well as networking systems, storage, and cooling infrastructure, which can account for around 30–40%.

The environmental impact of this energy use depends largely on where the electricity comes from. If data centers are powered by fossil fuels, AI can contribute to higher greenhouse gas emissions. If they use renewable energy sources, the impact can be reduced. As AI becomes more widely used, improving energy efficiency, increasing transparency from large technology companies, and expanding the use of clean energy will be important for reducing its environmental footprint.

Water

One aspect mentioned earlier is the energy used by cooling infrastructure in data centers. However, cooling does not only require energy; it can also require large amounts of water. Depending on the data center infrastructure, water use can range from 0.18 to 1.1 liters per kWh of energy. Water is used to remove heat through cooling systems, and it often needs to be clean to prevent damage to cooling pipes and equipment. Moreover, a significant portion of this water can evaporate due to high temperatures, meaning it does not always return to the same cycle. Water is also used in other stages of AI hardware production, such as semiconductor manufacturing for GPU chips, where it is needed for cleaning and sterilization, and, to a lesser extent, for generating energy.

water usage
Source: https://arxiv.org/pdf/2304.03271

Minerals

To manufacture the chips and hardware on which AI models run, large amounts of metals are required, including aluminum, copper, tin, tantalum, lithium, gallium, germanium, palladium, cobalt, and tungsten. Extracting these materials can have significant environmental impacts, as mining often requires high energy use, large amounts of water, and the removal of soil and vegetation. It can also contribute to habitat disruption, pollution, and waste from mining processes. As demand for AI hardware grows, the need for these minerals may increase, making responsible sourcing, recycling, and more efficient hardware design important for reducing AI's environmental footprint.

Greenhouse gas emissions

One of the most common sources of greenhouse gas emissions is electricity generation, which, as mentioned earlier, is required across different stages of AI. Emissions can also be produced during the manufacturing of specific materials, such as concrete and metals used to build data centers and the hardware infrastructure that supports AI systems.

How to measure the AI sustainability impact?

Now that we understand how AI can impact the environment, the next question is how we can measure that impact. However, to do so, we first need to understand that there is no single, fixed footprint for an “AI request.”

  • Input footprint varies: It is tempting to think of AI usage in simple units: one prompt in, one answer out, one measurable footprint. In reality, the same prompt can have very different impacts depending on model size, input and output length, inference settings, and the serving environment. Even the same model may behave differently across deployments, meaning its footprint can change and often be improved through better engineering decisions.
  • Not all AI workloads are equal: Text generation, image generation, and video generation sit at very different points on the compute spectrum. Video, for example, is typically far more compute-intensive than text. That means two teams may both be “using AI” while generating very different levels of environmental impact. Understanding those differences helps organizations identify and prioritize the areas where optimization can have the greatest effect.

model emissions

emissions modality
Source: https://arxiv.org/pdf/2311.16863

  • Deployment shapes impact: Hardware type, region, serving setup, and runtime choices all affect the environment. That is exactly why efficiency matters: once companies understand what drives impact, they can start reducing it through smarter models, better optimization, and more efficient infrastructure.

Perfectly measuring every aspect of sustainability is not possible. Still, it is worth tracking what can be measured and making the necessary updates as better data becomes available. So, it is time to look at the numbers.

energy formula

This formula calculates the energy consumption of a query ii at the lower and upper utilization bounds. First, it calculates the total inference time in hours, denoted as TiT_i . Then, the formula multiplies this time by the effective power used by the hardware: PGPU×UGPU,min,maxP_{\text{GPU}} \times U_{\text{GPU},{\min,\max}} represents the GPU power draw under the lower or upper utilization assumption, while Pnon-GPU×Unon-GPUP_{\text{non-GPU}} \times U_{\text{non-GPU}} represents the power draw from non-GPU components such as CPU, memory, networking, and storage. Finally, the result is multiplied by PUE\text{PUE} , which accounts for additional data center overhead such as cooling and power distribution.

water formula

This formula calculates the water consumption of a query in liters by separating the impact into on-site and off-site components. EqueryPUEWUEsite\frac{E_{\text{query}}}{\text{PUE}} \cdot \text{WUE}{\text{site}} estimates the water used on-site at the data center, mainly for cooling; EqueryPUE\frac{E{\text{query}}}{\text{PUE}} isolates the IT energy consumed by the computing equipment, and this value is multiplied by WUEsite\text{WUE}{\text{site}} , the data center’s on-site water usage effectiveness in liters per kilowatt-hour. The quantity EqueryWUEsourceE{\text{query}} \cdot \text{WUE}{\text{source}} estimates the off-site water consumption associated with generating the electricity used by the query, where WUEsource\text{WUE}{\text{source}} represents the water intensity of the electricity source. Adding both terms gives the total estimated water consumption for the query.

carbon formula

It calculates the carbon emissions of a query in kilograms of carbon dioxide equivalent. EqueryE_{\text{query}} represents the energy consumed by the query and CIF\text{CIF} is the carbon intensity factor of the electricity supply, usually expressed in kgCO2e/kWh\text{kgCO}_2\text{e}/\text{kWh} . By multiplying, it estimates the amount of greenhouse gas emissions associated with running that query.

Tools for individual testing:

How Can Efficiency Improve AI Sustainability?

The formulas above help quantify part of AI’s environmental impact, but they also raise a broader question: how does energy use affect the performance of AI systems themselves?

On the one hand, using more energy can improve the quality of AI outputs. This relationship has been widely studied through scaling laws, which show that increasing compute during training and, in some cases, during inference can lead to better model quality. Larger models, longer training runs, and more complex inference strategies can all improve the accuracy, reliability, or usefulness of predictions.

However, more energy does not mean more performance. A system that produces high-quality results but requires more time, larger hardware, higher compute costs, and greater energy consumption may not be efficient overall. Higher energy use can also increase the environmental impact of AI by requiring more resources to build, run, and cool the servers that support these systems.

> Performance is the combination of quality and efficiency.

At a practical level, efficient AI means achieving the same results with fewer resources. Even when sustainability is not your main priority, optimizing energy use remains important because it directly affects overall system performance, including cost, speed, scalability, and hardware requirements.

By reducing environmental impact without requiring users or developers to do less, it shows that sustainable AI is not only about limiting, but also about designing systems that are faster, more scalable, and less resource-intensive. In this sense, improving efficiency can benefit both the environment and the performance of AI systems, making it a practical and necessary direction for the future.

This is one of our key motivations behind sustainable AI: aligning environmental goals with broader performance incentives so that better engineering choices lead to lower impact.

What Does Pruna Do for AI Sustainability?

There is no single, one-size-fits-all approach to reducing the environmental impact of AI models. At Pruna, we believe that sustainable AI starts with efficient AI, and we work across several areas to make this possible.

Performance Models

At Pruna, we offer highly optimized models through our P-models family. They are smaller, faster, and more energy-efficient than many other released models, while still maintaining strong quality. This includes P-Image, P-Image-Edit, and P-Video, among others, which are 3 to 6 times more energy efficient than other models for the same tasks.

In addition, we provide optimized endpoints through our API and through other vendors, making the models more lightweight and easier to integrate into different environments. This reduces hardware requirements and energy consumption without compromising usability. Some examples are Wan 2.2 or Flux 2.

Check our P-models here.

Open Source AI Efficiency

If none of the provided models meet your needs, we also offer tools to make your preferred model smaller and more efficient. The OSS Pruna package is a model optimization framework that helps developers build faster and more efficient models with minimal overhead. It provides a comprehensive suite of compression techniques (caching, quantization, pruning, distillation, compilation, kernels, or recoverers) that can be easily combined without requiring complex manual integration.

Check the Pruna Github repository

Events and Challenges

We also collaborated with different initiatives and communities to promote AI efficiency beyond our own work.

For instance, we have been running AI efficiency meetups and webinars where we discuss this topic with pruners, as well as with invited speakers from the broader AI and sustainability community.

In addition, we have collaborated with other organizations. For instance, we hosted a community event with CodeCarbon and EcoLogits, where participants could learn, exchange ideas, and discuss practical ways to measure and reduce the environmental impact of AI. We also supported the 1st International Challenge on Compression of AI Models, aiming to contribute to sustainable AI by encouraging participants to optimize models.

events

Our Metrics

To measure the environmental impact, we integrated our runs with CodeCarbon and used their dashboard to track the results. We also estimated the energy use and CO₂ emissions avoided by comparing our optimized models with their base versions: what would have been consumed without optimization versus what was actually required when using Pruna.

These are the results we achieved over the past year for a single provider.

our metrics

A quick disclaimer: making AI more efficient is only one part of sustainable AI. Efficiency improvements can sometimes lead to more overall usage, known as the rebound effect. We should also ask whether AI is needed for every task, because in many cases, simpler solutions may be enough.

Conclusions

In this blog, we analyzed how AI impacts the environment, the stages where this impact occurs, and the main costs associated with it. We then explored how to measure this impact, showing that although results can vary depending on the prompt, task, deployment setup, and other factors, existing formulas can still help provide useful estimates. Finally, we presented what we are doing at Pruna through our efficient models, open-source package, and community events, and shared some of the results we have achieved.

Make your AI workloads More Efficient and Sustainable!

  • Run our efficient models from the API. Sign up here!
  • Compress your own models with Pruna and give us a ⭐️ to bring you many more algos!
  • Stay up to date with the latest AI efficiency research on our blog, explore our materials collection, or dive into our courses.
  • Join the conversation and stay updated in our Discord community.

References

Falk, S., Ekchajzer, D., Pirson, T., Lees-Perasso, E., Wattiez, A., Biber-Freudenberger, L., Luccioni, S., & van Wynsberghe, A. (2025). More than Carbon: Cradle-to-Grave environmental impacts of GenAI training on the Nvidia A100 GPU. arXiv. https://doi.org/10.48550/arXiv.2509.00093

Jegham, N., Abdelatti, M., Koh, C. Y., Elmoubarki, L., & Hendawi, A. (2025). How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference. arXiv. https://doi.org/10.48550/arXiv.2505.09598

Luccioni, S., Trevelin, B., & Mitchell, M. (2024). The Environmental Impacts of AI — Policy Primer. Hugging Face Blog. https://doi.org/10.57967/hf/3004

Luccioni, S., Jernite, Y., & Strubell, E. (2023). Power Hungry Processing: Watts Driving the Cost of AI Deployment? arXiv. https://arxiv.org/pdf/2304.03271