惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

N
News and Events Feed by Topic
Malwarebytes
Malwarebytes
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
C
Cybersecurity and Infrastructure Security Agency CISA
F
Future of Privacy Forum
C
Cisco Blogs
T
The Exploit Database - CXSecurity.com
A
Arctic Wolf
S
Securelist
K
Kaspersky official blog
S
Schneier on Security
T
ThreatConnect
T
Tenable Blog
Spread Privacy
Spread Privacy
T
True Tiger Recordings
AWS News Blog
AWS News Blog
F
Fox-IT International blog
量子位
T
Threatpost
V
Vulnerabilities – Threatpost
C
CERT Recently Published Vulnerability Notes
Cisco Talos Blog
Cisco Talos Blog
GbyAI
GbyAI
宝玉的分享
宝玉的分享
腾讯CDC
G
Google Developers Blog
aimingoo的专栏
aimingoo的专栏
Cyberwarzone
Cyberwarzone
有赞技术团队
有赞技术团队
S
SegmentFault 最新的问题
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
U
Unit 42
雷峰网
雷峰网
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Simon Willison's Weblog
Simon Willison's Weblog
O
OpenAI News
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
The GitHub Blog
The GitHub Blog
The Register - Security
The Register - Security
MyScale Blog
MyScale Blog
小众软件
小众软件
A
About on SuperTechFans
Last Week in AI
Last Week in AI
Y
Y Combinator Blog
博客园 - 三生石上(FineUI控件)
美团技术团队
Google Online Security Blog
Google Online Security Blog
P
Proofpoint News Feed
MongoDB | Blog
MongoDB | Blog

DEV Community

Securing Web APIs: A Practical Guide to Authentication & Authorization Methods Google I/O 2026: AI Built an OS in 12 Hours. I Spent Mine Sorting Screenshots. 🤦 Half a Day, Not a Week: One Nix Flake for Three Machines 🌱 Keep Feeding Your CI/CD — Or Watch It Die Gemma 4 vs GPT-4o vs Llama 3: What Actually Works Locally? Vessel Ops SSH in 2026: Why Every Developer Should Know It Cold Audit AI-Generated PRs Before You Merge Them (Swarm Orchestrator 10.3.0) App Store Optimization (ASO) I built a tool to visualize Django REST Framework architecture (URLs, Serializers, Models, and more) How I made my React site agent-ready in 100 lines AI Can Generate Interfaces on the Fly. But Users Still Need Orientation. AI-Assisted Content Workflow How We Learned That Most Resume Rejections Happen Before Humans See Your CV How I Prepared for CKA: Resources, Labs, and Strategy That Worked for Me Remix Mini PC: Moving the Whole Operating System Onto the eMMC Stop Flying Blind: We Built an LLM Evaluation Framework That Works Across 17+ Agent Frameworks The Misleading "User is not authorized to access connection" Error in AWS CodeBuild — and Why Your IAM Policy Looks Fine I Resurrected a Dead F1 Project and Accidentally Built a Race Intelligence OS Remix Mini PC: After a Year of Dead Ends, the eMMC Finally Talks Not All Games Are Equal: The Real Difference Between a Trap and a Tool How to add Peppol e-invoicing to your SaaS without making it your team's problem I Built a Hermes Agent to Tell Me Which Hackathons to Enter. It Told Me to Enter This One. The Five Hooks That Change How You Ship With Claude Code Powering Your Progress: Building Robust Solutions with Laravel I built a self-hosted CI/CD platform with persistent queue, encrypted secrets, and rollback UI — here's what I learned Antigravity 2.0 and the $1,000 OS: Why "Agent-First" Feels Like the Direction I've Been Building Toward Anyway I built an AI PR-triage agent in 30 lines of Markdown Core Web Vitals from 74 to 91: A Real Tax Practitioner Site Rebuild I Gave Gemma 4 150 Tools on Windows. Here's What Actually Happened. Beyond the Loop: Why Monolithic AI Agents Fail and How to Build a Microkernel Architecture The Hidden Tax of AI-Assisted Development (And How I Fixed It) I Ditched Cloud LLMs for Gemma 4 4B: A DevOps Engineer's 48-Hour Reality Check Building a Schema.org @graph That Validates on the First Try The "Lift and Shift" Trap: Why Your Integration Layer Needs More Than Just a Cloud Address All 7 OSI Layers Explained with Real-World Analogies Antigravity 2.0 in one day: the four shells and what each is good for Self-Hosting Google Fonts with size-adjust: Zero CLS Web Font Swap The Multi-Provider LLM Problem: Why “One API” Is Not Enough How I indexed 69,000 Claude Code skills (and what I learned doing it) RememberMe CareGrid: Local Gemma 4 for dementia memory and safety Google Is Killing Gemini CLI on June 18. Here Is What to Do Before Then Do Domínio ao Deploy: Hospedando Arquivos de Deep Links no Cloudflare Pages (Parte 7.1) Running Gemma 4 26B on an Old GTX 1080 with llama.cpp Devlog 1: I tried building an SNES game with the super FX chip Why Gemma 4 Feels Like an Important Moment for AI Developers✨ From Zero and Confused, This Is How I Started Learning to Code I Built a Local AI Gateway That Talks to Claude, ChatGPT, DeepSeek and Gemini — Without a Single API Key Bootstrapping with AI: Why Gemma 4 is the Micro-SaaS Founder’s Best Friend MyErp Architecture Series - #02 Cellular Architecture: Mapping Biology to Software Systems NodeJS vs Bun vs Go 🌍 RTL Arabic Style UI How Does an AI Agent Actually Buy Something? Google Just Published the Spec. Google I/O 2026 Is One Uncanny F.R.I.E.N.D.S Group Upgrade I Replaced 70MB Node.js Log Viewer with a 172KB Zig Binary The "MTTR Is All You Need" Trap The Quiet Revolution: How Firebase Became the First Agent-Native Backend at Google I/O 2026 I Built ResuMate! A 100% Private, Local AI Resume Optimizer with Google Gemma 4 Learning DirectX 12 - Part 2 Initialization Theory NeuralHats: I Put Edward de Bono’s Six Thinking Hats on Local LLMs Using Gemma 4 📝 Instant Auto Save Notes Engineering the "App-Like" Experience: A Deep Dive into PWA Architecture I built a local first AI CCTV assistant using Gemma 4 + Frigate CrowdShield AI — Smart Stadium Operating System & Crowd Intelligence Platform I built a free AI observability tool, prove your AI is useful, not just running Beyond Autocomplete: Why Google Antigravity 2.0 Changes the Rules for Indie Builders 터미널 AI 에이전트 구축 (v12) Building Instagram-Powered Apps with HikerAPI (Without Fighting Scrapers) Checkpoints, Not Transcripts: Rethinking AI Coding Agent Memory From Side Project to Student Savior: My AI PPT & Resume Tool Crossed 1.5K+ Users Why Story Points Don’t Work in the AI Era, And What Should Take Their Place Instead. Self-Hosted Document AI: How to Run Document Intelligence On Your Own Infrastructure (2026) How to Extract Tables from PDFs with AI: 4 Methods That Actually Work (2026) IDP vs OCR: What's the Difference — and Which Does Your Business Actually Need? Automated PII Detection and Redaction in Business Documents: A Practical Guide Human-in-the-Loop Document Review: When to Use It and How to Set It Up (2026) Document Processing Without RPA: A Modern Approach for Small Teams Reducto Alternative: When You Need More Than a Document Parser (2026) Hermes Agent vs LangChain vs CrewAI: When to Reach for Each SparshAI: I Built an Offline AI Tutor for Students Using Gemma 4 — Here's What Happened Building NeuroSense AI: A Human-Centered Stress Insight Assistant Powered by Gemma Why I Built a Privacy-First Dev Toolkit GAS Input Tags: Ability Activation Without Hardcoded Bindings AI Legal Document Advisor Supported By Gemm 4 Model Building Convertify in Public Week 10: PDF Cluster + Blog Launch CureNet AI: Decentralized Health Intelligence for India, Powered by Gemma 4 and ABHA Standardization When Open-Weights AI Meets a Broken Healthcare System: Deploying Gemma 4 in Rural India V.A.L.I.D. Google I/O 2026: The Year Google Stopped Building AI Assistants and Started Shipping AI Engineers Bondmap: AI-Powered Relationship Network That Maps How You're Connected to Everyone Using Gemma 4 Gemma 4 challenge inspired me to build my first app! 96. LoRA: Fine-Tune a Billion-Parameter Model on a Laptop From a Student Who Used CircuitVerse to a GSoC Contributor — My Community Bonding Story How Bf-Tree Keeps Mini-Pages Small, Hot, and Cheap to Evict I asked Claude to explain the chip war and ended up understanding modern geopolitics differently Stop Manually Checking for Server Updates: Automate With Email Notifications Nostalgia Meets Cybersecurity: Spotting Modern Scams in a Retro OS Simulator - Forward or Fraud CRACKING CODING INTERVIEW From Python to Production Pipeline :A Practical guide to Apache Airflow Antigravity 2.0: Google Just Changed What It Means to Be an Engineer
4 Smart Ways to Manage Retries in Side Projects
Mustafa ERBA · 2026-05-25 · via DEV Community

Introduction: The Hidden Teachers of My Own Projects

Side projects, undertaken on my own initiative, have become the greatest learning grounds in my career. These are the moments when I can step away from the processes required by large corporate projects and engage directly with technology. However, this freedom comes with its own set of challenges. One of the most significant is the ability to systematically manage retries, especially in error situations. Over the past decade, I've developed numerous side projects, from my personal financial calculator to a mobile spam blocker. Through the experiences gained, I've tried to establish four fundamental strategies for learning from mistakes and moving forward. In this post, I'll explain these strategies and how I apply them with real-life examples.

These side projects are not just pieces of code for me; they are also a platform for continuous development. While I gain corporate experience working on production ERP systems or internal banking platforms, these personal projects allow me to gain different perspectives and adopt experimental approaches. For instance, the optimizations I make in my personal financial calculators running on my own VPS can sometimes shed light on database performance issues in main projects. In this article, I will share my thoughts on how to manage retries more intelligently, as taught to me by these "hidden teachers."

1. Automate Error Logging: Journald and Logging Strategies

One of the biggest problems I encountered in side projects was the lack of sufficient data to understand why errors were occurring. Especially in one-off scripts or transient services, understanding what was happening in the system at the moment an error occurred is critical. At this point, I started leveraging the power of journald on my Linux systems, which have been using SystemD for a long time.

Journald doesn't just collect logs; it also manages logging levels and rate limits. For my own side projects, I generally follow this strategy: for critical errors, I log with high detail to syslog or a dedicated file, while keeping routine operational logs more concise. This saves disk space while ensuring I have all the necessary information when an error occurs. For example, I had a one-time data extraction script. This script would occasionally receive HTTP 500 Internal Server Error while fetching data from a specific API. Initially, I tried to debug by adding print() statements within the script to understand the error. However, this was slow, and it was difficult to precisely capture the moment the error occurred.

The rate limiting feature offered by Journald also comes into play here. Especially in high-traffic services, logging the same error repeatedly can fill up disk space and make logs unreadable. By adding parameters like RateLimitIntervalSec and RateLimitBurst to my SystemD unit files, I can limit the number of logs from the same source within a specific time frame. This proved useful when analyzing 429 Too Many Requests errors in an API gateway that handled high traffic. The first step in understanding the root cause of errors is to automate how effectively and efficiently we can record them. This not only helps in detecting the error but also speeds up the process of finding its root cause.

ℹ️ Logging and Error Analysis

In my own projects, I monitor logs in real-time using journalctl -f for critical services, while examining more detailed error dumps with journalctl -xe for specific errors. This prevents me from overlooking potential issues.

2. Retry Mechanisms: Ideal Wait Time and Exponential Backoff

When an error occurs, one of the first things that comes to mind is "wait a bit and try again." However, the duration of this "bit" is critical for system stability and user experience. In situations like network errors or temporary service outages, it's important to follow a smart waiting strategy rather than retrying immediately. The concept of "exponential backoff" comes into play here.

In my side projects, I frequently use this mechanism, especially in code that accesses external APIs or services. For instance, in my personal financial calculator project, I occasionally experienced network connection issues when connecting to a data provider's API. When I received an error on the first attempt, instead of retrying the next second, I would wait for 1 second and try again. If I still received an error, I would double the waiting time to 2 seconds, then 4 seconds, 8 seconds, and so on. The biggest advantage of this strategy is that it doesn't overload the target service while giving a reasonable amount of time for a temporary issue to resolve.

Another point I pay attention to when implementing this strategy is the maximum number of retries and the total waiting time. I don't want to get into an infinite loop. Generally, 3 to 5 retries are sufficient. If I still receive an error after these attempts, I understand that it's no longer a temporary issue and requires manual investigation. Another important parameter in this regard is adding "jitter." This means, instead of calculating the exact waiting time, adding a small amount of randomness to the calculated duration. This helps prevent multiple clients from retrying at the same time, thus avoiding the "thundering herd" problem.

💡 The Importance of Jitter

Adding jitter prevents multiple clients from retrying a service simultaneously, thus avoiding sudden load spikes on the service. In my own projects, I create this effect by adding a random duration of 10-20% to the calculated waiting time.

Here's an example of how I implemented this strategy using Astro's fetch API:

async function fetchDataWithRetry(url, maxRetries = 3, initialDelay = 1000) {
  let delay = initialDelay;
  for (let i = 0; i < maxRetries; i++) {
    try {
      const response = await fetch(url);
      if (!response.ok) {
        // Retry logic in case of error
        if (response.status >= 500 && response.status < 600) {
          console.warn(`Request failed with status ${response.status}. Retrying in ${delay}ms...`);
          await new Promise(resolve => setTimeout(resolve, delay));
          delay *= 2; // Exponential backoff
          // Add jitter
          delay += Math.random() * initialDelay;
        } else {
          // Client errors or other non-retryable errors
          throw new Error(`HTTP error! status: ${response.status}`);
        }
      } else {
        // Return data if successful
        return await response.json();
      }
    } catch (error) {
      console.error(`Attempt ${i + 1} failed: ${error.message}`);
      if (i === maxRetries - 1) {
        throw error; // Throw error if it's the last attempt
      }
      await new Promise(resolve => setTimeout(resolve, delay));
      delay *= 2;
      delay += Math.random() * initialDelay;
    }
  }
}

// Usage example:
// fetchDataWithRetry('https://api.example.com/data')
//   .then(data => console.log('Data received:', data))
//   .catch(error => console.error('Failed to fetch data after multiple retries:', error));

Enter fullscreen mode Exit fullscreen mode

This code snippet includes exponential backoff and jitter mechanisms to handle potential network issues when fetching data from an external service. Such simple yet effective strategies ensure my side projects run more stably.

3. State Management and Idempotency: The Side Effects of Retries

One of the most important aspects to consider when implementing retry mechanisms is whether the operation being performed is "idempotent." Idempotency means that applying an operation multiple times has the same effect as applying it once. If your operation is not idempotent, retries can lead to unexpected and undesirable side effects.

For example, consider a function that sends an email to a user. If this function is not idempotent and is retried due to a network error, it might send the same email to the user twice. This is a highly annoying situation from a user experience perspective. In my own projects, especially for sensitive operations like financial transactions or data updates, I use various methods to ensure idempotency.

One method is to use a unique transaction ID for each operation. When I send a request, I also send this ID with the request. On the server side, it checks this ID to see if the operation has been performed before and, if so, does not process the same operation again. For instance, in my Android spam blocker app, when sending a command to block a number, I would generate a unique UUID for each blocking request and send it to the server. If the server had seen this UUID before, it would not process the command to block the same number again. This increased the app's stability and prevented situations like double blocking.

Another approach is to make the operation itself idempotent. For example, instead of incrementing a value, directly setting the value to a specific number (like a SET operation) is a more idempotent approach. Or, instead of deleting an item, marking the item as "deleted." These types of approaches allow retries to be performed safely. Features like INSERT ... ON CONFLICT DO NOTHING or UPSERT in databases like PostgreSQL can also help us in this regard.

⚠️ The Danger of Retrying Without Idempotency

Retrying a non-idempotent operation can lead to data inconsistencies, duplicate records, or serious issues like spamming users. Always consider whether your operation is idempotent.

In this context, the "transaction outbox" pattern is also very useful. When saving an operation to the database, we also write the message to be sent for this operation into an "outbox" table. A separate service then reads messages from this outbox and sends them to external services. This way, if the database operation is successful but the external service message is not received, the message can be resent. This is a simpler and more reliable alternative to two-phase commit.

4. Create a Learning Loop: Learning from Errors and Documentation

One of the most valuable aspects of my side projects is the ability to learn from mistakes and make those lessons permanent. Simply fixing an error is not enough; understanding why it happened, taking the necessary steps to prevent similar errors in the future, and organizing this information makes a big difference in the long run. This is, in a way, performing my own personal "post-mortem" analysis.

For me, the first step in this learning loop is to set up detailed logging and retry mechanisms, as mentioned above. But the next step is to analyze these logs and error situations. In my own side projects, I often keep a "Error Log" using a simple Markdown file or a Notion page. In this log, I record the problem I encountered, when and where the error occurred, the steps I took to resolve it, and most importantly, the lessons I learned from this error.

For example, when I encountered an issue like the WAL (Write-Ahead Logging) files of PostgreSQL bloating in my personal financial calculator projects, I documented this situation in detail. I noted that the size of the pg_wal directory suddenly reached 10GB, filling up disk space. I recorded that I adjusted the pg_wal_keep_size parameter and optimized autovacuum settings to resolve the issue. These notes will guide me if I encounter a similar problem in the future. Such documentation is not only beneficial for me but also for anyone I might share the project with one day.

💡 The Importance of Documentation

Put your lessons learned from errors into writing. This will not only help you resolve the current problem but also help you identify and prevent similar issues faster in the future. The "Error Log" I keep in my own projects is one of my most valuable resources.

To complete this learning loop, I sometimes make improvements to my code. For instance, if an error consistently requires manual intervention, I try to make a small script or code change to automate this situation. This not only solves the current problem but also increases the overall reliability of the project. In my own side projects, this approach has allowed me to build more robust and less maintenance-intensive systems over time.

Conclusion: Errors Are the Fuel for Growth

Side projects offer the freedom to build and experiment on our own. However, encountering errors during these experiments is inevitable. What matters is how we deal with these errors. Systematic error logging, smart retry mechanisms, ensuring idempotency, and creating a loop for learning from errors make this process more efficient and educational. The experiences I've gained in my own projects have made me more prepared for the challenges I face not only in my personal projects but also in corporate projects. Errors are the fuel for growth; as long as we manage them correctly.

The strategies I've described in this article are approaches I've developed based on my personal experiences. Every project may have its own unique dynamics, but the fundamental principles generally remain the same. I hope this information helps you move forward with more solid steps in your own side projects. Remember, every error is a harbinger of a better next step.