惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Simon Willison's Weblog
Simon Willison's Weblog
P
Privacy International News Feed
www.infosecurity-magazine.com
www.infosecurity-magazine.com
T
Troy Hunt's Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
Attack and Defense Labs
Attack and Defense Labs
S
Secure Thoughts
V2EX - 技术
V2EX - 技术
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
O
OpenAI News
Cloudbric
Cloudbric
Google Online Security Blog
Google Online Security Blog
Schneier on Security
Schneier on Security
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Help Net Security
Help Net Security
Cyberwarzone
Cyberwarzone
G
GRAHAM CLULEY
L
Lohrmann on Cybersecurity
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
Spread Privacy
Spread Privacy
NISL@THU
NISL@THU
N
News and Events Feed by Topic
T
Tenable Blog
S
Security @ Cisco Blogs
N
News and Events Feed by Topic
The Hacker News
The Hacker News
C
CXSECURITY Database RSS Feed - CXSecurity.com
宝玉的分享
宝玉的分享
月光博客
月光博客
酷 壳 – CoolShell
酷 壳 – CoolShell
美团技术团队
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Google DeepMind News
Google DeepMind News
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Tailwind CSS Blog
V
Visual Studio Blog
P
Proofpoint News Feed
Webroot Blog
Webroot Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
博客园 - 三生石上(FineUI控件)
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
Jina AI
Jina AI
雷峰网
雷峰网
T
The Blog of Author Tim Ferriss
Hugging Face - Blog
Hugging Face - Blog
腾讯CDC
L
LangChain Blog
The Register - Security
The Register - Security
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 聂微东

VTEX’s Tech Blog

How AI-Driven Analysis improved latency by 68% in our main Payment Database Frontend Analytics Staff Engineer: How to find time to work on broader-scope problems Best Practices for Leadership in Critical Incidents Productivity in Software Engineering: Beyond lines of code, the importance of experience in the development cycle How VTEX improved the shopper experience with Amazon DynamoDB VTEX scales to 150 million metrics using Amazon Managed Service for Prometheus Monitoring Windows pods with Prometheus and Grafana
Black Friday Tales: Stepping up and Modernizing Orphans Systems
VTEX Tech Blog · 2024-10-08 · via VTEX’s Tech Blog

One of the most rewarding experiences during my journey at VTEX has been witnessing the long-term impact of the changes we have implemented throughout the years. In 2017, our engineering team was considerably smaller, and we faced the challenge of having a single person responsible for managing multiple areas and systems. This experience highlighted our team’s resilience and adaptability and laid the groundwork for the innovative practices we employ today.

For our engineering team, November is a time of intense preparation due to the major events hosted by our clients. This is when we are always prepared to deliver the platform's high availability and performance. Our preparation involves conducting rigorous load tests that simulate sales volumes beyond the predictions to be prepared for any situation. We focus on two main objectives: pushing the platform to its limits to identify opportunities for improving systems resilience and simulating heavy traffic loads to ensure flawless performance during Black Friday.

Before these crucial events, we held meetings with the entire engineering team to discuss concerns openly and plan preventive actions. Just before one of these meetings, someone left the team, leaving a critical system without an owner. It was time to decide who would step up and take on this responsibility.

On Black Friday 2017, this system was critical and one of our platform's main scalability constraints. In our tests, it was often the first to exhibit performance issues that could impact our clients’ sales. During the engineering meeting, when the discussion arose about who would take on this responsibility, silence filled the room. Noticing that no one else was stepping forward, I decided to volunteer.

At that time, I was already managing two critical systems of the platform on my own. Despite the challenges, we successfully supported the event that year, overcoming the limitations. This particular system required little to no improvements or fixes throughout the year, demanding attention only during the preparation for the main event. In 2018, we followed the same strategy, leading to another successful and issue-free event.

However, the real challenge came in 2019. With the continued success of our business, we have experienced consistent year-over-year growth, and our platform needed to keep pace. In the first load test that year, the system was not nearly capable of supporting the anticipated demand for the event. Although I was responsible for it, I didn’t have in-depth technical knowledge of the system. Up to that point, my work had involved only minimal maintenance. It was clear that this time, a more substantial effort was required.

With only two months left before the event and other critical systems under my responsibility, I needed to act decisively. I dived deep into the code and, over 21 days, made 48 substantial changes. Since I was also responsible for another service that depended on this system, I had enough knowledge of its API contracts to completely rewrite it, removing more than 340,000 lines of code. I simplified the business logic and improved the system's efficiency, more than doubling its capacity and significantly increasing its speed.

Given the critical nature of these changes, the team expressed concerns about the boldness of my approach. However, since we had a robust load tests process, I was confident in the risk and released this new version. The event was an absolute success, once again hitting sales records on the VTEX platform.

As of Black Friday 2023, this system successfully supported a sales volume seven times greater than in 2017, using only one-third of the infrastructure that was needed in 2019. I'm proud to say that this incredible impact was achieved by a team under my leadership, dedicated to maintaining and evolving this system. Unlike the early years when I was the sole contributor, we now have several talented engineers focusing on modernization to achieve these outstanding results.

Today, the system is no longer a bottleneck for the platform, and many hardly remember its existence, which is a source of great satisfaction for me and the team. It’s a testament to our collaborative efforts and relentless pursuit of excellence, paving the way for more innovation and success in the future.