惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

The Hacker News
The Hacker News
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
雷峰网
雷峰网
人人都是产品经理
人人都是产品经理
Recent Announcements
Recent Announcements
D
DataBreaches.Net
P
Proofpoint News Feed
V
Visual Studio Blog
J
Java Code Geeks
Recorded Future
Recorded Future
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
F
Full Disclosure
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
The GitHub Blog
The GitHub Blog
Engineering at Meta
Engineering at Meta
C
Cybersecurity and Infrastructure Security Agency CISA
V
Vulnerabilities – Threatpost
罗磊的独立博客
Jina AI
Jina AI
博客园 - 【当耐特】
C
CERT Recently Published Vulnerability Notes
G
GRAHAM CLULEY
Y
Y Combinator Blog
L
LangChain Blog
L
LINUX DO - 热门话题
宝玉的分享
宝玉的分享
月光博客
月光博客
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
H
Help Net Security
云风的 BLOG
云风的 BLOG
C
CXSECURITY Database RSS Feed - CXSecurity.com
博客园_首页
A
About on SuperTechFans
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
Latest news
Latest news
T
Threatpost
T
Tenable Blog
有赞技术团队
有赞技术团队
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
Stack Overflow Blog
Stack Overflow Blog
C
Cisco Blogs
C
Check Point Blog
T
Tor Project blog
T
Threat Research - Cisco Blogs
T
The Exploit Database - CXSecurity.com
S
Schneier on Security
美团技术团队
I
Intezer
S
Securelist
AWS News Blog
AWS News Blog

Stonecharioteer on Tech

I Traced My Traffic Through a Home Tailscale Exit Node What Was I Reading Last? In Three Not-So-Easy Pieces Dogfooding Is Hard Code blocks in your books, finally GoForGo v0.9.0 Merrilin - We built an app to read books I use a Macbook now Data Structures & Algorithms - Preparing for Interviews Using a local DNS namespace for local service discovery Direction KOllector - Publishing KOReader Highlights gbt: branches touched in the last 24 hours A Soiree into Symbols in Ruby Some Smalltalk about Ruby Loops Ruby Blocks Returning from Ruby Blocks, Procs and Lambdas My Linux Laptop Finally Works: How Claude Helped Me Fix Years of Annoyances TIL: Watchexec - Modern File Watching for Development Workflows A Less Busy Mind GoForGo - Learn Go through live examples Migrating My Old Blog to Hugo with Claude The Qtile Window Manager: A Python-Powered Tiling Experience Read the RFCs that Built the Internet Py-x-Protobuf - Or How I Learned to Stop Worrying and Love Protocol Buffers Python Reverse a List New Beginnings Leaving ChainSafe Systems Screen Lock for Cinnamon Desktop using Zenity and Terminal Commands Crews Not Teams A System for Getting Better at LeetCode So Far So Rust Retrying HTTP Requests with Rust A Primer on Control Charts Learning Rust Explicit is Better than Implicit: Rust for Pythonistas Using Custom Delimiters in Jinja Templates TIL: Creating Fixed Length Iterables in Python Documentation Without Assumption Vagrant Python - A Reflection in 2022 Learning Golang No, A Virtual Machine Is Not Enough: Why Developers Need Native Linux Empathy in Tech For Those Who Came in Late A Weekend With PostgreSQL TIL: Gooey and Python Fire for Quick GUIs and CLIs TIL: 2ality - Dr. Axel Rauschmayer's JavaScript Blog TIL: MassDNS - High-Performance Bulk DNS Lookups TIL: Matomo Analytics, Google Tech Writing, Memory Programming, and NES TV Signals TIL: MontyDB - MongoDB Implemented in Python Returning to the Craft of Programming TIL: CPUFetch, OneFetch, and Learn CSS TIL: DNS Performance Testing and Pi-hole with Unbound TIL: Eli Bendersky's Blog, Awesome By Example, NoCoDB, and Martin Kleppmann TIL: CRDTs, Extreme HTTP Performance, and BYTEPATH Game TIL: AutoInvent, ASGI, Python Packaging, RAPIDS GPU Computing, and FlaskCon TIL: MangaDesk - Terminal Client for MangaDex TIL: McFly - Smart Shell History Search TIL: Siege Load Testing and Awesome FastAPI Resources TIL: Ventoy Bootable USB and Justniffer Network Analysis TIL: CLI Code Review, Git Split Diffs, and Internal Combustion Engine TIL: Benford's Law, Web Security Headers, Event Sourcing, and Mozilla Security Guidelines How to Write Documentation - The README.md File The Importance of Documentation TIL: NNgroup UX Research, SponsorBlock, and Labella Python Library TIL: The Little Book of Rust Macros and Rust Performance Book TIL: Git-Bug Distributed Issue Tracker and Omni Kubernetes Monitoring TIL: Zellij - Modern Terminal Multiplexer TIL: How Discord Handles 2.5 Million Concurrent Voice Users TIL: Volumio - The Audiophile Music Player TIL: Areopagitica - Milton's Defense of Free Speech TIL: Fast Node Manager, Zoxide Smart CD, Technical Writing, PyO3, and Qubes OS TIL: Slurm Workload Manager for HPC Clusters TIL: Data Visualization Guide and Oso Authorization Academy TIL: CORS Deep Dive, Piku Tiny PaaS, Rust Strings, and Deno Standard Library TIL: Raspberry Pi OS Development, Vim Beginner Guide, Password Management, and QueryBook TIL: uBlock Origin Performance Optimization on Firefox TIL: Breaking PostgreSQL at Scale and LeetCode Problem Patterns TIL: Awesome Tmux Resources for Terminal Multiplexing TIL: Grit - A Multitree-Based Personal Task Manager TIL: Lens 4.2 Kubernetes IDE, Shell Scripting Guide, and Dark HTTP Server Do The Job You Hate So You Won't Hate The Job You Love TIL: Innernet VPN Solution and NoteCalc Calculator App TIL: Argo CD for GitOps and Lens Kubernetes IDE TIL: Modern Rust CLI Tools - System Monitoring, HTTP Requests, and DNS TIL: tz - A Time Zone Helper Tool TIL: Distributed Systems Education, Fallacies, and Self-Hosted Internet Archiving TIL: Real-Time Voice Cloning Technology TIL: ChartMuseum for Helm, AMD's Corporate Journey, and Kubernetes Pod Scaling TIL: Docker and Kubernetes Tools - Whaler, Descheduler, and Dive TIL: Post-Mortem Collection, Terminal Plotting, and Technical Twitter TIL: Dark Mode Toggle Web Component by Google Chrome Labs TIL: Python eval(), exec(), and compile() Functions TIL: Camelot PDF Tables, PostgreSQL Row Level Security, Zerodha Varsity, and Write Yourself a Git TIL: fuser Command for Process and File Investigation TIL: i Hate Regex - The Ultimate Regex Cheat Sheet TIL: Dolt - Git for Data and Database Version Control TIL: x86 Assembly Programming and SafeEyes Break Reminder TIL: Comprehensive Distributed Systems Reading List TIL: Cosmopolitan C Library, Distributed Systems Book, High Performance Browser Networking, and Rust Roguelike Tutorial
TIL: Python Collections Optimization and Linux Display Management
2020-07-26 · via Stonecharioteer on Tech

Today I discovered practical optimizations in Python’s collections module and explored Linux display management tools that make working with multiple monitors much easier in tiling window managers.

collections.defaultdict provides better performance than checking {}.get() for handling missing keys:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
from collections import defaultdict
import json

# Nested defaultdict for tree-like structures
def create_nested_defaultdict():
    """Create nested defaultdict for hierarchical data"""
    # Automatically creates nested structure
    tree = defaultdict(lambda: defaultdict(list))

    # Add data without checking if keys exist
    tree['animals']['mammals'].append('dog')
    tree['animals']['mammals'].append('cat')
    tree['animals']['birds'].append('eagle')
    tree['plants']['trees'].append('oak')
    tree['plants']['flowers'].append('rose')

    return tree

# Grouping data with defaultdict
def group_by_attribute(items, key_func):
    """Group items by a computed key function"""
    groups = defaultdict(list)
    for item in items:
        key = key_func(item)
        groups[key].append(item)
    return dict(groups)

# Example usage
people = [
    {'name': 'Alice', 'age': 25, 'city': 'New York'},
    {'name': 'Bob', 'age': 30, 'city': 'London'},
    {'name': 'Charlie', 'age': 25, 'city': 'New York'},
    {'name': 'Diana', 'age': 35, 'city': 'Paris'},
    {'name': 'Eve', 'age': 30, 'city': 'London'}
]

# Group by age
by_age = group_by_attribute(people, lambda p: p['age'])
print("Grouped by age:")
for age, group in by_age.items():
    print(f"  {age}: {[p['name'] for p in group]}")

# Group by city
by_city = group_by_attribute(people, lambda p: p['city'])
print("\nGrouped by city:")
for city, group in by_city.items():
    print(f"  {city}: {[p['name'] for p in group]}")

# Building indexes with defaultdict
def build_search_index(documents):
    """Build a search index from documents"""
    index = defaultdict(set)  # Use set to avoid duplicates

    for doc_id, content in enumerate(documents):
        words = content.lower().split()
        for word in words:
            # Clean word (remove punctuation)
            clean_word = ''.join(c for c in word if c.isalnum())
            if clean_word:
                index[clean_word].add(doc_id)

    return index

# Example document search
documents = [
    "Python is a programming language",
    "JavaScript is also a programming language",
    "Both Python and JavaScript are popular",
    "Learning programming languages takes time"
]

search_index = build_search_index(documents)

def search_documents(query, index, documents):
    """Search documents using the built index"""
    query_words = [word.lower() for word in query.split()]

    if not query_words:
        return []

    # Find documents containing first word
    result_docs = index[query_words[0]].copy()

    # Intersect with documents containing other words
    for word in query_words[1:]:
        result_docs &= index[word]

    return [(doc_id, documents[doc_id]) for doc_id in sorted(result_docs)]

# Test search
results = search_documents("Python programming", search_index, documents)
print(f"\nSearch results for 'Python programming': {len(results)} found")
for doc_id, content in results:
    print(f"  Doc {doc_id}: {content}")

# Counting with different data types
def advanced_counting_patterns():
    """Demonstrate various counting patterns with defaultdict"""

    # Count nested attributes
    transactions = [
        {'user': 'alice', 'category': 'food', 'amount': 25},
        {'user': 'bob', 'category': 'transport', 'amount': 15},
        {'user': 'alice', 'category': 'food', 'amount': 30},
        {'user': 'charlie', 'category': 'entertainment', 'amount': 50},
        {'user': 'bob', 'category': 'food', 'amount': 20},
    ]

    # Count by user and category
    user_category_counts = defaultdict(lambda: defaultdict(int))
    user_totals = defaultdict(int)
    category_totals = defaultdict(int)

    for transaction in transactions:
        user = transaction['user']
        category = transaction['category']
        amount = transaction['amount']

        user_category_counts[user][category] += amount
        user_totals[user] += amount
        category_totals[category] += amount

    print("\nTransaction analysis:")
    print("By user and category:")
    for user, categories in user_category_counts.items():
        print(f"  {user}: {dict(categories)}")

    print(f"\nUser totals: {dict(user_totals)}")
    print(f"Category totals: {dict(category_totals)}")

# Run examples
tree = create_nested_defaultdict()
print("Nested tree structure:")
print(json.dumps(tree, indent=2, default=list))

advanced_counting_patterns()
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
# Basic sorting with reverse parameter
numbers = [3, 1, 4, 1, 5, 9, 2, 6, 5]

ascending = sorted(numbers)
descending = sorted(numbers, reverse=True)

print(f"Ascending:  {ascending}")   # [1, 1, 2, 3, 4, 5, 5, 6, 9]
print(f"Descending: {descending}")  # [9, 6, 5, 5, 4, 3, 2, 1, 1]

# Sorting complex objects
students = [
    {'name': 'Alice', 'grade': 85, 'age': 20},
    {'name': 'Bob', 'grade': 92, 'age': 19},
    {'name': 'Charlie', 'grade': 78, 'age': 21},
    {'name': 'Diana', 'grade': 96, 'age': 20}
]

# Sort by grade (highest first)
by_grade_desc = sorted(students, key=lambda s: s['grade'], reverse=True)
print("\nStudents by grade (highest first):")
for student in by_grade_desc:
    print(f"  {student['name']}: {student['grade']}")

# Multiple sort criteria with reverse
def multi_level_sort():
    """Demonstrate complex sorting with multiple criteria"""
    data = [
        ('Alice', 'Engineering', 85, 20),
        ('Bob', 'Arts', 92, 19),
        ('Charlie', 'Engineering', 78, 21),
        ('Diana', 'Arts', 96, 20),
        ('Eve', 'Engineering', 85, 19),
        ('Frank', 'Arts', 78, 22)
    ]

    # Sort by department (ascending), then grade (descending), then age (ascending)
    from operator import itemgetter

    # Method 1: Using multiple sorted() calls (applied in reverse order)
    result1 = sorted(data, key=itemgetter(3))       # Sort by age first
    result1 = sorted(result1, key=itemgetter(2), reverse=True)  # Then by grade (desc)
    result1 = sorted(result1, key=itemgetter(1))    # Finally by department

    # Method 2: Using tuple key with negation for reverse
    result2 = sorted(data, key=lambda x: (x[1], -x[2], x[3]))

    # Method 3: Custom comparison with multiple criteria
    def sort_key(item):
        name, dept, grade, age = item
        return (dept, -grade, age)  # Negative grade for descending

    result3 = sorted(data, key=sort_key)

    print("Multi-level sort results:")
    print("(Name, Department, Grade, Age)")
    for item in result3:
        print(f"  {item}")

    # Verify all methods produce same result
    assert result1 == result2 == result3

# Advanced sorting patterns
def advanced_sorting_patterns():
    """Show advanced use cases for sorted() with reverse"""

    # Sort dictionary by values (descending)
    word_counts = {'apple': 23, 'banana': 45, 'cherry': 12, 'date': 67}

    # Get items sorted by count (highest first)
    sorted_items = sorted(word_counts.items(), key=lambda x: x[1], reverse=True)
    print("\nWords by frequency:")
    for word, count in sorted_items:
        print(f"  {word}: {count}")

    # Sort by string length (longest first), then alphabetically
    words = ['cat', 'elephant', 'dog', 'hippopotamus', 'ant', 'zebra']

    sorted_words = sorted(words, key=lambda w: (-len(w), w))
    print(f"\nWords by length (desc), then alphabetically: {sorted_words}")

    # Custom reverse logic for complex objects
    class Task:
        def __init__(self, name, priority, due_date):
            self.name = name
            self.priority = priority  # 1 = high, 2 = medium, 3 = low
            self.due_date = due_date

        def __repr__(self):
            return f"Task({self.name}, P{self.priority}, {self.due_date})"

    from datetime import date
    tasks = [
        Task("Fix bug", 1, date(2020, 8, 1)),
        Task("Write docs", 2, date(2020, 7, 30)),
        Task("Code review", 1, date(2020, 7, 29)),
        Task("Meeting", 3, date(2020, 7, 31)),
    ]

    # Sort by priority (high first), then due date (earliest first)
    sorted_tasks = sorted(tasks, key=lambda t: (t.priority, t.due_date))
    print("\nTasks by priority and due date:")
    for task in sorted_tasks:
        print(f"  {task}")

# Run examples
multi_level_sort()
advanced_sorting_patterns()

# Performance comparison: sorted() vs list.sort()
def sorting_performance_comparison():
    """Compare sorted() vs list.sort() performance"""
    import random
    import time

    # Generate test data
    data = [random.randint(1, 1000) for _ in range(100000)]

    # Test sorted() - creates new list
    data_copy1 = data.copy()
    start = time.time()
    result = sorted(data_copy1, reverse=True)
    sorted_time = time.time() - start

    # Test list.sort() - modifies in place
    data_copy2 = data.copy()
    start = time.time()
    data_copy2.sort(reverse=True)
    sort_time = time.time() - start

    print(f"\nPerformance comparison (100K integers):")
    print(f"sorted():    {sorted_time:.4f}s")
    print(f"list.sort(): {sort_time:.4f}s")
    print(f"sort() is {sorted_time/sort_time:.2f}x faster (in-place)")

    # Verify results are identical
    assert result == data_copy2

sorting_performance_comparison()

arandr provides a visual interface for configuring multiple monitors in tiling window managers:

Today’s exploration of collections.defaultdict demonstrates how choosing the right data structure can provide measurable performance improvements:

The combination of arandr and xrandr showcases how GUI tools can complement command-line utilities:

Today’s learning reinforced that small optimizations in both code and environment setup can compound to create significant productivity improvements over time.