惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Project Zero
Project Zero
F
Fortinet All Blogs
Recent Announcements
Recent Announcements
云风的 BLOG
云风的 BLOG
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
M
MIT News - Artificial intelligence
S
SegmentFault 最新的问题
Blog — PlanetScale
Blog — PlanetScale
T
Tailwind CSS Blog
WordPress大学
WordPress大学
Engineering at Meta
Engineering at Meta
S
Schneier on Security
N
News and Events Feed by Topic
N
News | PayPal Newsroom
H
Help Net Security
C
CXSECURITY Database RSS Feed - CXSecurity.com
T
The Exploit Database - CXSecurity.com
Attack and Defense Labs
Attack and Defense Labs
博客园 - Franky
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
J
Java Code Geeks
A
About on SuperTechFans
AWS News Blog
AWS News Blog
S
Secure Thoughts
The Cloudflare Blog
Hugging Face - Blog
Hugging Face - Blog
爱范儿
爱范儿
C
Cybersecurity and Infrastructure Security Agency CISA
V2EX - 技术
V2EX - 技术
Recorded Future
Recorded Future
Microsoft Azure Blog
Microsoft Azure Blog
博客园_首页
MyScale Blog
MyScale Blog
Martin Fowler
Martin Fowler
Help Net Security
Help Net Security
人人都是产品经理
人人都是产品经理
Latest news
Latest news
C
Cyber Attacks, Cyber Crime and Cyber Security
大猫的无限游戏
大猫的无限游戏
The Last Watchdog
The Last Watchdog
www.infosecurity-magazine.com
www.infosecurity-magazine.com
月光博客
月光博客
H
Hacker News: Front Page
P
Proofpoint News Feed
N
News and Events Feed by Topic
H
Heimdal Security Blog
L
Lohrmann on Cybersecurity
有赞技术团队
有赞技术团队
L
LangChain Blog
Application and Cybersecurity Blog
Application and Cybersecurity Blog

Arpit Bhayani

Temporal Primer - Building Long-Running Systems What Matters in Production RAG Structure of Every LLM Chat How LLMs Really Work Your Monolith Is Already A Distributed System Databases Were Not Designed For This BM25 JOIN Algorithms Venting at Work Comes at a Reputation Cost Why Half Your Skills Expire Every Few Years Multi-Paxos - Consensus in Distributed Databases MySQL Replication Internals Bloom Filters When You Increase Kafka Partitions Product Quantization The Q, K, V Matrices The Day I Accidentally Deleted Production How LLM Inference Works What are Blocking Queues and Why We Need Them Heartbeats in Distributed Systems How Writes Work in Apache Cassandra Redis Replication Internals How to Handle Arrogant Colleagues at Work How Does a CDN Handle Content Replication You Can't Fix Everything on Day One When Emotions Spill Over at Work Why gRPC Uses HTTP2 Meetings With No Agenda Are a Waste of Time Career Longevity Beats Constant Job Hopping Stay Relevant at Higher Salary Levels Why Distributed Systems Need Consensus Algorithms Like Raft Why Do Databases Deadlock and How Do They Resolve It Why and How Cache Locality Can Make Your Code Faster Why Eventual Consistency is Preferred in Distributed Systems Why does DNS use both UDP and TCP Should You Do a Master's My Honest Take Empathy Makes Great Engineers Unstoppable Good Mentors Build People, Not Just Skills Why You Should Always Have Back-Burner Projects Before You Push Back, Know What You're Standing On Be the One They Can Count On How Much Are People Willing to Bet on You How to Get Leadership to Say Yes to Your Project Don't Let Your Best Ideas Die in Silence Be the Person Everyone Wants to Work With The XY Problem and How to Avoid It The Startup Hiring Lie Nobody Talks About You Won't Be Promoted Unless You Ask It's Not Enough to be Right; Learn to be Heard No One Ships Great Software Alone You Don't Win by Proving Others Wrong Appreciate Generously; It Costs Nothing, But Builds Everything Your Soft Skills Aren't Soft at All Before you form an opinion, experience it Why You Need Both Curiosity and Action to Thrive A Daily Worklog Changed Everything How We Handle Mistakes Defines Us Own Your Mistakes Don't Wait. Step Up. Temporary Fixes Are Permanent Why Interviews Are Biased And What Sets You Apart Saying 'This isn't my problem' is actually the problem How to Write Effective OKRs Never Lose a Battle due to Miscommunication When In Doubt, Code It Out How to Follow Up Without Annoying People Lead Projects That Land, Execution Over Everything Abstract Thinking Will Define Your Next Decade We Engineers Suck at Task Estimation Shiny Obect Syndrome in Tech When to Change Jobs - The 3P Framework Comfort and Competition - Know When to Switch Gears Paper Notes - On-demand Container Loading in AWS Lambda Paper Notes - SQL Has Problems. We Can Fix Them Pipe Syntax In SQL Paper Notes - NanoLog - A Nanosecond Scale Logging System Don't Wait, Learn - The Best Resource is Mythical Paper Notes - WTF - The Who to Follow Service at Twitter The Unexpected Benefit of Reading Random Engineering Articles Roadmaps Are Limiting Your Growth Stop Leaving Money on the Table - Negotiate Your Job Offer Never Bad-Mouth Your Past Employers Show You're a Culture Fit Quantify your resume, Know Your Numbers The Importance of Being Likeable in Interviews Questions to Ask Your Interviewer How to Build Trust Through Collaboration Do This, Once You Are Out of the Interview Cycle Stop Pitching Ideas, Start Pitching Projects Read Those Design Docs, Even the Ones That Seem Irrelevant The Best Engineering Lessons Happen During Outages Great Engineers Start Broad LLM Summaries are Ruining Your Learning Turn System Design Interviews into Discussions Title Inflation At Work, Find Your Own Projects 6 Simple Strategies to Cracking Any Tech Interview How to Remain Unblocked Solving the Knapsack Problem with Evolutionary Algorithms Generating Pseudorandom Numbers with LFSR Local vs Global Indexes in Partitioned Databases
Structure Composition and Emulating Inheritance in C
Arpit Bhayani · 2020-06-07 · via Arpit Bhayani

C language does not support inheritance however it does support Structure Compositions which can be tweaked to serve use-cases requiring parent-child relationships. In this article, we find out how Structure Compositions help us emulate inheritance in C and keep our code extensible. We will also find how it powers two of the most important things to have ever been invented in the field of computer science.

What is structure composition?

Structure Composition is when we put one structure within another, not through its pointer but as a native member - something like this

// this structure defines a node of a linked list and
// it only holds the pointers to the next and the previous
// nodes in the linked list.
struct list_head {
	struct list_head *next; // pointer to the node next to the current one
	struct list_head *prev; // pointer to the node previous to the current one
};

// list_int holds an list_head and an integer data member
struct list_int {
	struct list_head list;  // common next and prev pointers
	int value;              // specific member as per implementation
};

// list_int holds an list_head and an char * data member
struct list_str {
	struct list_head list;  // common next and prev pointers
	char * str;             // specific member as per implementation
};

In the example above, we define a node of a linked list using structure composition. Usually, a linked list node has 3 members - two pointers to adjacent nodes (next and previous) and a third one could either be the data or a pointer to it. The defining factor of a linked list is the two pointers that logically form a chain of nodes. To keep things abstract we create a struct named list_head which holds these two pointers next and prev and omits the specifics i.e. data.

Using list_head structure, if we were to define a node of a linked list holding an integer value we could create another struct, named list_int that holds a member of type list_head and an integer value value. The next and previous pointers are brought into this struct through list_head list and could be referred to as list.next and list.prev.

There is a very genuine reason for picking such weird names for a linked list node and members of structures; the reason to do so will be cleared in the later sections of this essay.

Because of the above structure definition, building a linked list node holding of any type becomes a breeze. For example, a node holding string could be quickly defined as a struct list_str having list_head and a char *. This ability to extend list_head and build a node holding data of any type and any specifics make low-level code simple, uniform, and extensible.

Memory Representation of list_int

Structures in C are not padded and they do not even hold any meta information, not even for the member names; hence during allocation, they are allocated the space just enough to hold the actual data.

https://user-images.githubusercontent.com/4745789/83953834-694a6a00-a861-11ea-8ff7-fa69af6af7d6.png

In the illustration above we see how members of list_int are mapped on the allocated space - required by its individual members. It is allocated a contiguous space of 12 bytes - 4 bytes for each of the two pointers and another 4 bytes for the integer value. The contiguity of space allocation and order of members during allocation could be verified by printing out their addresses as shown below.

void print_addrs() {
    // creating a node of the list_int holding value 41434
    struct list_int *ll = new_list_int(41434);

    // printing the address of individual members
    printf("%p: head\n",             head);
    printf("%p: head->list.next\n",  &((head->list).next));
    printf("%p: head->list.prev\n",  &((head->list).prev));
    printf("%p: head->value\n",      &(head->value));
}

~ $ make && ./a.out
0x4058f0: head
0x4058f0: head->list.next
0x4058f4: head->list.prev
0x4058f8: head->value

We clearly see all the 3 members, occupying 12 bytes contiguous memory segments in order of their definition within the struct.

The above code was executed on a machine where the size of integer and pointers were 4 bytes each. The results might differ depending on the machine and CPU architecture.

Casting pointers pointing to struct

In C language, when a pointer to a struct is cast to a pointer to another struct, the engine maps the individual members of a target struct type, depending on their order and offsets, on to the slice of memory of the source struct instance.

When we cast list_int * into list_head *, the engine maps the space required by target type i.e. list_head on space occupied by list_int. This means it maps the 8 bytes required by list_head on the first 8 bytes occupied by list_int instance. Going by the memory representation discussed above, we find that the first 8 bytes of list_int are in fact list_head, and hence casting list_int * to list_head * is effectively just referencing the list_head member of list_int through a new variable.

https://user-images.githubusercontent.com/4745789/83943610-2e254800-a81b-11ea-8b25-056e1b1df85e.png

This effectively builds a parent-child relationship between the two structs where we can safely typecast a child list_int to its parent list_head.

It is important to note here that the parent-child relationship is established only because the first member of list_int is of type list_head. it would not have worked if we change the order of members in list_int.

How does this drive inheritance?

As established above, by putting one struct within another as its first element we are effectively creating a parent-child relationship between the two. Since this gives us an ability to safely typecast child to its parent we can define functions that accept a pointer to parent struct as an argument and perform operations that do not really require to deal with specifics. This allows us to NOT rewrite the functional logic for every child extensions and thus avoid redundant code.

From the context we have set up, say we want to write a function that adds a node between the two in a linked list. The core logic to perform this operation does not really need to deal with any specifics all it takes is a few pointer manipulations of next and prev. Hence, we could just define the function accepting arguments of type list_head * and write the function as

/*
 * Insert a new entry between two known consecutive entries.
 *
 * This is only for internal list manipulation where we know
 * the prev/next entries already!
 */
static void __list_add(struct list_head *new,
                       struct list_head *prev,
                       struct list_head *next)
{
    next->prev = new;
    new->next = next;
    new->prev = prev;
    prev->next = new;
}

Since we can safely typecase list_int * and list_str * to list_head * we can pass any of the specific implementations the function __list_add and it would still add the node between the other two seamlessly.

Since the core operations on linked lists only require pointer manipulations, we can define these operations as functions accepting list_head * instead of specific types like list_int *. Thus we need not write similar functions for specifics. A function to delete a node could be written as

/*
 * Delete a list entry by making the prev/next entries
 * point to each other.
 *
 * This is only for internal list manipulation where we know
 * the prev/next entries already!
 */
static inline void __list_del(struct list_head * prev, struct list_head * next)
{
    next->prev = prev;
    prev->next = next;
}

Other linked list utilities like adding a node to tail, swapping nodes, splicing the list, rotating the list, etc only require manipulations of next and prev pointers. Hence they could also be written in a very similar way i.e accepting list_head * and thus eliminating the need to reimplement function logic for every single child implementation.

This behavior is very similar to how inheritance in modern OOP languages, like Python and Java, work where the child is allowed to invoke any parent function.

Who uses structure compositions?

There are a ton of practical usage of using Structure Compositions but the most famous ones are

Linux Kernel

In order to keep things abstract and extensible, Linux Kernel uses Structure Composition at several places. One of the most important places where it uses composition is for managing and maintaining Linked Lists, exactly how we saw things above. The struct definitions and code snippets are taken as-is from the Kernel’s source code, and hence the structure and variable names look different than usual.

Python Type and Object Hierarchy

Python, one of the most important languages in today’s world, uses Structure Composition to build Type Hierarchy. Python defines a root structure called PyObject which holds reference count, defining the number of places from which the object is referenced - and object type - determining the type of the object i.e. int, str, list, dict, etc.

typedef struct _object {
    Py_ssize_t     ob_refcnt;  // holds reference count of the object
    PyTypeObject   *ob_type;   // holds the type of the object
} PyObject;

Since Python wants these fields to be present in every single object that is created during runtime, it uses structure composition to ensure that objects like integers, floats, string, etc put PyObject as their first element and thus establishing a parent-child relationship. A Float object in Python is defined as

#define PyObject_HEAD PyObject ob_base;

typedef struct {
    PyObject_HEAD
    double ob_fval;    // holds the actual float value
} PyFloatObject;

Now writing utility functions that increments and decrements references count on every access of any object could be written as just a single function accepting PyObject * as shown below

static inline void _Py_INCREF(PyObject *op) {
    op->ob_refcnt++;
}

Thus we eradicate a need of rewriting INCREF for every single object type and just write it once for PyObject and it will work for every single Python object type that is extended through PyObject.

References