惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

N
News and Events Feed by Topic
Malwarebytes
Malwarebytes
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
C
Cybersecurity and Infrastructure Security Agency CISA
F
Future of Privacy Forum
C
Cisco Blogs
T
The Exploit Database - CXSecurity.com
A
Arctic Wolf
S
Securelist
K
Kaspersky official blog
S
Schneier on Security
T
ThreatConnect
T
Tenable Blog
Spread Privacy
Spread Privacy
T
True Tiger Recordings
AWS News Blog
AWS News Blog
F
Fox-IT International blog
量子位
T
Threatpost
V
Vulnerabilities – Threatpost
C
CERT Recently Published Vulnerability Notes
Cisco Talos Blog
Cisco Talos Blog
GbyAI
GbyAI
宝玉的分享
宝玉的分享
腾讯CDC
G
Google Developers Blog
aimingoo的专栏
aimingoo的专栏
Cyberwarzone
Cyberwarzone
有赞技术团队
有赞技术团队
S
SegmentFault 最新的问题
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
U
Unit 42
雷峰网
雷峰网
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Simon Willison's Weblog
Simon Willison's Weblog
O
OpenAI News
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
The GitHub Blog
The GitHub Blog
The Register - Security
The Register - Security
MyScale Blog
MyScale Blog
小众软件
小众软件
A
About on SuperTechFans
Last Week in AI
Last Week in AI
Y
Y Combinator Blog
博客园 - 三生石上(FineUI控件)
美团技术团队
Google Online Security Blog
Google Online Security Blog
P
Proofpoint News Feed
MongoDB | Blog
MongoDB | Blog

Lobsters

Flatpak will depend on systemd – OSnews abyss * your_dotfiles_are_not_a_distro Vivado Licensing Options How my minimal, memory-safe Go rsync steers clear of vulnerabilities From AFSK to Goertzel the entropy layer of a wavelet codec, on its own 10,000 Lines Later: When a Tool Became a Compiler - Rob Durst - Gleam Gathering 2026 Debian SE Linux and PinTheft fht-compositor: A dynamic tiling Wayland compositor A Network Allow-List Won't Stop Exfiltration — André Graf Does bulk memmove speed up std::remove_if? (No.) What is Git made of? wake up! 16b 声明式部分更新 | Blog | Chrome for Developers Don't Roll Your Own ... Dianne Skoll's Web Site - Remind “Long-Term Support” doesn’t mean what you think The Architecture of Open Source Applications (Volume 1)Berkeley DB Pardon MIE? - ironPeak Blog seriot.ch It's time to talk about my writerdeck hershey Cuneiforth: A Forth for your Chifir z386: An Open-Source 80386 Built Around Original Microcode waylandcraft - Minecraft Mod On the <dl> HP QuickWeb, Singular And Pointless mvm - a fast virtual machine for Go That one time I used Go panics for flow control A new suite of modern tools coming for editing and publishing RFCs From the Tabletop… The Digital Antiquarian .NET (OK, C#) finally gets union types🎉: Exploring the .NET 11 preview - Part 2 Revised^7 Report on Scheme, Large: Procedural Fascicle Draft is now public The Soul of Maintaining a New Machine - Third Draft | Books in Progress
C array types are weird; and related topics
anselmschuel · 2026-05-25 · via Lobsters

In this article I’ll explain what I find weird about them, what I’d do differently, and ramble on a few related things.

Technically speaking, an array type T[n] (for some n) is distinct from a pointer type T *. A value of type T[n] represents a contiguous sequence of T values in memory, n long.

But you can’t actually refer to values of type T[n]. Any expression that would be of that type is immediately converted to a pointer, type T *, namely a pointer to the first element.

Since the array indexing operator arr[ix] actually operates on pointers, acting like *(arr + ix), you can basically treat arrays like pointers.

An important instance where this doesn’t happen is in sizeof arr, which returns sizeof(T) × n.

int arr[3] = {10, 20, 30};
int *arr_ptr = arr;
size_t arr_size = sizeof(arr);
size_t ptr_size = sizeof(arr_ptr);
// These may (and likely will) be different

Additionally, in function signatures, any array type you give to an argument is actually interpreted as a pointer instead. The n denoting the size is completely discarded. That means that, as an exception to the exception, sizeof arr in a function with an argument T arr[n] will not evaluate to sizeof(T) × n.

size_t foo(char buf[6]) {
    return sizeof(buf);
}

char msg[6] = "!! ??";
size_t msg_size = sizeof(msg);
size_t msg_size_in_fn = foo(msg);
// These may (and likely will) be different

Note that you can write char buf[static 8] to “enforce” the length, but this just makes it undefined behaviour if you pass a pointer to a shorter array. Similar to restrict, all it does is aid the compiler in optimisation.

Instead, you can use a pointer to the array as the argument. Instead of decaying to T *, a pointer to the first element, you can take a reference at the call site to get T (*)[n]. These are effectively the same thing at run-time, but this preserves the length information. It is inconvenient and confusing to write, though.

size_t foo(char (*buf)[6]) {
    return sizeof(*buf);
}

char msg[6] = "?? !!";
size_t msg_size = sizeof(msg);
size_t msg_size_in_fn = foo(&msg);
// These will be the same

Aside: Functions

Interestingly, there’s a second type in C that acts very similar, but isn't nearly as confusing. That type is functions.

Like arrays, function values immediately coerce to function pointers. Unlike arrays, however, dereferencing a variable that refers to a function, e.g. *fn, does allow you to call that function in the same way as the plain symbol would.

void foo() {}
(*foo)();
foo();

While writing &arr for an array does actually give you a pointer-to-array type T (*)[n], &fn is completely equivalent to fn. That’s because an array arr doesn’t decay to &arr, it decays to &arr[0], whereas a function fn does automatically convert to exactly &fn.

Note that for both arrays and functions, they don’t decay when given as arguments to the & operator, which is why &arr isn’t a pointer-to-pointer.

Additionally, writing T fn() or T (*fn)() in function argument lists is also the same—the second gets automatically corrected to the first, very much like array types being automatically corrected to pointer types.

Arrays by value

Fundamentally, an array type is similar to a struct with all members being of the same type. But arrays are often used in a way that structs aren’t. We rarely get the address of the second member of a struct. This is probably because an array with its head shifted remains an array, just of a different size. Since we often ignore, or are ignorant of, the size of an array, this is a natural way to deal with arrays.

I think it would’ve been much easier to mentally model the situation if C had employed a strict separation of arrays and pointers.

Arrays should act just like structs. Passing a char[5] to a function should pass the actual five values in the array. It should be like having five char arguments to the function.

int compute(int arr[3]) {
    arr[2] += arr[1];
    arr[1] *= arr[0];
    arr[0] *= (arr[1] + arr[2]);
    return arr[0] - arr[2];
}

int arr[3] = {10, 20, 30};
int result = compute(arr);
// arr is not modified

A pointer to an array would therefore involve only one level of indirection. If you wanted to treat an array like a pointer, you’d have to manually write &arr[0] to get a pointer to the first element of arr.

void toggle(bool *flag) {
    *flag = !*flag;
}

bool arr[2] = {true, true};
toggle(&arr[1]);

The most obvious immediate benefit is that this makes the language less confusing to learn. It’s very easy to be confused, as a beginner, by the fact that writing to an array inside a function does change the array outside the function, but the same isn’t true for structs.

Normally, the presence of references makes this delightfully explicit and easy to understand in C. In fact, C is, in this respect, much simpler and easier to understand than languages like Python, where objects are pointers by default, and C++, where an argument may be passed by reference depending on the function signature without any change to the call site.

The most immediate downside is that the arrays are being copied all the time. I don’t think that necessarily detracts from the idea. It would just mean that you have to be smart about using it, and it would give the programmer more choices, not less. (Still not as overwhelmingly many choices as something like C++, in case you’re worried about that)

The compiler could, of course, also choose to implement these arrays using pointers, even selectively, when it suits its purposes. That could leave the more intuitive semantics intact.

The @ operator

How would you construct such an array from a pointer? Writing (char[3]){*arr, *(arr + 1), *(arr + 2)} would be very tedious indeed. Luckily, there is prior art for this.

GDB, the debugger, has an expression system, and it extends C’s syntax with the @ operator, used to imbue a memory address with a length to make it an array.

However, it doesn’t actually take a memory address as its operand. Rather, it acts on expressions like *ptr, which have an address, instead of ones that are an address.

(gdb) list
1   int main() {
2       int arr[4] = {10, 20, 30, 40};
3       int *at_ix_1 = arr + 1;
4   }
(gdb) break 4
(gdb) run
Breakpoint 1, main ()
4   }
(gdb) print *at_ix_1
$1 = 20
(gdb) print *at_ix_1@1
$2 = {20}
(gdb) print *at_ix_1@2
$3 = {20, 30}
(gdb) print *(at_ix_1 + 1)@2
$4 = {30, 40}
(gdb) print *(at_ix_1 - 1)@4
$5 = {10, 20, 30, 40}

(The GDB diagnostic output has been slightly simplified for this example)

This is analogous to how things like = already work. We can write *ptr = 2, since *ptr is not just a value, but a value with a particular location in memory that can be written to. You cannot write 2 = 2. We call these expressions place expressions, or lvalues.

Similarly, you write *ptr@10 to get an array whose first element is *ptr and has 9 elements after that. But you cannot write 2@10. You would first have to give the 2 a place.

int x = 2;
int x_arr[1] = x@1;

I think this is a neat way for this operator to work. It could in theory be extended to allow for things like

struct coords_3d {
    int x;
    int y;
    int z;
} some_point;
struct coords_2d {
    int x;
    int y;
} some_point_projected = some_point.x@2;

This feels a bit unnatural in this case. I think this might be due to the fact that, unlike with arrays, a part of a struct type isn’t really quite as easy to relate to the original struct type. We rarely deal with structs where we only know some of the fields, which might be analogous to an array where we don’t know the size. Slicing structs, when it occurs, like in the Berkeley socket APIs, is unusual and feels like a bit of a hack.

The way in which we understand arrays of unknown size as a pointer is, in fact, an example of a broader pattern, where we hide some object we can’t deal with directly behind some opaque handle. Then, we have some way of supplying the missing information to actually operate on the object.

In a C array, that missing information may be the length, which is then supplied from any number of sources.

We may store that information alongside the array, either in memory, next to the array, but at a static offset, or alongside the pointer in our local variables (or wherever the pointer may reside).

Storing it together with the pointer is what we call a wide pointer. This is e.g. how std::vector in C++ may be implemented, and it’s what Rust uses automatically to let you take references to unsized types like arrays, &[T], that automatically store their length.

We’re effectively already doing this in C whenever we take parameters like size_t len, char *buf. Taking two arguments is equivalent to taking a two-member struct, and that two-member struct, if we were to extract it as its own type, is a wide pointer.

Storing that additional data in memory just before the actual data is what e.g. C++ derived classes with virtual methods do. Footnote 1

Getting back to my improved C arrays, you could therefore convert back and forth like this:

char arr[4] = {'x', 'y', 'z', 'w'};
char *arr_ptr = &arr[0];
char arr_again[4] = *arr_ptr@4;

Slicing an array is very natural in this syntax:

int iota[4] = {0, 1, 2, 3};
int one_two[2] = iota[1]@2;

Obviously, it would be equally possible to have the syntax ptr@n instead, without needing the dereference. You could still write something like (&iota[2])@3. I think it looks less nice though, and gives you less insight about how place expressions and the like work.

There’s some rough edges here. If you’re just shifting the beginning of the array, you write:

int arr[2] = {10, 20};
arr = &arr[1]@1;

But that requires stating the new length explicitly. If you have some kind of operator to get the array size defined like sizeof(arr)/sizeof(T), you could use that. It’s tedious and ugly nonetheless.

The three obvious solutions are to either allow arr + 1, or to automatically infer the length with a special syntax, e.g. arr[1]@..., or to make a new custom operator, e.g. arr +@ 1.

Since I can’t actually redesign C, and I’m not currently writing a new language, and this probably isn’t that common, I’ll give no specific recommendation.

Aside: ->

As a last note, I’ll mention the -> operator. That one is similar to the @ operator in that whether it deals in pointers or place expressions is kind of arbitrary.

Right now, the expression ptr->foo denotes the value (*ptr).foo, with a dereference included for free. To get the address, you write &ptr->foo. But it could’ve just as easily been defined as &(*ptr).foo. Then, to get the value, you’d write *ptr->foo.

Right now, to get nested values from pointers to structs, you write ptr->foo.bar. With the alternate ->, you’d write ptr->foo->bar (for the pointer).

One might say that ptr->foo.bar shows that there’s only actually one pointer being followed, and ptr->foo isn’t itself a pointer. But the alternate syntax would show that too, since you’d write *ptr->foo->bar to actually get at the value.

This is a very ill-substantiated feeling, and possibly entirely wrong, but I have a very slight preference for ptr->foo->bar. Working entirely in the realm of pointers is, to me, slightly more reflective of the fact that the compiler only actually has to apply one offset.

But ptr->foo.bar is more reflective of the neat interplay between place expressions, the dereference operator, and the address-of operator. Since I praised that so much above, perhaps some of my feelings are hypocritical.