Is there a well defined class of code, whose expression in Rust, will perform better than what would be emitted by a sufficiently advanced optimizing compiler (say SBCL) ?
Well, no, it's SBCL. Common Lisp has support for types, but most compilers only use them for optimization, SBCL goes one step further and emits warnings when you mismatch types. And looking at the code, I can see lots of type declarations.
It's also interesting to note that the code does not seem to be using SBCL's new SIMD library*, so it could be sped up even more.
The most common example is aliasing. Since the rust type system promises that you won't have two mutable aliases to the same data, even data coming from an unknown caller, the compiler can in theory make optimizations that aren't possible in other languages that permit multiple mutable aliases.
Especially for values that cross the boundaries of compilation units, the compiler in another language cannot possible have enough information to know if aliasing could occur.
alias analysis in Rust is great, but friends worked on it in optimizing compilers in Urbana-Champaign in the early 1990s, so it's possible Rust is its most widespread rollout.
It's possible to specify no alias in C with the `restrict` keyword, but it isn't a widely used feature. With Rust, a whole bunch of bugs related to it were found and fixed in LLVM, so I think you're correct that it is the most widespread rollout.
I mean... if the compiler isn't smart enough to see what you were trying to do and generate some ridiculously fast implementation, is it truly worthy of being called "sufficiently advanced"? ;P You just know someone is going to try to add in an LLM-based (note the lack of a V there) compiler pass that says "give me a faster algorithm that does the same thing as this function" (and then all hell will break loose when it starts injecting random bullshit into the function it sort of grokked, or decides your code is for some evil ad company and deletes it on purpose... oh, wait: maybe this is a great idea).
Maybe CPU bound code where either your data fits on the stack or you can avoid a lot of copying chunks of the heap by expressing your algorithm in terms of in-place mutation?
That is where garbage collected languages seem to be incurring penalties. Not sure what SBCL does specifically. There was this talk last October about efforts at Jane Street to make explicit stack allocations possible in OCaml:
Not sure how C++ compiler or SBCL compiler utilizes auto-vectorization. The Rust compiler is able to auto-vectorize tight loops to use SIMD without much prompting. E.g. the following uses SIMD automatically.