More

ltratt · 2026-04-16T14:42:25 1776350545

> If this project would be able to detect the interpreter hotspots itself and completely automate the procedure, it would be great.

I don't think that's realistic; or, at least, not if you want good performance. You need to use quite a bit of knowledge about your context to know when best to add optimisation hints. That said, it's not impossible to imagine an LLM working this out, if not today, then perhaps in the not-too-distant future! But that's above my pay grade.

moardiggin · 2026-04-16T14:52:48 1776351168

Thanks for sharing this technology. I hope it gets upstreamed into LLVM.

ltratt · 2026-04-16T07:00:33 1776322833

You're quite right that since we're working with LLVM IR, adapting to other languages is probably not _that_ difficult, though these things always end up taking more time than I expect! Since the majority of real-world problems in this area depend on C interpreters, we put our limited resources to that problem. You're also right that "interpreters" is a pretty vague category, and there are other parts of C (and other) programs that could be yk-ified, though I suspect it would be a fairly specialised subset of programs.

ltratt · 2026-04-16T06:56:06 1776322566

Our fork of LLVM does add a pass, amongst other changes, but we also have to do things like change stackmaps in a way that breaks compatibility. Whether stackmaps in their current incarnation are worth retaining compatibility for is above my pay grade! So some of our changes are probably upstreamable, but some might be considered too niche for wider integration.

ltratt · 2025-11-12T19:36:28 1762976188

I'm assuming you're referring to the Python finaliser example? If so, there's no syntax sugar hiding function calls to finalisers: you can verify that by running the code on PyPy, where the point at which the finaliser is called is different. Indeed, for this short-running program, the most likely outcome is that PyPy won't call the finaliser before the program completes!

ltratt · 2025-10-15T18:53:42 1760554422

We don't exactly want Alloy to have to be conservative, but Rust's semantics allow pointers to be converted to usizes (in safe mode) and back again (in unsafe mode), and this is something code really does. So if we wanted to provide an Rc-like API -- and we found reasonable code really does need it -- there wasn't much choice.

I don't think Rust's design in this regard is ideal, but then again what language is perfect? I designed languages for a long while and made far more, and much more egregious, mistakes! FWIW, I have written up my general thoughts on static integer types, because it's a surprisingly twisty subject for new languages https://tratt.net/laurie/blog/2021/static_integer_types.html

quotemstr · 2025-10-15T21:18:13 1760563093

> We don't exactly want Alloy to have to be conservative, but Rust's semantics allow pointers to be converted to usizes (in safe mode) and back again (in unsafe mode), and this is something code really does. So if we wanted to provide an Rc-like API -- and we found reasonable code really does need it -- there wasn't much choice.

You can define a set of objects for which this transformation is illegal --- use something like pin projection to enforce it.

ltratt · 2025-10-15T21:27:47 1760563667

The only way to forbid it would be to forbid creating pointers from `Gc<T>`. That would, for example, preclude a slew of tricks that high performance language VMs need. That's an acceptable trade-off for some, of course, but not all.

quotemstr · 2025-10-15T21:33:03 1760563983

Not necessarily. It would just require that deriving these pointers be done using an explicit lease that would temporarily defer GC or lock an object in place during one. You'd still be able to escape from the tyranny of conservative scanning everything.

ltratt · 2025-10-15T16:46:55 1760546815

If you've used Chrome or Safari to read this post, you've used a program that uses (at least in parts) conservative GC. [I don't know if Firefox uses conservative GC; it wouldn't surprise me if it does.] This partly reflects shortcomings in our current compilers and in current programming language design: even Rust has some decisions (e.g. pointers can be put in `usize`s) that make it hard to do what would seem at first glance to be the right thing.

astrange · 2025-10-15T19:41:03 1760557263

Also most mobile games written in C# use a conservative GC (Boehm).

Rohansi · 2025-10-15T22:14:20 1760566460

Not just mobile games - all games made with Unity.

ltratt · 2025-08-06T13:18:47 1754486327

As Koffiepoeder suggests, since the vast majority of content on my site is static, I only have to compress a file once when I build the site, no matter how many people later download it. [The small amount of dynamic content on my site isn't compressed, for the reason you suggest.]

santiagobasulto · 2025-08-06T14:52:44 1754491964

That’s a good point, didn’t know it was cached on top.

ltratt · on March 1, 2024

As an example, I like to point people at https://doc.rust-lang.org/std/cell/struct.UnsafeCell.html which for many years now has contained this line:

> The precise Rust aliasing rules are somewhat in flux, but the main points are not contentious

I've sometimes found myself in situations where the only way I've been able to deal with this is to check the compiler's output and trawl forums for hints by Rust's developers about what they think/hope the semantics are/will be.

Historically speaking, this situation isn't uncommon: working out exactly what a language's semantics should be is hard, particularly when it has many novel aspects. Most major languages go through this sort of sequence. Some sooner or later than others --- and some end up addressing it more thoroughly than others). Eventually I expect Rust to develop something similar to the modern C spec, but we're not there yet.

asa400 · on March 1, 2024

Excellent - thank you for the example and the clarification. This is exactly what I was looking for.

ltratt · on Oct 13, 2023

Because Morello is an experimental platform, only a small number were manufactured. They are/were allocated mostly to people involved in early stages CHERI R&D and, AFAIK, none were made available to the general public. [That said, I don't know whether there are still some unallocated machines!] One can fully emulate Morello with qemu. While the emulator is, unsurprisingly, rather slow, I generally use qemu for quick Morello experiments, even though I have access to physical Morello boards.

ltratt · on May 4, 2023

You're quite right, I over-simplified -- mea culpa! That should have said "often unify these phases". FWIW, I've written recursive descent parsers with and without separate lexers, though my sense is that the majority opinion is that "recursive descent" implies "no separate lexer".

munificent · on May 4, 2023

For what it's worth, in my little corner of the world, all of the recursive descent parsers I've seen and worked with have separate lexers. I can't recall seeing a single recursive descent parser in industry that didn't separate lexing.

However, I do often see a little fudging the two together for funny corners of the language. Often that just means handling ">>" as right-shift in some contexts and nested generics in others.

abecedarius · on May 4, 2023

That's not my impression of the majority opinion, fwiw. (I wrote my first recursive-descent parser in the 80s and I learned from pretty standard sources like one of Wirth's textbooks.)

lemming · on May 4, 2023

As another data point in addition to the sibling comments, all IntelliJ language parsers use recursive descent with a separate lexer.