Hacker Newsnew | past | comments | ask | show | jobs | submit | jeswin's commentslogin

C#'s LINQ (code as data, like LISP) wins over golang for any type of data access. Strongly-typed, language-native queries. Go has its own advantages though.

EF is amazing

And now with NativeAOT, you can use C# like go - you don't need to ship the CLR.

It's not a one off issue - it has happened to me a few times. It has once even force pushed to github, which doesn't allow branch protection for private personal projects. Here's an example.

1) claude will stash (despite clear instructions never to do so).

2) claude will use sed to bulk replace (despite clear instructions never to do so). sed replacements make a mess and replaces far too many files.

3) claude restores the stash. Finds a lot of conflicts. Nothing runs.

4) claude decides it can't fix the problem and does a reset hard.

I have this right at the top of my CLAUDE.md and it makes things better, but unlike codex, claude doesn't follow it to the letter. However, it has become a lot better now.

NEVER USE sed TO BULK REPLACE.

*NEVER USE FORCE PUSH OR DESTRUCTIVE GIT OPERATIONS*: `git push --force`, `git push --force-with-lease`, `git reset --hard`, `git clean -fd`, or any other destructive git operations are ABSOLUTELY FORBIDDEN. Use `git revert` to undo changes instead.


When will you all learn that merely "telling" an LLM not to do something won't deterministically prevent it from doing that thing? If you truly want it to never use those commands, you better be prepared to sandbox it to the point where it is completely unable to do the things you're trying to stop.

Even worse, explicitly telling it not to do something makes it more likely to do it. It's not intelligent. It's a probability machine write large. If you say "don't git push --force", that command is now part of the context window dramatically raising the probability of it being "thought" about, and likely to appear in the output.

Like you say, the only way to stop it from doing something is to make it impossible for it to do so. Shove it in a container. Build LLM safe wrappers around the tools you want it to be able to run so that when it runs e.g. `git`, it can only do operations you've already decided are fine.


Even even worse, angry all-caps shouting will make it more stupid, because it pushes you into a significantly stupider vector subspace full of angry all-caps shouting. The only thing that can possibly save you then is if you land in the even tinier Film Crit Hulk sub-subspace.

I touch on this a bit in the piece I wrote for normies, it helped a lot of people I know understand the tech a bit better.


Is this true for anything beyond the simplest LLM architectures? It seems like as soon as you introduce something like CoT this is no longer the case, at least in terms of mechanism, if not outcome.

This is true for prohibitions but claude.md works really well as positive documentation. I run custom mcp servers and documenting what each tool does and when to use it made claude pick the right ones way more reliably. Totally different outcome than a list of NEVER DO THIS rules though, for that you definitely need hooks or sandboxing.

Yes but this is probabilistic. Skill, documentation etc help by giving it the information it needs. You are then in the more correct probability distribution. Fine for docs, tips etc, but not good enough for mandatory things.

"more reliably" is still not "reliably".

The phrase "don't give them ideas" comes to mind.

Feels like a lot of people are still treating these tools like “smart scripts” instead of systems with failure modes.

Telling it not to do something is basically just nudging probabilities. If the action is available, it’s always somewhere in the distribution.

Which is why the boundary has to be outside the model, not inside the prompt.


Agree completely. The middle ground between "please don't" and full sandboxing: run a validation script between agent steps. The agent writes code, a regex check catches banned patterns, the agent has to fix them before it can proceed. Sandboxing controls what the agent can do. Output validation controls what it gets to keep. Both are more reliable than prompt instructions.

That’s right, because we’re not developers anymore— we orchestrate writhing piles of insane noobs that generally know how to code, but have absolutely no instinct or common sense. This is because it’s cheaper per pile of excreted code while this is all being heavily subsidized. This is the future and anyone not enthusiastically onboard is utterly foolish.

My point is exactly that you need safeguards. (I have VMs per project, reduced command availability etc). But those details are orthogonal to this discussion.

However "Telling" has made it better, and generally the model itself has become better. Also, I've never faced a similar issue in Codex.


> sandbox it to the point where it is completely unable to do the things you're trying to stop

Why are permissions for these "agents" on a default allow model anyway?


What do you mean? By default, Claude asks for permission for every file read, every edit, every command. It gets exhausting, so many people run it with `--dangerously-skip-permissions`.

It does not ask for permission for every file read, only those outside the project and not explicitly allowed. You can bypass project edit permission requests with “allow edits”, no need for “dangerously skip permissions”. Bash commands are harder, but you can allow-list them up to a point.

> so many people run it with `--dangerously-skip-permissions`

It's on the people then, not the "agent". But why doesn't Claude come with a decent allow list, or at least remember what the user allows, so the spam is reduced?


You have the option to "always allow command `x.*`", but even then. The more control you hand over to these things, the more powerful and useful (and dangerous) they become. It's a real dilemma and yet to be solved.

I use a script wrapper of git un muy path for claude, but as you correctly said, I'm not sure claude Will not ever use a new zsh with a differentPATH....

Why do you expect that a weighted random text generator will ever behave in predictable way?

How can people be so naive as to run something like Claude anywhere other than in a strictly locked down sandbox that has no access to anything but the single git repo they are working on (and certainly no creds to push code)?

This is absolutely insane behavior that you would give Claude access to your GitHub creds. What happens when it sees a prompt injection attack somewhere and exfiltrates all of your creds or wipes out all of your repos?

I can't believe how far people have fallen for this "AI" mania. You are giving a stochastic model that is easily misdirected the keys to all of your productive work.

I can understand the appeal to a degree, that it can seem to do useful work sometimes.

But even so, you can't trust it with anything, not running it in a locked down container that has no access to anything but a Git repo which has all important history stored elsewhere seems crazy.

Shouting harder and harder at the statistical model might give you a higher probability of avoiding the bad behavior, but no guarantee; actually lock down your random text generator properly if you want to avoid it causing you problems.

And of course, given that you've seen how hard it is to get it follow these instructions properly, you are reviewing every line of output code thoroughly, right? Because you can't trust that either.


> How can people be so naive as to run something like Claude anywhere other than in a strictly locked down sandbox that has no access to anything but the single git repo they are working on (and certainly no creds to push code)?

> This is absolutely insane behavior that you would give Claude access to your GitHub creds. What happens when it sees a prompt injection attack somewhere and exfiltrates all of your creds or wipes out all of your repos?

I don’t understand why people are so chill about doing this. I have AI running on a dedicated machine which has absolutely no access to any of my own accounts/data. I want that stuff hardware isolated. The AI pushes up work to a self-hosted Gitea instance using a low-permission account. This setup is also nice because I can determine provenance of changes easily.


> How can people be so naive as to run something like Claude anywhere other than in a strictly locked down sandbox that has no access to anything but the single git repo they are working on (and certainly no creds to push code)?

Because it’s insanely useful when you give it access, that’s why. They can do way more tasks than just write code. They can make changes to the system, setup and configure routers and network gear, probe all the iot devices in the network, set up dns, you name it—anything that is text or has a cli is fair game.

The models absolutely make catastrophic fuckups though and that is why we’ll have to both better train the models and put non-annoying safeguards in front of them.

Running them in isolated computers that are fully air gapped, require approval for all reads and writes, and can only operate inside directories named after colors of the rainbow is not a useful suggestion. I want my cake and I want to eat it too. It’s far to useful to give these tools some real access.

It doesn’t make me naive or stupid to hand the keys over to the robot. I know full well what I’m getting myself into and the possible consequences of my actions. And I have been burned but I keep coming back because these tools keep getting better and they keep doing more and more useful things for me. I’m an early adopter for sure…


Well, one of the other reasons I suggest running it in a strictly limited container is that you can then run it in yolo mode.

In fact, I use the pi agent, which doesn't have command sandboxing, it's always in yolo mode, I just run it in a container and then I get the benefit of not having to confirm every command, while strictly controlling what I share with it from the beginning of the session.


The answer is that for these people most of the time it looks predictable so they start to trust it

The tool is so good at mimicking that even smart people start to believe it


Claude Code hooks are deterministic; the agent can’t bypass them [1].

For example you force a linter to run or for tests to run.

Claude Code defaults to running in a sandbox on macOS and Linux. Claude Cowork runs in a Linux VM.

[1]: https://code.claude.com/docs/en/hooks-guide


> How can people be so naive as to run something like Claude anywhere other than in a strictly locked down sandbox that has no access to anything but the single git repo they are working on (and certainly no creds to push code)?

Because it is much easier to do and failure rate is quite low.

(not saying that it is a good idea)


Trust issues start at home.

If you can't trust yourself, you will never be able to trust anyone else.

If you believe the AI is out to get you, that's certainly the reality you will manifest.


It has once even force pushed to github, which doesn't allow branch protection for private personal projects.

This is only restricted for *fully free* accounts, but this feature only requires a minimum of a paid Pro account. That starts around $4 USD/month, which sounds worth it to prevent lost work from a runaway tool.


I was on one till recently, maybe I still am. But does it work for orgs? I put some projects under orgs when they become more than a few projects.

That's a fee for not running a local git proxy with permissions enforcement that holds onto the GitHub credentials in place of Claude.

Do you know of a good ready-made implementation of such a proxy? I’ve been looking for one.

GitHub is also a worry in terms of exfiltration. You can’t block pushes to public repos unless you are using GitHub Enterprise Managed Users afaict.


Or putting the code and .git in a sandbox without the credentials

Reinforcing an avoidance tactic is nowhere near as effective as doing that PLUS enforcing a positive tactic. People with loads of 'DONT', 'STOP', etc. in their instructions have no clue what they're doing.

In your own example you have all this huge emphasis on the negatives, and then the positive is a tiny un-emphasized afterthought.


I think you're generally correct, but certainly not definitively, and I worry the advice and tone isn't helpful in this instance with an outcome of this magnitude.

(more loosely: I'm a big proponent of this too, but it's a helluva hot take, how one positively frames "don't blow away the effing repro" isn't intuitive at all)


The trick is to explain why something is important, not just to emphasize it. For instance:

"As an LLM, when Claude used 'sed', it can quickly and easily break files that are difficult for the user to fix. Claude must be aware that an LLM's actions seem effortless to it but to the user it represents hours of work getting things back in order."


Claude tends to disregard "NEVER do X" quite often, but funnily enough, if you tell it "Always ask me to confirm before going X", it never fails to ask you. And you can deny it every time

If it disregards "NEVER do" instructions, why would it honor your denial when it asks?

You mean like in this example? https://web.archive.org/web/20260313042512/https://gist.gith...

There is never a guarantee with GenAI. If you need to be sure, sandbox it.


There are plenty of examples in the RL training showing it how and when to prompt the human for help or additional information. This is even a common tool in the "plan" mode of many harnesses.

Conversely, it's much harder to represent a lack of doing something


Because it’s just fancy auto-complete.

This is why I use yoloAI (https://github.com/kstenerud/yoloai).

    $ yoloai new bugfix . -a --network-isolated --agent claude
Now I have a claude code session that only has a COPY of my work dir, and can't reach anything over the network except the Claude API server.

Now I interact with the agent, and when it's done:

    $ yoloai diff bugfix
    diff --git a/b64.go b/b64.go
    index cfc5549..253c919 100644
    --- a/b64.go
    +++ b/b64.go
    @@ -39,7 +39,7 @@ func Encode(data []byte) string {
        val |= uint(data[i+2])
       }

    -  out[j] = alphabet[(val>>18)&0x3E]
    +  out[j] = alphabet[(val>>18)&0x3F]
       out[j+1] = alphabet[(val>>12)&0x3F]

       remaining := n - i
Looks good, let's apply it:

    $ yoloai apply bugfix
    Target: /home/ks/tmp/b64

    Commits to apply (1):
      9db260b33bcd Fix bit mask in base64 encoding

    Apply to /home/ks/tmp/b64? [y/N] y
    1 commit(s) applied to /home/ks/tmp/b64
Now the commit claude made inside the sandbox has been applied to my workdir:

    $ git log
    commit 5b0fc3a237efe8bbc9a9e1a05f9ce45d37d38bfa (HEAD -> main)
    Author: Karl Stenerud <kstenerud@gmail.com>
    Date:   Mon Mar 30 05:28:21 2026 +0000

        Fix bit mask in base64 encoding

        Corrected the bit mask for the first character extraction from 0x3E to 0x3F to properly extract all 6 bits.

        Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

    commit 31e12b62b0c3179f3399521d7c4326a8f6130721 (tag: init)
The important thing here is that Claude was not able to reach anything on the network except its own API, and nothing it did ever touched my work dir until I was happy with the changes and applied them.

It also doesn't get access to my credentials, so it couldn't push even if it did have network access.


> which doesn't allow branch protection for private personal projects.

Time for a personal Forgejo instance? Mine has been running great for more than a year. Faster than GitHub even.


I don't understand how people in this day and age have not learned what the pink elephant problem is.

If you tell AI not to do something, you make it incomprehensibly more likely it will happen.

Use affirming language. Why do you think negative prompts don't exist in diffusion anymore?


I've recently implemented hooks that make it impossible for Claude to use tools that I don't want it to use. You could consider setting up a tool that errors if if they do an unsafe use of sed (or any use of sed if there are safer tools).

Even just last week I auto approved a plan and it even wrote the commit message for me (with @ClaudeCode signed off) which I am grateful my manager did not see.

Claude does not know my github ssh key. I'll do the push myself, thank you. Always good to keep around one or two really import things it can't do.

Like for humans, teaching the good way to do things works better than forbidding a few bad behaviours.

Maybe stop using the CLAUDE.md to prevent it from running tools you don't want it to and just setup a hook for pretooluse that blocks any command you don't want.

Its trivial to setup and you could literally ask claude to do it for you and never have any of these issues ever again.

Any and all "I don't want it to ever run this command" issues are just skill issues.


How that stops Claude from removing hook and then running command anyway?

That's nothing like the issue of the main topic

"DO NOT, EVER, UNDER ANY CIRCUMSTANCES, think of an elephant"

> Linux gets a bad reputation because 20-ish years ago Ubuntu sent out free CDs and became the dominant OS.

I've been an Ubuntu user for 20 years, and RedHat and Suse prior to that. Ubuntu just worked. Debian had packages for everything, including from 3rd party vendors. It lets me focus on my work, and not worry about the OS, or compiling packages, or finding installers. When I had issues (rare), the large user base meant that someone had already figured out a solution to the problem.

The flavor of Linux doesn't matter so much in my opinion.


There are so many useful snippets of good advice on this thread.

I'd like to mention sport again, but with an addition: find a sports coach you can afford. This changes sport from being a destination to a path, and you'll avoid injuries - which is something you'll need to be careful about as your grow older. Im in my mid 40s, for context.


How well does it support Linux + NativeAOT? Thanks in advance.

Never mind, found this in the docs: https://fna-xna.github.io/docs/appendix/Appendix-A%3A-Native...


I wouldn't give too much credit to rules like this. Data structures are often created with an approach in mind. You can't design a data structure without knowing how you will use it.

If anything it's the other way round, if you're not talking about business domain modeling (where data structures first is a valid approach).


> If anything it's the other way round, if you're not talking about business domain modeling (where data structures first is a valid approach).

And even there, the data models usually come about to make specific business processes easier (or even possible). An Order Summary is structured a specific way to allow both the Fulfilment and Invoicing processes possible, which feed down into Payment and Collections processes (and related artefacts).


To elaborate on @jeswin's point above (IDK why it got downvoted)... a data structure is basically like a cache for the processing algorithm. The business logic and algorithm needs will dictate what details can be computed on-the-fly -vs- pre-generated and stored (be it RAM or disk). Eg: if you're going to be searching a lot then it makes sense to augment the database with some kind of "index" for fast lookup. Or if you are repeatedly going to be pllotting some derived quantity then maybe it makes sense to derive that once and store with the struct.

It's not enough for a data structure to represent the "fundamental" degrees of freedom needed to model the situation; the algorithmic needs (vis-a-vis the available resources) most definitely matter a lot.


If you don't know enough to design a data structure, requirements are missing and someone talking to the client is dropping the ball big time.


Where did I say any of that?

I'm saying that if you care about performance, data structures should be designed with approach specific tradeoffs in mind. And like I've said above, in typical business apps, it's ok to start with data structures because (a) performance is usually not a problem, (b) staying close to the domain is cleaner.


You said: "You can't design a data structure without knowing how you will use it."

But the whole discussion involves knowing how you will use it; the advocacy is for careful consideration of data structures (based on how you will use them) resulting in less pain when designing/choosing algorithms.


My point is that one doesn't follow the other. To design good data structures, you need to know how it'll get used (the algorithm).

> If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident.

This is what I was responding to.


See also:

"Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious."

https://en.wikiquote.org/wiki/Fred_Brooks


When it comes to lengthy non-trivial work, codex is much better but also slower.


If you want native binaries from typescript, check my project: https://tsonic.org/

Currently it uses .Net and NativeAOT, but adding support for the Rust backend/ecosystem over the next couple of months. TypeScript for GPU kernels, soon. :)


True p2p is the only approach that will work, not federation. I'd go futher and make the protocol high-friction for federation.

It's true that many p2p attempts have failed, but it's also the only solution that doesn't require someone running servers for free. There's evidence of success as well: napster (and bittorrent). Both were wildly successful, and ultimately died because of legal issues. It might work when the data is yours to share.


I can't imagine a world where a p2p social network is practical. Not when each node is an unreliable mobile phone that's maybe on cellular. Even with something like ipfs you have pinning services, bittorrent has seed boxes, because pure p2p is impractical.


You can have your other devices and friends replicating.


That uses a lot of bandwidth and battery. I'd rather find a better way to pay for servers than try to avoid them.


I sort of agree, but federation is good. It's funny that you use bittorrent as an example because it involves every single user running servers for free.

If people can both be an origin for content and a relay for content, and modulate the extent to which they want to do either of those things, there's not really much of a difference between "federation" and "true" p2p. Some people will be all relay, and some people will be all content. Some content people might be paying relays, and some relays might be paying content people. Some relays will be private and some relays will be public. Some people will maintain all of their own content locally, and some people will leave it all on a specialized remote server as a service and not even care about holding a local copy.

Also, browsing would either have to be done through a commercial or public service (federation again), or through specialized software (no one will ever use this and operating systems will intentionally lock it out if they see it as a competitor.)

The problem with wishing this all into existence, though, is that bittorent (not dead) exists and is completely stagnant. There is often a lot of talk about improving the protocol, and the various software dealing with it, and none of it gets done. If bittorrent would just allow torrents to be updated (content added or removed), you could almost piggyback social media on it immediately. It's not getting done. Nobody is doing it, just writing specs that everybody ignores for decades.

So I guess my belief is that "true p2p" is a meaningless term and target when it comes to creating recognizable social media. "True p2p" would be within a private circle of friends, on specialized software. Might as well be a fancy e.g. XMPP group chat; it's already available for anyone who wants it. Almost nobody wants it. Telegram, Whatsapp, and imessage are already good enough for that. They may not be totally private, but they're private enough for 99.9999% of people's purposes, and people are very suspicious of the 0.0001% who want something stronger.

I actually think you're using "true p2p" here to sort of handwave a business model into existence (trying to imply mutuality, or barter, or something.) Whereas I think the business model is the part that needs to be engineered carefully and the tech is easy.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: