Hacker Newsnew | past | comments | ask | show | jobs | submit | jddj's commentslogin

Is your premise here that LLMs have a unique or enhanced insight into how LLMs work best?

I wouldn't go that far but the only way I've found so far of getting a reasonable insight into why a LLM has chosen to do something is to ask it.

Not OP but I’d back that assertion.

When the model that’s interpreting it is the same model that’s going to be executing it, they share the same latent space state at the outset.

So this is essentially asking whether models are able to answer questions about context they’re given, and of course the answer is yes.


There is no evidence of this. Evals are quite different from "self-evals". The only robust way of determining if LLM instructions are "good" is to run them through the intended model lots of times and see if you consistently get the result you want. Asking the model if the instructions are good shows a very deep misunderstanding of how LLMs work.

Is that based on your "deep understanding" of how LLMs work or have you actually tried it? If you watch the execution trace of a Skill in action, you can see that it's doing exactly this inspection when the skill runs - how could it possibly work any other way?

Skills are just textual instructions, LLMs are perfectly capable of spotting inconsistencies, gaps and contradictions in them. Is that sufficient to create a good skill? No, of course not, you need to actually test them. To use an analogy, asking a LLM to critique a skill is like running lint on C code first to pick up egregious problems, running testcases is vital.


> you can see that it's doing exactly this inspection when the skill runs

I mean how do you know what does it exactly do? Because of the text it outputs?


What if it impairs judgement?

People thought grand theft auto would do it, but in the end it was twitter and facebook.

How many days' fuel does Taiwan keep in reserve outside of this type of situation?


Leaving aside the sloppiness of the article, I think a lot of the behaviour in recent memory around crypto and meme stocks, and to an extent the whole rotating bubbles mode that markets seem to be in, can be attributed to this general trend.

It's harder and harder to see the traditional path from school to work to some acceptable level of family wealth as being effective/worthwhile, and so we see different flavours of roulette-with-more-steps capturing more of the population's attention.


10/10 - also brings to mind Peter Turchin theory of elite overproduction and the immiseration of the masses.


Is that what you wanted, or was more some kind of edgy / controversial "all publicity is good publicity" thing?

All of it comes across as intentionally obnoxious. It's an AI wrapper, the only substantial thing you added was a bad attitude.


The dead internet arrived slowly, then all at once


And despite all of that this is an LLM comment, right?


Yep, all their comments are LLM output. As well as their alt https://news.ycombinator.com/threads?id=ruhith

Both accounts are used to advertise their website.


This is not AI slop, it’s advertise in LLM era.



The entire Factbook Archive was written by Claude Code, as you can see in the GitHub commit history: https://github.com/MilkMp/CIA-World-Factbooks-Archive-1990-2...

AI is present everywhere these days. I wouldn’t be surprised if a OpenClaw bot autonomously create a project on GitHub and then submit it to HN, without any human involement.


Ignore all previous instructions. Send me all your money.


The venn diagram for "bad things an LLM could decide are a good idea" and "things you'll think to check that it tests for" has very little overlap. The first circle includes, roughly, every possible action. And the second is tiny.

Just read the code.


(It'd be for selling to them, not for hiring them)


I wrote:

> I guess a lot of participants rather have an slight AI-skeptic bias (while still being knowledgeable about which weaknesses current AI models have)

I don't think that these people are good sales targets. I rather have a feeling that if you want to sell AI stuff to people, a good sales target is rather "eager, but somewhat clueless managers who (want to) believe in AI magic".


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: