More

eleventyseven · 2026-03-12T04:07:04 1773288424

I routinely call out people of writing in an LLM assisted fashion that clearly shows they have just been "vibe commenting". You know, just paste it in and copy the output without even thinking. The people who for some insane reason think they are making a genuine conversation with their copy pasting skills and $20/mo subscription. As if they are like the archive.whatever of the AI era. Because those comments are objectively terrible and contribute little. The ones with all the consultant sycophant speak and distracting prose that comes off the default prompt and RLHF.

But that's really what you're now enforcing: writing in an easily detectable LLM prose and voice. LLM detection is very difficult especially at small comment scale texts. There is never proof, only telltale phrases. How will this be enforced? What the heck even is "AI"?

The thing that really frustrates me is that I can't put tokens through a transformer in any way in editing my post? I can't have an LLM turn a bare link after a sentence into a [1]? I can't have it literally do nothing more than spell check in an LLM, but could with a rule based model? Or what about other LLMs or SLMs or classic NLP chained together? Or is it just the transformer?

And it is officially sanctioned that people ought to be keeping in the back of their mind "does this feel LLMish?" instead of "is this a good comment that contributes to the discussion?" Maybe LLM prose is so annoying and insufferably sycophantic that even if all the content and logic was sound, it still should be moderated completely out. But the entire technological form is profane and unclean?

I am 100% not interested in participating in a community that seeks to profile and police the technological infrastructure that its members use. I want my comments judged by the contributions they make and do not make to the discussion. If the LLM makes the comment better, it is good. If it makes it worse, it is bad.

lelanthran · 2026-03-12T04:51:14 1773291074

> I am 100% not interested in participating in a community that seeks to profile and police the technological infrastructure that its members use.

I suppose, then... goodbye?

After all, there are a ton of different forums where you can have your chatbot talk to other chatbots.

thirtygeo · 2026-03-12T05:43:58 1773294238

Definitely agree. If you look at comments posted in places like Slashdot - is is basically ruined forever (and at one time it was quite excellent for real comments, from real experts and experienced people)

coldtea · 2026-03-12T08:29:21 1773304161

>But that's really what you're now enforcing: writing in an easily detectable LLM prose and voice.

That's a good start already. Don't let the impossibility of the perfect prevent implementing the good.

>I want my comments judged by the contributions they make and do not make to the discussion. If the LLM makes the comment better, it is good. If it makes it worse, it is bad.

Nope, it's all bad. If I wanted the comments of an LLM, I'd ask an LLM.

>I am 100% not interested in participating in a community that seeks to profile and police the technological infrastructure that its members use.

Well, don't let the door hit you on your way out.

bluedel · 2026-03-12T08:56:04 1773305764

>I want my comments judged by the contributions they make and do not make to the discussion

There used to be a sort of gentleman's agreement that I could spare the time to read and judge your comment because you went through the effort of writing it.

calmoo · 2026-03-12T04:11:44 1773288704

I think a more generous interpretation of dang's comment is that it's fine to use LLMs / tools to fix grammatical errors / spellchecking, but a heavier pass where the prose, wording and tone is altered (even mildly) can create a 'slop ambience' over time, death by a thousand paper cuts.

dang · 2026-03-12T04:28:45 1773289725

There's a gradient here for sure, but it's getting clear that people using LLMs "only" for grammar and spelling fixes are underestimating how much else the LLMs are doing.

eleventyseven · 2026-03-12T04:13:49 1773288829

Slop ambience just sure sounds to me like HN is banning a prose style. I guess I just think that if this is how the rule will be enforced, that is how it should be written.

calmoo · 2026-03-12T04:24:34 1773289474

HN already does a decent amount of content-policing, which helps keep the discussion higher quality. I don't see a huge diversion here from the usual moderation.

darkwater · 2026-03-12T07:29:40 1773300580

Home can be sure the LLM is modifying just the prose style? Moreover, prose style is one of signal that convey information about what you are trying to transmit (unlike code, which is totally debatable if it should have meaning on its own).

eleventyseven · 2026-03-09T04:15:30 1773029730

> Didn't read the articles

Then kindly shut the fuck up.

eleventyseven · 2026-03-09T00:31:15 1773016275

> ollama benchmark ... for now, it's purely CPU, with DeepSeek R1 models tested based on the RAM available.

Then the results aren't comparable across different boards across RAM sizes. It'd be better to test a set of different model sizes on all and report -- if it didn't fit. But could you report the full ollama model name and version size slug for each?

> I pull Jeff's fork of the ollama-benchmark software

A link would be nice.

sthlmb · 2026-03-09T10:30:40 1773052240

Hmm, I'm not sure if I'm missing something but that 1st comment is what I'm doing. I have 3 different sized Deepseek R1 models (1.5, 8, 16) and they run on each board that can handle them and then the data is reported.

For the 2nd, the file I grabbed initially was https://github.com/geerlingguy/ai-benchmarks/blob/main/obenc... - which I now notice wasn't modified in his repository, so I can check that out, but either way, the same version has been tested across everything thus far.

eleventyseven · 2026-03-04T01:14:42 1772586882

I recommend capitalizing TextAdept, as it took me way way way too long to figure out it wasn't text a dept (SMS which department????)

eleventyseven · 2026-03-02T16:29:55 1772468995

> if you own the entire training to inference pipeline i'm not sure what this offers

336x faster than Python, and swapping backends in a production environment can be far from trivial

eleventyseven · 2026-03-02T15:31:25 1772465485

> Throughout this series, “we” refers to maderix (human) and Claude Opus 4.6 (by Anthropic) working as a pair. The reverse engineering, benchmarking, and training code were developed collaboratively

Sure, "collaboratively." Why would I ever trust a vibe coded analysis? How do I, a non expert in this niche, know that Opus isn't pulling a fast one on both of us? LLMs write convincing bullshit that even fools experts. Have you manually verified each fact in this piece? I doubt it. Thanks for the disclaimer, it saved me from having to read it.

brookst · 2026-03-03T05:35:29 1772516129

You’d feel better if it was two people you don’t know? Because obviously any random person is 100% accurate, never mistaken, never making shit up?

I don’t understand the mindset, I really don’t. Why are humans held to such a lower standard?

ezst · 2026-03-03T07:21:19 1772522479

Despite all the anthropomorphizing of LLMs, you must have come across already how each has VERY DISTINCT failure modes?

brookst · 2026-03-03T14:34:03 1772548443

Actually… no. Now that you mention it, and thanks for the interesting thought, the failure modes seem pretty similar to me.

Shoddy research / hallucination, tendency to lose the thread, lack of historical / background context… the failure modes are at least qualitatively similar.

Show me an LLM failure and I’ll show you a high profile journalist busted for the same thing. And those are humans who focus on these things!

michaelmrose · 2026-03-03T06:36:32 1772519792

Humans as a class are error prone but some humans in their respective fields are very very good. It's often not terribly hard to figure out based on resume and credentials who these folks are and as a shortcut we can look for markers in terms of terminology specifics confidence if it's less important like deciding what to read vs cancer care for your mom.

AI can trip all the right searches to fool these shortcuts whilst sometimes being entirely full of shit and they have no resume nor credentials to verify should we desire to check.

If you have such and vouch for it I can consider your trustworthiness rather than its. If you admit you yourself are reliant on it then this no longer holds

Anonbrit · 2026-03-02T16:19:56 1772468396

Humans also write endless amounts of convincing bullshit, and have done since time immemorial. False papers and faked results have been a growing scourge in academia before LLMs were a thing, and that's just counting the intentional fraud - the reproducibility crisis in science, especially medical and psychological science, affects even the best designed and well intentioned of studies.

Humans also make mistakes and assumptions while reverse engineering, so it will always need more engineers to go through the results, test things

withinboredom · 2026-03-02T15:59:14 1772467154

Claude likes to hide bad benchmarks from you, so it will show you where you are clearly winning. You even see some weird benchmarks in the article.

maderix · 2026-03-03T06:58:42 1772521122

Benchmarks all in part 2. Training progress in part 3(upcoming) Also I think AI human collaboration is important for goal management. Sure LLMs bullshit all the time, but that's the role of the human to create good goals and gating criteria to what constitutes as good.

this-is-why · 2026-03-03T04:51:51 1772513511

Agreed. Now is our chance to start pushing back on this. Don’t patronize this. Just glad author admitted it. Next time they won’t tho.

eleventyseven · 2026-03-02T08:47:37 1772441257

This is the vLLM of classic ML, not Ollama.

eleventyseven · 2026-02-28T15:02:22 1772290942

I am saddened by your gullibility. Your first instinct is to trust this administration? Who has repeatedly showed utter contempt for the very idea of truth, the constitution, the rule of law, and science, merely because half of American voters are brainwashed?

This administration's arguments do not deserve to be steelmanned.

eleventyseven · 2026-02-28T14:54:07 1772290447

Because HNers are not so gullible to swallow and regurgitate this pretext. The Trump administration doesn't care about the people of Iran, any more than Bush cared about the Iraqi Kurds or Afghan women. Just a pretext for geopolitics.

eleventyseven · 2026-02-28T14:50:33 1772290233

Religion isn't the facade, it is the medium through which other reasons are transmitted