Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Could you say that stuff with llama 3? Llama 2 famously had a good uncensored version but I thought they put a lot of work into ruining llama 3 so you couldn't fine-tune it to say bad things. Even Grok would be hard to use in such a way that you could say phrases like that naturally.

I do believe it's possible but as far as I am aware, getting LLM's to say that sort of stuff is still pretty difficult

 help



Just go look on HuggingFace. It's packed with uncensored models from the Dolphin Llama 3 70B family that will happily write you a recipe for napalm while swearing like a sailor. Meta's guardrails lasted exactly one week before the community figured out weight abliteration - a method that surgically removes the refusal vectors from the weights without even needing a fine-tune



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: