For coding it makes no sense to use any quantization worse than Q6_K, from my experience. More quantized models make more mistakes and if for text processing it still can be fine, for coding it's not.
I don't think most people realize that. Quality of tokens beats quantity of token. I always tell folks to go as high a quant as you can only go lower if you just don't have the memory capacity.
AI models like gemma4 are available in different quant "sizes", think about it as an image available in various compression levels.
The best image is the largest, takes up the most memory when loading, and while it is large and looks the best, it uses up much of your system resources.
On the other end of the spectrum there is a smaller much more compressed version of that same image. It loads quickly, uses less resources, but is lacking detail and clarity of the original image.
AI models are similar in that fashion, and the parent poster is suggesting you use the largest version of the AI model your system can support, even if it runs a little slower than you like.
For those who's looking into a good homelab servers - better look at the refurbrished/used mini-pc based on 5th gen of Intel, like i5 11500T (HP ProDesk 400 G5 Mini for example), or ryzen. You'll get better thermals, better CPU, more expansion slots for cheaper than you can get out of NUC.
On top of that, resellers also often have upgrades for RAM and NVME available. WD-Red OEM 1Tb for less than 100 dollars sounds like bargain.
This is false, you can make many plastics without fossil sources (pla, bio-pet, bio-abs, etc). The only challenge is cost and scale - it's cheaper and easier to use existing processes.
My target audience is everyone, but not for daily use. Just like Wikipedia isn't a site you visit every day, but you go there when you need specific information. In the same way, this can be the go-to place for information about products, places, or anything with a barcode or QR code. People can just check it when they are curious about an item right in front of them.
I also have the same concerns. I have my agenda meeting free and create meetings like once a few weeks. The same is for booking flight tickets - once a decade. Adding openclaw there would take more time and effort than doing it manually.
And none of the friends playing with openclaw have any useful non-trivial workflows which can't be automated in oldschool way.
The only viable workflow so far I could think of - build your own knowledge base and info processing pipeline.
That implies that you have a fixed time for lunch and also chat during lunch. I may be the minority but I prefer to eat when I'm hungry and focus on the food instead of chatting. And there is also allergies, as a celiac, I have big troubles eating together with others - they may accidently contaminate my food
I’m actually curious here, not trying to question your experience but does other people’s food regularly contaminate your food when you eat at the same table as them?
I’ve lived with a celiac sufferer before and I’ve never heard of something that extreme, but everyone’s different.
The degree of sensitivity of allergies varies widely. For example there are people who only have a problem after consuming a large scoop of peanut butter but there are also those who will end up in the hospital from trace amounts that you'd have difficulty spotting with the naked eye.
I dated a woman with celiac sprue (which I guess was extreme.. her mother had to have a bowel resection due to celiac related issues) and she had sudden anaphylaxis at a restaurant that required the use of an epi-pen and an ambulance.
The reaction was caused by the micro-brewery that had opened next door and all the wheat dust in the ventilation system.
It sounds like you could get very high ROI from chilling out a little bit. If one social lunch per month is an unfathomable hardship then you're probably leaving a lot of other opportunities on the table. Do you have OCD or social anxiety or something?
This is a very good question. I also struggle to find a good solution to process various signals (papers, tecniques, etc.) with my co-workers while maintaining proper work-life balance. Either you have to be a full time geek, or be left behind..
I don't have any benchmarks avalable right now, and honestly I found pretty hard to make them considering that the workflow I have set up is not fully automated, but there is a lot of human intervention in the pre-coding phases.
I feel the problem of token wasting a lot, and actually that was the first reason I had to introduce a hierarchy for instructions, and the artfact indexes: avoid wasting. Then I realized that this approaches helped to keep a lean context that can help the AI agent to deliver better results.
Consider that in the initial phase the token consumption is very limited: is in the implementation phase that the tokens are consumed fast and that the project can proceed with minimal human intevenction. You can try just the fist requirement collection phase to try out the approach, the implementation phase is something pretty boring and not innovative.
I am playing around with building my own similar and am faced with the question you pose.
How can you tell if your prompt process works? I feel like the outputs from SDLC process are so much more high level than could be done with evals, but I am no eval expert.
For sure the proposed approach is more token-consuming than just ask high level the final outcome of the project and make an AI agent to decide everything and deliver the code. This can be acceptable for small personal projects, but if you want to deliver production ready code, you need to be able to control all the intermediate decisions, or at least you want to save and store them. They are needed because otherwise any high level change that you will require will not be able to make focused and coherent enough code changes, with previous forgotten decision that are modified and the code change that will produce lots of side-effects.
reply