Hacker Newsnew | past | comments | ask | show | jobs | submit | gkbrk's commentslogin

When you use a FOSS product more, the person that wrote the code doesn't end up spending more money. When you use a free service more, someone is paying for that usage and resources.

My Hacker News items table in ClickHouse has 47,428,860 items, and it's 5.82 GB compressed and 18.18 GB uncompressed. What makes Parquet compression worse here, when both formats are columnar?


Sorting, compression algorithm +level, and data types can all have an impact. I noted elsewhere that a Boolean is getting represented as an integer. That’s one bit vs 1-4 bytes.

There is also flexibility in what you define as the dataset. Skinnier, but more focused tables could be space saving vs a wide table that covers everything -will probably break compressible runs of data.


Parquet has a few compression option. Not sure which one they are using.


Plus isn't the least wasteful format, native duckdb for instance compacts better. That's not just down to the compression algorithm, which as you say got three main options for parquet.


.. and Remove all the political shit-slop since COVID/AI and it's probably under a gig.


You could download the data and run that analysis yourself. I’d be interested to see it, especially your method of identifying “political shit-slop” and “AI” and the relationship to COVID. Sounds like an interesting project.


Browser killing the tab way before it happens


A microphone + ADC is hearing though, that's the whole reason we even produce microphones. So that our electronics can hear sound.


So according to you when can you qualify something as capable of hearing

1. Vibrate according input to the sound, is that hearing?

2. Generate electrical signals according to the sound, is that hearing?

3. Amplify electrical signal, does we cross the hearing mark?

4. Record the signal to a cassete tape (or use an ADC -> mp3), are we hearing yet?

5. Play it back through a speaker. Sure, we should be hearing now!

At which point exactly would you say the thing is definitely hearing?


You can reduce the human auditory process to a similar mechanical list. At which specific point would you say a human is hearing?

You've fallen into the trap of human exceptionalism but you don't seem to be aware of that fact. Are you a substance dualist or not?


>You can reduce the human auditory process to a similar mechanical list.

You can't. Because we don't know at which point sound gets registered in consciousness.


Because you can't even define what consciousness is, let alone objectively test for it.

You are entirely wrong though. You most certainly _can_ reduce the human auditory process to a (bio)mechanical list.

You have unilaterally, arbitrarily, and without justification added consciousness to that list.


>Because you can't even define what consciousness is, let alone objectively test for it.

Exactly. So if we understand "hearing" as something registered by consciousness, then implicitly things that are not conscious cannot "hear".

>reduce the human auditory process

Yes, human auditory process, yes. "Hearing" no. I see that you cleverly switched to the "auditory process" instead of "hearing". moving goal posts, are we?


Alan Watts talks about this.

If a tree falls, does it make a sound? It depends on whether there is somebody to ultimately perceive the vibrations that the falling tree made (either directly or via recording).


It is easy to answer this if we define sound as the sensation. If there is no sensor then there is no sensation. If we define sound as the vibration of air. Then yes, it will make a sound.

Most of these questions feels perplexing because some of the underlying terms are loosely defined. If we strictly define those terms, then the question answers itself.


> considering popular opinion there is absolutely no reason to enable it by default

You need to update your priors. The popular opinion is there is a reason to enable it by default. ChatGPT is the #5 most popular website in the world, more popular than Wikipedia, Reddit or Twitter. The vast majority of users want to use AI.

https://en.wikipedia.org/wiki/List_of_most-visited_websites


And there is no YouTube, Facebook or Instagram (#2, #3, #4) default integration for Mozilla


Firefox has YouTube integration of course.

Here's a Firefox file [1] specifically for integrating YouTube videos into their picture-in-picture system. Your random video website won't get this treatment of course, need to be a popular one.

Here's a piece of C++ code [2] in the Firefox engine that specifically rewrites old YouTube embeds from their old HTML embed snippet to the new one. Again, your own video website will never be so deeply integrated into Firefox because it's not a top 10 website.

Firefox Readability mode makes pages more readable by removing useless stuff like videos. Unless it's a YouTube [3] or other top-N popular video website of course. YouTube videos are given special treatment because it's popular and having small integrations like this make the user experience better.

[1]: https://github.com/mozilla-firefox/firefox/blob/1f43fe5ffadd...

[2]: https://github.com/mozilla-firefox/firefox/blob/1f43fe5ffadd...

[3]: https://github.com/mozilla-firefox/firefox/blob/1f43fe5ffadd...


Twitter is a garbage fire, after all.


Because people like chat bot sidebars.

My code editor has a built-in chat bot sidebar that I use every day. It's not a huge stretch that people who use chatbot sidebars in other applications would also want one in their browser.

ChatGPT is the #6 most popular website in the world, why wouldn't a browser want tighter integration with such a popular kind of service?


Should Firefox build in a separate side bar for every popular website? Would you want a Facebook side bar and Facebook account integration?

I wouldn't.


The way users use Facebook and LLMs are so massively different, it almost seems like a bad faith argument to equate them.

Facebook is mostly scrolling the timeline and passive consumption. It doesn't benefit from being on the side because the content you interact with on Facebook is completely separate from the content on your other tabs.

In contrast, LLMs have ongoing conversations that the user can come back to, and each conversation might relate to multiple tabs that the user is working on. On top of that, it's a very common occurrence that the user has questions about, or a task to be done using the content of the current page. This makes LLM and chatbot integration much more useful than a Facebook integration.

Also if you have the Facebook Messenger installed, Firefox already gives you an integration to share things with your Facebook contacts.


Funfact: Firefox already had a Facebook sidebar and integration back in 2012! https://blog.mozilla.org/en/mozilla/firefox-introduces-new-s... The Social API was later removed though because it was wildly unpopular (unrelated but man look at the good macOS and Firefox design in the screenshot...)


Haha, that's so typical Mozilla. Proves my point I suppose.


They kinda already do. Google is built in, just search right in there url bar. You also got DDG, Bing, Wikipedia, Amazon, EBay? They make it easy to add YouTube, I wouldn't be surprised if you could add Facebook.

And like every browser does that. It's been that way for like over a decade...


Okay, and? Is anyone complaining about being able to search your favorite search engine from within Firefox?

Do you genuinely think this is comparable to Facebook integration? Do you believe that it Mozilla announced Facebook account integration and a Facebook side bar tomorrow, people's reaction to that would be, "oh this is just like what they did with search, this is fine"?

If not, isn't your comment a tiny bit disingenuous?


  > My code editor has a built-in chat bot sidebar that I use every day.
Even as a vim user I don't get why an AI chat bot shoved into an IDE is endlessly praised while an optional hidden chatbot in a browser is treated like some grave insult. Last I checked, OpenAI was the 5th most visited website. No one complained that browsers made it easier to interface with the most popular website (Google) by directly typing into the url bar. FFS you can also do that with the 8th most popular website, Wikipedia.

I seriously don't understand why everyone is upset about that. Do what I do and just don't open it or interact with it. No one is making you use it. It's trivial about if bytes because it's literally just a wrapper. So it doesn't affect you, why let it live rent free in your head and make you angry? Just sounds like you're looking for things to complain.


I ... am not convinced that the people who praise Microsoft for shoving Copilot into VS Code are the same people who criticize Mozilla for shoving ChatGPT into Firefox

Personally I dislike both, and VS Code marketing itself as an "AI code editor" is one of many reasons why I would never consider using VS Code.


Somehow people manage to run it without this magical release


They don't pick the keywords uniformly randomly from a list of all keywords though. They think they randomly picked something that popped up in their mind, but those keywords are either

- stuff they saw online recently — ads or otherwise, which put the keywords in their mind

- or stuff they were already interested in recently

Not hard to imagine targeting algorithms picking up on either of these


As I tell my friends

You dont see those "coincidental" ads because your phone is listening to you, you see them because your freind showed interest in the product and theirs enough information to infer they talked to you about it. The good news is, your phone isn't listening to you without your consent. The bad news is, because it doesnt need to.


Are those your assumptions or something that have been tested?


That's not because 4o is good at things, that's because it's pretty much the most sycophantic model and people easily fall for a model incorrectly agreeing with them then a model correctly calling them out.


Not just companies, people too. It will have trouble getting into Linux distro repos. And a lot of devs/users avoid non-open-source projects especially if they went looking for solutions on Github.


The distro repos is a good point and the user went looking for permissionless GitHub projects is even better


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: