Roughly 0% of a typical desktop’s disk space is used by hardlinked files. You ca...

toast0 · on Jan 16, 2022

> Roughly 0% of a typical desktop’s disk space is used by hardlinked files. You can safely double count them. That’s exactly what every disk space analyzer does!

While that may be a reasonable strategy, it's not what every disk space analyzer does.

I've got a backup setup that uses hardlinks to provide a wide variety of restore points without using a lot of space. du doesn't double count:

   $ du -hs daily.0
   436G    daily.0

   $ du -hs daily.1
   436G    daily.1

   $ du -hs daily.0 daily.1
   436G    daily.0
   12M    daily.1

dataflow · on Jan 16, 2022

Not sure what you consider a "typical desktop", but on Windows, WinSxS has gigabytes worth of hardlinks. If you don't care about them that's another matter I guess.

dataflow · on Jan 16, 2022

Also note that the user will be confused when they delete the whole directory and observe 0 bytes get freed. (I guess a similar problem is also there even if you double count.)

The point is, the problem itself is-ill defined. There's no solution to that other than scrapping or redefining the problem itself. And it's hard to define the problem precisely for a non-technical user.

glandium · on Jan 16, 2022

> Also note that the user will be confused when they delete the whole directory and observe 0 bytes get freed.

Note that's already the case when the user removes files that are opened by some process.

dataflow · on Jan 16, 2022

On Linux yeah. On Windows no.

pjmlp · on Jan 16, 2022

On any non POSIX system.

DarylZero · on Jan 16, 2022

Modern filesystems that use CoW will share data between files even without any hard links.

yjftsjthsd-h · on Jan 16, 2022

Then you get into semantic arguments: If a directory contains 2 1GB files, does the user care that 99% of their blocks are shared, or that just an under-the-hood implementation detail, and the user wants to know that there are 2GB worth of files in there?

garaetjjte · on Jan 16, 2022

Really, there should be two file sizes: "how much space this will take if I copy it to other filesystem" and "how much space will be freed if I delete this"

scaramanga · on Jan 16, 2022

Well, three, "how many bytes do I get if I open it and read all the bytes out". Or maybe four, how many bytes do I get if I open it and read all bytes which aren't holes (ie. how many bytes do I need to put into an archive that supports sparse files) :)

yjftsjthsd-h · on Jan 16, 2022

I would expect at least one of those cases to be identical to "how much space this will take if I copy it to other filesystem" if you're asking a generic question where the target is a hypothetical and therefore you have to say how many bytes would be taken by the raw files since everything else requires knowing specific details about the target.

bscphil · on Jan 16, 2022

> "how much space this will take if I copy it to other filesystem"

This is ambiguous between "how many bytes are all these files in total" and "how many bytes does it take to store a single copy of all these files on such-and-such file system (mostly the current one)". The latter can be different because of transparent compression, which is common on e.g. BTRFS.

webmaven · on Jan 16, 2022

Don't forget "How much of my free storage/broadband data will be used up when I attach this to an email?"

Const-me · on Jan 16, 2022

Windows explorer does that.

They are labelled "Size" and "Size on disk", respectively.