Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Adding different kinds of measurements of size doesn't particularly increase the difficulty of tracking it here, just adds some more fields to the metadata and maybe some additional events.

To take an example of size data types from another thread, let's imagine you want to track 2 kinds of size: size as read sequentially (A) and size gained if deleted (B). The first cares about links but the second does not. Every folder would track the size of sub-directories by listening for size calculation events firing on the contained files and folders. A lowest level folder would listen to the events of all its files and update it's own size metadata, this triggers another event that any parent folder is listening to, repeat. A link would just listen to the event of its linked file or folder and would only indicate size updates as relevant to the size data type A. This information gets propagated upwards normal.

As you say, there's a lot of details here but I'm not coming up with any blockers that would invalidate the pattern. It's a very flexible system and I've written this sort of propagating update before with future features all being simple to add. If you can think of any blockers I would be very happy to hear them so I can design around it next time I touch a project like this.



This is exactly how I assumed modern operating systems would operate (macOS seems to operate this way because you can actually show size subtallies, but it actually polls and caches; Windows doesn’t even bother unless you get properties on a directory, which is lame; I think beOS may have actually done this?), and was disappointed when I realized it wasn’t… and still doesn’t for some reason? Why not? Couldn’t they just route all OS-level ways of writing to the storage medium through a system like this?


There's no real reason they couldn't but polling systems tend to be the first thing implemented as conceptually they are simpler and require less rigor to enforce. Switching from a polling pattern to an event pattern however is a sizeable amount of work required and you can frequently get bugs in the switch. The discussion here is in the context of a greenfield project where such concerns aren't an issue.

To be clear it's not like an event registration system is better in every way than polling. It's trading some extra hd and (potentially) memory usage for upfront and cheaper cpu costs. It is possible that whoever made the decision weighed the two options and decided on polling instead of events, and this decision was made long ago when os filesystems were first being designed and storage was at much more of a premium.


I don't see why "size gained if deleted" "does not [care about links]".

What I've learned in my experience with ZFS, Lustre, and other filesystems is that the user simply cannot get what you're asking for, not with any kind of real reliability. For distributed filesystems (like Lustre), the kind of thing you're asking for is simply ETOOHARD or ETOOSLOW. It's very easy to insist on a solution, and very hard to get one.


> I don't see why "size gained if deleted" "does not [care about links]".

Deleting links does not give you any additional storage space beyond the minimal amount taken up by the link itself.

As for existing filesystems, yeah they're going to have problems as they're built on filesystems that for the most part are polling. I'm not insisting on a solution or saying existing filesystems need to use this though, this is all in the context of a greenfield project. Switching something like linux to use an event based system instead would be a major project.

As for distributed filesystems where every file does not own all its own bits, that's just a different way of measuring and doesn't have any major problems for the concept.


Deleting the last link does. All the links are equal, any one of them could be the last one remaining.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: