I think this only happens if the user manually switches to Latin-1 encoding. In that case IE will try to use the same encoding when submitting form data. The user might do this if you already have encoding problems and present a mixed Latin-1/UTF-8 page. The snowman hack serves to prevent the corruption from spreading.
> I think this only happens if the user manually switches to Latin-1 encoding.
That's correct. If you send your HTML document with a charset of UTF-8 (In the Content-Type header) then IE will submit forms using UTF-8 even if the user doesn't input any UTF-8 characters. Unless the user changes the encoding, but I have yet to hear a compelling reason why an ordinary user would do that under ordinary circumstances.
> The snowman hack serves to prevent the corruption from spreading.
It's clever, but the framework could also just reject POST and GET requests which contain invalid UTF-8 characters. (I'm flabbergasted that Ruby doesn't do this[1].) Otherwise a malicious user could try to inject non-UTF-8 characters into your database by sending crafted requests which nevertheless contain the "utf8=✓". And speaking from experience, you do not want to have to deal with encoding problems in your database.
Whether or not you use this hack, you can't naively trust the client to always send valid UTF-8; you're right about this. But because of this bug in IE, rejecting posts with invalid UTF-8 as malicious will net you some false-positive cases, where the user isn't malicious but the browser is being stupid. This hack takes care of the stupidity, leading to a better user experience for people who would otherwise have tripped the false-positive.
What if a sequence of byte values is valid in the charset that IE uses to encode the form data as well as in UTF-8, but is interpreted as different characters in UTF-8? With your method you would not detect an error and use the wrong characters.
(Except if IE sends a content-type header with the actual encoding used, and this header is evaluated on the server side to convert the form data into a string. But in that case you don't have to check for invalid UTF-8 characters, but for characters that are invalid in the charset specified in the content-type header.)