Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As far as I've seen, local OSS video understanding models just really aren't there yet. I briefly looked at facial recognition models but a good amount of signal was actually in the video's audio instead of the raw video frames. Depends on the accuracy you're looking for at the end of the day.


Thanks for the reply. Let's hope local models catch up.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: