Congratulations! Looks awesome.
1. I found it very intuitive.
2. If I could have smart filtering using llm classification, that would be very powerful. Any plans on doing that?
As in a search box where you can ask free form queries rather than applying filters? We haven't heard much demand for that yet, so haven't prioritized it. We will if it's a common request.
TL;DR - the benchmark depends on its specific dataset, and it isn't a perfect representation to evaluate AI progress.
That doesn't mean it doesn't make sense, or doesn't have value.