jauws's comments

jauws · 2025-08-15T16:06:09 1755273969

Thanks! Anecdotally, I'd tend to say that Claude 3.7 tends to improve the most, but it seems like (via the leaderboard), some people really prefer Grok-3 lol.

jauws · 2025-08-15T06:22:27 1755238947

Thanks for the comment! Do you mind linking the site - would love to check it out! That's a very fair point about the technical error aspect. Though with all the confounding variables (author skill differences, model selection based on price/speed, etc.) I'd say it's probably the most mature signal we have right now, but still far from ideal.

Really interested in what you've been working on for the past year! Are you doing custom fine-tuning or more on the prompting/post-processing side? Also I definitely need to check out the Midjourney onboarding, it sounds super interesting for inspo regarding your point about personalization + taste!

BoorishBears · 2025-08-16T02:04:33 1755309873

My 2nd most recent submission has a link to it

Most of it has been fine-tuning (SFT/DPO/GRPO), but also a lot of prompting and adding steps between the user's prompt and the output

jauws · 2025-08-15T06:02:47 1755237767

This is an amazing suggestion! Will definitely try to figure out a way to incorporate this into the leaderboard without making it a constant each time. I'm currently using OpenRouter's default parameters which is totally a brainfart on my part.

jauws · 2025-08-15T05:57:20 1755237440

Thanks Johnny! I totally agree with you, really appreciate you for checking out my project!