Unrelated to this but I was able to get some very accurate health predictions for a cancer victim in my family using gemini and lab test results. I would actually say that other than one Doctor Gemini was more straightforward and honest about how and more importantly WHEN things would progress. Nearly to the day on every point over 6 months.
Pretty much every doctor would only say vague things like everyone is different all cases are different.
I did find this surprising considering I am critical of AI in general. However I think less the AI is good than the doctors simply don't like giving hopeless information. An entirely different problem. Either way the AI was incredibly useful to me for a literal life/death subject I have almost no knowledge about.
We must have a different definition of arbitrary. OP ran 2.3 million tests comparing random battles against the original implementation? Which is probably what you or I would do if we were given this task without an LLM.
Well I cloned the repo and cannot generate this
battle test by following the instructions. It appears a file called dex.js that is required is not present among other things as well as other suspicious wrong things for what appears to be on the surface a well organized project.
I'm very suspicious of such projects so take it for what you will, but I don't have time to debug some toy project so if it was presented as complete but the instructions don't work it's a red flag for the increasingly AI slop internet to me.
I'm saying I think they may have used one simple trick called lying.
Am I the only one that is going to call this out? Am I the only person that cloned the repo to run it and found out it does nothing? This is disingenuous at a best. This is not a working project, they even admit this at the end of the article but not directly.
>Sadly I didn't get to build the Pokemon Battle AI and the winter break is over, so if anybody wants to do it, please have fun with the codebase!
In other words this is just another smoking wreck of an hopelessly incomplete project on github. There is even imaginary instructions for running in docker which doesn't exist. How would I have fun with a nonsense codebase?
The author just did a massive AI slop generation and assumes the codes works because it compiles and some equivalent output tests worked. All that was proved here is that by wasting a month of time you can individually rewrite a bunch of functions in a language you don't know if you already know how to program and it will compile. This has been known for 2-3 years now.
This is just AI propaganda or resume padding. Nothing was ported or done here.
Sorry what I meant to say is AI is revolutionary and changing the world for the better................................
Its bad. The most depressing part is that it is because of de-funding not AI. While at the same time this field is probably one of the only venues for escaping the AI sinkhole but its being dismantled rather than built up. Source my partner in research.
I think there is something here but not much. The majority of business are carrying some SAS products that are an entire marching band when all they want are a drummer and guitar player. Making bespoke efficient tools will surge for sure.
The problem is that the building of these tools is all the same end. Increasing industry control to few players and further widening wealth inequality. Which leads us back to where does everyone go to work at that point?
We are at some sort of societal inflection point where we need new industries but only 20% of degrees are in some sort of science. 80% of degrees are in what is becoming nothing more than resume checkboxing for jobs that no longer will exist. Who is going to make the next big industry breakthough with 20% of degrees in business management? I don't see any push to get people in college for actual meaningful progress.
It seems it did happen, humans have hit post scarcity in survival terms(unevenly distributed). However we have in no way planned for what happens here, fairness has never been a priority. The cuthroat capitalism that made this possible is now eating itself with no plans to change.
I think we hit peak AI improvement velocity sometime mid last year. The reality is all progress was made using a huge backlog of public data. There will never be 20+ years of authentic data dumped on the web again.
I've hoped against but suspected that as time goes on LLMs will become increasingly poisoned by the the well of the closed loop. I don't think most companies can resist the allure of more free data as bitter as it may taste.
Gemini has been co opted as a way to boost youtube views. It refuses to stop showing you videos no matter what you do.
> I don't think most companies can resist the allure of more free data as bitter as it may taste.
Mercor, Surge, Scale, and other data labelling firms have shown that's not true. Paid data for LLM training is in higher demand than ever for this exact reason: Model creators want to improve their models, and free data no longer cuts it.
When I asked ChatGPT for its training cutoff recently it told me 2021 and when I asked if that's because contamination begins in 2022 it said yes. I recall that it used to give a date in 2022 or even 2023.
Rule of thumb: never ask chatgpt about its inner working. It will lie or fabricate something. It will probably say something completely different next time
How? I just asked ChatGPT 5.2 for its training cutoff, and it said August 2025. I then tried to dig down to see if that was the cutoff date for the base model, and it said it couldn't tell me and I'd have to infer it from other means (and that it's not a fully well-formed query anymore with the way they do training).
I was double-checking because I get suspicious whenever asking an AI to confirm anything. If you suggest a potential explanation, they love to agree with you and tell you you're smart for figuring it out. (Or just agree with you, if you have ordered them to not compliment you.)
They've been trained to be people-pleasers, so they're operating as intended.
To be honest for most things probably yea. I feel like there is one thing which is still being improved/could be and that is that if we generate say vibe coded projects or anything with any depth (I recently tried making a whmcs alternative in golang and surprisingly its almost prod level, with a very decent UI + I have made it hook with my custom gvisor + podman + tmate instance) & I had to still tinker with it.
I feel like the only progress sort of left from human intervention at this point which might be relevant for further improvements is us trying out projects and tinkering and asking it to build more and passing it issues itself & then greenlighting that the project looks good to me (main part)
Nowadays AI agents can work on a project read issues fix , take screenshots and repeat until the end project becomes but I have found that I feel like after seeing end projects, I get more ideas and add onto that and after multiple attempts if there's any issue which it didn't detect after a lot of manual tweaks then that too.
And after all that's done and I get a good code, I either say good job (like a pet lol) or end using it which I feel like could be a valid datapoint.
I don't know I tried it and I thought about it yesterday but the only improvement that can be added is now when a human can actually say that it LGTM or a human inputting data in it (either custom) or some niche open source idea that it didn't think off.
I've found a funny and simple technique for this. Just write "what the F$CK" and it will often seem to unstick from repetitiveness or refusals(i cant do that).
Actually just writing the word F#ck often will do it. Works on coding too.
This is like the ultimate version of going back 1000+ years economically and socially. Where a merchant would size up how desperate or rich they thought you were and charged you based on that rather than a reasonable price.
It wastes the time of the poor whom must be willing to walk away without anything when they can "afford it" and further deepens the problems when you are desperate.
Except now they can also spy on you 24x7 and buy information from other spys while they make their decisions and have 100% information asymmetry. Now they also HAVE to charge you more to make back the money they spend spying on you rather than just running a normal business.
You already answered your own question though. It is the peak of exploiting power wealth disparity, there is zero chance of it being used beneficially.
Pretty much every doctor would only say vague things like everyone is different all cases are different.
I did find this surprising considering I am critical of AI in general. However I think less the AI is good than the doctors simply don't like giving hopeless information. An entirely different problem. Either way the AI was incredibly useful to me for a literal life/death subject I have almost no knowledge about.
reply