I agree. Here is my thinking.
What if LLM providers will make short answers the default (for example, up to 200 tokens, unless the user explicitly enables “verbose mode”). Add prompt caching and route simple queries to smaller models.
Result: a 70%+ reduction in energy consumption without loss of quality.
Current cost: 3–5 Wh per request. At ChatGPT scale, this is $50–100 million per year in electricity (at U.S. rates).
In short mode: 0.3–0.5 Wh per request. That is $5–10 million per year — savings of up to 90%, or 10–15 TWh globally with mass adoption. This is equivalent to the power supply of an entire country — without the risk of blackouts.
This is not rocket science — just a toggle in the interface and I believe, minor changes in the system prompt. It increases margins, reduces emissions, and frees up network resources for real innovation.
And what if EU/California enforces such mode? This will greatly impact DC economy.
I used to be a Dell customer; all my family members had Dell Latitude laptops. But I agree, they got worse. I had a fan break, a key fall out, and Bluetooth issues with some of them, and when it was time to upgrade, I moved to Apple MacBook. It took some time to learn how to work in the macOS environment, but I am happy now.
It’s per core licensing with core factors. Core factor ranges from 0.25 to 1. I have not checked how it is now, but it used to be as above 5-10 years ago
Nowadays, when I post a job for our openings, I get thousands of applicants, which is much higher than 2–3 years ago. Over 90% are irrelevant. To limit the applicant count, I ask for references. For example, just the other day a candidate reached out directly, I asked if he could provide references, and then heard nothing back. It’s the quickest way to filter out irrelevant candidates. Now I am employing the same tactics and building an ATS where references make you stand out.
I am using Neo4j to build an equipment database, I also use MySQL in the same project to store transactional data. It took some time to figure out right syntax for Spring Boot/Neo4j Cypher query, but now it works OK. The reason I chose Neo4j? Because I wanted to play with it:). I can say it is more flexible than relational databases. I would like to continue using it, but you can't create multiple instances within a database, I guess it is possible to do so by installing separate binaries, but I have not tried it yet.
But its formal syntax and formal semantics are underspecified. People might not be able to agree on the interpretation of natural language sentences in general, or grammaticality judgments, or how to handle sentences that stop in
I live in such country, local currency volatile, laws are changing. There is no long term planning, projects should span 1-2 years and if they bring income, then you are lucky. In long term the situation is no good
In short mode: 0.3–0.5 Wh per request. That is $5–10 million per year — savings of up to 90%, or 10–15 TWh globally with mass adoption. This is equivalent to the power supply of an entire country — without the risk of blackouts.
This is not rocket science — just a toggle in the interface and I believe, minor changes in the system prompt. It increases margins, reduces emissions, and frees up network resources for real innovation.
And what if EU/California enforces such mode? This will greatly impact DC economy.