Hacker Newsnew | past | comments | ask | show | jobs | submit | tgtweak's commentslogin

Are you using the full BF16 model or a quantized mlx4?

Not sure what the default is – whatever that was. It's probably the quantized mlx4 if I had to guess.

I like the byteshape quantizations - they are dynamic variable quantization weights that are tuned for quality vs overall size. They seem to make less errors at lower "average" quantizations than the unsloth 4 bit quants. I think this is similar to variable bitrate video compression where you can keep higher bits where it helps overall model accuracy.

Should be able to run this in 22GB vram so your 4090 (and a 3090) would be safe. This model also uses MLA so you can run pretty large context windows without eating up a ton of extra vram.

edit: 19GB vram for a Q4_K_M - MLX4 is around 21GB so you should be clear to run a lower quant version on the 4090. Full BF16 is close to 60GB so probably not viable.


It's been mentioned that this model is MLA capable, but it seems like the default vLLM params don't use MLA. Seeing ~0.91MB KV Footprint per token right now. Are you getting MLA to work?

Few thoughts/observations on Tariff impacts now that we have a decent amount of time/data to look at:

Suppliers in China are dropping prices to offset the tariff impact - this is what I see in my direct industry and also in many adjacent ones. This is benefitting other countries that don't have Tariffs on Chinese goods since they can buy cheaper as well. I suspect this is a significant factor in the GBP/EUR strengthening in relation to USD. There was a point where there was such a pronounced impact to imported goods that the cost of shipping a container from China to US went from ~$3500 to <$1500 pre/post Tariff.

Large manufacturers (automotive certainly, but also raw materials production and component production) are actually moving facilities to the US, which was one of the intended effects.

US manufacturers are enjoying some price relief as landed costs of chinese-produced goods are increasing. Hard to quantify what this means but the frustrating part is that they are not reducing their prices just enjoying higher margins.

Countries outside the Tariff zone are enjoying more trade - Canada is a very real example of this policy backfiring - they just walked back the Chinese Automotive Tariffs in exchange for relief on agricultural reciprocal tariffs. Mexico is entering into similar agreements with non-US trade partners. Some products are releasing in non-US markets first and at lower costs than they are in the US.

US-sourced chipmaking is accelerating - Intel's new fabs are probably the most prominent example of this (albeit they are slow to pick up volume - I expect this will shift with TSMC rationing production to brands like Apple and Qualcomm).

I think the increase in cost to consumers is painful and that the current Tariff rates are excessive + there is a lot of "cheating" where Chinese suppliers are declaring lower cost of goods on import to reduce the landed cost of Tariffed goods - there doesn't appear to be enough resources to police this policy fully.

All in all an interesting economic experiment and it will certainly take years for any of these realities to have a measurable positive impact domestically.


> This is benefitting other countries that don't have Tariffs on Chinese goods since they can buy cheaper as well.

What countries are those? As far as I know, most important markets have great tariffs on Chinese goods.


Most have low/no duties on most goods categories from China - aside from a few specific goods categories in developed markets - most goods flow out of China to export countries without tariff/duties. ASEAN markets and most of Africa have zero tariffs. I don't think there are any developed markets (which are China's main export destinations) with double-digit tariff averages except India. Using Germany as an example... Electronics (hs code 85) is 0-6% tariffs on Chinese origin goods (averages out under 2%), and has been that way for a while. US on the other hand is now 7.5-27.5% where it was previously 0-4%. While they aren't "no tariffs", an average ~2% tariff across all imported goods is very small compared to what is shelved in the US. ASEAN is almost entirely no tariff.

So when the US is high teens/low 20's average tariff and the rest of the developed world is 0-5%, that's essentially zero as far as market pressure is concerned.

Chinese imports to the US is down to 8% in October 2025, compared to >13% prior to recent trade policy changes. In the same period, EU imports from China have grown 5%. My point remains that US exerting import pressure on China is benefitting the rest of the world's buying power with China, at it's expense.


When I import Chinese goods to Europe as a consumer, I have to pay a 25% VAT, which functions as a tariff. It is officially a tariff as far as the government is concerned, as it is collected by customs and not by internal revenue.

That is of course paired by the fact that domestic products also have the same VAT to be paid by consumers, but domestic producers are reimbursed their VAT expenses to varying degrees depending on their business model, while there is no such function for foreign imports.

All in all the VAT functions as a tariff as far as the exporter is concerned, in my understanding.

> My point remains that US exerting import pressure on China is benefitting the rest of the world's buying power with China, at its expense.

I agree with that, and can't for my life understand why foreign nationals rage so incessantly against the US tariffs. The expected outcome is that those consumers get to enjoy cheaper imports from China and from each others, as companies need to find new export markets. Or even cheaper domestic goods as domestic companies have to unload produce which isn't getting exported.


VAT is a sales tax not a duty/tariff - this is due by you as a consumer not the importer were a company to purchase it for resale. If you bought the same goods from a local seller you'd have the same VAT. EU has some exceptions on goods from the EU (in the form of IOSS/de minimis) but by and large the VAT operates like a sales tax which many US states also have (as do Canadian Provinces and AU residents).

For sake of comparison, they are not usually conflated with import/tariff/duties.


>expected outcome is that those consumers get to enjoy cheaper imports from China

That's eco101 expectation, but reality after initial tariff chaos settles is cat-mouse of circumvention - PRC producers would price discriminately, i.e. sell to RoW slightly higher to make up for less sales in US / cover transhipment / diversion costs, arbitrage different tariff rates etc. PRC producers that still sells to US would also simply value engineer / despec more (lower quality), make US SKUs with stripped functionality etc to recoup on $ per value basis, and other shenanigans like value shifting - selling at "lower" cost but recoup via other bundled fees, i.e. buyers buy widgets low, but also have to buy "service" packages in arbitrage jurisdictions so PRC producers still ends up netting same. All these strategies are happening btw. The main issue is PRC producer advantage is so large and the tariff execution so leaky that they can still underprice and get products to US at profit to stunt US reindustrialization. Meanwhile PRC industrial index for manufacturing is increasing while producer index dropping because they have access to cheap renewables / discounted sanctioned fossils, so their input costs dropping vs RoW, allowing them to make more, by selling cheaper, which extends gap of how much tariff shenanigans/engineering they do to mitigate the goal of US tariffs in the first place. Hence PRC exporting record #s and US SME manufacturing index on 3rd quarter of contraction from tariff drama. Meanwhile, much of RoW without PRC god tier industrial chains, simply left with eating straight shit on US tariffs vs PRC who has much more tools to circumvent. It's like how regulations favor big incumbents with most headroom/resources to exploit loopholes. TLDR supply chain manipulation by the worlds largest supply chain skews expected market equilibrium.


There are tariffs on every country, not just China. This isn’t like his first term in office. Can you speak to the impacts of those tariffs?

Looking purely at import volumes, most of those other countries haven't experienced drop-offs like China. Canada exports about the same amount of goods to the US today as they did in early 2024. Mexico is in a similar trend. What I see is that when Chinese buying dipped due to exorbitant tariffs, Canada and Mexico spiked until they were also hit with tariffs, returning to their previous levels.

Interesting comment but I have to ask why you keep capitalizing "Tariff".

They may be a German-speaker betrayed by muscle-memory while typing in english.

But they didn't capitalize any of the other nouns.

Is it too much to expect companies to share some of this in the open vs just the results?

Very interesting but the slight issue I see here is one of data: the information that is recorded and in the training data here is heavily skewed to those intelligent/recognized enough to have recorded it and had it preserved - much less than the current status quo of "everyone can trivially document their thoughts and life" diorama of information we have today to train LLMs on. I suspect that a frontier model today would have 50+TB of training data in the form of text alone - and that's several orders of magnitude more information and from a much more diverse point of view than what would have survived from that period. The output from that question "what happened in 1834" read like a newspaper/bulletin which is likely a huge part of the data that was digitized (newspapers etc).

Very cool concept though, but it definitely has some bias.


Models today will be biased based on what's in their training data. If English, it will be biased heavily toward Western, post-1990's views. Then, they do alignment training that forces them to speak according to the supplier's morals. That was Progressive, atheist, evolutionist, and CRT when I used them years ago.

So, the OP model will accidentally reflect the biases of the time. The current, commercial models intentionally reflect specific biases. Except for uncensored models which accidentally have those in the training data modified by uncensoring set.


> but it definitely has some bias.

to be frank though, I think this a better way than all people's thoughts all of the time.

I think the "crowd" of information makes the end output of an LLM worse rather than better. Specifically in our inability to know really what kind of Bias we're dealing with.

Currently to me it feels really muddy knowing how information is biased, beyond just the hallucination and factual incosistencies.

But as far as I can tell, "correctness of the content aside", sometimes frontier LLMs respond like freshman college students, other times they respond with the rigor of a mathematics PHD canidate, and sometimes like a marketing hit piece.

This dataset has a consistency which I think is actually a really useful feature. I agree that having many perspectives in the dataset is good, but as an end user being able to rely on some level of consistency with an AI model is something I really think is missing.

Maybe more succinctly I want frontier LLM's to have a known and specific response style and bias which I can rely on, because there already is a lot of noise.


Biases exposed through artificial constraints help to make visible the hidden/obscured/forgotten biases of state-of-the-art systems.

I used to run gentoo like 14 years ago! It remains one of the fastest distros I've seen for the specific hardware it was running on (high core count 4-socket AMD opteron servers) and I mostly attributed that to the fact it was compiling everything (even the base os in this case!) for that specific CPU at install time... emerge would build/compile and if you set your USE flags correctly it produced heavily tailored and optimized binaries. I feel like a staged/graduated (downloading/running precompiled initially while a flag-optimized compile runs in the background) would be a good way to get around some of the downsides here (namely that it takes 45 minutes to install firefox with emerge/pacman and that builds fail more often than packages fail to install).

Very cool to see that it's still going strong - I remember managing many machines at scale was a bit of a challenge, especially keeping ahead of vulnerabilities.


45 minutes hah, it used to take us three days to build kde ;)

change the SSD and retry (the same ssd in another machine may not trigger the same error btw, this is not unilateral process of elimination) - those windows updates do a lot of disk writes and a small miss there can screw up an entire install since it shuffles things around in preboot environment (moving them on disk) and that can corrupt things and prevent a new install in the same way.

You can also try to live boot into Ubuntu 25.04 arm64 since that iso has experimental snapdragon elite support and has some built-in drivers for storage and network - you can extract firmware from the windows drivers with qcom-firmware-extract - they recommend doing this from a windows partition which you should have (albeit possibly corrupted).

If that still fails - you have a ram issue as others have pointed out. I've had the exact same symptoms (hardware instability after windows update) and it was nvme ssd (an early samsung one) and ram, in both instances.

Not saying the windows update didn't also come with some junk firmware that got loaded into some of your devices, but that would be a distant diagnosis from ssd/ram (and many others would have seen the exact same thing during their update if it was that).


So basically the quantization in a byteshape model is per-tensor and can be variable and is an "average" in the final result? The results look good - curious why this isn't more prevalent! Would also love to better understand what factors into "accuracy" since there might be some nuance there depending on the measure.


> Would also love to better understand what factors into "accuracy" since there might be some nuance there depending on the measure.

It's accuracy across GSM8K, MMLU, IFEVAL and LiveCodeBench.

They detail their methodology here: https://byteshape.com/blogs/Qwen3-4B-I-2507/


There is a maximum theoretical length and the sphere actually allows you to wrap around it, Rod of Asclepius style, until you get there - picking up an apple on each cycle. I suspect it's not more than a few hundred segments though so those ~600 submissions are probably someone gaming the submission with a forged score.


now there's 9999 and 1337 for scores. Imma guess there's not a lot of security on the scoreboard of a fun little game


There is definitely a turn-based minigame here - get the most "distance" travelled by the horse, every turn the horse moves one block towards it's closest escape and you can drop walls to cause it to find a new path - in this one you actually lose when the horse can't get out but the goal is to get the horse to move as many blocks as possible using your limited number of walls (or apples which can attract it).


That reminds me of Paquerette Down the Bunburrows [1] which is a very fun pathfinding game where the bunnies will pathfind to try to run away from you. It's not exactly what you described, but it is very fun and surprisingly deep and challenging.

[1] https://store.steampowered.com/app/1628610/Paquerette_Down_t...


I was initially expecting the horse to move after each turn. As it is, this is a logic game, similar to what I'd expect to see in the NYT Games app. Quite entertaining, but something that you could look at and reason about to solve.

But, you absolutely could make this a turn based game where the horse is trying to escape and you (playing as the farmer), work to fence it in as it meanders towards a gate.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: