DeepSeek ditches Nvidia for Huawei chips in V4 launch

inari@piefed.zip · 5 days ago

DeepSeek ditches Nvidia for Huawei chips in V4 launch

[object Object]@lemmy.ca · 5 days ago

Even with a bitnet, it’s almost definitely better to train on a high precision float then refine down to bits.

I would expect bitnet to require more layers for equivalent quality too.

brucethemoose@lemmy.world · 5 days ago

I just meant for mass inference serving.

Yeah, I haven’t seen much in the way of bitnet training savings yet, like regular old QAT. It does appear that Deepseek is finetuning their MoEs in a 4-bit format now, though.

DeepSeek ditches Nvidia for Huawei chips in V4 launch

DeepSeek ditches Nvidia for Huawei chips in V4 launch

DeepSeek launched V4 on Huawei chips one day after White House accused China of AI theft