• melroy
    link
    fedilink
    01 month ago

    I see ok. I only want to add that DeepSeek is not the first or the only model that is using mixture-of-experts (MoE).