• piccolo [any]@hexbear.net
    link
    fedilink
    English
    arrow-up
    2
    ·
    9 days ago

    From your second quote:

    Like all companies that build on DeepSeek, they can choose to either host their products locally and pay for computing and storage infrastructure, or go through providers like Huawei. EqualyzAI does the former.

    So that means that DeepSeek is not getting a cent from this company. It’s open-weight, meaning if one has sufficiently powerful hardware they can just run DeepSeek, unlike OpenAI state of the art models, which can only be run by companies that contract with OpenAI to get the weights (as far as I know, this is basically just Google (Vertex) and Amazon (Bedrock)).

    But… even considering that DeepSeek is a more lightweight/efficient programme and China overall is rapidly expanding their electricity output… it still seems hard to imagine any profit is actually happening

    I think DeepSeek is absolutely burning money. Right now, almost all Chinese models are all open-weight. I’ve seen numerous hypotheses for why this is the case, but I think the one that convinces me the most, at least for DeepSeek, is that they’re doing it as advertising/recruiting. But the revenue that DeepSeek has is only from charging per token on their API as described in your first quote, and they’re competing with every other GPU provider for these prices, so it’s an aggressive race to the bottom. It’s possible that DeepSeek is even running this at a loss to get more training data from people using their API.

    In any case, DeepSeek has made a lot of innovations relating to doing more training with less power, because they are currently relatively GPU-poor. NVIDIA chips are hard to come by in China and so DeepSeek can’t really buy any more of the top tier models than they already have. Some of these are used for running the inference for the API, and some are used for the training. But even with all of these optimizations, it costs a lot of money to train an LLM, and it’s hard to imagine that with how often they’re releasing models, they’re actually breaking even, given that at best they have small margins on their API.