• plz1@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    4
    ·
    1 hour ago

    DeepSeek never said it was permanent in their pricing materials, the article writer did. They are just taking the current expiration date off an existing discount. It’s absolutely a shot across the bow at Claude, OpenAI, et al., but the author was click-baiting, as is tradition.

  • Ucarenya@lemmy.zip
    link
    fedilink
    English
    arrow-up
    17
    ·
    8 hours ago

    All about energy, and energy cost plays a role here, DeepSeek can go cheaper than western models…

  • ayyy@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    51
    arrow-down
    1
    ·
    13 hours ago

    “Permanently” lol it’s a subscription and the terms say they can change the price at any time. How is it legal for them to advertise with the word “permanent”?

    • BlackLaZoR@lemmy.world
      link
      fedilink
      English
      arrow-up
      12
      ·
      8 hours ago

      lol it’s a subscription

      It’s actually API access price, and it’s charged per input + output tokens. $0.87 per million tokens is damn cheap.

      They probably have super cheap electricity and it’s possible they use cheap Chinese Ai chips for inference.

      • ragebutt@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        2
        ·
        34 minutes ago

        china is expanding energy tremendously to the point that the USA simply cannot compete. Even if data centers all get built tomorrow they will soon bottleneck because energy demands can’t be met in a timely manner. The median time to get a new power plant online is 5 years. Meanwhile china is investing heavily not only in expansion of their grid, but expansion into renewable energy. They’ve added 8x the power to their grid that the us did just in 2023 and if anything their pace has risen since then. Their renewable grid is 3x the size of the entire us grid

        In terms of raw performance US firms were months ahead and that gap is shrinking. Dola-seed is ranked second behind opus by us firms with a gap of under 3% in benchmark performance

        This performance gap closing and energy superiority is why ultimately DeepSeek v4 pro outperforms opus 4.6. Opus is the clear winner, but not by a very appreciable amount, and ranges from 11-26x more expensive. Chinas hardware isn’t more efficient but their energy superiority puts them way ahead; their cloudmatrix uses well over 100% more energy than nvidia g200 but their energy costs are sometimes as little as 1/8th American costs per kWh

        The race to superiority here is ultimately does America substantially update and expand their grid before Chinas domestic chip manufacturing bridges the hardware gap that has been created by things like export controls? My money is on China here; Huawei, SMIC, etc have an engineering problem that is rapidly being addressed with gigantic state sponsorship (and frankly the major bottleneck is EUV lithography, which they are actively pursuing, though this is an issue that even with tens of billions will take many years to catch up to the west). While those barriers are real the American barriers are an extremely complex regulatory system (which is ultimately why trump is being directed to gut everything in terms of environmental and worker protections), funding (the oligarchs want this but not enough to part with their money, they want us to fund it), and unlike China the US drastically changes trajectory every 4-8 years.

    • InFerNo@lemmy.ml
      link
      fedilink
      English
      arrow-up
      13
      arrow-down
      1
      ·
      10 hours ago

      I think it’s meant to convey that it’s not a temporary deal on the old price, but a permanent new price point.

      • ayyy@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        1
        ·
        8 hours ago

        What is the effective difference? It’s not like they’re offering long term contracts.

        • Crit@lemmy.wtf
          link
          fedilink
          English
          arrow-up
          1
          ·
          4 hours ago

          Likely more to do with communication to customers on sale pages and such, and consequently with customer protection laws. They likely can’t advertise it as a discounted price for example in the same way a seasonal sale would be signposted, I would imagine at least.

      • Greyghoster@aussie.zone
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 hours ago

        Don’t use any of them much and from my limited experience they all seem to be pretty much the same. In fact DeepSeek probably has been a little better than ChatGPT.

        • blargh513@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          5
          ·
          6 hours ago

          As long as you don’t mind them harvesting every tiny bit of data you feed it.

          I don’t like the big US players, but at least they’re doing a tiny bit to keep out of your shit. Deepseek is not pretending at all. I suppose it’s at least honest and the price point is REALLY tempting. Openclaw gets expensive fast with the number of tokens it consumes. I burned through $30 in two days with it using Claude Haiku/Sonnet. Plugging it into cheap LLM is a nice idea, but no thanks.

          • Greyghoster@aussie.zone
            link
            fedilink
            English
            arrow-up
            2
            ·
            1 hour ago

            Really depends on your point of view. Personally I see the US AI push as a maximum harvest and it is hard to see the Chinese as being worse. The US has really gone flat out destroying whatever credibility and moral authority they may have had. As I said I don’t use the technology that much and the queries are pretty innocuous, so it may be different for others.

          • elucubra@sopuli.xyz
            link
            fedilink
            English
            arrow-up
            3
            ·
            2 hours ago

            I don’t like the big US players, but at least they’re doing a tiny bit to keep out of your shit.

            Oh, bless your heart, you sweet summer child.

  • aceshigh@lemmy.world
    link
    fedilink
    English
    arrow-up
    59
    arrow-down
    1
    ·
    14 hours ago

    Prices are funny. My last job we were changing clients extra for doing a thing that didn’t cost us anything and was fast to do. How much we charged was completely arbitrary and depended on the partners mood. It’s all made up folks.

    • Buddahriffic@lemmy.world
      link
      fedilink
      English
      arrow-up
      41
      ·
      14 hours ago

      Yeah, which is why the “if minimum wage increases, so will prices” aregument is BS. They were going to charge the highest price they thought they could either way, the difference is that they are forced to increase the amount that goes to the people they are trying to pay the least.

      • Rioting Pacifist@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        10 hours ago

        There is an element of minimum wage increasing, increasing prices because now there are more people that can afford to pay for things.

        But yes it isn’t because costs go up, and it really only applies to things people on minimum wage can afford and it’s always less than the increase in wages.

      • Tja@programming.dev
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        2
        ·
        10 hours ago

        It’s not BS, it’s just not as direct of an impact as they are implying. If payroll is 10% of their expenses (assuming EVERYONE makes minimum wage) then doubling the minimum wage will increase costs by 10%.

        Which could be (partially) absorbed from profits, could cause a 10% price hike… or a 50% price hike and fat bonuses for the executives.

      • aceshigh@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        ·
        13 hours ago

        This would impact the companies pnl though, so shareholders and c suite will get less money. That’s why they’re scaring people into not wanting to increase wage.

        • badgermurphy@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          ·
          8 hours ago

          The hilarious irony is that is not even conclusive. There are plenty of studies, both real-world and contrived, that indicate that employers paying more, in broad, yields returns in excess of the added payroll costs.

          Not only are there more customers, but increasing pay increases the quality and quantity of labor output.

  • MalReynolds@slrpnk.net
    link
    fedilink
    English
    arrow-up
    144
    arrow-down
    2
    ·
    19 hours ago

    The lower prices could be aimed at undercutting the competition.

    Mobster voice: Sure would be a pity if the monetization potential of those 2 huge IPOs (3 if you count SpaceX with xAI deadweight rolled in) went boom when that’s all that’s holding your economy out of recession (depression depending on how they cook the books).

    • Slotos@feddit.nl
      link
      fedilink
      English
      arrow-up
      73
      ·
      18 hours ago

      The way SpaceX IPO got crammed into index, it’s invulnerable to anything but an immediate incarceration of everybody involved.

      Index funds will be required to buy the stocks at a listing price before market can decide how much they are worth exactly.

      Afterwards, “economy in a recession” is synonymous to “free buffet” to those at the reins.

      • sp3ctr4l@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        19
        ·
        11 hours ago

        Yeah, the whole plan is to have every US citizen’s 401k’s autobuy into the SpaceX IPO.

        Your retirement fund is Elon’s exit liquidity.

        Its a truly fantastic fraud.

        Because… the Nasdaq… well a few weeks ago it changed its rules on the delay time between an IPO and it being part of the index, the index that everyone’s 401k’s buy into.

        I guess you could say its going to be ‘epic’ when this all blows up.

        See this is basically how the us economy works:

        Poors roll over negative equity into their next car loan.

        The ever diminishing ‘middle class’ basically does the same with homes, helocs, etc.

        The owners roll over debt via corporate amalgamations.

        But because the rich have a magical legal barrier of ‘all the bad and dumb things i do are a legal fiction doing them, not me personally’, well, the legal fiction gets what its due and/or evaporates when it can’t pay what it owes… and the rich remain on top.

        Yeehaw!

      • edible_funk@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        51
        ·
        18 hours ago

        Yep this will be the fourth or fifth record breaking upward transfers of wealth I’ve lived through. I really don’t want to live through another.

        • sp3ctr4l@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          4
          ·
          5 hours ago

          The ‘good’ news is you might not live through the next one.

          Food.

          Yeah our entire modern agricultural system is basically a process of squirting refined oil into the ground and then food comes out.

          Drive up the precursor cost?

          Well a whole bunch of that process breaks, and that’s not even considering the oil burned to move things from one place to another.

          Even if you solve that problem: Climate’s broke, ain’t gonna fix it at this point. That’ll fuck up food production massively as well.

          Population control (culling) via food prices doubling or tripling every few years, and the chaos that results from that.

          They’re hoping enough of us will die before we actually try to eat them.

          I suggest making what stockpile of edible reserves you can now, keep cycling through it, eat a bit of it, replace more of it, etc.

          We got the worst el nino in recorded history, AMOC is currently giving out, as is the ‘doomsday’ glacier… and roughly 1/5 to 1/4 of the world’s oil just stopped, it’ll take years to recover from the damage thats already done.

          Everything that has happened so far will be a footnote compared to what happens next.

        • BonsaiBoo@lemmy.world
          link
          fedilink
          English
          arrow-up
          16
          ·
          18 hours ago

          At this point many people won’t. Hard to squeeze blood out of a dried, overworked, malnutritioned poisoned and diseased husk of a laid off worker.

          • wonderingwanderer@sopuli.xyz
            link
            fedilink
            English
            arrow-up
            4
            arrow-down
            5
            ·
            16 hours ago

            Can’t wait until enough people are unemployed that nothing is stopping them from marching in the streets.

            Remember what happened during covid layoffs? People brought tents to the protests and camped out for months…

            • jaybone@lemmy.zip
              link
              fedilink
              English
              arrow-up
              9
              ·
              15 hours ago

              Wait, I don’t remember that happening during Covid. I remember some years before, that happened as part of the 99% anti Wall Street protests… which did nothing.

      • triptrapper@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        1
        ·
        16 hours ago

        I think I’m following what you’re saying but I haven’t read anything about this. Where can I read more?

      • MalReynolds@slrpnk.net
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        1
        ·
        edit-2
        16 hours ago

        Afterwards, “economy in a recession” is synonymous to “free buffet” to those at the reins.

        Not at all wrong, but there’s only so much blood parasites can suck before the host dies (and with luck kills the parasites, and / or sends a strong signal to everybody else to get their infestations eradicated, or at the very least under control), and that host is already hurting bad.

        Perhaps I’m being optimistic, but a collapse of the likely magnitude could be that straw, or maybe it’ll just be the back of US influence that breaks.

  • Rioting Pacifist@lemmy.world
    link
    fedilink
    English
    arrow-up
    37
    arrow-down
    13
    ·
    15 hours ago

    All numbers in AI are made up it’s wild to see tankies glaze DeepSeek’s fake numbers while being skeptical of Western corporations’ numbers

    • BlackLaZoR@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      7 hours ago

      All I see is good and cheap model. It doesn’t even have to be perfect, just in ballpark of mainstream models.

      • Rioting Pacifist@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        6 hours ago

        Them cutting consumer prices doesn’t show that though.

        It’s wild that people normally critical of AI boosting will drink Koolaid if it’s China flavored

      • Rioting Pacifist@lemmy.world
        link
        fedilink
        English
        arrow-up
        10
        arrow-down
        8
        ·
        14 hours ago

        But the numbers are fake, so it really doesn’t mean much to reduce a fake number by 75%, it isn’t an indicator that DeepSeek is beating anyone at anything.

          • Rioting Pacifist@lemmy.world
            link
            fedilink
            English
            arrow-up
            4
            arrow-down
            3
            ·
            10 hours ago

            What does that number meaningful represent as DeepSeek doing well?

            They can afford to lose more money on this? They have lower operating costs? They have a better way to make money of their users?

            It could indicate any/all/none of theses

            • clifmo@programming.dev
              link
              fedilink
              English
              arrow-up
              6
              ·
              10 hours ago

              What does that number meaningful represent as DeepSeek doing well? I don’t understand the question

              They can afford to lose more money on this? Yes They have lower operating costs? Yes They have a better way to make money of their users? They are not as profit motivated as their competition

              I don’t think you understand Deepseek’s role in the market. It’s to intentionally undercut US providers.

        • Calfpupa [she/her]@lemmy.ml
          link
          fedilink
          English
          arrow-up
          5
          ·
          12 hours ago

          What do you mean by the numbers are fake? Are you saying the worth is over inflated? If that’s the case, of course it is, none too different than virtually any other commodity.

          • Rioting Pacifist@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            arrow-down
            2
            ·
            10 hours ago

            What does that number meaningful represent as DeepSeek doing well?

            They can afford to lose more money on this? They have lower operating costs? They have a better way to make money of their users?

            It could indicate any/all/none of theses

  • BlackLaZoR@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    8 hours ago

    No wonder. Since deepseek has open license, they have to compete with 3rd party providers, and in case of smallest models with local generation.

  • lemmydividebyzero@reddthat.com
    link
    fedilink
    English
    arrow-up
    36
    arrow-down
    8
    ·
    17 hours ago

    Still doesn’t know what happened at Tiananmen Square, but can tell in detail how protests were brutally ended a few years later in South Africa…

      • lemmydividebyzero@reddthat.com
        link
        fedilink
        English
        arrow-up
        1
        ·
        40 minutes ago

        I don’t have accounts for the large models, but I tried out some of the free ones on https://duck.ai/

        Prompt:

        Name up to 10 negative things Israel has done in the last 10 years

        GPT 5-mini answer:

        I can list up to 10 actions by the Israeli government, military, or state-associated actors from the last decade that critics and many international observers have described as negative. I will focus on widely reported, verifiable events or policies (not broad accusations). If you want sources for any item, tell me which ones and I’ll fetch them.

        1. Expansion of Israeli settlement construction and approvals in the West Bank and East Jerusalem, including outposts declared unlawful under Israeli law but supported politically, undermining prospects for a two‑state solution.
        2. Enforcement actions, demolitions, and eviction orders against Palestinian homes in East Jerusalem and the West Bank, often justified by lack of permits.
        3. Large-scale military operations in Gaza causing significant civilian casualties and infrastructure damage (e.g., major escalations in 2014, 2021, and 2023–2024), criticized for disproportionate force.
        4. Tightened blockade and restrictions on Gaza (control of crossings, import/export limits), contributing to severe humanitarian and economic conditions.
        5. Prolonged detention without trial (administrative detention) of Palestinians, including use of renewals and classified evidence, condemned by human rights groups.
        6. Use of lethal force against Palestinian protesters and demonstrators (including during the Great March of Return and other border protests), with contested accountability and investigation processes.
        7. Passage of laws and policies critics say discriminate against Arab citizens of Israel and non‑Jewish residents (e.g., the 2018 Nation-State Law drew international criticism).
        8. Restrictions and raids on Palestinian civil society organizations and NGOs, including designation of some groups as “terrorist” or funding investigations and closures.
        9. Heavy-handed policing, surveillance practices, and use of administrative measures in occupied territories that limit movement (checkpoints, permits) and civil freedoms.
        10. Targeted killings and extrajudicial strikes against individuals in the West Bank, Gaza, and elsewhere, sometimes criticized for insufficient transparency and due process.

        Claude Haiku 4.5

        Negative Things Israel Has Done in the Last 10 Years

        1. Gaza military operations and civilian casualties — Multiple conflicts in Gaza (2014, 2021, 2023-2024) have resulted in significant civilian deaths and displacement. Human rights organizations have documented concerns about proportionality and civilian protection in these operations.

        2. Settlement expansion in the West Bank — Israel has continued building and expanding settlements in occupied Palestinian territories, which the UN and international law consider illegal under international humanitarian law.

        3. Restrictions on Palestinian movement — Checkpoint systems, closures, and movement restrictions in the West Bank have been criticized as limiting Palestinian freedom of movement and economic opportunity.

        4. Demolition of Palestinian homes — Israeli authorities have demolished Palestinian-owned structures in the West Bank and East Jerusalem, displacing families and communities. Human rights groups say many demolitions lack proper legal process.

        5. Treatment of asylum seekers and migrants — Israel has faced criticism for detention conditions and deportation policies affecting African migrants and asylum seekers.

        6. Restrictions on Palestinian water access — Palestinian communities have reported limited access to water resources, with critics arguing Israeli control of water allocations disadvantages Palestinians.

        7. Use of administrative detention — Israel has detained Palestinians without formal charges under administrative detention laws, which human rights organizations say lacks due process protections.

        8. Alleged use of excessive force by security forces — Various incidents have been documented where Israeli police and military have been accused of using disproportionate force against Palestinian civilians and protesters.

        9. Restrictions on freedom of expression — Palestinian activists and organizations have reported harassment, arrest, and restrictions on their ability to organize and speak freely.

        10. Conditions in Israeli prisons — Human rights groups have documented concerns about prison conditions, solitary confinement practices, and treatment of Palestinian detainees.

        These criticisms come from sources including Amnesty International, Human Rights Watch, B’Tselem, UN bodies, and international media. It’s important to note that Israel disputes many of these characterizations and provides its own perspective on security concerns and legal justifications for its actions. This is a deeply contested and complex geopolitical situation with legitimate perspectives on multiple sides.

      • BlackLaZoR@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 hours ago

        It’s like 865GB so you can’t. But you can run older/smaller versions on High End consumer GPU

      • lemmydividebyzero@reddthat.com
        link
        fedilink
        English
        arrow-up
        3
        ·
        12 hours ago

        Really? You sure, this is still true?

        I have never run that one locally, but qwen doesn’t “know” about specific Chinese historic events either when executed locally.

        • Yerbouti@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          1
          ·
          10 hours ago

          It was a couple of months ago at least. Also, if you use the deepthink mode online you can actually see the reply ( really criticising the Chinese regime) for a couple of seconds before it disappears. I’ve manage to screeshot it and also to trick him once about a “fictional” regime so I could have the answer.

          • boonhet@sopuli.xyz
            link
            fedilink
            English
            arrow-up
            1
            ·
            8 hours ago

            Interesting. I tried their R1 model when it was all the rage and that in its thought process mentioned something about being a Chinese AI and having to provide the user with safe information. It then responded it doesn’t know about any specific historic events.

    • frongt@lemmy.zip
      link
      fedilink
      English
      arrow-up
      10
      ·
      15 hours ago

      It does, and it’ll tell you about it. But it’s their interface that censores the output, and it’s not perfect. Ask it in English or Chinese and it’ll censor it. But ask in Spanish or other languages and it doesn’t get caught.

      • VeryFrugal@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        2
        ·
        15 hours ago

        Huh. Apparently in Korean it’s censored. It’s also not their interface. Am using their API and still getting rejected.

        It will happily give you details if search is involved(via Searxng in my case) though, so that’s something.

        Other than that, amazing model. I’m not having political conversation with LLMs, let alone Chinese ones.

      • errer@lemmy.world
        link
        fedilink
        English
        arrow-up
        15
        arrow-down
        4
        ·
        17 hours ago

        There are plenty of rails, they’re just different ones. Like criticizing dear leader or Tiananmen square.

        • stumu415@lemmy.zip
          link
          fedilink
          English
          arrow-up
          12
          arrow-down
          34
          ·
          17 hours ago

          Blah blah blah blah. China bad. Invading Iran, Venezuela, supporting the genocide in Palestine is all good, getting out of the WHO, removing USAID which had let to thousand of deaths around the world, threathing Greenland and Cuba. Causing global inflation to rise due to oil cost. Tariffs . Destroyed relationships with organizations and countries that took 80 years to build.

          And that is just this year for the US.

          What catastrophic action on this scale has China in let’s say the last decade?

          • Tja@programming.dev
            link
            fedilink
            English
            arrow-up
            4
            ·
            10 hours ago

            Two superpowers can be shit at the same time. Three, even. Four is phisically imposible.

          • Womble@piefed.world
            link
            fedilink
            English
            arrow-up
            32
            arrow-down
            3
            ·
            16 hours ago

            Its rather revealing that your response to “China has severe restrictions on political speech” is to accuse that person of bootlicking for American empire.

          • wonderingwanderer@sopuli.xyz
            link
            fedilink
            English
            arrow-up
            22
            arrow-down
            3
            ·
            16 hours ago

            First of all, no one said “The US is good actually.”

            Second of all, aren’t you the same type of person who used to cry about USAID being a front for US imperialism? Now you want to complain that it’s gone, because oh wait they were actually helping to contain infectious diseases, provide clean drinking water, and feed people in food scarce areas…

            That got shut down last year, by the way, not “just this year.”

            As for catastrophic actions China has engaged in, I suppose if you ignore forced assimilations in Tibet and Urumqi, the hostile police takeover of Hong Kong, and the aggressive and environmentally destructive colonization of the South China Sea, then it would start to be somewhat difficult to come up with any examples…

          • Cethin@lemmy.zip
            link
            fedilink
            English
            arrow-up
            4
            arrow-down
            2
            ·
            13 hours ago

            You claim someone is biased against China, because they said one true thing, then you bring up all kinds of unrelated things to attack another nation, which wasn’t being discussed. I wonder if you might possess the bias your against, But for China and against the US? Both nations do good and bad things. If you’re so angry about bias, maybe you should check your own. Or maybe bias isn’t the issue, and you just support the Chinese leadership?

            • Matty Roses@lemmy.today
              link
              fedilink
              English
              arrow-up
              2
              arrow-down
              3
              ·
              11 hours ago

              “See, China did a bad thing in 1989, while listed here are multiple worse things the US did this year alone. These are equal”

    • mic_check_one_two@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      17
      ·
      15 hours ago

      A token is basically just a word. Know how your phone’s auto suggest tries to anticipate the words you want to use as you type? In this case, your phone is using an extremely small token amount (typically only the previous two or three words you have typed) to try and predict your next word, which would also be a token. Your phone only uses a few tokens at a time, because as token count rises, processing requirements also quickly balloon.

      And AI chat is basically the same concept, but with a massively inflated token limit. Instead of looking at your previous two or three words, it looks at entire conversations. And it also uses tokens to generate responses, the same way your phone is using one token at a time to predict your next word.

      So when you pay for tokens, you’re essentially paying for a word count. As you continue a conversation, the token requirement for each subsequent request will increase, because it is attempting to look at the entire context of the conversation you have had.

      Models have built-in token limits, to put a cap on how much memory is required to run the model. As conversations stretch on and you reach the model’s token limits, it will begin losing context for things that happened earlier. It will try to summarize earlier parts of the conversation to shorten them but keep relevant pieces in memory, or it will just outright drop old parts of the conversation and “forget” that context, the same way my phone has already forgotten the start of this sentence.

      It’s a little more complicated that “each word is a token”, because the chatbot will combine your prompts with its own internal systems. Especially as conversations stretch on, and it begins to summarize old parts to keep them in memory. But that’s the most straightforward way to explain it.

    • boatswain@infosec.pub
      link
      fedilink
      English
      arrow-up
      14
      ·
      17 hours ago

      My understanding is that tokens are basically words, and that when you ask a question it charges for all the tokens it consumes, produces, or processes. There’s a lot of internal processing for each request, where the input text is summarized in different ways and combined with previous parts of the conversation, so it’s not as straightforward as “word count of what you say plus what it says”.

      • iamthetot@piefed.ca
        link
        fedilink
        English
        arrow-up
        14
        ·
        16 hours ago

        Worth noting that a token is not necessarily a word, though can be. One word could also take multiple tokens. It can also vary from LLM to LLM and their tokenization methods.

      • teft@piefed.social
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        6
        ·
        15 hours ago

        There’s a lot of internal processing for each request, where the input text is summarized in different ways and combined with previous parts of the conversation, so it’s not as straightforward as “word count of what you say plus what it says”.

        In other words obfuscation so they can charge whatever they want using some obscure formula that only they know.

        • Eager Eagle@lemmy.world
          link
          fedilink
          English
          arrow-up
          7
          arrow-down
          1
          ·
          edit-2
          15 hours ago

          Not really, there are ways to count tokens before running an inference. Some providers make tokenizers public, so they even work offline. APIs also usually return rolling costs per response and have budget limits - though some could have more fine-grained limits.

          Users who are surprised by the bill are usually not paying attention to each call, or using autonomous subagents, or a setup where they have little or no control to what is sent to the provider.

          So the problem isn’t really the API provider, as much as it’s the tooling around it, which makes it too easy to overspend.

    • Peruvian_Skies@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      5
      ·
      17 hours ago

      In very simple terms, a token is more or less a word. You pay per input and output tokens (your prompts and the answers) as they correlate the most closely to the energy expended by the LLM to process your request.

      • Tja@programming.dev
        link
        fedilink
        English
        arrow-up
        2
        ·
        10 hours ago

        How are they running it? Doesn’t the model have to fit in (V)RAM? Does Nvidia have such huge memories in the H cards?

        • BlackLaZoR@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          7 hours ago

          There’s tech for splitting model to run on multiple cards, but it requires really fast interconnect between GPUs.

        • boonhet@sopuli.xyz
          link
          fedilink
          English
          arrow-up
          1
          ·
          8 hours ago

          For self hosting it essentially needs to fit in VRAM + RAM but it’ll take a lot of CPU for the part in RAM

          Deepseek probably uses those big fancy H cards and not one but several together to increase VRAM.