Recently, there has been a trend of showing off how many Tokens one has spent.
Some say it is the “digital currency” in the AI world.
[Large models charge by Tokens]

For example, this morning
A big sister called DeepSeek through a certain platform
The platform’s charging standard is
16 yuan for every 1 million Tokens
Calculating this way, I spent 0.03 yuan this morning

Every time I ask DeepSeek a question
I will “spend” some Tokens
After each question is answered, it automatically displays
How many Tokens you have consumed

Hehe, interesting, right?
Every time humans communicate with large models, whether chatting, writing, or predicting…
They will consume some Tokens
At this time, many people are confused
What are Tokens? How are they charged?
What is the difference from traditional API calls and subscription charges?

Tokens are the smallest unit processed by large models
Equivalent to a “computational granularity”
It is not measured directly bywordsorphrasesbut rather by the smallest units obtained after the text is segmented by the model’s tokenizer.
For example, in English, “I love AI!“
In Chinese, “Artificial Intelligence is powerful“

Moreover, the length of a Token is not fixed
Its length depends on
the specific model’s tokenizer rules

The computational cost of running large models is very high
Charging by Tokens is to more accurately control resource usage
This charging method is fairer and more transparent

If charged by the traditional API call method
Charging per call,charging each time it is called
And charging a fixed fee each time is obviously unreasonable
The computational cost of short texts and long texts is completely different
How precise can Tokens charging be?
The cost consists of two parts
The input question + the AI output answer are both chargeable
Input 1k Tokens + Output 2k Tokens = Charge for 3k Tokens

What you see[DeepSeek API Price List]

So, when asking AI questions, don’t be too verbose
Every character you type costs money

Although DeepSeek is also accessed via API calls
This is just a technical method, not a charging method
However, the previous API calls
In the industry, the default charging method is
Charging per call, charging a fixed fee each time it is called
SMS API (charges 0.01 yuan for each message sent)
Weather query API (charges 0.03 yuan for each query)
This is very different from the current charging by Tokens
There is no good or bad charging method
Different projects adopt different methods
Charging per call is suitable for standard service products
Charging by Tokens is suitable for dynamically generated products
↓


Of course, there are traditional charging methods that we are very familiar with
① Subscription charging, like SaaS
Pay monthly/yearly, regardless of actual usage
② Charging by computing resources, like cloud computing
Charging based on how muchcomputing resources(CPU/GPU/TPU) are consumed
Charging based ondata storagecapacity (GB/TB)
Charging based onbandwidthtransfer volume (GB/TB)
③ Charging by unlocking features, how much to unlock a feature
For example: unlocking security features, solving advanced features, etc.
Some also have basic and advanced versions, etc.
In short, different project types have different charging methods
↓
However, in actual projects, many charging methods
[API per call + Tokens by amount] hybrid charging
[Subscription + Tokens by amount] hybrid charging
[One-time contract + daily operation] hybrid charging
