What Are Tokens? Why Large Models Charge by Tokens?

Recently, there has been a trend of showing off how many Tokens one has spent.

Some say it is the “digital currency” in the AI world.

Simply because

[Large models charge by Tokens]

For example, this morning

A big sister called DeepSeek through a certain platform

The platform’s charging standard is

16 yuan for every 1 million Tokens

Calculating this way, I spent 0.03 yuan this morning

↓

The bill shows

Every time I ask DeepSeek a question

I will “spend” some Tokens

After each question is answered, it automatically displays

How many Tokens you have consumed

Hehe, interesting, right?

Every time humans communicate with large models, whether chatting, writing, or predicting…

They will consume some Tokens

At this time, many people are confused

What are Tokens? How are they charged?

What is the difference from traditional API calls and subscription charges?

↓

Tokens are the smallest unit processed by large models

Equivalent to a “computational granularity”

It is not measured directly bywordsorphrasesbut rather by the smallest units obtained after the text is segmented by the model’s tokenizer.

For example, in English, “I love AI!“

What Are Tokens? Why Large Models Charge by Tokens?

Another example

In Chinese, “Artificial Intelligence is powerful“

Simply put

Token ≠ Word ≠ Character

Moreover, the length of a Token is not fixed

Its length depends on

the specific model’s tokenizer rules

Mainly because

The computational cost of running large models is very high

Charging by Tokens is to more accurately control resource usage

This charging method is fairer and more transparent

If charged by the traditional API call method

Charging per call,charging each time it is called

And charging a fixed fee each time is obviously unreasonable

For large models

The computational cost of short texts and long texts is completely different

How precise can Tokens charging be?

The cost consists of two parts

The input question + the AI output answer are both chargeable

For example

Input 1k Tokens + Output 2k Tokens = Charge for 3k Tokens

What you see[DeepSeek API Price List]

Usually looks like this

↓

So, when asking AI questions, don’t be too verbose

Every character you type costs money

Although DeepSeek is also accessed via API calls

This is just a technical method, not a charging method

However, the previous API calls

In the industry, the default charging method is

Charging per call, charging a fixed fee each time it is called

For example

SMS API (charges 0.01 yuan for each message sent)

Weather query API (charges 0.03 yuan for each query)

This is very different from the current charging by Tokens

↓

There is no good or bad charging method

Different projects adopt different methods

Charging per call is suitable for standard service products

Charging by Tokens is suitable for dynamically generated products

↓

Of course, there are traditional charging methods that we are very familiar with

① Subscription charging, like SaaS

Pay monthly/yearly, regardless of actual usage

Usually a fixed fee

② Charging by computing resources, like cloud computing

Charging based on how muchcomputing resources(CPU/GPU/TPU) are consumed

Charging based ondata storagecapacity (GB/TB)

Charging based onbandwidthtransfer volume (GB/TB)

…

③ Charging by unlocking features, how much to unlock a feature

For example: unlocking security features, solving advanced features, etc.

Some also have basic and advanced versions, etc.

…

In short, different project types have different charging methods

↓

However, in actual projects, many charging methods

Adopt a [hybrid model]

For example

[API per call + Tokens by amount] hybrid charging

[Subscription + Tokens by amount] hybrid charging

[One-time contract + daily operation] hybrid charging

…

Leave a Comment Cancel reply