MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering NLP graduate students, university teachers, and corporate researchers.The vision of the community is to promote communication and progress between the academic and industrial circles of natural language processing and machine learning, especially for beginners.
Reprinted from | New Intelligence
What skills do you need to build a multi-task natural language processing model? Answer: Just know Chinese. Recently, the CLUE Chinese evaluation community released a magical tool called ClueAI, which encapsulates all interfaces, allowing you to build an NLP model API in three minutes, capable of supporting different types of tasks with various Chinese prompts.Repository link: https://github.com/clue-ai/clueai-pythonClueAI also provides a free online trial website that includes templates for 20 NLP tasks.
Trial link: https://www.cluebenchmarks.com/clueai.html
Long press to recognize the QR code for a quick trialBy the way, the API integrated with this is “PromptCLUE”, which is currently the open-source model that supports the most Chinese tasks in the country!
1
Example
News Classification (classify)
Input: News classification: Earlier today (the 3rd), Taiwan media such as China Times and United Daily News reported that Pelosi arrived at the "Legislative Yuan" on the morning of the 3rd. The Taiwan New Party protested outside the "Legislative Yuan" at 8 a.m., shouting: "Pelosi, get out!" Taiwan media reported that New Party Chairman Wu Chengdian stated that Pelosi's visit to Taiwan has no justification, saying, "Usually we say guests are welcome, but this time it is a disaster! She is here to harm Taiwan." He said Pelosi brings disaster to Taiwan, "What else should we welcome?" Options: Finance, Law, International, Military Answer:
Model output: International
Intent Classification (classify)
Input: Intent classification: Help me book a room in Shanghai Pudong on Sunday Options: Alarm, Literature, Hotel, Art, Sports, Health, Weather, Other Answer:
Model output: Hotel
Sentiment Analysis (classify)
Input: Sentiment analysis: This looks okay, but I actually don’t like it Options: Positive, Negative Answer:
Model output: Negative
Inference (generate)
Input: Inference relationship judgment: Premise: Xiao Ming is in Beijing today Assume: Xiao Ming is traveling in Shenzhen Options: Contradiction, Entailment, Neutral Answer:
Model output: Contradiction
Reading Comprehension (generate)
Input: Reading comprehension: Paragraph: On August 2, overseas network reported that three members of the U.S. House of Representatives recently sent a letter to the U.S. Government Accountability Office (GAO), requesting an examination of whether the federal government’s response to the monkeypox epidemic is sufficient. In the letter, the three congressmen stated that the U.S. public health system is "seriously damaged," and the federal government’s response to the monkeypox epidemic has been slow, with delays in distributing reagents and vaccines, affecting the ability to contain the spread of the epidemic. Moreover, millions of doses of monkeypox vaccines took months to be approved and shipped from a Danish factory to the U.S. The congressmen also requested the GAO to examine whether federal agencies such as the Centers for Disease Control and Prevention, the Food and Drug Administration, and the Department of Homeland Security have adequately prepared for the monkeypox epidemic and whether they have learned from the experience of responding to the COVID-19 pandemic. U.S. media reported that for several weeks, U.S. lawmakers have frequently pressured the federal government regarding its response to the monkeypox epidemic, calling for the government to declare a public health emergency. As of now, the CDC has confirmed more than 5,000 cases of monkeypox. Considering that many citizens still cannot be tested for the monkeypox virus, the current number of cases may be underestimated. Question: What problems does the federal government face in responding to the epidemic? Answer:
Model output: The federal government’s response to the monkeypox epidemic has been slow, and there have been delays in distributing reagents and vaccines, affecting the ability to contain the spread of the epidemic.
Reading Comprehension – Open-ended (generate)
Input: Read the dialogue and answer the question. Man: Why are you so late for work today? Woman: I worked late yesterday, and I also have a cold. Man: Then you should go home and rest, I'll help you take a leave. Woman: Thank you. Question: How is the woman? Options: Working, Has a cold, On a call, Going on a business trip. Answer:
Model output: Has a cold
Summary (generate)
Input: Generate a summary for the following article: At 12:52 on September 5, a 6.8 magnitude earthquake occurred in Luding County, Ganzi Tibetan Autonomous Prefecture, Sichuan Province. After the earthquake, leaders attached great importance and made important instructions, requiring that saving lives be the top priority, and all-out rescue efforts be made for the affected people, minimizing casualties. Answer:
Model output: A 6.8 magnitude earthquake occurred in Luding County, Ganzi Prefecture, Sichuan Province.
Translation – Chinese to English (generate)
Input: Translate to English: The Speaker went to Taiwan, and the Chinese people were angry. Answer:
Model output: The Speaker went to Taiwan, and the Chinese people were angry.
Translation – English to Chinese (generate)
Input: Translate to Chinese: This is a dialogue robot that can talk to people. Answer:
Model output: 这是一个能与人对话的机器人。
General Information Extraction (generate)
Input: Information extraction: According to Xinhua News Agency, the government of Qingcheng District, Qingyuan City, Guangdong Province, announced yesterday that Chen Bo, the director of the Environmental Protection Bureau of Qingcheng District, who was recently reported for extorting businesses and saying "I can ruin a factory in minutes," has been removed from his position as a member of the district committee. Question: Organization name, person name, position Answer:
Model output: Organization name: Xinhua News Agency, Qingcheng District Environmental Protection Bureau, Qingcheng District Committee Person name: Chen Bo Position: Director
Resume Information Extraction (generate)
Input: Read the text and extract key information: Zhang Xuanwu, born in 2000, Chinese nationality, no overseas residence rights, PhD degree, currently the technical director of Hangzhou Xiansuo Technology. Question: Organization, person name, position, native place, major, nationality, degree, ethnicity Answer:
Model output: Organization: Hangzhou Xiansuo Technology Person name: Zhang Xuanwu Position: Technical Director Nationality: Chinese Nationality Degree: PhD Degree
Medical Information Extraction (generate)
Input: Find the specified information: I feel uncomfortable in my stomach today, my hands are a bit numb, and I feel unwell all over. A few days ago, I went to the hospital, and the doctor said I have gastritis. Question: Symptoms, treatment, examination, body part, disease Answer:
Model output: Symptoms: Stomach discomfort, numbness, unwell all over, gastritis
E-commerce Customer Demand Analysis (classify)
Input: E-commerce customer demand classification: Received but not fitting, can I return or exchange? Options: Buyer inquires whether the product supports Huabei payment, Buyer expresses interest in the store, Buyer inquires about return and exchange rules, Buyer needs product recommendations Answer:
Model output: Buyer inquires about return and exchange rules
Medical Semantic Similarity (classify)
Input: Do the following sentences express the same semantics: Text 1: What to do if my legs are numb from diabetes? Text 2: How to control diabetes lifestyle? Options: Similar, Not similar Answer:
Model output: Not similar
Question Generation (generate)
Input: Question generation: According to China News Service, on September 22, the Ministry of Commerce held a regular press conference, and the Ministry's spokesperson Shu Jueting stated that from January to August this year, China actually utilized foreign capital of 138.4 billion U.S. dollars, an increase of 20.2%; among which, investment from the European Union to China increased by 123.7% (including data from free port investments). This fully indicates that foreign investors, including those from the EU, continue to have confidence in the Chinese market and hope to deepen investment cooperation with China. Answer:
Model output: 1. How much did China actually utilize foreign capital from January to August this year? 2. How much did EU investment in China increase?
Coreference Resolution (generate)
Input: Which noun phrase does the pronoun refer to: Paragraph: On September 21, local time, British Prime Minister Truss met with U.S. President Biden in New York. She then emphasized on Twitter that the UK and the US are steadfast allies. The tweet was accompanied by a video of her (the pronoun) meeting with Biden. Question: Who does the pronoun "she" refer to? Answer:
Model output: Truss
Keyword Extraction (generate)
Input: Extract keywords: On the 21st local time, the Federal Reserve announced an interest rate hike of 75 basis points, raising the target range for the federal funds rate to between 3.00% and 3.25%, in line with market expectations. This is the fifth interest rate hike by the Federal Reserve this year and the third consecutive hike, creating the most intensive rate hike since 1981. Keywords:
Model output: Federal Reserve, 75 basis points interest rate hike, rate hike magnitude
Sentiment Orientation (classify)
Input: What kind of emotion is contained in the text: The super cute handsome guy, love it... Options: Disgust, Like, Happy, Sad, Surprised, Angry, Afraid Answer:
Model output: Like
Input: What kind of emotion is contained in the text: The super cute handsome guy, love it... Options: Disgust, Like, Happy, Sad, Surprised, Angry, Afraid Answer:
Model output: Like
Custom (generate)
Input: XXXX
Model output: YYYY
2
Model Introduction
When calling the clueai API, we specify a default model for the use case. The default model is very suitable for getting started, but in a production environment, we recommend specifying a particular model using the model_name parameter.Return resultsFor different tasks, the corresponding clueai object will be returned (for example, for classification, it will be “Classification”).
3
One-Click Installation
ClueAI supports pip installation of Python packages to simplify the interface with the clueai API in Python 3.
pip install –upgrade clueai
It can also be installed from source
python setup.py install
If you do not want to install it locally for the time being, you can also run it in Colab with one click.Link: https://colab.research.google.com/drive/1H5J03ek3kpKschQ32mhX-y0JyRo1mIXN#scrollTo=zMSp1naSL8X9It is also very convenient to use. If it is a basic model, just run it directly, and to try the large model, you need to obtain an api-key from the ClueAI official website to use.The code interface is designed separately for text classification and text generation tasks.
Text Classification The Python code is as follows:
import clueai
from clueai.classify import Example
cl = clueai.Client("", check_api_key=False)
response = cl.classify(model_name='clueai-base', task_name='Product Classification', inputs=["Powerful image processor, showcasing the natural beauty of you,, fixing some small errors, enhancing overall stability.", "Seeking idle sales, boutique shopping, safe and reliable idle mall, quality products guaranteed"], labels = ["Beauty", "Second-hand", "Takeout", "Office", "Job Seeking"])
print('prediction: {}'.format(response.classifications))
import clueai
from clueai.classify import Example
cl = clueai.Client("", check_api_key=False)
response = cl.classify(model_name='clueai-base', task_name='Product Classification', inputs=["Powerful image processor, showcasing the natural beauty of you,, fixing some small errors, enhancing overall stability.", "Seeking idle sales, boutique shopping, safe and reliable idle mall, quality products guaranteed"], labels = ["Beauty", "Second-hand", "Takeout", "Office", "Job Seeking"])
print('prediction: {}'.format(response.classifications))
You can also access the API using a curl command
curl --location --request POST 'modelfun.cn/modelfun/ap' \
--header 'Content-Type: application/json' \
--header 'Model-name: clueai-base' \
--data '{"task_type": "classify","task_name": "Product Classification","input_data": ["Powerful image processor, showcasing the natural beauty of you,, fixing some small errors, enhancing overall stability.", "Seeking idle sales, boutique shopping, safe and reliable idle mall, quality products guaranteed"],"labels": ["Beauty", "Second-hand", "Takeout", "Office", "Job Seeking"]}'
curl --location --request POST 'modelfun.cn/modelfun/ap' \
--header 'Content-Type: application/json' \
--header 'Model-name: clueai-base' \
--data '{"task_type": "classify","task_name": "Product Classification","input_data": ["Powerful image processor, showcasing the natural beauty of you,, fixing some small errors, enhancing overall stability.", "Seeking idle sales, boutique shopping, safe and reliable idle mall, quality products guaranteed"],"labels": ["Beauty", "Second-hand", "Takeout", "Office", "Job Seeking"]}'
Text Generation
The Python code is as follows:
import clueai
# initialize ClueAI Client with an API Key
cl = clueai.Client("", check_api_key=False)
prompt= '''Summary: This article summarizes ten design principles for wearable products, which are also considered by the author to be the most attractive aspects of this industry: 1. Solve repetitive problems for people; 2. Start from people, not from machines; 3. Attract attention, but do not be deliberate; 4. Enhance user capabilities rather than replace humans. Answer: '''
# generate a prediction for a prompt
prediction = cl.generate(model_name='clueai-base', prompt=prompt)
# print the predicted text
print('prediction: {}'.format(prediction.generations[0].text))
The curl command is as follows:
curl --location --request POST 'modelfun.cn/modelfun/ap' \
--header 'Content-Type: application/json' \
--header 'Model-name: clueai-base' \
--data '{"task_type": "generate","task_name": "Summary","input_data": ["Summary:\nThis article summarizes ten design principles for wearable products, which are also considered by the author to be the most attractive aspects of this industry: 1. Solve repetitive problems for people; 2. Start from people, not from machines; 3. Attract attention, but do not be deliberate; 4. Enhance user capabilities rather than replace humans.\nAnswer:"]}'
Welcome interested students to join the group for discussion:
Reference materials:https://mp.weixin.qq.com/s/wkHItMnttReKpLd5nFmwDw
Technical Group Invitation
△ Long press to add the assistant
Scan the QR code to add the assistant’s WeChat
Please note: Name – School/Company – Research Direction(e.g.: Xiao Zhang – Harbin Institute of Technology – Dialogue System)to apply to join the Natural Language Processing/PyTorch and other technical discussion groups.
About Us
MLNLP Community is a grassroots academic community jointly established by scholars in machine learning and natural language processing from home and abroad. It has developed into a well-known machine learning and natural language processing community both domestically and internationally, aiming to promote progress between the academic and industrial circles of machine learning and natural language processing.The community can provide an open communication platform for the further study, employment, and research of related practitioners. Everyone is welcome to follow and join us.