AI Empowerment in Identifying Telecom Fraud

Author / Hu Ruimin, Xi’an University of Electronic Science and Technology, School of Network and Information Security; Wu Junhang, Ren Lingfei, Wuhan University, School of Computer Science

Mao Renxin, Partner and Algorithm Scientist, Blue Elephant Intelligent Connection (Hangzhou) Technology Co., Ltd.

AI Empowerment in Identifying Telecom Fraud

School of Network and Information Security, Xi’an University of Electronic Science and Technology,

Hu Ruimin, Wuhan University, National Cybersecurity College

Background

In today’s society, with the rapid development of the internet and mobile communication technology, telecom network fraud activities have shown a high incidence, with various fraudulent methods emerging, posing a significant threat to public property safety. On January 9, 2024, the Ministry of Public Security held a press conference stating that in 2023, a total of 437,000 telecom fraud cases were cracked, with the National Anti-Fraud Center issuing a total of 9.4 million funding warning instructions to various regions, and public security agencies having conducted face-to-face persuasion for 13.89 million people, in collaboration with relevant departments to intercept fraud calls 2.75 billion times, and text messages 2.28 billion times, handling 8.364 million fraudulent domain names and URLs, and urgently intercepting involved funds of 328.8 billion yuan.

In September 2022, the “Anti-Telecom Network Fraud Law of the People’s Republic of China” (referred to as the “Anti-Fraud Law”) was enacted, providing a legal basis for combating telecom network fraud crimes and clarifying the responsibilities and rights of institutions such as banks and operators. Among them, banks play a crucial role in the fight against telecom fraud, as they are a key node in the flow of funds, serving not only as the first line of defense against fraud but also as the core entity in identifying and preventing money laundering activities related to fraud. Therefore, the “Anti-Fraud Law” clearly stipulates the responsibilities of banks and other financial institutions in anti-fraud work. According to the law, banks must establish and improve internal control systems, strengthen the identification of customer identities and monitoring of transactions, and timely discover and report suspicious transactions. Additionally, banks should enhance cooperation with public security agencies, assist in anti-fraud work, and provide necessary support for combating crime. Furthermore, the “Anti-Fraud Law” grants banks and other financial institutions certain powers. In performing their anti-fraud duties, banks can legally query and freeze involved accounts to block the flow of fraudulent funds. This empowerment allows banks to be more proactive and effective in anti-fraud work, timely blocking fraudulent activities and reducing losses. Finally, the “Anti-Fraud Law” also stipulates that if banks fail to fulfill their anti-fraud responsibilities, resulting in the occurrence of fraud cases, they will bear corresponding legal responsibilities. This undoubtedly increases banks’ sense of responsibility and urgency, prompting them to pay more attention to anti-fraud work and take more effective measures.

Challenges

1. Strong disguise of criminal methods, high difficulty in investigation. The methods used by telecom fraudsters are highly deceptive, posing significant challenges to law enforcement agencies. These criminals disguise their identities and behaviors in various ways to gain the trust of victims and commit fraud. Firstly, fraudsters utilize the virtual nature of the internet to easily forge identity information, including phone numbers, IP addresses, etc., making their true identities difficult to trace and confirm. Secondly, fraudsters often contact victims via phone calls, social media, etc., avoiding face-to-face communication, further increasing the difficulty of investigation. Moreover, fraudsters also employ technical means such as mass SMS messaging and dialing technology to rapidly and extensively initiate fraudulent activities, making it challenging for law enforcement agencies to track and combat them effectively in a timely manner.

2. Diverse fraudulent methods, new types of fraud emerging continuously. Telecom network fraud methods are varied, ranging from the initial “lottery scams” and “impersonating acquaintances scams” to the current “phishing” and “false investment platforms.” Fraudsters exploit people’s unfamiliarity with new technologies and blind trust in online information to carefully design schemes that lure victims into falling for their tricks. Fraud crimes not only result in significant economic losses but also damage the psychological health and social trust of victims. Many victims, while suffering economic losses, also endure immense mental pressure and social discrimination, severely impacting their normal lives and work. More concerning is the rapid evolution of fraudulent methods, from traditional phone scams and SMS scams to scams using internet platforms and social media, with methods becoming increasingly concealed and intelligent, posing unprecedented challenges for prevention and control efforts.

3. Tight organization of fraud groups, clear division of labor among members. To improve the efficiency of fraud, traditional scattered and independent fraud gangs have begun to transform into super-large criminal groups disguised as “industrial parks” and “technology parks,” with group-style fraud cases emerging continuously. For example, in the 2023 Jiangyin “6·16” case jointly supervised by the Supreme People’s Procuratorate and the Ministry of Public Security, six behind-the-scenes “financial backers” built a “hardware and building materials city” in northern Myanmar, successively recruiting 18 fraud gangs to settle in, providing office space and accommodations for the fraud gangs, managing them in a closed manner, and guarding fraud gang members with weapons, forming a super-large criminal group. Within telecom fraud groups, there is a tight organizational structure and clear division of labor, with members performing their respective roles during the execution of fraud, cooperating with each other to form an organized crime form. Members of telecom network fraud criminal groups can be categorized according to their responsibilities into six types: group organizers, information collection teams, script editing teams, fraud teams, primary money laundering teams, and subordinate money laundering teams.

4. Low model recognition coverage, short lifecycle. In the financial sector, banks and other financial institutions are facing severe challenges from telecom network fraud, particularly in terms of recognition coverage, model lifecycle, and control measures. Firstly, the recognition coverage during the account opening phase is insufficient, mainly limited by the few verifiable data dimensions, making it challenging to conduct a comprehensive assessment of account holders. In addition, there is a lack of cross-channel joint prevention and control mechanisms in transaction scenarios, and the accuracy of risk account warning systems needs improvement, resulting in high costs and low efficiency in manual inspections. At the same time, the rapid iteration of fraudulent methods shortens the lifecycle of risk control models, requiring constant updates to adapt to new risk characteristics.

Secondly, banks have relatively limited control measures after identifying suspicious transactions, mainly limited to sending SMS alerts or educational reminders, lacking targeted strategy deployment. This limits banks’ proactive defense and timely response capabilities. In the face of these challenges, banks need to adopt innovative measures, such as introducing broader data dimensions, utilizing biometric recognition and big data analysis technologies to enhance monitoring accuracy, and closely collaborating with law enforcement agencies to improve their ability to identify and control fraudulent activities. Through these efforts, banks can more effectively build a solid security line to protect customer funds and maintain financial order stability.

Methods

Although there are many challenges in combating telecom network fraud, with the country’s high emphasis on such cases, various anti-telecom fraud technologies have emerged, providing significant support for law enforcement agencies in combating telecom network fraud.

1. Rule-based expert systems. In the early stages, the data used for fraud detection was often highly structured, such as transaction logs or call logs. Therefore, operators or banks used pre-defined rules and static thresholds to filter out fraudulent behaviors. For example, when a user makes a large number of calls in a short period or when a bank account has a large number of transaction records within a certain period, the anti-fraud system will issue an alert.

2. Traditional machine learning algorithms. Although rule-based expert systems can detect more obvious fraudulent methods to some extent, their limitations have become increasingly apparent with the evolution of attack and defense dynamics. Firstly, the knowledge update of rule-based expert systems is slow, unable to meet the detection needs under the constantly changing fraudulent methods. Secondly, the rule-based detection methods lead to many normal users being incorrectly judged as fraudulent subjects, causing inconvenience to ordinary people’s lives. Considering the shortcomings of rule-based methods, more machine learning-based approaches have been proposed and utilized. Machine learning-based methods typically start by extracting statistical features related to the given task, such as call records, credit history, and historical transactions, and after feature engineering, use these features to train classifiers to obtain final classification results. Widely used machine learning algorithms include: Naive Bayes, Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), XGBoost, and LightGBM, etc.

3. Voiceprint recognition technology. In telecom fraud activities, criminals typically use voice communication devices to contact victims. Since each person’s voiceprint signal is unique, voice calls leave behind voiceprint signals, making voiceprint recognition technology a powerful tool for intercepting fraud crimes. Voiceprint recognition technology is mainly divided into two stages: front-end processing and modeling testing. Front-end processing involves the preprocessing of voice signals and feature extraction, while modeling testing includes feature modeling and voiceprint testing. In recent years, with the rapid development of big data, cloud computing, and deep learning technologies, voiceprint recognition technology has made significant progress in accuracy and speed, leading to its widespread application in the field of anti-telecom network fraud.

4. Natural language processing technology. Anti-telecom network fraud technology based on natural language processing has developed as an important technical means in response to the increasingly rampant telecom network fraud. This technology utilizes computers to analyze and understand language text to identify and intercept fraudulent behaviors in telecom networks. In practice, this technology mainly conducts deep mining of communication content, including semantic, grammatical, and contextual analysis, to identify fraudulent information such as scam calls, text messages, and online information. Through natural language processing technology, automatic detection and timely blocking of fraudulent behaviors can be achieved, thus protecting users’ legitimate rights and property safety.

Thoughts on Building a Comprehensive Anti-Fraud System

At the same time, how to utilize big data and artificial intelligence technologies to build a new type of anti-fraud risk control system (as shown in the figure) to predict, block, manage, and trace the risk behaviors of telecom fraudsters and their upstream and downstream black and gray industries, thus forming a comprehensive anti-fraud governance that spans prevention before incidents, interception during incidents, and loss mitigation after incidents, has become a key means for banks and operators to confront telecom network fraud.

Figure: Comprehensive Anti-Fraud System Framework

1. Data and algorithm infrastructure. Data and algorithms serve as the infrastructure for anti-telecom network fraud, forming an important foundation for the entire anti-fraud system.

Regarding data, the business logic of the financial industry is complex and intertwined, and the data generated in financial payment processes is diverse, highly correlated, and involves user privacy. Therefore, when quickly accessing and effectively representing data, issues such as large data scale, multi-modal heterogeneity, strong coupling and correlation, sensitivity of privacy, and real-time invocation need to be considered. Based on this, multi-dimensional heterogeneous large-scale interactive graphs, as important information carriers that can support unsupervised/semi-supervised learning, can comprehensively utilize node features and relational information, and have visualization advantages, can retain original rich information to the greatest extent, combined with various graph machine learning methods, can achieve multi-source heterogeneous information integration and enable anti-fraud models to possess efficient and powerful multi-modal data analysis capabilities.

Additionally, to protect the security and privacy of data in the entire anti-fraud chain, multi-institutional collaborative prevention and control play a crucial role. In the joint analysis of data among institutions, by introducing privacy-preserving techniques such as federated learning and secure multi-party computation, privacy protection issues of data can be effectively addressed, achieving data availability without visibility. The author’s collaboration with Blue Elephant Intelligent Connection, under the guidance of a branch of the People’s Bank of China and the Public Security Department, has integrated government, public security agencies, multiple commercial banks, three major operators, and other third-party data sources, building a “financial anti-fraud collaborative prevention and control platform based on privacy computing technology.” Compared to single-institution data modeling, the performance of the federated model improves by 30%, allowing for the realization of comprehensive anti-fraud collaborative prevention and control after securely sharing multi-party data while ensuring that bank accounts are “opened as much as possible.”

For algorithms, they have always been the core of artificial intelligence models. The business chain involved in telecom network fraud detection is relatively complex. To ensure the stability and reliability of the model, it is necessary to consider the trustworthy technology issues of anti-fraud models throughout the entire chain of model training, deployment, and operation. At the same time, telecom network anti-fraud is a two-way adversarial process, where upstream and downstream black and gray industries and fraud groups will use various means to evade detection. Therefore, under such adversarial attacks, it is necessary to enhance the robustness of anti-fraud model algorithms. Finally, to avoid the “black box” problem, it is necessary for humans to “understand” the model, and conversely, to enable artificial intelligence to adapt to new knowledge, the model also needs to “understand” humans, thus requiring strengthening the bidirectional interpretability of the model.

2. Pre-incident situational awareness and intelligent collection. Unlike traditional methods that can only detect telecom network fraud when it occurs, pre-incident situational awareness and prediction are important aspects of intelligent financial anti-fraud. In terms of data, internal business system data of financial organizations can be divided into internal domain data and external domain data. Based on this, pre-incident risk awareness can be categorized into external and internal risk awareness based on data types.

For internal risks, it is necessary to strengthen the supervision of malicious user registration, authentication, login, and other processes. For external risk awareness, cross-comparison with certain external public security data sources has revealed that over 70% of fraud cases are related to apps or websites. Therefore, on one hand, proactive situational detection should be conducted using artificial intelligence technology to scan and analyze the aforementioned apps or websites. On the other hand, intelligent collection of multi-source information should be performed, as telecom network fraud detection requires high real-time performance. Therefore, NLP technology can be utilized to pre-collect external intelligence data and extract knowledge as a vocabulary, which helps to block risks in advance and provides strong evidence for fraud detection during incidents.

3. In-incident anomaly detection and risk blocking. During the incident phase, suspicious risk transactions need to be identified and blocked. Firstly, when facing new businesses, there may be problems of sparse or missing risk labels, so methods based on few-shot learning or unsupervised methods can be used for risk identification. Secondly, to address the high concealment of fraud groups or upstream and downstream black and gray industries, research can introduce full-graph risk control technology based on graph machine learning to mine multi-dimensional relational topology information between fraud groups/black and gray industries to alleviate information silos. Furthermore, current fraudulent methods often involve cross-platform transfers, so to reduce the difficulty of risk prevention on a single platform while also protecting user funds and data security, a financial anti-fraud collaborative prevention and control platform based on privacy computing technology can be utilized, achieving data integration and sharing while enhancing the recognition coverage of anti-fraud models under the premise of data security. Finally, in terms of risk persuasion, to reduce the disturbance cost to users and enhance their payment experience, the model will adopt interactive risk control measures, actively initiating interaction with users to obtain real risk information they face, and after risk identification and assessment through the risk model, targeted risk persuasion will be conducted.

4. Post-incident model iteration and defense upgrade. The main challenge in financial risk prevention and control lies in dynamic confrontation. Upstream and downstream black and gray industries or fraudsters may still find ways to evade detection or exploit system vulnerabilities, leading to “missed cases” and causing losses to users’ rights. Therefore, the risk prevention and control system needs to include a post-incident response module. By reviewing user complaints collected in advance, multi-modal machine learning (MMML) methods can be utilized to process and understand the text and image information contained in user complaints, thus mining potential risks to achieve iterations of fraud detection models during incidents. Finally, by constructing a large language model for anti-fraud, knowledge extraction can be achieved, and based on online learning and new knowledge input, continuous optimization of anti-fraud models can be performed, improving the overall defense system.

Conclusion

As the saying goes, “The devil is in the details, and the righteous will prevail.” The battle against telecom fraud is also being fought in a collaborative model across society, actively introducing various new technologies such as artificial intelligence, big data, privacy computing, and federated learning, ensuring that the governance technology itself meets the requirements of trustworthy AI, thus enhancing the capabilities of core elements such as “data privacy protection,” “robustness,” and “interpretability,” fundamentally protecting users’ legitimate rights and interests while combating telecom network fraud crimes.

(This article was published in the June 2024 issue of “Financial Electrification”)

AI Empowerment in Identifying Telecom Fraud

Leave a Comment Cancel reply