Multimodal Perception Data and One-Stop Algorithm Training for AI Empowerment in Jiangsu Courts

Smart Introduction

In recent years, Jiangsu courts have deeply implemented Xi Jinping’s thoughts on the rule of law and his important ideas on building a strong networked nation, closely focusing on the work theme of “justice and efficiency.” They have actively explored the deep integration of artificial intelligence and judicial applications, relying on multimodal perception data to construct the Jiangsu Court Perception AI Empowerment Platform, achieving one-stop training for audio and video AI and intelligent management and scheduling of algorithms, establishing a multi-dimensional feature visual library for personnel, further enhancing the AI-assisted case handling and management capabilities of Jiangsu courts, effectively ensuring judicial safety.

Building a One-Stop Training Platform Shared by All Courts in the Province

To ensure the data security of the court’s dedicated network, Jiangsu courts rely on the built-in one-stop training platform of the Jiangsu Court Perception AI Empowerment Platform, utilizing different model training solutions and data intelligent labeling technology to achieve rapid iteration and optimization of various audio and video algorithms.

Model Training. In response to violations such as unauthorized video recording and physical conflicts in courtroom areas, Jiangsu courts leverage the one-stop training platform to create 12 types of training models, including image single-label classification, image multi-label classification, image multi-attribute classification, and object detection models, conducting targeted training to fully meet the AI application needs of courts in various scenarios and fragmented streaming media.

Intelligent Labeling. Pilot courts in Jiangsu have quickly fine-tuned the foundational models using intelligent labeling technology for audio and video data based on the models trained and generated at the Jiangsu High People’s Court, automatically labeling the incremental audio and video data they collect, completing optimization training through simple result adjustments and confirmations.

Multimodal Perception Data and One-Stop Algorithm Training for AI Empowerment in Jiangsu Courts

Intelligent Labeling

Creating an Algorithm Repository Supported by Unified Management and Scheduling of Intelligent Algorithms

To address the inconvenient issues caused by different sources and formats of audio and video across courts, pilot courts in Jiangsu analyze the sources and real-time needs of different audio and video through an algorithm repository, and achieve recognition and access to audio and video algorithms provided by different manufacturers through algorithm scheduling of cloud edge devices like cloud centers and front-end devices, realizing unified management of multi-category and multi-manufacturer audio and video algorithms from aspects like defining algorithm package structure specifications, structured description specifications, and interface specifications.

Multimodal Perception Data and One-Stop Algorithm Training for AI Empowerment in Jiangsu Courts

Audio and Video Algorithm Repository

In addition, pilot courts apply ultra-wideband wireless communication technology (UWB) and the principle of TDOA (Time Difference of Arrival) for spatial positioning, deploying positioning bases in security inspection rooms, litigation service halls, various floors of trial courts, and specialized business rooms. When registering visitors, temporary visitor passes with UWB positioning functions are issued, achieving precise positioning of visitors from the entry security inspection registration to the end of their visit. In the criminal trial guarantee work, by having criminal defendants wear positioning wristbands, functions such as full-process positioning of the defendants, anti-tampering alarms, and escape alarms are realized.

Multimodal Perception Data and One-Stop Algorithm Training for AI Empowerment in Jiangsu Courts

UWB Personnel Trajectory Positioning

Real-Time Detection of Abnormal Behavior in Court Trials and Petition Scenes and AI Video Labeling

In response to certain parties being improperly dressed, smoking, or eating during remote court sessions, pilot courts in Jiangsu utilize the Jiangsu Court Perception AI Empowerment Platform to automatically identify these abnormal behaviors, labeling the video stream and overlaying event trigger frames and scene management frames to display warning information in real-time, thus promptly correcting violations and enhancing the normativity of court trials and the seriousness of justice.

Real-Time Interactive Audio and Video Across Multiple Business Scenarios

Jiangsu pilot courts leverage the Jiangsu Court Perception AI Empowerment Platform to enhance audio and video interaction quality. In response to network conditions during cross-network court sessions, multiple weak network countermeasures such as anti-network fluctuations, packet loss compensation, adaptive bitrate, adaptive frame rate, adaptive video and image, dynamic multi-streaming, and adaptive sending and receiving buffering are implemented, achieving audio echo cancellation, noise reduction, background noise elimination, and audio-visual synchronization effects. Additionally, based on facial recognition technology, background image overlay and person extraction are achieved, supporting remote trials, remote interrogations, and remote mediation, effectively breaking geographical limitations and improving case handling efficiency.

Online Retrieval of Multi-Modal Electronic Evidence and Key Information Extraction and Localization

The pilot courts in Jiangsu rely on the Jiangsu Court Perception AI Empowerment Platform to achieve key information extraction and quick localization of audio and video evidence. Second-instance judges can extract and automatically label key audio and video segments from first-instance trial data, allowing for rapid retrospection of the first-instance trial process.

At the same time, the platform supports the integration of case audio and video materials from the electronic dossier system’s material transfer module, providing an upload entrance for audio and video evidence via internet litigation service platforms or apps, enabling parties to upload evidence materials independently and supporting electronic material data integration. After the audio and video evidence is integrated into the case, pilot courts conduct playability tests, and for audio and video that cannot be played normally, a unified transcoding process is carried out, generating a playable audio and video file for online retrieval while retaining the original file.

Multimodal Perception Data and One-Stop Algorithm Training for AI Empowerment in Jiangsu Courts

Online Retrieval of Electronic Evidence

Conclusion

Next, Jiangsu courts will closely focus on the overall layout of the Supreme People’s Court’s “One Network” and continuously improve the Jiangsu Court Perception AI Empowerment Platform, promoting the deep integration of artificial intelligence and judicial practice, and striving to extend the AI capabilities in the field of court audio and video to all business applications, thereby enhancing the quality and efficiency of trial execution work.

(This achievement was awarded second prize in the 2023 People’s Court Science and Technology Achievement Evaluation Activity)

END

Recommended Reading

“The Series on Smart Court Construction in China” is a set of toolbooks for government information construction, with 7 volumes published. The first batch of 5 volumes was published in April 2021, and the second batch of 2 volumes was published in August 2023, namely “Research and Practice of Blockchain in Smart Courts” and “Engineering Methods of Complex Information Systems – Exploratory Practices in Smart Court Construction.” The series deeply summarizes the engineering methods, standard systems, and evaluation systems for smart court construction, excellent cases, local practices, internet judicial practices, blockchain research and practice, and engineering methods for complex information systems, integrating theoretical research, practical application, experience promotion, and technical operations, suitable for both smart court builders and researchers in smart court technology innovation, with reference significance for information construction in other industries.

Multimodal Perception Data and One-Stop Algorithm Training for AI Empowerment in Jiangsu Courts

Click the image to purchase the series

Unit: Jiangsu High People’s Court

This issue’s editor: Li Jingyuan

Special Reminder: All works marked “source” in this public account are reprinted from media sources. Some images are sourced from the internet, and copyright belongs to the original authors. If there is any infringement, please contact us for deletion. The content shared is the author’s personal opinion and is for readers’ learning and reference only. Suggestions are welcome, and contributions can be sent to the only submission channel email: [email protected] (Internet), Email should also be copied to [email protected].

Leave a Comment Cancel reply