Medical Text Annotation Tool | A Great Partner for NLP

At this stage, deep learning has become a very important technical means in the field of natural language processing, among which supervised learning is a very important method. Whether it is sequence labeling or text classification in supervised learning, it relies on a large amount of labeled data for model training.

How to improve the efficiency of corpus annotation with tools has become a common concern. To this end, the OMAHA Alliance HiTA service platform has launched an updated medical text annotation tool, which can perform predefined entity recognition on text data, and also supports users for manual review and re-annotation, significantly reducing the time required to complete text annotation.The medical text annotation tool provided by the HiTA service platform has the following highlights:

1. Supports pre-annotation of entities in the OMAHA semantic model

2. Supports configuration of custom text annotation schemes, which can be modified during the annotation process

3. Supports batch deletion of specific annotation content

Peter is a data annotator at a company, and currently has a clinical guideline document that needs annotation. Below is the process of Peter using the medical text annotation tool (hereinafter referred to as “the tool”) for annotation, for everyone’s reference.

Step 1: Configure the Annotation Scheme.

Peter reviewed the “HiTA System Scheme” (which includes five semantic types in the OMAHA semantic model: clinical findings, diseases, operations, observation operations, and drugs, as well as the relationships between these five semantic types), and found that it did not meet his needs, so he chose a custom annotation scheme. In the custom annotation scheme, Peter selected six semantic types under the OMAHA semantic model, including clinical findings, diseases, operations, observation operations, drugs, and people, while also using the system-recommended relationships for these six semantic types. He also defined a new semantic type “location” and a new relationship “medical location.”

Medical Text Annotation Tool | A Great Partner for NLP

Medical Text Annotation Tool | A Great Partner for NLP

Step 2: Create Annotation Project.

Upload the text file that needs annotation, select the annotation scheme (either system scheme or custom scheme), and click confirm. After completing the creation of the annotation project, you can view the number of files and the configured scheme in the project list.

Medical Text Annotation Tool | A Great Partner for NLP

Step 3: View Pre-annotation Results.

Click on the project name to view the results of the text pre-annotation. Currently, only entity pre-annotation in the OMAHA semantic model is supported; the relationships in the OMAHA semantic model and user-defined semantic types and relationship types are not yet supported for pre-annotation.

Medical Text Annotation Tool | A Great Partner for NLP

Step 4: Manual Review and Re-annotation.

You can modify the pre-annotated content or annotate new entities and relationships.

  • Entity and Relationship Annotation

Entity Annotation: Select the word to be annotated, choose the appropriate semantic type in the pop-up box to complete the new annotation; for already annotated words, double-click the annotation result to modify the semantic type, and right-click to delete the annotation result.

Relationship Annotation: First click on the annotation result of the starting entity, then click on the annotation result of the target entity. After completing the connection, select the relationship type to complete the new relationship annotation; for already annotated relationships, double-click the annotation result to modify the relationship type, and right-click to delete the annotation result.

If there are remarks during annotation, they can be filled in the remarks column in the annotation box.

Medical Text Annotation Tool | A Great Partner for NLP

Medical Text Annotation Tool | A Great Partner for NLP

  • Batch Deletion of Annotation Results

The tool supports batch deletion of annotation results. Click “Batch Delete Annotations,” select the specific relationship types and semantic types to delete. Peter selected “clinical findings” in the “semantic type,” which deletes all annotation results under the “clinical findings” label. If you need to batch delete all annotations related to the text “treatment,” just select “text,” input “treatment” in the input box, and click confirm to delete all treatment-related annotation results.

Medical Text Annotation Tool | A Great Partner for NLP

  • Modification of Annotation Scheme

Peter found that there were new semantic types and relationships that needed annotation in the text during the annotation process, but he forgot to set them in the custom annotation scheme before annotating. Therefore, he selected to modify the annotation scheme on the left side, added the required entities and relationship types, and continued with the annotation.

Medical Text Annotation Tool | A Great Partner for NLP

Step 5: Export Annotation Results.

In the annotation project interface, you can choose to export the annotation results. Currently, it supports two formats for result export: txt and json. The txt file includes entity annotation results and relationship annotation results; the json file includes user-imported text content, semantic type configuration information, entity annotation results, attribute relationship configuration information, relationship annotation results, etc. Detailed text specifications and field descriptions can be viewed by clicking the “HiTA User Guide” in the upper right corner, within the “annotation tool.”

Medical Text Annotation Tool | A Great Partner for NLP

The text annotation tool is currently only open for use by OMAHA service agencies. Service agencies can log in with their platform account to use it. For more function introductions, please log into the HiTA service platform and view them in the “HiTA User Guide.”

Medical Text Annotation Tool | A Great Partner for NLP

Contact Us

WeChat ID: OMAHA君 (WeChat ID: omaha-phr)

HiTA Service Email: [email protected]

Head of the Digital Medicine Knowledge Center: Xu Meilan:

[email protected]

Recommended Reading:

Terminology Standardization Tool | A Powerful Assistant for Standardizing Heterogeneous Data Processing

Medical Text Annotation Tool | A Great Partner for NLP

OMAHA HiTA: Metadata | Terminology | Knowledge Graph

For healing, we choose openness and sharing

Medical Text Annotation Tool | A Great Partner for NLP

Click “Read the Original”, Join Us Now, and Let’s Start the Movement of Returning Personal Health Medical Data Together!!

Leave a Comment