Company

Tech

Beyond Words: A Glimpse into the Development Challenges of Our Text Annotation Tool

Nicole Filz, Saif eddine Hasnaoui, Sarah Alaghbari

March 8, 2024

Deciphering the nuances of communication can be quite challenging, not least understanding the intent of a written text. However, there are technologies that demonstrate the ability to accurately understand content written by a human. One of those is the elevait suite which helps users organize their documents by automatically labeling them, allowing for an easier distinction of document types, for example of Invoices, Curricula or Plans.

What lies behind such seemingly magical features is Natural Language Processing (NLP), an intersection of linguistics and computer science which focuses on the understanding between computers and human languages, and attempts to gain meaningful insights from textual data.

The basis for such NLP systems is a process called text annotation. During text annotation, data is streamlined for machines by providing clear markers for relevant information, which add valuable context and meaning to the raw text data. Through careful analysis and tagging performed by humans, it is then possible to derive patterns and to apply those insights to unknown text documents, in order to categorize them and to identify recurring entities such as names or places.

At elevait, we have developed a custom Text Annotation Tool (TAT), which allows us to train and frequently retrain our models - because just like us, an AI never stops learning, and new domains require new training. Are you wondering what was our approach to develop this tool, and what challenges we faced? We’ll tell you, showing two different views: the UX integration and the development perspective! ✨

Finding the right words: Understanding user needs for annotation

As a first step, we collected and analyzed the user requirements in order to understand the software requirements in detail. On the one hand, we discussed the requirements with the annotators with various expertise. The requirements and workflows for the Text Annotation Tool (TAT) were then collected and prioritized. On the other hand, we analyzed similar tools and carefully evaluated their advantages and disadvantages.

This exchange allowed us to identify possible workflows and some individual needs could already be taken into account in the development. Based on this knowledge, we visualize the interface using small sketches to place the various components based on the user interface. The layout of the individual elements was described and their basic interaction shown. In addition, we again collected feedback from experts on the user-friendliness of the interaction from a small team and ensured that it fulfilled the requirements. The feedback was integrated into the development at this point.

After some functions were implemented into the interface, it was tested with users for the first time. The feedback received was carefully analyzed and integrated into further development steps. This iterative process and the exchange with the users enabled us to continuously improve the software and ensure that it meets the needs and expectations of the users in the best possible way.

‍

Since we were lucky enough to be working closely with the future users of the TAT, we could easily establish a routine of frequent discussions and exchanges with them. First, we collected and analyzed their requirements and current workflows, in order to comprehend and prioritize their feature requests. Of course, we also tried our best to take special requests into account, for example the possibility to translate the text to annotate. 🤝

Some core features requested by the users were:

Classify a document and annotate its entities (e.g. “first name”, “address”) within the same view
Efficient review of annotations
Easy preparation of annotation data and assignment of annotation tasks to the responsible annotators and reviewers

‍

Next, it was time to talk visuals. Therefore, we analyzed similar tools regarding their advantages and disadvantages (you wouldn’t want to reinvent the wheel all over again). Based on those insights and the requested features, we then created a mockup which is a purely visual version of the user interface mockup.png, showing the rough layout of the UI elements and their basic interaction. The users then provided us with valuable feedback during a further exchange round, which we gratefully integrated into the development.
We established a continuous feedback loop with the users. This iterative procedure enabled us to continuously improve the software and ensure that it met the user needs and expectations in the best possible way.

A Desktopscreen showcasing an example of a mock up as a pure visual UI version. — Example of a mock up as a purely visual version of the UI.

TAT: A generic text annotation tool providing a smart review

Now that we know some of the user’s pain points, the question remains: which annotations can be performed inside the TAT? In general, the tool supports two types of annotation tasks:

the classification of documents, on the one hand and
the annotation of entities by labeling individual words inside a text, on the other hand.
The combination of both, as requested initially, see TAT-Annotation.png.

Examplary of the TAT Annotation in a Desktopscreen. — Example of the TAT-Annotation.

‍

The labels and classes to be used can be defined by authorized users, before as well as during the annotation process. In this way, the TAT stays generic and can be flexibly used for various use cases and domains.

Individual text elements are annotated by first selecting the text and then assigning the label. After these annotations have been made and confirmed, a review is carried out. This can either be done manually by a reviewer or automated.

Since it became clear during the interviews that one big pain point was the high effort necessary for manually reviewing annotated data, we developed an auto-approval mechanism that relies on the agreement among the annotators. Therefore, admins have the possibility to specify a number of equal annotations that can be regarded as reliable. Once this number is reached, the according annotations will be automatically approved and don’t need to be manually revised. 🥳

This number of required annotations can be specified upon the setup of the annotation tasks. Additionally, during this setup step, the task type needs to be selected, as well as the data sources and users who should perform the annotations and reviews respectively. A bunch of tasks will then be automatically created by mapping data and annotators - without any additional manual assignment effort.

‍

Two examplary screens to showcase a task creation. — Example for the task creation.

‍

At the very end, all finished and approved annotations can be exported and used for training purposes.

Challenges during the Implementation

Let’s take a dive into the development now. During the implementation, we stumbled upon some fascinating challenges that needed to be tackled: enabling the communication between different components, handling diverse data, and providing a way for users to translate the text to annotate.

Components Communication

The text annotation tool is decomposed into numerous small and dedicated components, each handling only one specific task. This approach is adopted to address the challenges associated with testing, debugging, and maintaining large components. These smaller components are designed to be reused across different contexts throughout the tool in question and throughout the whole application.

Challenge: The decomposition introduces a communication challenge between components, particularly when they dynamically alter the states of variables based on user-defined actions.

Solution: Reactive programming using NgRx. NgRx is a state management library for Angular applications inspired by the Redux pattern. It provides a predictable and centralized state management approach, making it easier to manage the state of the application in a scalable and maintainable way.

Key Concepts are:

Actions: simple objects that have a type property and sometimes a payload, which are dispatched to describe user interactions or other events.
Reducers: pure functions which handle how the state changes in response to actions. They take the current state and an action as params, and return a new state.
Selectors: used for deriving and selecting data from the store.
Effects: used for handling side effects like asynchronous operations (e.g., HTTP requests). They listen for actions and can dispatch new actions as a result.

‍

Example process of state management lifecycle, Source: NGRX — Examplary Source: https://v8.ngrx.io/generated/images/guide/store/state-management-lifecycle.png

‍

Data handling

In essence, annotating involves the assignment of attributes to word(s) or classes to documents. However, these attributes and classes extend beyond mere data; they represent nodes interconnected through relationships, forming knowledge graphs. These knowledge graphs enable the integration of data from diverse sources and formats, presenting a unified view that facilitates improved decision-making and analysis.

Challenge: Alongside using knowledge graphs, we encounter the task of handling diverse data such as documents themselves, users, permissions, and so on.

Solution: We are using 2 databases, each for its own purpose, and data models.

GraphDB: A graph database which is optimized for managing and querying graph data. It stores data in the form of nodes and edges, allowing for the representation of complex relationships between entities. It is ideal for applications that involve highly interconnected data, such as knowledge graphs, and it excels in scenarios where relationships between entities are crucial.
MongoDB: A document-oriented NoSQL database that stores data in BSON (binary JSON) format. It is suitable for storing and retrieving large amounts of semi-structured or unstructured data for a wide range of applications with rapidly evolving schemas. It is versatile and can handle diverse data structures.

Scheme to demonstrate how two data bases are used for the data handling. In the example MongoDB & GraphDB.h — Handling diverse data with MongoDB and GraphDB

‍

Translation Feature

Within our dynamic annotation tool, documents could be provided through different channels, such as emails or invoices, over which we don’t have control.

Challenge: Annotating documents can often pose difficulties, particularly when dealing with content in an unfamiliar language.

Solution: In response to the diverse linguistic challenges posed by documents, we introduced a translator. This feature offers annotators the possibility to choose the language they are familiar with, enabling the seamless translation of the text within a split-view interface. This translator is integrated as a service within our infrastructure to streamline the annotation process, ensuring accessibility and efficiency.

Desktopscreen with an example of the translator tooling. — Example visual of the translator

‍

Are we done with the tool then? Not quite yet. There remain some implementation details yet to be tackled. One of those lies in navigating the challenge of word splitting in an annotation task, which poses a unique challenge and demands a specific approach. Our quest is to employ a machine learning-based tokenizer that's good at understanding words based on their context. While the conventional strategy relies on spaces and special characters as natural word dividers, the landscape becomes complicated when confronting entities like emails, phone numbers or even addresses. In these instances, the conventional rules fail, demanding a more sophisticated mechanism capable of recognizing and preserving the integrity of these specific entities. This complexity shows why it's important to create a service based machine learning tokenizer that follows regular language rules, and can also adjust to different types of information, making sure our annotation process runs smoothly.

Apart from this, we ran an internal testing session where some developers clicked their ways for the tool. The findings combined with our latest round of feedback revealed multiple venues for future development of the TAT, such as annotation statistics, improving the accessibility of the tool, and maybe even elements of gamification.

With this independent implementation, it is possible to make customizations individually. Missing functionalities in existing solutions can be added at any time during in-house development. In addition, it is possible to respond individually to the specific requirements of users.

But for now, we are positive, the TAT may become a central training tool at elevait and ultimately helps to bridge the gap between technology and human expression. After all, the benefits we can gain from NLP go beyond words.

‍