a major breakthrough in AI handwriting generation

A team from Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) has developed an AI system capable of imitating a person’s writing style based on a few paragraphs of original writing. Researchers who shared initial results of their research at the 2021 International Conference on Computer Vision (ICCV) recently received a patent from the United States Patent and Trademark Office for the tool.

The team presenting “Handwriting Transformers” included Assistant Professor of Computer Vision Rao Muhammad Anwer, Associate Professor of Computer Science Vison Salman Khan, Deputy Head of the Department of Computer Vision and Professor of Computer Vision Fahad Shahbaz Khan and Ankan Kumar Bhunia.

Previous research has relied on generative adversarial networks (GANs). However, although these approaches can capture a writer’s general style, for example, the slant of writing or the width of the strokes that make up letters, they encounter two main problems.

First, the link between style and content is weak, as these characteristics are treated separately and fused together, resulting in a lack of explicit engagement at the character level. On the other hand, they do not explicitly encode local style patterns, such as character style and ligatures, which can be found, for example, in the word heart or the Latin phrase ex aequo.

To overcome these limitations, researchers took a new approach using vision transformers, neural networks designed for computer vision tasks.

Fahad Khan explains:

“To imitate someone’s writing style, we want to look at the entire text, and only then will we begin to understand how the writer linked the characters, how the writer linked closely spaced letters or words. All of these tasks require some kind of global receptive field, which is not easy with convolutional neural networks. We identified this gap in existing methods and adopted this transformer-based method.”

The scientists compared their handwriting textual imaging approach, HWT (Handwriting Transformers), with two other handwriting generation technologies. They asked 100 people to evaluate the text generated by the different models. They preferred HWT over other text generators in 81% of cases.

A qualitative comparison of HWT with two other handwriting generators, GANwriting and Davis et al. All three generators were instructed to produce the same text: “No two people can write exactly the same way, just as no two people can have the same fingerprints. » All three apps were trained on handwritten text samples (far left column) by six different editors. Davis et al. they capture a writer’s general style, for example, the slant of the text, but have difficulty imitating character-specific stylistic details. GANwriting is limited by the length of words it can imitate and was not able to complete the provided textual content – ​​for example, it generated the word “accurate” instead of “precisely”. MBZUAI researchers’ approach better mimics global and local style patterns, thus generating more realistic writing.

They also showed the original and generated text, participants were unable to distinguish the two, thus validating the performance of the AI ​​system.

Although this advance opens the way to promising applications, researchers are aware of the ethical implications linked to their technology and warn of the potential danger of counterfeits and other abuses. They emphasize the need to take steps to combat them as part of responsible deployment.

Rao Muhammad Anwer states:

“We are very cautious about this because it can be misused. Handwriting represents a person’s identity, so we think carefully about it before implementing it.”.

Article references: blog MBZUAI


Rao Muhammad Anwer, Vison Salman Khan, Fahad Shahbaz Khan, Ankan Kumar Bhunia

Leave a Comment