🔥 Burn Fat Fast. Discover How! 💪

​​Facebook AI has built a system called TextStyleBrush that ca | Gradient Dude

​​Facebook AI has built a system called TextStyleBrush that can replace text both in scenes and handwriting — in one shot — using only a single example word.
The model was made self-supervised because it is utterly hard to collect labeled pairs of text in different conditions, and to annotate the segmentation masks for text (although I think it can be done using synthetic generation).

The model is trained to understand unlimited text styles for not just different typography and calligraphy, but also for different transformations, like rotations, curved text, and deformations that happen between paper and pen when handwriting; background clutter; and image noise. The main idea is to disentangle the content of a text image from all aspects of the appearance of the entire word box. The representation of the overall appearance can then be applied as a one-shot-transfer without retraining on the novel source style samples.

The model consists of a style encoder, content encoder, and stylized text generator (plus a bunch of losses).
The generator architecture is based on the StyleGAN2 model. However, the design of StyleGAN2 has an important limitation: StyleGAN2 is an unconditional model, meaning it generates images by sampling a random latent vector. For generating photo-realistic text images, however, one needs to control the output based on two separate sources: the desired text content and style. This is solved by extracting layer-specific style information and injecting it at each layer of the generator (it is some sort of conditional instance normalization).

The losses are the following: 1) reconstruction and cycle loss; 2) Discriminator real/fake; 3) Recognizer - the network that recognizes text on the stylized image and makes sure that no content is lost; 4) Typeface classifier - a pretrained network that measures how well the generator captures the style of input.

Results are quite striking!
Now imagine how you drive through the busy streets of Hong Kong and see street signs projected on the windshield of your car and translated online. Or one day used we will send personalized messages by generating some creative images with the text embedded in them (instead of stickers).

Blogpost
Paper