CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters

Difference between BERT and CharacterBERT CharacterBERT is a variant of BERT that tries to go back to the simpler days where models produced single embeddings for single words (or rather, tokens). In practice, the only difference is that instead of relying on WordPieces, CharacterBERT uses a CharacterCNN module just like the one that was used in ELMo [1]. The next figure show the inner mechanics of the CharacterCNN and compares it to the original WordPiece system in BERT....

November 9, 2020 · 7 min · Hicham EL BOUKKOURI