AI Features

The WordPiece Tokenizer

Learn about the WordPiece tokenizer and how it works.

BERT uses a special type of tokenizer called a WordPiece tokenizer. The WordPiece tokenizer follows the subword tokenization scheme. Let's understand how the WordPiece tokenizer works with the help of an example. Consider the following sentence:

Tokenize the sentence

Now, if we tokenize the sentence using the WordPiece tokenizer, then we obtain the ...

Ask