Thank you very much . Responses below for your questions.

1) Yes, the vocabulary was generated using WordPiece tokenizer using transformer (BertWordPieceTokenizer)

2) When fine tuning (I call it continual training in the article to avoid confusion with supervised fine tuning), the training started with the pre-trained vectors from pre-training (these are within the pytorch_mode.bin file).

--

Machine learning practitioner

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store