Thank you very much . Responses below for your questions.

Nov 8, 2020

1) Yes, the vocabulary was generated using WordPiece tokenizer using transformer (BertWordPieceTokenizer)

2) When fine tuning (I call it continual training in the article to avoid confusion with supervised fine tuning), the training started with the pre-trained vectors from pre-training (these are within the pytorch_mode.bin file).

Written by Ajit Rajasekharan

No responses yet