tBERT
tBERT
Original address of paper
https://www.aclweb.org/anthology/2020.acl-main.630.pdf
Reason the idea come up
The semantic similarity could provide additional signal and information to model. Also, the integration of the domain could effectively help the model get better performance. That is the reason tBERT come up.
Main idea of the technology
First create the place for CLS in BERT, representing the domain of the input.
The input of sentences put into the topic model and get the average value.
At the meantime, the input the sentences into the BERT
The paper not mention the detail design of loss, from my understanding, the CLS should be included in the loss, and no other digest article for the tBERT, may it should be confirmed for the further research.
Thinking
Fine tuning could also helps model to deal with the domain shift, but the experiment showed that the result of tBERT is better than the fine tuning, especially for the specific domain, e.g., medical, finance, etc.