Knowledge Quiz
Test your understanding of this article
1.What is the primary reason language model (LM) probability is considered an unreliable quality estimator according to the article?
2.Which of the following is NOT identified as a reason for the issue of misleading low output quality in LMs when multiple output options are valid?
3.What is the main purpose of the 'Sigmoid Head' proposed in the article?
4.How does the Sigmoid Head address the limitation related to LMs' training data using single, one-hot encoded references?
