...

[2402.13213] Probabilities of Chat LLMs Are Miscalibrated but Still Predict Correctness on Multiple-Choice Q&A

[2402.13213] Probabilities of Chat LLMs Are Miscalibrated but Still Predict Correctness on Multiple-Choice Q&A
[Submitted on 20 Feb 2024 (v1), last revised 19 Mar 2025 (this version, v3)] View a PDF of the paper ...
Read more

MoonCast: High-Quality Zero-Shot Podcast Generation

[2402.13213] Probabilities of Chat LLMs Are Miscalibrated but Still Predict Correctness on Multiple-Choice Q&A
arXiv:2503.14345v1 Announce Type: cross Abstract: Recent advances in text-to-speech synthesis have achieved notable success in generating high-quality short utterances for ...
Read more

Clock and Calendar Understanding Challenges in Multimodal LLMs

[2402.13213] Probabilities of Chat LLMs Are Miscalibrated but Still Predict Correctness on Multiple-Choice Q&A
[Submitted on 7 Feb 2025 (v1), last revised 18 Mar 2025 (this version, v2)] View a PDF of the paper ...
Read more

Min-p Sampling for Creative and Coherent LLM Outputs

[2402.13213] Probabilities of Chat LLMs Are Miscalibrated but Still Predict Correctness on Multiple-Choice Q&A
[Submitted on 1 Jul 2024 (v1), last revised 16 Mar 2025 (this version, v3)] View a PDF of the paper ...
Read more

a Tool for Fine-Grained Machine-Generated Text Detection

[2402.13213] Probabilities of Chat LLMs Are Miscalibrated but Still Predict Correctness on Multiple-Choice Q&A
[Submitted on 8 Aug 2024 (v1), last revised 14 Mar 2025 (this version, v3)] Authors:Mervat Abassy, Kareem Elozeiri, Alexander Aziz, ...
Read more

[2502.17308] Implicit Word Reordering with Knowledge Distillation for Cross-Lingual Dependency Parsing

[2402.13213] Probabilities of Chat LLMs Are Miscalibrated but Still Predict Correctness on Multiple-Choice Q&A
[Submitted on 24 Feb 2025 (v1), last revised 14 Mar 2025 (this version, v2)] View a PDF of the paper ...
Read more

[2503.09701] Have LLMs Made Active Learning Obsolete? Surveying the NLP Community

[2402.13213] Probabilities of Chat LLMs Are Miscalibrated but Still Predict Correctness on Multiple-Choice Q&A
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals ...
Read more

[2501.14073] LLMs are Vulnerable to Malicious Prompts Disguised as Scientific Language

[2402.13213] Probabilities of Chat LLMs Are Miscalibrated but Still Predict Correctness on Multiple-Choice Q&A
[Submitted on 23 Jan 2025 (v1), last revised 18 Feb 2025 (this version, v2)] View a PDF of the paper ...
Read more

Consuming Resource via Auto-generation for LLM-DoS Attack under Black-box Settings

[2402.13213] Probabilities of Chat LLMs Are Miscalibrated but Still Predict Correctness on Multiple-Choice Q&A
[Submitted on 18 Dec 2024 (v1), last revised 18 Feb 2025 (this version, v3)] View a PDF of the paper ...
Read more

A Review of Tools & Resources

[2402.13213] Probabilities of Chat LLMs Are Miscalibrated but Still Predict Correctness on Multiple-Choice Q&A
[Submitted on 24 Jun 2024 (v1), last revised 17 Feb 2025 (this version, v4)] Authors:Shayne Longpre, Stella Biderman, Alon Albalak, ...
Read more
12315 Next