Assessing Rationality in Language Models
Fall 2024 - Present
Partners: Dr. Thomas Hofweber, MURGe-Lab
This project is significant for evaluating whether language models live up to the norms or rationality, which ones might do better than others, and how to improve internal coherence. This is a followup on this project here: https://arxiv.org/abs/2406.03442 There is a series of experiments that are conceptually worked out, but not coded and evaluated. They involve evaluating language models for a certain kind of coherence tied to rationality which can be calculated from next-token probabilities (I can explain the details, the basic idea is spelled out in the paper cited above). To carry out the experiments one needs to do certain fairly simple calculations on next token probabilities given a certain prompt in various language models.