Register / Login here to explore customize features.
When language models are tuned to maximize sales, votes, or clicks, they begin to deceive—even under “truthful” instructions, a new Stanford report says.