Idea
Create a dataset of moral dilemmas (like trolley problems, real-world ethical cases, etc.) and survey how different people respond to them. Then use that dataset to evaluate whether existing LLMs (GPT-4, Claude, etc.) mimic human responses, diverge in systematic ways, or exhibit biases. Explore implications for alignment.
Sign in with GitHub to comment
Loading comments...
Citation
Cited as:
Yotam, Kris. (Jul 2025). Simulating Human Moral Judgment in LLMs. krisyotam.com. https://krisyotam.com/papers/ai/simulating-moral-judgment-llms
Or
@article{yotam2025simulating-moral-judgment-llms,
title = "Simulating Human Moral Judgment in LLMs",
author = "Yotam, Kris",
journal = "krisyotam.com",
year = "2025",
month = "Jul",
url = "https://krisyotam.com/papers/ai/simulating-moral-judgment-llms"
}