搜索优化
English
搜索
Copilot
图片
视频
地图
资讯
购物
更多
航班
旅游
酒店
笔记本
Top stories
Sports
U.S.
Local
World
Science
Technology
Entertainment
Business
More
Politics
时间不限
过去 1 小时
过去 24 小时
过去 7 天
过去 30 天
按相关度排序
按时间排序
10 天
AI“短板”暴露:研究发现 GPT-4 Turbo 回答高级历史题准确率仅 46%
研究结果于上月在知名 AI 会议 NeurIPS 上公布,结果显示, 即使是表现最佳的 GPT-4 Turbo 模型,其准确率也仅为 46%,并不比随机猜测高多少。 论文合著者、伦敦大学学院计算机科学副教授 Maria del Rio-Chanona ...
凤凰网
10 天
AI“短板”暴露:研究发现GPT-4 Turbo回答高级历史题准确率仅46%
研究团队开发了一个名为“Hist-LLM”的基准测试工具,其根据 Seshat 全球历史数据库来测试答案的正确性,Seshat 全球历史数据库是一个以古埃及智慧 ...
earth
8 天
AI struggles to understand human history and fails miserably when tested
The study, which is the first of its kind, evaluates the historical knowledge of leading AI models such as ChatGPT-4, Llama, ...
11 天
on MSN
AI isn’t very good at history, new paper finds
AI might excel at certain tasks like coding or generating a podcast. But it struggles to pass a high-level history exam, a ...
PsyPost on MSN
8 天
AI models struggle with expert-level global history knowledge
Researchers recently evaluated the ability of advanced artificial intelligence (AI) models to answer questions about global ...
Digital information world
6 天
AI Models Struggle with Historical Accuracy, GPT-4 Turbo Only Scores 46%
According to a new study, many AI models don't answer accurately about world history which is a very concerning matter. The ...
来自MSN
9 天
Can AI pass a Ph.D.-level history test? New study says 'not yet'
For the past decade, complexity scientist Peter Turchin has been working with collaborators to bring together the most ...
EurekAlert!
10 天
Can ChatGPT pass a Ph.D.-level history test?
Peter Turchin, from the Complexity Science Hub, and an international team of collaborators decided to evaluate the historical ...
一些您可能无法访问的结果已被隐去。
显示无法访问的结果
反馈