Jonathan Star 2024-10-30 16:52 采纳率: 70.5%
浏览 3

ares rag测评的数据集怎么理解?

ares rag测评的数据集怎么理解?
这个文件是怎么获得的,是不是用一个llm+rag系统获取,输入Query 输出Answer,Context_Relevance_Label Answer_Relevance_Label Answer_Faithfulness_Label是自己标的几个例子。
但是这个Document是什么,rag系统一般会存入好几个文档吧,从好多个文档定位到关键词的那个文档应该是rag的能力,这里直接就有对应文档了

stanford-futuredata/ARES: Automated Evaluation of RAG Systems
https://github.com/stanford-futuredata/ARES

datasets\example_files\nq_few_shot_prompt_for_judge_scoring.tsv

Query    Document    Answer    Context_Relevance_Label    Answer_Relevance_Label    Answer_Faithfulness_Label
when did the first fleet arive in australia    The 11 ships of the First Fleet set sail from Portsmouth on 13 May 1787. The fleet called at Rio de Janeiro for supplies from 6 August to 4 September. The leading ship, reached Botany Bay setting up camp on the Kurnell Peninsula, on 18 January 1788. Phillip soon decided that this site, chosen on the recommendation of Sir Joseph Banks, who had accompanied James Cook in 1770, was not suitable, since it had poor soil, no secure anchorage and no reliable water source. After some exploration Phillip decided to go on to Port Jackson, and on 26 January the marines and convicts landed at Sydney Cove, which Phillip named after Lord Sydney.     18 January 1788    [[Yes]]    [[Yes]]    [[Yes]]
when did red dead redemption 1 come out    Red Dead Redemption is a Western action-adventure game developed by Rockstar San Diego and published by Rockstar Games. A spiritual successor to 2004's "Red Dead Revolver", it is the second game in the "Red Dead" series, and was released for the PlayStation 3 and Xbox 360 in May 2010. "Red Dead Redemption" is set during the decline of the American frontier in the year 1911 and follows John Marston, a former outlaw whose wife and son are taken hostage by the government in ransom for his services as a hired gun. Having no other choice, Marston sets out to bring three members of his former gang to justice.     May 2010    [[Yes]]    [[Yes]]    [[Yes]]
tools made from high-speed tool steel are generally used for what type of machining operations    Milling is the process of machining using rotary cutters to remove material by advancing a cutter into a workpiece. This may be done varying direction on one or several axes, cutter head speed, and pressure. Milling covers a wide variety of different operations and machines, on scales from small individual parts to large, heavy-duty gang milling operations. It is one of the most commonly used processes for machining custom parts to precise tolerances.     milling    [[Yes]]    [[Yes]]    [[Yes]]
where does the papillary layer of the skin lie    The dermis or corium is a layer of skin between the epidermis (with which it makes up the cutis) and subcutaneous tissues, that primarily consists of dense irregular connective tissue and cushions the body from stress and strain. It is divided into two layers, the superficial area adjacent to the epidermis called the papillary region and a deep thicker area known as the reticular dermis. The dermis is tightly connected to the epidermis through a basement membrane. Structural components of the dermis are collagen, elastic fibers, and extrafibrillar matrix. It also contains mechanoreceptors that provide the sense of touch and thermoreceptors that provide the sense of heat. In addition, hair follicles, sweat glands, sebaceous glands (oil glands), apocrine glands, lymphatic vessels, nerves and blood vessels are present in the dermis. Those blood vessels provide nourishment and waste removal for both dermal and epidermal cells.      Nicole DuPort    [[Yes]]    [[No]]    [[No]]
which episode does gideon die in criminal minds    In the Season 10 episode "Nelson's Sparrow", Gideon was murdered off-screen, having been shot dead at a close range by a serial killer named Donnie Mallick (Arye Gross), which prompts the BAU team to investigate Gideon's murder. During the flashbacks focusing on a young version of him for the episode which show him working at the BAU in 1978, he is played by Ben Savage.     Little Boy    [[Yes]]    [[No]]    [[No]]
who destroyed the first temple and exiled the two tribes of judah to babylon    Fuels include natural substances like dry wood, peat and coal. pitch, petroleum jelly, charred cotton, shaved rubber, and frayed synthetic cloth can be used as kindling. Candles provide illumination and can help start a fire. Alcohol, DIY and commercial alcohol stoves are made and carried by hikers. Oil, petroleum, vegetable, and tallow can help start and feed a fire. Propane bottles are made for backpacking. Charcoal or briquettes could be packed in the fire.     Nebuchadnezzar    [[No]]    [[No]]    [[No]]
what is a e8 in the marine corps    A support van tails the Cash Cab, containing producers and a camera crew for the various street shots. The staff provide logistical information and questions by way of a walkie-talkie and earpiece worn by the host. The Cash Cab is equipped with ten cameras: one on the host, three on the passengers, one pointing forward from the back window of the cab, and three (front, left and right) in the advertising shell on the roof of the cab showing the passing streetscape. The rear of the cab behind the seats contains ten recording decks and other production equipment.     walkie - talkie and earpiece    [[No]]    [[No]]    [[No]]
  • 写回答

1条回答 默认 最新

  • 无疆76 2024-10-30 17:03
    关注

    以下回复参考:皆我百晓生券券喵儿等免费微信小程序作答:

    这个数据集似乎是为了评估和测试对话或问答系统而创建的,特别是在关系答案生成(RAG)系统中。数据集包括了查询(Query)、文档(Document)、答案(Answer)以及三个标签:Context_Relevance_Label、Answer_Relevance_Label和Answer_Faithfulness_Label。这些标签用于评估系统生成的答案的质量。

    1. Query:这是用户提出的问题或查询。

    2. Document:这是与查询相关的文档或资料。RAG系统可能会存储多个文档,并在接收到查询时在内部检索相关文档以提供答案。在这个数据集中,为每个查询提供了一个相关的文档,这使得评估系统能够在正确的文档中找到答案的能力成为可能。

    3. Answer:这是系统基于查询和文档生成的答案。

    4. Context_Relevance_Label、Answer_Relevance_Label和Answer_Faithfulness_Label:这三个标签用于评估答案的质量。

      • Context_Relevance_Label:评估答案是否与查询的上下文相关。
      • Answer_Relevance_Label:评估答案是否直接回答了查询的问题。
      • Answer_Faithfulness_Label:评估答案是否忠实于原始文档,即答案是否准确地反映了文档中的信息。

    这个数据集可能是通过运行一个RAG系统(可能包括一个大语言模型)来生成的,通过输入查询,系统输出答案,然后人工或自动地打上标签。数据集中的例子涵盖了不同类型的问题,包括历史、日期、地点、事实信息等。

    关于你提到的“是不是用一个llm+rag系统获取”,是的,这种数据集通常是通过使用大型语言模型(LLM)和RAG系统生成的。而关于“rag系统一般会存入好几个文档吧”,是的,RAG系统通常会存储多个文档,并在接收到查询时检索相关文档以提供答案。在这个数据集中,每个查询似乎都有一个相关的文档,这可能是为了简化评估过程而特意设计的。

    至于“stanford-futuredata/ARES”是一个用于评估RAG系统的项目或平台,这个数据集可能是该项目的一部分或相关资源。

    简而言之,这个数据集是为了测试和评估RAG系统的性能而创建的,包含了查询、相关文档、答案以及评估答案质量的标签。

    评论

报告相同问题?

问题事件

  • 创建了问题 10月30日

悬赏问题

  • ¥15 PADS Logic 原理图
  • ¥15 PADS Logic 图标
  • ¥15 电脑和power bi环境都是英文如何将日期层次结构转换成英文
  • ¥20 气象站点数据求取中~
  • ¥15 如何获取APP内弹出的网址链接
  • ¥15 wifi 图标不见了 不知道怎么办 上不了网 变成小地球了
  • ¥50 STM32单片机传感器读取错误
  • ¥15 (关键词-阻抗匹配,HFSS,RFID标签天线)
  • ¥15 机器人轨迹规划相关问题
  • ¥15 word样式右侧翻页键消失