首先,我是参考了这篇文章做的:https://blog.csdn.net/javastart/article/details/130255982
我安装好了库,文件摆放位置也和文章里一样,成功运行了代码,并且训练成功,回答都是切合我给的pdf文档的。当回答是英文的时候,可以很完整,很长的回答,但是要求回答是中文的时候,却只能回答100字左右,并且不完整,叫他继续也没有用,个人感觉是gpt_index与lanchain两个库对中文的处理有误的原因,但缺乏证据,这是我正在看的文章:https://blog.csdn.net/qq_56591814/article/details/131376763
源代码的博客的评论区也有人和我一样问题。恳请各位佬们帮忙解答!!
以下是代码:
from gpt_index import SimpleDirectoryReader, GPTListIndex, GPTSimpleVectorIndex, LLMPredictor, PromptHelper
from langchain import OpenAI
import gradio as gr
import os
import openai
openai.api_base = "https://openai.wndbac.cn/v1"
os.environ["OPENAI_API_KEY"] = 'sess-KRbSWktd1VqkhoRIja1nV7oswVPqNO3rst9SYT6L'
def construct_index(directory_path):
max_input_size = 4096
num_outputs = 3000
max_chunk_overlap = 20
chunk_size_limit = 3000
prompt_helper = PromptHelper(max_input_size, num_outputs, max_chunk_overlap, chunk_size_limit=chunk_size_limit)
llm_predictor = LLMPredictor(llm=OpenAI(temperature=0.7, model_name="text-davinci-003", max_tokens=num_outputs))
documents = SimpleDirectoryReader(directory_path).load_data()
index = GPTSimpleVectorIndex(documents, llm_predictor=llm_predictor, prompt_helper=prompt_helper)
index.save_to_disk('index.json')
return index
def chatbot(input_text):
index = GPTSimpleVectorIndex.load_from_disk('index.json')
response = index.query(input_text, response_mode="compact")
return response.response
iface = gr.Interface(fn=chatbot,
inputs=gr.inputs.Textbox(lines=7, label="Enter your text"),
outputs="text",
title="Custom-trained AI Chatbot")
index = construct_index("docs")
iface.launch(share=True)
以下是运行的截图(英文回答):
以下是运行的截图(中文回答):