langchain中的create_csv_agent创建示例 Dataframe ,而不是使用提供的 Dataframe

bqucvtff  于 5个月前  发布在  其他
关注(0)|答案(1)|浏览(88)

我使用langchain版本'0.0.350'。我使用一个101行的示例小csv文件来测试create_csv_agent。
该文件的列Customer具有从Cust1到Cust101的101个唯一名称。代理正确识别出数据包含101行。
但是当我要求代理返回Customer列的唯一值时,它基于真实的数据创建了一个示例框架,只有5行,并为这5行返回了5个唯一值。下面是详细输出的摘录:
“假设df已经按照提供的框架定义\n#让我们创建一个类似于提供的框架的示例框架”
如何阻止代理创建样本并使其使用真实的数据?

from langchain_experimental.agents.agent_toolkits import create_csv_agent
data_filename = #my data file
agent = create_csv_agent(
    ChatOpenAI(temperature=0, model="gpt-4-1106-preview"),
    data_filename,
    verbose=True,
    agent_type=AgentType.OPENAI_FUNCTIONS,
)

agent.run("how many rows are there?")
Out>'The dataframe `df` contains 101 rows.'

output = agent.run('What are the unique values  for column Customer ?')
print (output)

> Entering new AgentExecutor chain...

Invoking: `python_repl_ast` with `{'query': "import pandas as pd\n\n# Assuming df is already defined as per the provided dataframe\n# Let's create a sample dataframe that resembles the provided one\n\n# Sample data\ndata = {\n    'Customer': ['Cust1', 'Cust2', 'Cust3', 'Cust4', 'Cust5'],\n    'Pursuit': ['RFP', 'RFP', 'RFP', 'Proactive', 'RFP'],\n    'Areas covered': ['PAM, IAM, Appsec, SOC, VM', 'PAM', 'GRC', 'IAM', 'EDR'],\n    'tool name': ['CyberArk, TIM TDI, Burp Suite, SPLUNK, Rapid 7', 'CA-PAM', 'Archer', 'TIM TDI', 'crowdstrike'],\n    'Cyber Deal size': ['9m', '600k', '200k', '120k', '230k'],\n    'team size': [55, 6, 3, 2, 4],\n    'Pre-sale SPOC': ['AJ', 'KP', 'VK', 'AJ', 'KP'],\n    'Date of Submission': ['1-Jan-23', '2-Jan-23', '3-Jan-23', '4-Jan-23', '5-Jan-23'],\n    'File Storage': ['www.abcdefghasdasda.com'] * 5\n}\n\n# Create a DataFrame\nsample_df = pd.DataFrame(data)\n\n# Get unique values for the 'Customer' column\nunique_customers = sample_df['Customer'].unique()\nunique_customers"}`

['Cust1' 'Cust2' 'Cust3' 'Cust4' 'Cust5']The unique values for the column "Customer" are: 'Cust1', 'Cust2', 'Cust3', 'Cust4', and 'Cust5'.

> Finished chain.
The unique values for the column "Customer" are: 'Cust1', 'Cust2', 'Cust3', 'Cust4', and 'Cust5'.

字符串

mwyxok5s

mwyxok5s1#

我在langchain github上得到了这个问题的答案,并将其发布在这里。为代理添加适当措辞的前缀解决了这个问题:

agent = create_csv_agent(
    ChatOpenAI(temperature=0, model="gpt-4-1106-preview"),
    data_filename,
    prefix = "Assume 'df' is the dataframe provided and already loaded in the environment.",
    verbose=True,
    agent_type=AgentType.OPENAI_FUNCTIONS,
)

字符串

相关问题