A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
Power up your data science workflow with ChatGPT.
pandas-gpt Power up your data science workflow with LLMs.
pandas-gpt is a Python library for doing almost anything with a pandas DataFrame using ChatGPT or any other Large Language Model (LLM).
pip install pandas-gpt[openai]
You may also want to install the optional openai and/or litellm dependencies.
Next, set the OPENAI_API_KEY environment variable to your OpenAI API key, or use the following code snippet:
import openai
openai.api_key = '<API Key>'
If you're looking for a free alternative to the OpenAI API, we encourage using Google Gemini for code completion:
pip install pandas-gpt[litellm]
import pandas_gpt
pandas_gpt.completer = pandas_gpt.LiteLLM('gemini/gemini-1.5-pro', api_key='...')
Setup and usage examples are available in this Google Colab notebook.
import pandas as pd
import pandas_gpt
df = pd.DataFrame('https://gist.githubusercontent.com/bluecoconut/9ce2135aafb5c6ab2dc1d60ac595646e/raw/c93c3500a1f7fae469cba716f09358cfddea6343/sales_demo_with_pii_and_all_states.csv')
# Data transformation
df = df.ask('drop purchases from Laurenchester, NY')
df = df.ask('add a new Category column with values "cheap", "regular", or "expensive"')
# Queries
weekday = df.ask('which day of the week had the largest number of orders?')
top_10 = df.ask('what are the top 10 most popular products, as a table')
# Plotting
df.ask('plot monthly and hourly sales')
top_10.ask('horizontal bar plot with pastel colors')
# Allow changes to original dataset
df.ask('do something interesting', mutable=True)
# Show source code before running
df.ask('convert prices from USD to GBP', verbose=True)
It's possible to use a different language model with the completer config option:
import pandas_gpt
# Global default
pandas_gpt.completer = pandas_gpt.OpenAI('gpt-3.5-turbo')
# Custom completer for a specific request
df.ask('Do something interesting with the data', completer=pandas_gpt.LiteLLM('gemini/gemini-1.5-pro'))
By default, API keys are picked up from environment variables such as OPENAI_API_KEY.
It's also possible to specify an API key for a particular call:
df.ask('Do something important with the data', completer=pandas_gpt.OpenAI('gpt-4o', api_key='...'))
pandas_gpt.completer = pandas_gpt.OpenAI('gpt-4o')
pandas_gpt.completer = pandas_gpt.LiteLLM('gemini/gemini-1.5-pro')
pandas_gpt.completer = pandas_gpt.LiteLLM('huggingface/meta-llama/Meta-Llama-3.1-8B-Instruct')
pandas_gpt.completer = pandas_gpt.OpenRouter('anthropic/claude-3.5-sonnet')
If you want to use the Azure OpenAI Service,
you can globally configure the openai and pandas-gpt packages:
import openai
openai.api_type = 'azure'
openai.api_base = '<Endpoint>'
openai.api_version = '<Version>'
openai.api_key = '<API Key>'
import pandas_gpt
pandas_gpt.completer = pandas_gpt.AzureOpenAI(
model='gpt-3.5-turbo',
engine='<Engine>',
deployment_id='<Deployment ID>',
)
It's also possible to use fully custom code generation:
def my_custom_completer(prompt: str) -> str:
# Use an LLM or any other method to create a `process()` function that
# takes a pandas DataFrame as a single argument, does some operations on it,
# and return a DataFrame.
return 'def process(df): ...'
pandas_gpt.completer = my_custom_completer
Please note that the limitations of ChatGPT also apply to this library. I would recommend using pandas-gpt in a sandboxed environment such as Google Colab, Kaggle, or GitPod.
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
npx CLI installing 100+ agents, commands, hooks, and integrations in one command
干净、强大、属于你的 AI Agent 平台 --AI agents, without the clutter.
Native macOS app to monitor Claude AI usage limits and watch your coding sessions live