A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
An agent based LLM assistant that extends RAG with batch entity extraction and SQL querying to improve performance on mu
The Agentic Documents Assistant is an LLM assistant that allows users to answer complex questions from their business documents through natural conversations. It supports answering factual questions by retrieving information directly from documents using semantic search with the popular RAG design pattern. Additionally, it answers analytical questions such as which contracts will expire in the next 3 months? by translating user questions into SQL queries and running them against a database of entities extracted from the documents using a batch process. It is also able to answer complex multi-step questions by combining retrieval, analytical, and other tools and data sources using an LLM agent design pattern.
To learn more about the design and architecture of this solution, check the accompanying AWS ML blog post: Boosting RAG-based intelligent document assistants using entity extraction, SQL querying, and agents with Amazon Bedrock.
The following architecture diagrams depicts the design of the solution.

Below an outline of the main folders included in this asset.
| Folder | Description |
|---|---|
backend | Includes a Typescript CDK project implementing IaaC to setup the backend infrastructure. |
frontend | A Typescript CDK project to setup infrastructure for deploying and hosting the frontend app with AWS Amplify. |
frontend/chat-app | A Next.js app with AWS Cognito Authentication and secured backend connectivity. |
data-pipelines | Notebooks implementing SageMaker Jobs and Pipeline to process the data in batch. |
experiments | Notebooks and code showcasing different modules of the solution as standalone experiments for research and development. |
Follow the insturctions below to setup the solution on your account.
To install the solution in your AWS account:
backend folder.npm install to install the dependencies.npx cdk bootstrap.npx cdk deploy to deploy the stack.frontend folder.npm install to install the dependencies.npx cdk deploy to deploy a stack that builds an Amplify CI/CDdata-pipelines/04-sagemaker-pipeline-for-documents-processing.ipynb. This will load two files: (1) a pre-created csv file to load into the SQL tables, and (2) a json file containing preprocessed Amazon financial documents which will be used to create the semantic search index used by the LLM assistant.After running the above steps successfully, you can start interacting with the assistant and ask questions.
If you want to update the underlying documents and extract new entities, explore the notebooks 1 to 5 under the experiments folder.
To remove the resources of the solution:
backend folder by running npx cdk destroy.frontend folder by running npx cdk destroy.The authors of this asset are:
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.
Pocket Flow: Codebase to Tutorial
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
npx CLI installing 100+ agents, commands, hooks, and integrations in one command