Skip to content

Add Document QA Bot Example and Update README #1178

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 53 additions & 0 deletions examples/Retrieval/docs/sample.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
NATIONAL ELIGIBILITY CUM ENTRANCE TEST (NEET)

Introduction:
The National Eligibility cum Entrance Test (NEET) is the single medical entrance examination in India for admission to MBBS, BDS, and other undergraduate medical courses in approved medical and dental colleges. It replaced multiple medical entrance exams with a single standardized test.

Key Details:

1. Conducting Body:
- Conducted by the National Testing Agency (NTA)
- Under the Ministry of Education, Government of India

2. Exam Pattern:
- Mode: Pen and paper based (offline)
- Duration: 3 hours 20 minutes
- Questions: 200 multiple-choice questions (180 to be attempted)
- Subjects: Physics (45), Chemistry (45), Biology (90)
- Marking Scheme: +4 for correct, -1 for incorrect

3. Eligibility Criteria:
- Age Limit: Minimum 17 years (no upper limit)
- Educational Qualification: 10+2 with Physics, Chemistry, Biology/Biotechnology
- Minimum Marks: 50% for General, 40% for SC/ST/OBC

4. Important Dates (2024):
- Application Start: March 2024
- Exam Date: May 5, 2024
- Result Declaration: June 2024

5. Syllabus:
- Physics: Mechanics, Thermodynamics, Optics, etc.
- Chemistry: Organic, Inorganic, Physical Chemistry
- Biology: Botany and Zoology

6. Participating Institutions:
- All government and private medical colleges (except AIIMS and JIPMER)
- 15% All India Quota seats
- 85% State Quota seats

7. NEET UG Cutoff (2023):
- General: 137-720
- OBC/SC/ST: Lower cutoff ranges

8. Preparation Tips:
- Focus on NCERT textbooks
- Regular practice of MCQs
- Take mock tests
- Time management during exam

9. Contact Information:
- Official Website: neet.nta.nic.in
- Helpline: 011-40759000

Note: NEET is the only medical entrance test for undergraduate courses in India as per the Supreme Court ruling. All admissions to MBBS/BDS courses are done through NEET scores.
53 changes: 53 additions & 0 deletions examples/Retrieval/docs/test_neet.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
NATIONAL ELIGIBILITY CUM ENTRANCE TEST (NEET)

Introduction:
The National Eligibility cum Entrance Test (NEET) is the single medical entrance examination in India for admission to MBBS, BDS, and other undergraduate medical courses in approved medical and dental colleges. It replaced multiple medical entrance exams with a single standardized test.

Key Details:

1. Conducting Body:
- Conducted by the National Testing Agency (NTA)
- Under the Ministry of Education, Government of India

2. Exam Pattern:
- Mode: Pen and paper based (offline)
- Duration: 3 hours 20 minutes
- Questions: 200 multiple-choice questions (180 to be attempted)
- Subjects: Physics (45), Chemistry (45), Biology (90)
- Marking Scheme: +4 for correct, -1 for incorrect

3. Eligibility Criteria:
- Age Limit: Minimum 17 years (no upper limit)
- Educational Qualification: 10+2 with Physics, Chemistry, Biology/Biotechnology
- Minimum Marks: 50% for General, 40% for SC/ST/OBC

4. Important Dates (2024):
- Application Start: March 2024
- Exam Date: May 5, 2024
- Result Declaration: June 2024

5. Syllabus:
- Physics: Mechanics, Thermodynamics, Optics, etc.
- Chemistry: Organic, Inorganic, Physical Chemistry
- Biology: Botany and Zoology

6. Participating Institutions:
- All government and private medical colleges (except AIIMS and JIPMER)
- 15% All India Quota seats
- 85% State Quota seats

7. NEET UG Cutoff (2023):
- General: 137-720
- OBC/SC/ST: Lower cutoff ranges

8. Preparation Tips:
- Focus on NCERT textbooks
- Regular practice of MCQs
- Take mock tests
- Time management during exam

9. Contact Information:
- Official Website: neet.nta.nic.in
- Helpline: 011-40759000

Note: NEET is the only medical entrance test for undergraduate courses in India as per the Supreme Court ruling. All admissions to MBBS/BDS courses are done through NEET scores.
14 changes: 14 additions & 0 deletions examples/Retrieval/document_qa_bot.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# 📄 Document Q&A Bot using LLMWare

This example shows how to build a simple command-line chatbot that can answer questions from a document using LLMWare's built-in retrieval.

### 📁 Files:
- `sample.txt`: A small text file used for testing.
- `document_qa_bot.py`: Loads the document, indexes it, and runs a Q&A chatbot.

### 🚀 How to Run

```bash
cd examples
cd Retrieval
python document_qa_bot.py
87 changes: 87 additions & 0 deletions examples/Retrieval/document_qa_bot.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
import os
import shutil
from llmware.library import Library
from llmware.retrieval import Query

def setup_environment(docs_folder):
"""Ensure docs folder exists and contains test files"""
os.makedirs(docs_folder, exist_ok=True)

test_file = os.path.join(docs_folder, "test_neet.txt")
if not os.path.exists(test_file):
with open(test_file, "w", encoding="utf-8") as f:
f.write("NEET stands for National Eligibility cum Entrance Test. "
"It is India's medical entrance examination for undergraduate programs.")
print(f"Created test file: {test_file}")

def create_fresh_library(library_name):
"""Create a completely clean library"""
# Delete if exists
if Library().check_if_library_exists(library_name):
Library().delete_library(library_name)

# Manual cleanup
lib_path = os.path.join(os.path.expanduser("~"), "llmware", "libraries", library_name)
if os.path.exists(lib_path):
shutil.rmtree(lib_path)

# Create new
return Library().create_new_library(library_name)

def process_documents(library, docs_folder):
"""Add and process documents with clear feedback"""
print("\nProcessing documents:")
doc_files = [f for f in os.listdir(docs_folder)
if f.endswith(('.pdf', '.docx', '.txt', '.pptx', '.md'))]

for doc in doc_files:
print(f"- Found: {doc}")

# Process documents (v0.4.1 compatible)
result = library.add_files(input_folder_path=docs_folder,
chunk_size=400)

library.generate_knowledge_graph()
print("\nDocuments processed successfully!")

def query_documents(library):
"""Interactive query interface"""
print("\nReady for queries. Try asking about:")
print("- NEET exam")
print("- Medical entrance")
print("- What NEET stands for")

while True:
query = input("\nYour question (or 'quit'): ").strip()
if query.lower() in ('quit', 'exit'):
break

results = Query(library).text_query(query, result_count=2)

if results:
print(f"\nFound {len(results)} results:")
for i, res in enumerate(results, 1):
print(f"\n{i}. From {res['file_source']}:")
print(res['text'][:300].replace('\n', ' ') + "...")
else:
print("\nNo results found. Try rephrasing or check document content.")

if __name__ == "__main__":
# Configuration - matches your exact paths
DOCS_FOLDER = r"C:\Users\rites\llmware\examples\Retrieval\docs"
LIB_NAME = "neet_library"

print("NEET Document QA System (llmware v0.4.1)")
print(f"Using documents from: {DOCS_FOLDER}")

# Setup environment
setup_environment(DOCS_FOLDER)

# Create fresh library
lib = create_fresh_library(LIB_NAME)

# Process documents
process_documents(lib, DOCS_FOLDER)

# Start query session
query_documents(lib)