diff --git a/README.md b/README.md
index 99620287..9710cb99 100644
--- a/README.md
+++ b/README.md
@@ -1,13 +1,15 @@
# llmware
+


-[](https://discord.gg/MhZn5Nc39h)
-[](https://github.com/llmware-ai/llmware/actions/workflows/pages.yml)
+[](https://discord.gg/MhZn5Nc39h)
+[](https://github.com/llmware-ai/llmware/actions/workflows/pages.yml)
πCheck out [Model Depot](https://medium.com/@darrenoberst/model-depot-9e6625c5fc55)
-Are you using a Windows/Linux x86 machine?
-- Getting started with [OpenVino example](https://github.com/llmware-ai/llmware/blob/main/examples/Models/using_openvino_models.py)
-- Getting started with [ONNX example](https://github.com/llmware-ai/llmware/blob/main/examples/Models/using_onnx_models.py)
+Are you using a Windows or Linux x86 machine?
+
+- Getting started with [OpenVino example](https://github.com/llmware-ai/llmware/blob/main/examples/Models/using_openvino_models.py)
+- Getting started with [ONNX example](https://github.com/llmware-ai/llmware/blob/main/examples/Models/using_onnx_models.py)
## Table of Contents
@@ -21,45 +23,45 @@ Are you using a Windows/Linux x86 machine?
- [Using LLMs and setting-up API keys & secrets](#using-llms-and-setting-up-api-keys--secrets)
- [Release notes and Change Log](#--release-notes-and-change-log)
-## π§°π οΈπ©Building Enterprise RAG Pipelines with Small, Specialized Models
+## π§°π οΈπ© Building Enterprise RAG Pipelines with Small, Specialized Models
-`llmware` provides a unified framework for building LLM-based applications (e.g., RAG, Agents), using small, specialized models that can be deployed privately, integrated with enterprise knowledge sources safely and securely, and cost-effectively tuned and adapted for any business process.
+`llmware` provides a unified framework for building LLM-based applications, such as RAG and Agents, using small, specialized models that can be deployed privately, integrated with enterprise knowledge sources safely and securely, and cost-effectively tuned and adapted for any business process.
- `llmware` has two main components:
-
- 1. **RAG Pipeline** - integrated components for the full lifecycle of connecting knowledge sources to generative AI models; and
+`llmware` has two main components:
- 2. **50+ small, specialized models** fine-tuned for key tasks in enterprise process automation, including fact-based question-answering, classification, summarization, and extraction.
+1. **RAG Pipeline** - integrated components for the full lifecycle of connecting knowledge sources to generative AI models; and
-By bringing together both of these components, along with integrating leading open source models and underlying technologies, `llmware` offers a comprehensive set of tools to rapidly build knowledge-based enterprise LLM applications.
+2. **50+ small, specialized models** fine-tuned for key tasks in enterprise process automation, including fact-based question-answering, classification, summarization, and extraction.
-Most of our examples can be run without a GPU server - get started right away on your laptop.
+By bringing together both of these components, along with integrating leading open source models and underlying technologies, `llmware` offers a comprehensive set of tools to rapidly build knowledge-based enterprise LLM applications.
-[Join us on Discord](https://discord.gg/MhZn5Nc39h) | [Watch Youtube Tutorials](https://www.youtube.com/@llmware) | [Explore our Model Families on Huggingface](https://www.huggingface.co/llmware)
+Most of our examples can be run without a GPU server. Get started right away on your laptop.
-New to Agents? [Check out the Agent Fast Start series](https://github.com/llmware-ai/llmware/tree/main/fast_start/agents)
+[Join us on Discord](https://discord.gg/MhZn5Nc39h) | [Watch Youtube Tutorials](https://www.youtube.com/@llmware) | [Explore our Model Families on Huggingface](https://www.huggingface.co/llmware)
-New to RAG? [Check out the Fast Start video series](https://www.youtube.com/playlist?list=PL1-dn33KwsmD7SB9iSO6vx4ZLRAWea1DB)
+New to Agents? [Check out the Agent Fast Start series](https://github.com/llmware-ai/llmware/tree/main/fast_start/agents)
-π₯π₯π₯ [**Multi-Model Agents with SLIM Models**](examples/SLIM-Agents/) - [**Intro-Video**](https://www.youtube.com/watch?v=cQfdaTcmBpY) π₯π₯π₯
+New to RAG? [Check out the Fast Start video series](https://www.youtube.com/playlist?list=PL1-dn33KwsmD7SB9iSO6vx4ZLRAWea1DB)
+
+π₯π₯π₯ [**Multi-Model Agents with SLIM Models**](examples/SLIM-Agents/) - [**Intro-Video**](https://www.youtube.com/watch?v=cQfdaTcmBpY) π₯π₯π₯
[Intro to SLIM Function Call Models](https://github.com/llmware-ai/llmware/blob/main/examples/Models/using_function_calls.py)
-Can't wait? Get SLIMs right away:
+Want to get started immediately? Access SLIMs now:
-```python
+```python
from llmware.models import ModelCatalog
-ModelCatalog().get_llm_toolkit() # get all SLIM models, delivered as small, fast quantized tools
-ModelCatalog().tool_test_run("slim-sentiment-tool") # see the model in action with test script included
+ModelCatalog().get_llm_toolkit() # get all SLIM models, delivered as small, fast quantized tools
+ModelCatalog().tool_test_run("slim-sentiment-tool") # see the model in action with test script included
```
-## π― Key features
+## π― Key features
+
Writing code with`llmware` is based on a few main concepts:
Model Catalog: Access all models the same way with easy lookup, regardless of underlying implementation.
-
-
+
```python
# 150+ Models in Catalog with 50+ RAG-optimized BLING, DRAGON and Industry BERT models
@@ -81,16 +83,16 @@ prompter = Prompt().load_model("llmware/bling-tiny-llama-v0")
response = prompter.prompt_main("what is the future of AI?", context="Insert Sources of information")
```
-Library: ingest, organize and index a collection of knowledge at scale - Parse, Text Chunk and Embed.
+Library: ingest, organize and index a collection of knowledge at scale - Parse, Text Chunk and Embed.
```python
from llmware.library import Library
-# to parse and text chunk a set of documents (pdf, pptx, docx, xlsx, txt, csv, md, json/jsonl, wav, png, jpg, html)
+# Use this to parse and text chunk a set of documents (pdf, pptx, docx, xlsx, txt, csv, md, json/jsonl, wav, png, jpg, html)
# step 1 - create a library, which is the 'knowledge-base container' construct
# - libraries have both text collection (DB) resources, and file resources (e.g., llmware_data/accounts/{library_name})
@@ -106,7 +108,7 @@ lib.add_files("/folder/path/to/my/files")
# to install an embedding on a library - pick an embedding model and vector_db
lib.install_new_embedding(embedding_model_name="mini-lm-sbert", vector_db="milvus", batch_size=500)
-# to add a second embedding to the same library (mix-and-match models + vector db)
+# to add a second embedding to the same library (mix-and-match models + vector db)
lib.install_new_embedding(embedding_model_name="industry-bert-sec", vector_db="chromadb", batch_size=100)
# easy to create multiple libraries for different projects and groups
@@ -124,7 +126,8 @@ lib_card = Library().get_library_card("my_library")
all_my_libs = Library().get_all_library_cards()
```
-Query: query libraries with mix of text, semantic, hybrid, metadata, and custom filters.
@@ -134,7 +137,7 @@ all_my_libs = Library().get_all_library_cards()
from llmware.retrieval import Query
from llmware.library import Library
-# step 1 - load the previously created library
+# step 1 - load the previously created library
lib = Library().load_library("my_library")
# step 2 - create a query object and pass the library
@@ -156,7 +159,7 @@ q2 = Query(lib, embedding_model_name="mini_lm_sbert", vector_db="milvus")
results4 = q2.semantic_query("new semantic query")
```
-Prompt with Sources: the easiest way to combine knowledge retrieval with a LLM inference.
@@ -196,10 +199,10 @@ source_materials = prompter.review_sources_summary()
prompt_history = prompter.get_current_history()
```
-RAG-Optimized Models - 1-7B parameter models designed for RAG workflow integration and running locally.
+RAG-Optimized Models - 1-7B parameter models designed for RAG workflow integration and running locally.
```
""" This 'Hello World' example demonstrates how to get started using local BLING models with provided context, using both
@@ -383,16 +386,16 @@ def bling_meets_llmware_hello_world (model_name):
print(f"\n > Loading Model: {model_name}...")
- # load the model
+ # load the model
prompter = Prompt().load_model(model_name)
t1 = time.time()
print(f"\n > Model {model_name} load time: {t1-t0} seconds")
-
+
for i, entries in enumerate(test_list):
print(f"\n{i+1}. Query: {entries['query']}")
-
+
# run the prompt
output = prompter.prompt_main(entries["query"],context=entries["context"]
, prompt_name="default_with_context",temperature=0.30)
@@ -415,24 +418,24 @@ if __name__ == "__main__":
# list of 'rag-instruct' laptop-ready small bling models on HuggingFace
pytorch_models = ["llmware/bling-1b-0.1", # most popular
- "llmware/bling-tiny-llama-v0", # fastest
+ "llmware/bling-tiny-llama-v0", # fastest
"llmware/bling-1.4b-0.1",
"llmware/bling-falcon-1b-0.1",
"llmware/bling-cerebras-1.3b-0.1",
- "llmware/bling-sheared-llama-1.3b-0.1",
+ "llmware/bling-sheared-llama-1.3b-0.1",
"llmware/bling-sheared-llama-2.7b-0.1",
"llmware/bling-red-pajamas-3b-0.1",
"llmware/bling-stable-lm-3b-4e1t-v0",
- "llmware/bling-phi-3" # most accurate (and newest)
+ "llmware/bling-phi-3" # most accurate (and newest)
]
# Quantized GGUF versions generally load faster and run nicely on a laptop with at least 16 GB of RAM
gguf_models = ["bling-phi-3-gguf", "bling-stablelm-3b-tool", "dragon-llama-answer-tool", "dragon-yi-answer-tool", "dragon-mistral-answer-tool"]
# try model from either pytorch or gguf model list
- # the newest (and most accurate) is 'bling-phi-3-gguf'
+ # the newest (and most accurate) is 'bling-phi-3-gguf'
- bling_meets_llmware_hello_world(gguf_models[0]
+ bling_meets_llmware_hello_world(gguf_models[0]
# check out the model card on Huggingface for RAG benchmark test performance results and other useful information
```
@@ -446,20 +449,20 @@ if __name__ == "__main__":
from llmware.configs import LLMWareConfig
-# to set the collection database - mongo, sqlite, postgres
-LLMWareConfig().set_active_db("mongo")
+# to set the collection database - mongo, sqlite, postgres
+LLMWareConfig().set_active_db("mongo")
-# to set the vector database (or declare when installing)
-# --options: milvus, pg_vector (postgres), redis, qdrant, faiss, pinecone, mongo atlas
-LLMWareConfig().set_vector_db("milvus")
+# to set the vector database (or declare when installing)
+# --options: milvus, pg_vector (postgres), redis, qdrant, faiss, pinecone, mongo atlas
+LLMWareConfig().set_vector_db("milvus")
-# for fast start - no installations required
-LLMWareConfig().set_active_db("sqlite")
-LLMWareConfig().set_vector_db("chromadb") # try also faiss and lancedb
+# for fast start - no installations required
+LLMWareConfig().set_active_db("sqlite")
+LLMWareConfig().set_vector_db("chromadb") # try also faiss and lancedb
-# for single postgres deployment
-LLMWareConfig().set_active_db("postgres")
-LLMWareConfig().set_vector_db("postgres")
+# for single postgres deployment
+LLMWareConfig().set_active_db("postgres")
+LLMWareConfig().set_vector_db("postgres")
# to install mongo, milvus, postgres - see the docker-compose scripts as well as examples
@@ -469,7 +472,7 @@ LLMWareConfig().set_vector_db("postgres")
π₯ Agents with Function Calls and SLIM Models π₯
+ π₯ Agents with Function Calls and SLIM Models π₯
```python
@@ -512,7 +515,7 @@ activity_summary = agent.activity_summary()
for i, entries in enumerate(agent.response_list):
print("update: response analysis: ", i, entries)
-output = {"report": report, "activity_summary": activity_summary, "journal": agent.journal}
+output = {"report": report, "activity_summary": activity_summary, "journal": agent.journal}
```
@@ -553,7 +556,7 @@ def contract_analysis_on_laptop (model_name):
sample_files_path = Setup().load_sample_files()
contracts_path = os.path.join(sample_files_path,"Agreements")
-
+
# Query list - these are the 3 main topics and questions that we would like the LLM to analyze for each contract
query_list = {"executive employment agreement": "What are the name of the two parties?",
@@ -586,7 +589,7 @@ def contract_analysis_on_laptop (model_name):
# step 4 above - calling the LLM with 'source' information already packaged into the prompt
- responses = prompter.prompt_with_source(value, prompt_name="default_with_context")
+ responses = prompter.prompt_with_source(value, prompt_name="default_with_context")
# step 5 above - print out to screen
@@ -609,20 +612,21 @@ def contract_analysis_on_laptop (model_name):
if __name__ == "__main__":
- # use local cpu model - try the newest - RAG finetune of Phi-3 quantized and packaged in GGUF
+ # use local cpu model - try the newest - RAG finetune of Phi-3 quantized and packaged in GGUF
model = "bling-phi-3-gguf"
contract_analysis_on_laptop(model)
```
+
Step 2- Go to Examples - Get Started Fast with 100+ 'Cut-and-Paste' Recipes
-## π₯ Top New Examples π₯
+## π₯ Top New Examples π₯
End-to-End Scenario - [**Function Calls with SLIM Extract and Web Services for Financial Research**](https://github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/web_services_slim_fx.py)
Analyzing Voice Files - [**Great Speeches with LLM Query and Extract**](https://github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/parsing_great_speeches.py)
New to LLMWare - [**Fast Start tutorial series**](https://github.com/llmware-ai/llmware/tree/main/fast_start)
Getting Setup - [**Getting Started**](https://github.com/llmware-ai/llmware/tree/main/examples/Getting_Started)
-SLIM Examples - [**SLIM Models**](examples/SLIM-Agents/)
-
-| Example | Detail |
-|-------------|--------------|
-| 1. BLING models fast start ([code](examples/Models/bling_fast_start.py) / [video](https://www.youtube.com/watch?v=JjgqOZ2v5oU)) | Get started with fast, accurate, CPU-based models - question-answering, key-value extraction, and basic summarization. |
-| 2. Parse and Embed 500 PDF Documents ([code](examples/Embedding/docs2vecs_with_milvus-un_resolutions.py)) | End-to-end example for Parsing, Embedding and Querying UN Resolution documents with Milvus |
-| 3. Hybrid Retrieval - Semantic + Text ([code](examples/Retrieval/dual_pass_with_custom_filter.py)) | Using 'dual pass' retrieval to combine best of semantic and text search |
-| 4. Multiple Embeddings with PG Vector ([code](examples/Embedding/using_multiple_embeddings.py) / [video](https://www.youtube.com/watch?v=Bncvggy6m5Q)) | Comparing Multiple Embedding Models using Postgres / PG Vector |
-| 5. DRAGON GGUF Models ([code](examples/Models/dragon_gguf_fast_start.py) / [video](https://www.youtube.com/watch?v=BI1RlaIJcsc&t=130s)) | State-of-the-Art 7B RAG GGUF Models. |
-| 6. RAG with BLING ([code](examples/Use_Cases/contract_analysis_on_laptop_with_bling_models.py) / [video](https://www.youtube.com/watch?v=8aV5p3tErP0)) | Using contract analysis as an example, experiment with RAG for complex document analysis and text extraction using `llmware`'s BLING ~1B parameter GPT model running on your laptop. |
-| 7. Master Service Agreement Analysis with DRAGON ([code](examples/Use_Cases/msa_processing.py) / [video](https://www.youtube.com/watch?v=Cf-07GBZT68&t=2s)) | Analyzing MSAs using DRAGON YI 6B Model. |
-| 8. Streamlit Example ([code](examples/UI/simple_rag_ui_with_streamlit.py)) | Ask questions to Invoices with UI run inference. |
-| 9. Integrating LM Studio ([code](examples/Models/using-open-chat-models.py) / [video](https://www.youtube.com/watch?v=h2FDjUyvsKE&t=101s)) | Integrating LM Studio Models with LLMWare |
-| 10. Prompts With Sources ([code](examples/Prompts/prompt_with_sources.py)) | Attach wide range of knowledge sources directly into Prompts. |
-| 11. Fact Checking ([code](examples/Prompts/fact_checking.py)) | Explore the full set of evidence methods in this example script that analyzes a set of contracts. |
-| 12. Using 7B GGUF Chat Models ([code](examples/Models/chat_models_gguf_fast_start.py)) | Using 4 state of the art 7B chat models in minutes running locally |
+SLIM Examples - [**SLIM Models**](examples/SLIM-Agents/)
+
+| Example | Detail |
+| ----------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| 1. BLING models fast start ([code](examples/Models/bling_fast_start.py) / [video](https://www.youtube.com/watch?v=JjgqOZ2v5oU)) | Get started with fast, accurate, CPU-based models - question-answering, key-value extraction, and basic summarization. |
+| 2. Parse and Embed 500 PDF Documents ([code](examples/Embedding/docs2vecs_with_milvus-un_resolutions.py)) | End-to-end example for Parsing, Embedding and Querying UN Resolution documents with Milvus |
+| 3. Hybrid Retrieval - Semantic + Text ([code](examples/Retrieval/dual_pass_with_custom_filter.py)) | Using 'dual pass' retrieval to combine best of semantic and text search |
+| 4. Multiple Embeddings with PG Vector ([code](examples/Embedding/using_multiple_embeddings.py) / [video](https://www.youtube.com/watch?v=Bncvggy6m5Q)) | Comparing Multiple Embedding Models using Postgres / PG Vector |
+| 5. DRAGON GGUF Models ([code](examples/Models/dragon_gguf_fast_start.py) / [video](https://www.youtube.com/watch?v=BI1RlaIJcsc&t=130s)) | State-of-the-Art 7B RAG GGUF Models. |
+| 6. RAG with BLING ([code](examples/Use_Cases/contract_analysis_on_laptop_with_bling_models.py) / [video](https://www.youtube.com/watch?v=8aV5p3tErP0)) | Using contract analysis as an example, experiment with RAG for complex document analysis and text extraction using `llmware`'s BLING ~1B parameter GPT model running on your laptop. |
+| 7. Master Service Agreement Analysis with DRAGON ([code](examples/Use_Cases/msa_processing.py) / [video](https://www.youtube.com/watch?v=Cf-07GBZT68&t=2s)) | Analyzing MSAs using DRAGON YI 6B Model. |
+| 8. Streamlit Example ([code](examples/UI/simple_rag_ui_with_streamlit.py)) | Ask questions to Invoices with UI run inference. |
+| 9. Integrating LM Studio ([code](examples/Models/using-open-chat-models.py) / [video](https://www.youtube.com/watch?v=h2FDjUyvsKE&t=101s)) | Integrating LM Studio Models with LLMWare |
+| 10. Prompts With Sources ([code](examples/Prompts/prompt_with_sources.py)) | Attach wide range of knowledge sources directly into Prompts. |
+| 11. Fact Checking ([code](examples/Prompts/fact_checking.py)) | Explore the full set of evidence methods in this example script that analyzes a set of contracts. |
+| 12. Using 7B GGUF Chat Models ([code](examples/Models/chat_models_gguf_fast_start.py)) | Using 4 state of the art 7B chat models in minutes running locally |
+
+Check out: [llmware examples](https://github.com/llmware-ai/llmware/blob/main/examples/README.md)
-
-Check out: [llmware examples](https://github.com/llmware-ai/llmware/blob/main/examples/README.md)
-
-Step 3 - Tutorial Videos - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples.
-π¬ Check out these videos to get started quickly:
-- [Document Summarization](https://youtu.be/Ps3W-P9A1m8?si=Rxvst3RJv8ZaOk0L)
-- [Bling-3-GGUF Local Chatbot](https://youtu.be/gzzEVK8p3VM?si=8cNn_do0oxSzCEnM)
-- [Agent-based Complex Research Analysis](https://youtu.be/y4WvwHqRR60?si=jX3KCrKcYkM95boe)
-- [Getting Started with SLIMs (with code)](https://youtu.be/aWZFrTDmMPc?si=lmo98_quo_2Hrq0C)
-- [Are you prompting wrong for RAG - Stochastic Sampling-Part I](https://youtu.be/7oMTGhSKuNY?si=_KSjuBnqArvWzYbx)
-- [Are you prompting wrong for RAG - Stochastic Sampling-Part II- Code Experiments](https://youtu.be/iXp1tj-pPjM?si=3ZeMgipY0vJDHIMY)
-- [SLIM Models Intro](https://www.youtube.com/watch?v=cQfdaTcmBpY)
-- [Text2SQL Intro](https://youtu.be/BKZ6kO2XxNo?si=tXGt63pvrp_rOlIP)
-- [RAG with BLING on your laptop](https://www.youtube.com/watch?v=JjgqOZ2v5oU)
-- [DRAGON-7B-Models](https://www.youtube.com/watch?v=d_u7VaKu6Qk&t=37s)
-- [Install and Compare Multiple Embeddings with Postgres and PGVector](https://www.youtube.com/watch?v=Bncvggy6m5Q)
-- [Background on GGUF Quantization & DRAGON Model Example](https://www.youtube.com/watch?v=ZJyQIZNJ45E)
-- [Using LM Studio Models](https://www.youtube.com/watch?v=h2FDjUyvsKE)
-- [Using Ollama Models](https://www.youtube.com/watch?v=qITahpVDuV0)
-- [Use any GGUF Model](https://www.youtube.com/watch?v=9wXJgld7Yow)
+π¬ Check out these videos to get started quickly:
+
+- [Document Summarization](https://youtu.be/Ps3W-P9A1m8?si=Rxvst3RJv8ZaOk0L)
+- [Bling-3-GGUF Local Chatbot](https://youtu.be/gzzEVK8p3VM?si=8cNn_do0oxSzCEnM)
+- [Agent-based Complex Research Analysis](https://youtu.be/y4WvwHqRR60?si=jX3KCrKcYkM95boe)
+- [Getting Started with SLIMs (with code)](https://youtu.be/aWZFrTDmMPc?si=lmo98_quo_2Hrq0C)
+- [Are you prompting wrong for RAG - Stochastic Sampling-Part I](https://youtu.be/7oMTGhSKuNY?si=_KSjuBnqArvWzYbx)
+- [Are you prompting wrong for RAG - Stochastic Sampling-Part II- Code Experiments](https://youtu.be/iXp1tj-pPjM?si=3ZeMgipY0vJDHIMY)
+- [SLIM Models Intro](https://www.youtube.com/watch?v=cQfdaTcmBpY)
+- [Text2SQL Intro](https://youtu.be/BKZ6kO2XxNo?si=tXGt63pvrp_rOlIP)
+- [RAG with BLING on your laptop](https://www.youtube.com/watch?v=JjgqOZ2v5oU)
+- [DRAGON-7B-Models](https://www.youtube.com/watch?v=d_u7VaKu6Qk&t=37s)
+- [Install and Compare Multiple Embeddings with Postgres and PGVector](https://www.youtube.com/watch?v=Bncvggy6m5Q)
+- [Background on GGUF Quantization & DRAGON Model Example](https://www.youtube.com/watch?v=ZJyQIZNJ45E)
+- [Using LM Studio Models](https://www.youtube.com/watch?v=h2FDjUyvsKE)
+- [Using Ollama Models](https://www.youtube.com/watch?v=qITahpVDuV0)
+- [Use any GGUF Model](https://www.youtube.com/watch?v=9wXJgld7Yow)
- [Use small LLMs for RAG for Contract Analysis (feat. LLMWare)](https://www.youtube.com/watch?v=8aV5p3tErP0)
- [Invoice Processing with LLMware](https://www.youtube.com/watch?v=VHZSaBBG-Bo&t=10s)
- [Ingest PDFs at Scale](https://www.youtube.com/watch?v=O0adUfrrxi8&t=10s)
@@ -752,50 +760,50 @@ Check out: [llmware examples](https://github.com/llmware-ai/llmware/blob/main/e
- [Use Retrieval Augmented Generation (RAG) without a Database](https://www.youtube.com/watch?v=tAGz6yR14lw)
- [Pop up LLMWare Inference Server](https://www.youtube.com/watch?v=qiEmLnSRDUA&t=20s)
+Fast Start: use SQLite3 and ChromaDB (File-based) out-of-the-box - no install required
+Fast Start: use SQLite3 and ChromaDB (File-based) out-of-the-box - no install required
```python
-from llmware.configs import LLMWareConfig
-LLMWareConfig().set_active_db("sqlite")
-LLMWareConfig().set_vector_db("chromadb")
+from llmware.configs import LLMWareConfig
+LLMWareConfig().set_active_db("sqlite")
+LLMWareConfig().set_vector_db("chromadb")
```
-Speed + Scale: use MongoDB (text collection) and Milvus (vector db) - install with Docker Compose
+Speed + Scale: use MongoDB (text collection) and Milvus (vector db) - install with Docker Compose
```bash
curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose.yaml
@@ -808,10 +816,10 @@ LLMWareConfig().set_active_db("mongo")
LLMWareConfig().set_vector_db("milvus")
```
-Postgres: use Postgres for both text collection and vector DB - install with Docker Compose
+Postgres: use Postgres for both text collection and vector DB - install with Docker Compose
```bash
curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose-pgvector.yaml
@@ -824,7 +832,7 @@ LLMWareConfig().set_active_db("postgres")
LLMWareConfig().set_vector_db("postgres")
```
-Mix-and-Match: LLMWare supports 3 text collection databases (Mongo, Postgres, SQLite) and
@@ -835,103 +843,108 @@ LLMWareConfig().set_vector_db("postgres")
curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose-redis-stack.yaml
```
-
β¨ Roadmap - Where are we going ...
-- π‘ Making it easy to deploy fine-tuned open source models to build state-of-the-art RAG workflows
-- π‘ Private cloud - keeping documents, data pipelines, data stores, and models safe and secure
-- π‘ Model quantization, especially GGUF, and democratizing the game-changing use of 1-9B CPU-based LLMs
-- π‘ Developing small specialized RAG optimized LLMs between 1B-9B parameters
-- π‘ Industry-specific LLMs, embedding models and processes to support core knowledge-based use cases
-- π‘ Enterprise scalability - containerization, worker deployments and Kubernetes
-- π‘ Integration of SQL and other scale enterprise data sources
-- π‘ Multi-step, multi-model Agent-based workflows with small, specialized function-calling models
+- π‘ Making it easy to deploy fine-tuned open source models to build state-of-the-art RAG workflows
+- π‘ Private cloud - keeping documents, data pipelines, data stores, and models safe and secure
+- π‘ Model quantization, especially GGUF, and democratizing the game-changing use of 1-9B CPU-based LLMs
+- π‘ Developing small specialized RAG optimized LLMs between 1B-9B parameters
+- π‘ Industry-specific LLMs, embedding models and processes to support core knowledge-based use cases
+- π‘ Enterprise scalability - containerization, worker deployments and Kubernetes
+- π‘ Integration of SQL and other scale enterprise data sources
+- π‘ Multi-step, multi-model Agent-based workflows with small, specialized function-calling models
-Like our models, we aspire for llmware to be "small, but mighty" - easy to use and get started, but packing a powerful punch!
+Like our models, we aspire for llmware to be "small, but mighty" - easy to use and get started, but packing a powerful punch!
Optional
- [Docker](https://docs.docker.com/get-docker/)
-
- To enable the OCR parsing capabilities, install [Tesseract v5.3.3](https://tesseract-ocr.github.io/tessdoc/Installation.html) and [Poppler v23.10.0](https://poppler.freedesktop.org/) native packages.
π§ Change Log
-**Monday, July 29 - v03.4**
-- Enhanced safety protections for text2sql db reads for LLMfx agents
-- New examples - see [example](https://github.com/llmware-ai/llmware/blob/main/examples/UI/dueling_chatbot.py)
-- More Notebook examples - see [notebook examples](https://github.com/llmware-ai/llmware/blob/main/examples/Notebooks)
-
-**Monday, July 8 - v03.3**
-- Improvements in model configuration options, logging, and various small fixes
-- Improved Azure OpenAI configs - see [example](https://github.com/llmware-ai/llmware/blob/main/examples/Models/using-azure-openai.py)
-
-**Saturday, June 29 - v0.3.2**
-- Update to PDF and Office parsers - improvements to configurations in logging and text chunking options
-
-**Saturday, June 22 - v0.3.1**
-- Added module 3 to Fast Start example series [examples 7-9 on Agents & Function Calls](https://github.com/llmware-ai/llmware/tree/main/fast_start)
-- Added reranker Jina model for in-memory semantic similarity RAG - see [example](https://github.com/llmware-ai/llmware/tree/main/examples/Embedding/using_semantic_reranker_with_rag.py)
-- Enhanced model fetching parameterization in model loading process
-- Added new 'tiny' versions of slim-extract and slim-summary in both Pytorch and GGUF versions - check out 'slim-extract-tiny-tool' and 'slim-summary-tiny-tool'
-- [Biz Bot] use case - see [example](https://github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/biz_bot.py) and [video](https://youtu.be/4nBYDEjxxTE?si=o6PDPbu0PVcT-tYd)
-- Updated numpy reqs <2 and updated yfinance version minimum (>=0.2.38)
-
-**Tuesday, June 4 - v0.3.0**
-- Added support for new Milvus Lite embedded 'no-install' database - see [example](https://github.com/llmware-ai/llmware/tree/main/examples/Embedding/using_milvus_lite.py).
-- Added two new SLIM models to catalog and agent processes - ['q-gen'](https://github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents/using-slim-q-gen.py) and ['qa-gen'](https://github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents/using-slim-qa-gen.py)
-- Updated model class instantiation to provide more extensibility to add new classes in different modules
-- New welcome_to_llmware.sh and welcome_to_llmware_windows.sh fast install scripts
-- Enhanced Model class base with new configurable post_init and register methods
-- Created InferenceHistory to track global state of all inferences completed
-- Multiple improvements and updates to logging at module level
-- Note: starting with v0.3.0, pip install provides two options - a base minimal install `pip3 install llmware` which will support most use cases, and a larger install `pip3 install 'llmware[full]'` with other commonly-used libraries.
-
-**Wednesday, May 22 - v0.2.15**
-- Improvements in Model class handling of Pytorch and Transformers dependencies (just-in-time loading, if needed)
-- Expanding API endpoint options and inference server functionality - see new [client access options](https://github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/llmware_inference_api_client.py) and [server_launch](https://github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/llmware_inference_server.py)
-
-**Saturday, May 18 - v0.2.14**
-- New OCR image parsing methods with [example](https://github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/slicing_and_dicing_office_docs.py)
-- Adding first part of logging improvements (WIP) in Configs and Models.
-- New embedding model added to catalog - industry-bert-loans.
-- Updates to model import methods and configurations.
-
-**Sunday, May 12 - v0.2.13**
-- New GGUF streaming method with [basic example](https://github.com/llmware-ai/llmware/tree/main/examples/Models/gguf_streaming.py) and [phi3 local chatbot](https://github.com/llmware-ai/llmware/tree/main/examples/UI/gguf_streaming_chatbot.py)
-- Significant cleanups in ancillary imports and dependencies to reduce install complexity - note: the updated requirements.txt and setup.py files.
-- Defensive code to provide informative warning of any missing dependencies in specialized parts of the code, e.g., OCR, Web Parser.
-- Updates of tests, notice and documentation.
-- OpenAIConfigs created to support Azure OpenAI.
-
-**Sunday, May 5 - v0.2.12 Update**
-- Launched ["bling-phi-3"](https://huggingface.co/llmware/bling-phi-3) and ["bling-phi-3-gguf"](https://huggingface.co/llmware/bling-phi-3-gguf) in ModelCatalog - newest and most accurate BLING/DRAGON model
-- New long document summarization method using slim-summary-tool [example](https://github.com/llmware-ai/llmware/tree/main/examples/Prompts/document_summarizer.py)
-- New Office (Powerpoint, Word, Excel) sample files [example](https://github.com/llmware-ai/llmware/tree/main/examples/Parsing/parsing_microsoft_ir_docs.py)
-- Added support for Python 3.12
-- Deprecated faiss and replaced with 'no-install' chromadb in Fast Start examples
-- Refactored Datasets, Graph and Web Services classes
-- Updated Voice parsing with WhisperCPP into Library
-
-**Monday, April 29 - v0.2.11 Update**
-- Updates to gguf libs for Phi-3 and Llama-3
-- Added Phi-3 [example](https://github.com/llmware-ai/llmware/tree/main/examples/Models/using-microsoft-phi-3.py) and Llama-3 [example](https://github.com/llmware-ai/llmware/tree/main/examples/Models/using-llama-3.py) and Quantized Versions to Model Catalog
-- Integrated WhisperCPP Model class and prebuilt shared libraries - [getting-started-example](https://github.com/llmware-ai/llmware/tree/main/examples/Models/using-whisper-cpp-getting-started.py)
-- New voice sample files for testing - [example](https://github.com/llmware-ai/llmware/tree/main/examples/Models/using-whisper-cpp-sample-files.py)
-- Improved CUDA detection on Windows and safety checks for older Mac OS versions
-
-**Monday, April 22 - v0.2.10 Update**
-- Updates to Agent class to support Natural Language queries of Custom Tables on Postgres [example](https://github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/agent_with_custom_tables.py)
-- New Agent API endpoint implemented with LLMWare Inference Server and new Agent capabilities [example](https://github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents/agent_api_endpoint.py)
-
-**Tuesday, April 16 - v0.2.9 Update**
-- New CustomTable class to rapidly create custom DB tables in conjunction with LLM-based workflows.
-- Enhanced methods for converting CSV and JSON/JSONL files into DB tables.
+**Monday, July 29 - v03.4**
+
+- Enhanced safety protections for text2sql db reads for LLMfx agents
+- New examples - see [example](https://github.com/llmware-ai/llmware/blob/main/examples/UI/dueling_chatbot.py)
+- More Notebook examples - see [notebook examples](https://github.com/llmware-ai/llmware/blob/main/examples/Notebooks)
+
+**Monday, July 8 - v03.3**
+
+- Improvements in model configuration options, logging, and various small fixes
+- Improved Azure OpenAI configs - see [example](https://github.com/llmware-ai/llmware/blob/main/examples/Models/using-azure-openai.py)
+
+**Saturday, June 29 - v0.3.2**
+
+- Update to PDF and Office parsers - improvements to configurations in logging and text chunking options
+
+**Saturday, June 22 - v0.3.1**
+
+- Added module 3 to Fast Start example series [examples 7-9 on Agents & Function Calls](https://github.com/llmware-ai/llmware/tree/main/fast_start)
+- Added reranker Jina model for in-memory semantic similarity RAG - see [example](https://github.com/llmware-ai/llmware/tree/main/examples/Embedding/using_semantic_reranker_with_rag.py)
+- Enhanced model fetching parameterization in model loading process
+- Added new 'tiny' versions of slim-extract and slim-summary in both Pytorch and GGUF versions - check out 'slim-extract-tiny-tool' and 'slim-summary-tiny-tool'
+- [Biz Bot] use case - see [example](https://github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/biz_bot.py) and [video](https://youtu.be/4nBYDEjxxTE?si=o6PDPbu0PVcT-tYd)
+- Updated numpy reqs <2 and updated yfinance version minimum (>=0.2.38)
+
+**Tuesday, June 4 - v0.3.0**
+
+- Added support for new Milvus Lite embedded 'no-install' database - see [example](https://github.com/llmware-ai/llmware/tree/main/examples/Embedding/using_milvus_lite.py).
+- Added two new SLIM models to catalog and agent processes - ['q-gen'](https://github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents/using-slim-q-gen.py) and ['qa-gen'](https://github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents/using-slim-qa-gen.py)
+- Updated model class instantiation to provide more extensibility to add new classes in different modules
+- New welcome_to_llmware.sh and welcome_to_llmware_windows.sh fast install scripts
+- Enhanced Model class base with new configurable post_init and register methods
+- Created InferenceHistory to track global state of all inferences completed
+- Multiple improvements and updates to logging at module level
+- Note: starting with v0.3.0, pip install provides two options - a base minimal install `pip3 install llmware` which will support most use cases, and a larger install `pip3 install 'llmware[full]'` with other commonly-used libraries.
+
+**Wednesday, May 22 - v0.2.15**
+
+- Improvements in Model class handling of Pytorch and Transformers dependencies (just-in-time loading, if needed)
+- Expanding API endpoint options and inference server functionality - see new [client access options](https://github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/llmware_inference_api_client.py) and [server_launch](https://github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/llmware_inference_server.py)
+
+**Saturday, May 18 - v0.2.14**
+
+- New OCR image parsing methods with [example](https://github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/slicing_and_dicing_office_docs.py)
+- Adding first part of logging improvements (WIP) in Configs and Models.
+- New embedding model added to catalog - industry-bert-loans.
+- Updates to model import methods and configurations.
+
+**Sunday, May 12 - v0.2.13**
+
+- New GGUF streaming method with [basic example](https://github.com/llmware-ai/llmware/tree/main/examples/Models/gguf_streaming.py) and [phi3 local chatbot](https://github.com/llmware-ai/llmware/tree/main/examples/UI/gguf_streaming_chatbot.py)
+- Significant cleanups in ancillary imports and dependencies to reduce install complexity - note: the updated requirements.txt and setup.py files.
+- Defensive code to provide informative warning of any missing dependencies in specialized parts of the code, e.g., OCR, Web Parser.
+- Updates of tests, notice and documentation.
+- OpenAIConfigs created to support Azure OpenAI.
+
+**Sunday, May 5 - v0.2.12 Update**
+
+- Launched ["bling-phi-3"](https://huggingface.co/llmware/bling-phi-3) and ["bling-phi-3-gguf"](https://huggingface.co/llmware/bling-phi-3-gguf) in ModelCatalog - newest and most accurate BLING/DRAGON model
+- New long document summarization method using slim-summary-tool [example](https://github.com/llmware-ai/llmware/tree/main/examples/Prompts/document_summarizer.py)
+- New Office (Powerpoint, Word, Excel) sample files [example](https://github.com/llmware-ai/llmware/tree/main/examples/Parsing/parsing_microsoft_ir_docs.py)
+- Added support for Python 3.12
+- Deprecated faiss and replaced with 'no-install' chromadb in Fast Start examples
+- Refactored Datasets, Graph and Web Services classes
+- Updated Voice parsing with WhisperCPP into Library
+
+**Monday, April 29 - v0.2.11 Update**
+
+- Updates to gguf libs for Phi-3 and Llama-3
+- Added Phi-3 [example](https://github.com/llmware-ai/llmware/tree/main/examples/Models/using-microsoft-phi-3.py) and Llama-3 [example](https://github.com/llmware-ai/llmware/tree/main/examples/Models/using-llama-3.py) and Quantized Versions to Model Catalog
+- Integrated WhisperCPP Model class and prebuilt shared libraries - [getting-started-example](https://github.com/llmware-ai/llmware/tree/main/examples/Models/using-whisper-cpp-getting-started.py)
+- New voice sample files for testing - [example](https://github.com/llmware-ai/llmware/tree/main/examples/Models/using-whisper-cpp-sample-files.py)
+- Improved CUDA detection on Windows and safety checks for older Mac OS versions
+
+**Monday, April 22 - v0.2.10 Update**
+
+- Updates to Agent class to support Natural Language queries of Custom Tables on Postgres [example](https://github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/agent_with_custom_tables.py)
+- New Agent API endpoint implemented with LLMWare Inference Server and new Agent capabilities [example](https://github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents/agent_api_endpoint.py)
+
+**Tuesday, April 16 - v0.2.9 Update**
+
+- New CustomTable class to rapidly create custom DB tables in conjunction with LLM-based workflows.
+- Enhanced methods for converting CSV and JSON/JSONL files into DB tables.
- See new examples [Creating Custom Table example](https://github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables/create_custom_table-1.py)
-
-**Tuesday, April 9 - v0.2.8 Update**
-- Office Parser (Word Docx, Powerpoint PPTX, and Excel XLSX) - multiple improvements - new libs + Python method.
-- Includes: several fixes, improved text chunking controls, header text extraction and configuration options.
-- Generally, new office parser options conform with the new PDF parser options.
-- Please see [Office Parsing Configs example](https://github.com/llmware-ai/llmware/tree/main/examples/Parsing/office_parser_new_configs.py)
-
-**Wednesday, April 3 - v0.2.7 Update**
-- PDF Parser - multiple improvements - new libs + Python methods.
-- Includes: UTF-8 encoding for European languages.
-- Includes: Better text chunking controls, header text extraction and configuration options.
-- Please see [PDF Parsing Configs example](https://github.com/llmware-ai/llmware/tree/main/examples/Parsing/pdf_parser_new_configs.py) for more details.
-- Note: deprecating support for aarch64-linux (will use 0.2.6 parsers). Full support going forward for Linux Ubuntu20+ on x86_64 + with CUDA.
-
-**Friday, March 22 - v0.2.6 Update**
-- New SLIM models: summary, extract, xsum, boolean, tags-3b, and combo sentiment-ner.
-- New logit and sampling analytics.
-- New SLIM examples showing how to use the new models.
-
-**Thursday, March 14 - v0.2.5 Update**
-- Improved support for GGUF on CUDA (Windows and Linux), with new prebuilt binaries and exception handling.
-- Enhanced model configuration options (sampling, temperature, top logit capture).
-- Added full back-level support for Ubuntu 20+ with parsers and GGUF engine.
-- Support for new Anthropic Claude 3 models.
-- New retrieval methods: document_lookup and aggregate_text.
-- New model: bling-stablelm-3b-tool - fast, accurate 3b quantized question-answering model - one of our new favorites.
-
-**Wednesday, February 28 - v0.2.4 Update**
+
+**Tuesday, April 9 - v0.2.8 Update**
+
+- Office Parser (Word Docx, Powerpoint PPTX, and Excel XLSX) - multiple improvements - new libs + Python method.
+- Includes: several fixes, improved text chunking controls, header text extraction and configuration options.
+- Generally, new office parser options conform with the new PDF parser options.
+- Please see [Office Parsing Configs example](https://github.com/llmware-ai/llmware/tree/main/examples/Parsing/office_parser_new_configs.py)
+
+**Wednesday, April 3 - v0.2.7 Update**
+
+- PDF Parser - multiple improvements - new libs + Python methods.
+- Includes: UTF-8 encoding for European languages.
+- Includes: Better text chunking controls, header text extraction and configuration options.
+- Please see [PDF Parsing Configs example](https://github.com/llmware-ai/llmware/tree/main/examples/Parsing/pdf_parser_new_configs.py) for more details.
+- Note: deprecating support for aarch64-linux (will use 0.2.6 parsers). Full support going forward for Linux Ubuntu20+ on x86_64 + with CUDA.
+
+**Friday, March 22 - v0.2.6 Update**
+
+- New SLIM models: summary, extract, xsum, boolean, tags-3b, and combo sentiment-ner.
+- New logit and sampling analytics.
+- New SLIM examples showing how to use the new models.
+
+**Thursday, March 14 - v0.2.5 Update**
+
+- Improved support for GGUF on CUDA (Windows and Linux), with new prebuilt binaries and exception handling.
+- Enhanced model configuration options (sampling, temperature, top logit capture).
+- Added full back-level support for Ubuntu 20+ with parsers and GGUF engine.
+- Support for new Anthropic Claude 3 models.
+- New retrieval methods: document_lookup and aggregate_text.
+- New model: bling-stablelm-3b-tool - fast, accurate 3b quantized question-answering model - one of our new favorites.
+
+**Wednesday, February 28 - v0.2.4 Update**
+
- Major upgrade of GGUF Generative Model class - support for Stable-LM-3B, CUDA build options, and better control over sampling strategies.
-- Note: new GGUF llama.cpp built libs packaged with build starting in v0.2.4.
-- Improved GPU support for HF Embedding Models.
-
-**Friday, February 16 - v0.2.3 Update**
-- Added 10+ embedding models to ModelCatalog - nomic, jina, bge, gte, ember and uae-large.
-- Updated OpenAI support >=1.0 and new text-3 embedding models.
-- SLIM model keys and output_values now accessible in ModelCatalog.
-- Updating encodings to 'utf-8-sig' to better handle txt/csv files with bom.
+- Note: new GGUF llama.cpp built libs packaged with build starting in v0.2.4.
+- Improved GPU support for HF Embedding Models.
+
+**Friday, February 16 - v0.2.3 Update**
+
+- Added 10+ embedding models to ModelCatalog - nomic, jina, bge, gte, ember and uae-large.
+- Updated OpenAI support >=1.0 and new text-3 embedding models.
+- SLIM model keys and output_values now accessible in ModelCatalog.
+- Updating encodings to 'utf-8-sig' to better handle txt/csv files with bom.
**Latest Updates - 19 Jan 2024 - llmware v0.2.0**
- - Added new database integration options - Postgres and SQlite
- - Improved status update and parser event logging options for parallelized parsing
- - Significant enhancements to interactions between Embedding + Text collection databases
- - Improved error exception handling in loading dynamic modules
+
+- Added new database integration options - Postgres and SQlite
+- Improved status update and parser event logging options for parallelized parsing
+- Significant enhancements to interactions between Embedding + Text collection databases
+- Improved error exception handling in loading dynamic modules
**Latest Updates - 15 Jan 2024: llmware v0.1.15**
- - Enhancements to dual pass retrieval queries
- - Expanded configuration objects and options for endpoint resources
-
+
+- Enhancements to dual pass retrieval queries
+- Expanded configuration objects and options for endpoint resources
+
**Latest Updates - 30 Dec 2023: llmware v0.1.14**
- - Added support for Open Chat inference servers (compatible with OpenAI API)
- - Improved capabilities for multiple embedding models and vector DB configurations
- - Added docker-compose install scripts for PGVector and Redis vector databases
- - Added 'bling-tiny-llama' to model catalog
-
+
+- Added support for Open Chat inference servers (compatible with OpenAI API)
+- Improved capabilities for multiple embedding models and vector DB configurations
+- Added docker-compose install scripts for PGVector and Redis vector databases
+- Added 'bling-tiny-llama' to model catalog
+
**Latest Updates - 22 Dec 2023: llmware v0.1.13**
- - Added 3 new vector databases - Postgres (PG Vector), Redis, and Qdrant
- - Improved support for integrating sentence transformers directly in the model catalog
- - Improvements in the model catalog attributes
- - Multiple new Examples in Models & Embeddings, including GGUF, Vector database, and model catalog
+
+- Added 3 new vector databases - Postgres (PG Vector), Redis, and Qdrant
+- Improved support for integrating sentence transformers directly in the model catalog
+- Improvements in the model catalog attributes
+- Multiple new Examples in Models & Embeddings, including GGUF, Vector database, and model catalog
- **17 Dec 2023: llmware v0.1.12**
- dragon-deci-7b added to catalog - RAG-finetuned model on high-performance new 7B model base from Deci
@@ -1076,8 +1111,8 @@ For complete history of release notes, please open the Change log tab.
- Adding prebuilt llama_cpp / ctransformer shared libraries for Mac M1, Mac x86, Linux x86 and Windows
- 3 DRAGON models packaged as Q4_K_M GGUF models for CPU laptop use (dragon-mistral-7b, dragon-llama-7b, dragon-yi-6b)
- 4 leading open source chat models added to default catalog with Q4_K_M
-
- **8 Dec 2023: llmware v0.1.11**
+
- New fast start examples for high volume Document Ingestion and Embeddings with Milvus.
- New LLMWare 'Pop up' Inference Server model class and example script.
- New Invoice Processing example for RAG.
@@ -1085,11 +1120,13 @@ For complete history of release notes, please open the Change log tab.
- Enhancing debugging log output mode options for PDF and Office parsers.
- **30 Nov 2023: llmware v0.1.10**
+
- Windows added as a supported operating system.
- - Further enhancements to native code for stack management.
+ - Further enhancements to native code for stack management.
- Minor defect fixes.
- **24 Nov 2023: llmware v0.1.9**
+
- Markdown (.md) files are now parsed and treated as text files.
- PDF and Office parser stack optimizations which should avoid the need to set ulimit -s.
- New llmware_models_fast_start.py example that allows discovery and selection of all llmware HuggingFace models.
@@ -1102,9 +1139,9 @@ For complete history of release notes, please open the Change log tab.
- Improved handling for eos_token_id for llama2 and mistral.
- Improved support for Hugging Face dynamic loading
- New examples with the new llmware DRAGON models.
-
- **14 Nov 2023: llmware v0.1.7**
- - Moved to Python Wheel package format for PyPi distribution to provide seamless installation of native dependencies on all supported platforms.
+
+ - Moved to Python Wheel package format for PyPi distribution to provide seamless installation of native dependencies on all supported platforms.
- ModelCatalog enhancements:
- OpenAI update to include newly announced βturboβ 4 and 3.5 models.
- Cohere embedding v3 update to include new Cohere embedding models.
@@ -1114,12 +1151,13 @@ For complete history of release notes, please open the Change log tab.
- βevidence_metadataβ added to prompt_main output dictionaries allowing prompt_main responses to be plug into the evidence and fact-checking steps without modification.
- API key can now be passed directly in a prompt.load_model(model_name, api_key = β[my-api-key]β)
- LLMWareInference Server - Initial delivery:
- - New Class for LLMWareModel which is a wrapper on a custom HF-style API-based model.
- - LLMWareInferenceServer is a new class that can be instantiated on a remote (GPU) server to create a testing API-server that can be integrated into any Prompt workflow.
-
+ - New Class for LLMWareModel which is a wrapper on a custom HF-style API-based model.
+ - LLMWareInferenceServer is a new class that can be instantiated on a remote (GPU) server to create a testing API-server that can be integrated into any Prompt workflow.
+
- **03 Nov 2023: llmware v0.1.6**
+
- Updated packaging to require mongo-c-driver 1.24.4 to temporarily workaround segmentation fault with mongo-c-driver 1.25.
- - Updates in python code needed in anticipation of future Windows support.
+ - Updates in python code needed in anticipation of future Windows support.
- **27 Oct 2023: llmware v0.1.5**
- Four new example scripts focused on RAG workflows with small, fine-tuned instruct models that run on a laptop (`llmware` [BLING](https://huggingface.co/llmware) models).
@@ -1137,15 +1175,14 @@ For complete history of release notes, please open the Change log tab.
- MongoDB Atlas Vector Search support.
- Support for authentication using a MongoDB connection string.
- Document summarization methods.
- - Improvements in capturing the model context window automatically and passing changes in the expected output length.
+ - Improvements in capturing the model context window automatically and passing changes in the expected output length.
- Dataset card and description with lookup by name.
- Processing time added to model inference usage dictionary.
- Additional test scripts, examples, and defect fixes.
- **06 Oct 2023: llmware v0.1.1**
- Added test scripts to the github repository for regression testing.
- Minor defect fixes and version update of Pillow to address [dependency security alert](https://github.com/llmware-ai/llmware/security/dependabot/1).
-- **02 Oct 2023: llmware v0.1.0** π₯ Initial release of llmware to open source!! π₯
-
+- **02 Oct 2023: llmware v0.1.0** π₯ Initial release of llmware to open source!! π₯
@@ -1154,21 +1191,20 @@ For complete history of release notes, please open the Change log tab. ## π€ Read our White Papers - - **Revolutionizing AI Deployment: Unleashing AI Acceleration with Intel's AI PCs and Model HQ by LLMWare** [AI PC Model HQ.pdf](https://github.com/user-attachments/files/18024139/AI.PC.Model.HQ.pdf) -- **Revultionizing AI Deployment (Intel Abstract Version)** [LNL White paper (Abstract Version) final.pdf](https://github.com/user-attachments/files/18281644/LNL.White.paper.Abstract.Version.final.pdf) +- **Revultionizing AI Deployment (Intel Abstract Version)** [LNL White paper (Abstract Version) final.pdf](https://github.com/user-attachments/files/18281644/LNL.White.paper.Abstract.Version.final.pdf) - **Accelerating AI Powered Productivity with AI PCs** [Laptop.Performance.WP.Final (10).pdf](https://github.com/user-attachments/files/18024294/Laptop.Performance.WP.Final.10.pdf) ## Intel Joint Solutions -- **Arrow Lake** -[IPA.Optimization.Summary.LLMWare (1).pdf](https://github.com/user-attachments/files/18292873/IPA.Optimization.Summary.LLMWare.1.pdf) +- **Arrow Lake** + [IPA.Optimization.Summary.LLMWare (1).pdf](https://github.com/user-attachments/files/18292873/IPA.Optimization.Summary.LLMWare.1.pdf) ## About Model HQ - - **Privacy Policy** [AI.BLOKS.PRIVACY.POLICY.1.3.25.pdf](https://github.com/user-attachments/files/19289355/AI.BLOKS.PRIVACY.POLICY.1.3.25.pdf) + +- **Privacy Policy** [AI.BLOKS.PRIVACY.POLICY.1.3.25.pdf](https://github.com/user-attachments/files/19289355/AI.BLOKS.PRIVACY.POLICY.1.3.25.pdf) - **Terms of Service** [AI.Bloks.Terms.of.Service.3.3.25.pdf](https://github.com/user-attachments/files/19289545/AI.Bloks.Terms.of.Service.3.3.25.pdf) - **Acceptable Use Policy**[Acceptable Use Policy for Model HQ by AI BLOKS LLC.docx](https://github.com/user-attachments/files/18291481/Acceptable.Use.Policy.for.Model.HQ.by.AI.BLOKS.LLC.docx) -