DSPy
Indexify complements DSPy by providing a robust platform for indexing large volume of multi-modal content such as PDFs, raw text, audio and video. It provides a retriever API to retrieve context for LLMs.
We provide a Indexify Retreival Module for DSPy, that works with DSPy.
It will act as a wrapper that internally uses your indexify client for indexing and querying but externally leverages the modularity and design of DSPy.
Install the Indexify DSPy retriever package -
Import the necessary libraries
Instantiate the Retreival Model
You can create a Retreival Model to retrieve from an index mantained by Indexify. Use the DSPy settings to confgiure the retriever model.
turbo = dspy.OpenAI(model="gpt-3.5-turbo")
indexify_client = IndexifyClient()
indexify_retriever_model = IndexifyRM("index_name", indexify_client, k=3)
dspy.settings.configure(lm=turbo, rm=indexify_retriever_model)
Using the Retreival Model is very simple
retrieve = dspy.Retrieve(k=3)
question = "Who are the NBA Finals MVPs"
topK_passages = retrieve(question).passages
Create an indexify client and populate it with some documents
indexify_client = IndexifyClient()
extraction_graph_spec = """
name: 'myextractiongraph'
extraction_policies:
- extractor: 'tensorlake/minilm-l6'
name: 'minilml6'
"""
extraction_graph = ExtractionGraph.from_yaml(extraction_graph_spec)
client.create_extraction_graph(extraction_graph)
indexify_client.add_documents(
"myextractiongraph",
[
"Indexify is amazing!",
"Indexify is a retrieval service for LLM agents!",
"Steph Curry is the best basketball player in the world.",
],
)
Initialize the IndexifyRM class
Using the RM class
retrieve = IndexifyRM(indexify_client)
topk_passages = retrieve("Sports", "myextractiongraph.minilml6.embedding", k=2).passages
print(topk_passages)
Setting up DSPy Module with Indexify
You can use IndexifyRM like any other DSPy module or build your own wrapper for retrieval using the Indexify client following this example.