In this below program we will try to Learn about the LLMLingua Capabilities
!pip install llmlingua llama-index
# Using the OAI
import openai
openai.api_key = "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"
!wget "https://www.dropbox.com/s/f6bmb19xdg0xedm/paul_graham_essay.txt?dl=1" -O paul_graham_essay.txt
from llama_index import (
VectorStoreIndex,
SimpleDirectoryReader,
load_index_from_storage,
StorageContext,
)
# load documents
documents = SimpleDirectoryReader(
input_files=["paul_graham_essay.txt"]
).load_data()
index = VectorStoreIndex.from_documents(documents)
retriever = index.as_retriever(similarity_top_k=10)
question = "Where did the author go for art school?"
answer = "RISD"
contexts = retriever.retrieve(question)
context_list = [n.get_content() for n in contexts]
len(context_list)
#10
from llama_index.llms import OpenAI
llm = OpenAI(model="gpt-3.5-turbo-16k")
prompt = "\n\n".join(context_list + [question])
response = llm.complete(prompt)
print(str(response))
The author went to the Rhode Island School of Design (RISD) for art school.
Setup LinguaLLM
# Setup LLMLingua
from llama_index.query_engine import RetrieverQueryEngine
from llama_index.response_synthesizers import CompactAndRefine
from llama_index.indices.postprocessor import LongLLMLinguaPostprocessor
node_postprocessor = LongLLMLinguaPostprocessor(
instruction_str="Given the context, please answer the final question",
target_token=300,
rank_method="longllmlingua",
additional_compress_kwargs={
"condition_compare": True,
"condition_in_question": "after",
"context_budget": "+100",
"reorder_context": "sort", # enable document reorder,
"dynamic_context_compression_ratio": 0.3,
},
)
retrieved_nodes = retriever.retrieve(question)
synthesizer = CompactAndRefine()
from llama_index.indices.query.schema import QueryBundle
# outline steps in RetrieverQueryEngine for clarity:
# postprocess (compress), synthesize
new_retrieved_nodes = node_postprocessor.postprocess_nodes(
retrieved_nodes, query_bundle=QueryBundle(query_str=question)
)
original_contexts = "\n\n".join([n.get_content() for n in retrieved_nodes])
compressed_contexts = "\n\n".join([n.get_content() for n in new_retrieved_nodes])
original_tokens = node_postprocessor._llm_lingua.get_token_length(original_contexts)
compressed_tokens = node_postprocessor._llm_lingua.get_token_length(compressed_contexts)
print(compressed_contexts)
print()
print("Original Tokens:", original_tokens)
print("Compressed Tokens:", compressed_tokens)
print("Compressed Ratio:", f"{original_tokens/(compressed_tokens + 1e-5):.2f}x")
What should I do next? Rtm's advice hadn't included anything about that. I wanted to do something completely different, so I decided I'd paint. I wanted to see how good I could get if I focused on it. So the day after I stopped working on YC, I started painting. I was rusty and it took a while to get back into shape, but it was at least completely engaging. [18]
Our Ulivi, was a guy. He could see I worked hard, and gave me, wrote down in a sort of pass each student But Accademia wasn't me anything Italian, and my money was running out, so at the end of the first year I back to US
I wanted back to RISD, but I was now broke and RISD very expensive decided to a job for year return RISD the I got one at called Interleaf, which made software. You Microsoft Word? Exactly That was learned end software tends to high. But Interleaf still had a few years to live. [] in ID, but was basically myself to I for free99 I out around my friend Nancy Parmet did big Aled building in York becomingant. Did I It wasn my place was be where the. wanted it! [7]
Original Tokens: 10703
Compressed Tokens: 275
Comressed Ratio: 38.92x
response = synthesizer.synthesize(question, new_retrieved_nodes)
print(str(response))
#output: The author went to RISD for art school.
retriever_query_engine = RetrieverQueryEngine.from_args(
retriever, node_postprocessors=[node_postprocessor]
)
response = retriever_query_engine.query(question)
print(str(response))
#The author went to RISD for art school.
We can see that “Original Tokens: 10703 “ were compressed to 275
Compressed Ratio: 38.92x but still the output were same
All program in one time:
!pip install llmlingua llama-index
# Using the OAI
import openai
openai.api_key = "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"
!wget "https://www.dropbox.com/s/f6bmb19xdg0xedm/paul_graham_essay.txt?dl=1" -O paul_graham_essay.txt
from llama_index import (
VectorStoreIndex,
SimpleDirectoryReader,
load_index_from_storage,
StorageContext,
)
# load documents
documents = SimpleDirectoryReader(
input_files=["paul_graham_essay.txt"]
).load_data()
index = VectorStoreIndex.from_documents(documents)
retriever = index.as_retriever(similarity_top_k=10)
question = "Where did the author go for art school?"
answer = "RISD"
contexts = retriever.retrieve(question)
context_list = [n.get_content() for n in contexts]
len(context_list)
from llama_index.llms import OpenAI
llm = OpenAI(model="gpt-3.5-turbo-16k")
prompt = "\n\n".join(context_list + [question])
response = llm.complete(prompt)
print(str(response))
# Setup LLMLingua
from llama_index.query_engine import RetrieverQueryEngine
from llama_index.response_synthesizers import CompactAndRefine
from llama_index.indices.postprocessor import LongLLMLinguaPostprocessor
node_postprocessor = LongLLMLinguaPostprocessor(
instruction_str="Given the context, please answer the final question",
target_token=300,
rank_method="longllmlingua",
additional_compress_kwargs={
"condition_compare": True,
"condition_in_question": "after",
"context_budget": "+100",
"reorder_context": "sort", # enable document reorder,
"dynamic_context_compression_ratio": 0.3,
},
)
retrieved_nodes = retriever.retrieve(question)
synthesizer = CompactAndRefine()
from llama_index.indices.query.schema import QueryBundle
# outline steps in RetrieverQueryEngine for clarity:
# postprocess (compress), synthesize
new_retrieved_nodes = node_postprocessor.postprocess_nodes(
retrieved_nodes, query_bundle=QueryBundle(query_str=question)
)
original_contexts = "\n\n".join([n.get_content() for n in retrieved_nodes])
compressed_contexts = "\n\n".join([n.get_content() for n in new_retrieved_nodes])
original_tokens = node_postprocessor._llm_lingua.get_token_length(original_contexts)
compressed_tokens = node_postprocessor._llm_lingua.get_token_length(compressed_contexts)
print(compressed_contexts)
print()
print("Original Tokens:", original_tokens)
print("Compressed Tokens:", compressed_tokens)
print("Compressed Ratio:", f"{original_tokens/(compressed_tokens + 1e-5):.2f}x")
response = synthesizer.synthesize(question, new_retrieved_nodes)
print(str(response))
retriever_query_engine = RetrieverQueryEngine.from_args(
retriever, node_postprocessors=[node_postprocessor]
)
response = retriever_query_engine.query(question)
print(str(response))
Nivce article..its informative and interactive. Good luck. Bonzai!