[MODEL I/O - Prompts] Prompt templates - 1
- 출처 : https://python.langchain.com/docs/modules/model_io/
- 이 블로그 글은 LangChain API document의 글을 기반으로 번역되었으며 이 과정에서 약간의 내용이 추가되었습니다.
- MODEL I/O는 Prompts, Language Model, Output Parser로 이루어져 있습니다.
- 본 글에서는 Custom Prompt Template과 Few-shot Prompt Template를 다룹니다.
1. Custom prompt template
이제 LLM이 이름으로 어떤 함수명이 주어졌을 때, 이 함수에 대한 설명을 생성하기를 원한다고 가정해 봅시다. 이를 해내기 위해서 우리는 함수명을 입력으로 받아 함수의 source code를 프롬프트를 제공하는 프롬프트 템플릿을 만들어야합니다.
LangChain은 다양한 프롬프트 템플릿을 제공하지만 위의 경우처럼, 사용자는 제공된 것 이상의 것이 필요할 수 있습니다.
이를 위해 LangChain에서는 Custom prompt template을 제작할 수 있도록 적절한 클래스를 제공하고 있습니다.
1 - 1. Creating a Custom Prompt Template
LangChain은 사용할 모델에 따라 두 개의 기본적인 템플릿을 제공하고 있습니다. 하나는 string prompt template이고, 다른 하나는 chat prompt template입니다. string prompt template은 단순히 string 형태의 프롬프트를 제공하고, chat prompt template은 Chat API에 적합한 형태의 구조적인 프롬프트를 제공합니다.
이번 세션에선 string prompt template을 사용하여 Custom Prompt Template을 만듭니다.
함수명을 입력으로 받아 소스코드를 프롬프트로 변환하는 프롬프트 템플릿을 만들려면, 먼저 함수명을 받아 소스코드를 반환하는 함수가 필요합니다.
import inspect
def get_source_code(function_name):
return inspect.getsource(function_name)
그다음으로는 함수명을 입력으로 받아 프롬프트 템플릿으로 변환하는 custom prompt template을 작성합니다.
from langchain.prompts import StringPromptTemplate
from pydantic import BaseModel, validator
class FunctionExplainerPromptTemplate(StringPromptTemplate, BaseModel):
@validator('input_variables')
def validate_input_variables(cls, v):
if len(v) != 1 or 'function_name' not in v:
raise ValueError('function_name must be the only input_variable.')
return v
def format(self, **kwargs) -> str :
source_code = get_source_code(kwargs['function_name'])
prompt = f"""
Given the function name and source code, generate an English language explanation of the function.
Function Name: {kwargs["function_name"].__name__}
Source Code:
{source_code}
Explanation:
"""
return prompt
def _prompt_type(self) :
return 'function_explainer'
1-2. Use the custom prompt template
이제 위에서 만든 custom prompt template을 적용하여 함수명을 입력으로 받아 소스코드를 프롬프트로 만들어보겠습니다.
fn_explainer = FunctionExplainerPromptTemplate(input_variables=["function_name"])
prompt = fn_explainer.format(function_name=inspect.isclass)
print(prompt)
Given the function name and source code, generate an English language explanation of the function.
Function Name: isclass
Source Code:
def isclass(object):
"""Return true if the object is a class.
Class objects provide these attributes:
__doc__ documentation string
__module__ name of module in which this class was defined"""
return isinstance(object, type)
Explanation:
1-3. Generate Function Explanations
이제 모든 작업이 끝났습니다. 마지막으로 위에서 만든 프롬프트를 활용하여 입력한 함수명에 대한 설명을 생성해 보겠습니다.
from langchain.llms import OpenAI
llm = OpenAI()
print(llm(prompt))
'''
The isclass function checks if an object is a class.
It returns true if the object is a type, which is the base class for all classes.
A class object provides attributes such as a documentation string and the name of
the module in which it was defined.
'''
2. Few-shot Prompt Template
이번 세션에선 few shot examples을 작성하는 프롬프트 템플릿을 만들어 볼 것입니다. LangChain에서 제공하는 FewShotPromptTemplate 클래스를 사용하면 간단하게 제작할 수 있습니다.
2-1. Create the example set
프롬프트 템플릿을 제작하기에 앞서, few shot에 사용될 예시를 먼저 만들어야 합니다. 예시는 각각 dictionary형태로 저장되어 있고, 이들을 리스트 형태로 보관합니다.
from langchain.prompts.few_shot import FewShotPromptTemplate
from langchain.prompts.prompt import PromptTemplate
examples = [
{
"question": "Who lived longer, Muhammad Ali or Alan Turing?",
"answer":
"""
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali
"""
},
{
"question": "When was the founder of craigslist born?",
"answer":
"""
Are follow up questions needed here: Yes.
Follow up: Who was the founder of craigslist?
Intermediate answer: Craigslist was founded by Craig Newmark.
Follow up: When was Craig Newmark born?
Intermediate answer: Craig Newmark was born on December 6, 1952.
So the final answer is: December 6, 1952
"""
},
{
"question": "Who was the maternal grandfather of George Washington?",
"answer":
"""
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
"""
},
{
"question": "Are both the directors of Jaws and Casino Royale from the same country?",
"answer":
"""
Are follow up questions needed here: Yes.
Follow up: Who is the director of Jaws?
Intermediate Answer: The director of Jaws is Steven Spielberg.
Follow up: Where is Steven Spielberg from?
Intermediate Answer: The United States.
Follow up: Who is the director of Casino Royale?
Intermediate Answer: The director of Casino Royale is Martin Campbell.
Follow up: Where is Martin Campbell from?
Intermediate Answer: New Zealand.
So the final answer is: No
"""
}
]
2-2. Create a formatter for the few shot examples
Few shot example을 string으로 변환하는 formatter 형식을 먼저 갖춰야 합니다. 이 formatter는 반드시 PromptTemplate 객체여야 합니다.
example_prompt = PromptTemplate(input_variables=["question", "answer"], template="Question: {question}\n{answer}")
print(example_prompt.format(**examples[0]))
Question: Who lived longer, Muhammad Ali or Alan Turing?
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali
2-3. Feed examples and formatter to FewShotPromptTemplate
마지막으로 FewShotPromptTemplate 클래스를 활용하여 few shot prompt를 만듭니다.
prompt = FewShotPromptTemplate(
examples=examples,
example_prompt=example_prompt,
suffix="Question: {input}",
input_variables=["input"]
)
print(prompt.format(input="Who was the father of Mary Ball Washington?"))
Question: Who lived longer, Muhammad Ali or Alan Turing?
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali
Question: When was the founder of craigslist born?
Are follow up questions needed here: Yes.
Follow up: Who was the founder of craigslist?
Intermediate answer: Craigslist was founded by Craig Newmark.
Follow up: When was Craig Newmark born?
Intermediate answer: Craig Newmark was born on December 6, 1952.
So the final answer is: December 6, 1952
Question: Who was the maternal grandfather of George Washington?
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
Question: Are both the directors of Jaws and Casino Royale from the same country?
Are follow up questions needed here: Yes.
Follow up: Who is the director of Jaws?
Intermediate Answer: The director of Jaws is Steven Spielberg.
Follow up: Where is Steven Spielberg from?
Intermediate Answer: The United States.
Follow up: Who is the director of Casino Royale?
Intermediate Answer: The director of Casino Royale is Martin Campbell.
Follow up: Where is Martin Campbell from?
Intermediate Answer: New Zealand.
So the final answer is: No
Question: Who was the father of Mary Ball Washington?
2-4. Using an example selector
Few shot을 사용하는 경우, 특별한 이유가 있을 때는 모든 예시를 사용하는 것이 아니라 질문과 관련이 있는 예시를 사용하거나, 글자 길이를 최대한 줄이는 방법으로 예시를 선택해야 할 수 있습니다. 이런 경우를 위해 LangChain은 ExampleSelector 제공하고 있습니다. 이번 세션에서는 SementicSimilarityExampleSeletor 클래스를 사용하여 입력과 유사도가 큰 예시를 선택하는 것을 보겠습니다.
먼저, 입력과 예시의 유사도를 계산하기 위해 각 문자열을 임베딩해야 합니다. 문장의 임베딩을 Chroma로 저장하고, 질문과 가장 큰 유사도를 갖는 예시를 선택합니다.
from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
example_selector = SemanticSimilarityExampleSelector.from_examples(
examples,
OpenAIEmbeddings(),
Chroma,
k=1
)
question = "Who was the father of Mary Ball Washington?"
selected_examples = example_selector.select_examples({"question": question})
print(f"Examples most similar to the input: {question}")
for example in selected_examples:
print("\n")
for k, v in example.items():
print(f"{k}: {v}")
Examples most similar to the input: Who was the father of Mary Ball Washington?
question: Who was the maternal grandfather of George Washington?
answer:
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
질문과 가장 큰 유사도를 갖는 예시를 선택하는 것을 확인했습니다. 이제 이것을 활용하여 프롬프트를 만들겠습니다.
prompt = FewShotPromptTemplate(
example_selector=example_selector,
example_prompt=example_prompt,
suffix="Question: {input}",
input_variables=["input"]
)
print(prompt.format(input="Who was the father of Mary Ball Washington?"))
Question: Who was the maternal grandfather of George Washington?
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
Question: Who was the father of Mary Ball Washington?