微信扫码
与创始人交个朋友
我要投稿
【8】GIF 动画生成代理 (LangGraph)-GIF Animation Generator Agent (LangGraph)
【9】TTS 诗歌生成代理 (LangGraph)-TTS Poem Generator Agent (LangGraph)
【10】音乐作曲代理 (LangGraph)-Music Compositor Agent (LangGraph)
【11】记忆增强对话代理-Memory-Enhanced Conversational Agent
【12】多代理协作系统-Multi-Agent Collaboration System
概述:GIF动画生成器,集成LangGraph用于工作流管理、GPT-4用于文本生成和DALL-E用于图像创作的,可以根据用户提示生成自定义动画。
#Setup and Imports
import os
from typing import TypedDict, Annotated, Sequence, List
from langgraph.graph import Graph, END
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage
from openai import OpenAI
from PIL import Image
import io
from IPython.display import display, Image as IPImage
from langchain_core.runnables.graph import MermaidDrawMethod
import asyncio
import aiohttp
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
# Set OpenAI API key
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
# Initialize OpenAI client
client = OpenAI()
#Define Data Structures
class GraphState(TypedDict):
messages: Annotated[Sequence[HumanMessage | AIMessage], "The messages in the conversation"]
query: Annotated[str, "Input query describing the character and scene"]
plot: Annotated[str, "Generated plot for the GIF"]
character_description: Annotated[str, "Detailed description of the main character or object"]
image_prompts: Annotated[List[str], "List of prompts for each frame"]
image_urls: Annotated[List[str], "List of URLs for generated images"]
gif_data: Annotated[bytes, "GIF data in bytes"]
# Initialize the language model
llm = ChatOpenAI(model="gpt-4")
#Define Graph Functions
async def get_image_data(session, url: str):
"""Fetch image data from a given URL."""
async with session.get(url) as response:
if response.status == 200:
return await response.read()
return None
def generate_character_description(state: GraphState) -> GraphState:
"""Generate a detailed description of the main character or scene."""
query = state["query"]
response = llm.invoke([HumanMessage(content=f"Based on the query '{query}', create a detailed description of the main character, object, or scene.
Include specific details about appearance, characteristics, and any unique features.
This description will be used to maintain consistency across multiple images.")])
state["character_description"] = response.content
return state
def generate_plot(state: GraphState) -> GraphState:
"""Generate a 5-step plot for the GIF animation."""
query = state["query"]
character_description = state["character_description"]
response = llm.invoke([HumanMessage(content=f"Create a short, 5-step plot for a GIF based on this query: '{query}' and featuring this description: {character_description}.
Each step should be a brief description of a single frame,
maintaining consistency throughout. Keep it family-friendly and avoid any sensitive themes.")])
state["plot"] = response.content
return state
def generate_image_prompts(state: GraphState) -> GraphState:
"""Generate specific image prompts for each frame of the GIF."""
plot = state["plot"]
character_description = state["character_description"]
response = llm.invoke([HumanMessage(content=f"""Based on this plot: '{plot}' and featuring this description: {character_description}, generate 5 specific, family-friendly image prompts,
one for each step. Each prompt should be detailed enough for image generation, maintaining consistency, and suitable for DALL-E.
Always include the following in EVERY prompt to maintain consistency:
1. A brief reminder of the main character or object's key features
2. The specific action or scene described in the plot step
3. Any relevant background or environmental details
Format each prompt as a numbered list item, like this:
1. [Your prompt here]
2. [Your prompt here]
... and so on.""")])
prompts = []
for line in response.content.split('\n'):
if line.strip().startswith(('1.', '2.', '3.', '4.', '5.')):
prompt = line.split('.', 1)[1].strip()
prompts.append(f"Create a detailed, photorealistic image of the following scene: {prompt}")
if len(prompts) != 5:
raise ValueError(f"Expected 5 prompts, but got {len(prompts)}. Please try again.")
state["image_prompts"] = prompts
return state
async def create_image(prompt: str, retries: int = 3):
"""Generate an image using DALL-E based on the given prompt."""
for attempt in range(retries):
try:
response = await asyncio.to_thread(
client.images.generate,
model="dall-e-3",
prompt=prompt,
size="1024x1024",
quality="standard",
n=1,
)
return response.data[0].url
except Exception as e:
if attempt == retries - 1:
print(f"Failed to generate image for prompt: {prompt}")
print(f"Error: {str(e)}")
return None
await asyncio.sleep(2) # Wait before retrying
async def create_images(state: GraphState) -> GraphState:
"""Generate images for all prompts in parallel."""
image_prompts = state["image_prompts"]
tasks = [create_image(prompt) for prompt in image_prompts]
image_urls = await asyncio.gather(*tasks)
state["image_urls"] = image_urls
return state
async def create_gif(state: GraphState) -> GraphState:
"""Create a GIF from the generated images."""
image_urls = state["image_urls"]
images = []
async with aiohttp.ClientSession() as session:
tasks = [get_image_data(session, url) for url in image_urls if url]
image_data_list = await asyncio.gather(*tasks)
for img_data in image_data_list:
if img_data:
images.append(Image.open(io.BytesIO(img_data)))
if images:
gif_buffer = io.BytesIO()
images[0].save(gif_buffer, format='GIF', save_all=True, append_images=images[1:], duration=1000, loop=0)
state["gif_data"] = gif_buffer.getvalue()
else:
state["gif_data"] = None
return state
#Set Up LangGraph Workflow
workflow = Graph()
workflow.add_node("generate_character_description", generate_character_description)
workflow.add_node("generate_plot", generate_plot)
workflow.add_node("generate_image_prompts", generate_image_prompts)
workflow.add_node("create_images", create_images)
workflow.add_node("create_gif", create_gif)
workflow.add_edge("generate_character_description", "generate_plot")
workflow.add_edge("generate_plot", "generate_image_prompts")
workflow.add_edge("generate_image_prompts", "create_images")
workflow.add_edge("create_images", "create_gif")
workflow.add_edge("create_gif", END)
workflow.set_entry_point("generate_character_description")
app = workflow.compile()
#Display Graph Structure
display(
IPImage(
app.get_graph().draw_mermaid_png(
draw_method=MermaidDrawMethod.API,
)
)
)
#Run Workflow Function
async def run_workflow(query: str):
"""Run the LangGraph workflow and display results."""
initial_state = {
"messages": [],
"query": query,
"plot": "",
"character_description": "",
"image_prompts": [],
"image_urls": [],
"gif_data": None
}
try:
result = await app.ainvoke(initial_state)
print("Character/Scene Description:")
print(result["character_description"])
print("\nGenerated Plot:")
print(result["plot"])
print("\nImage Prompts:")
for i, prompt in enumerate(result["image_prompts"], 1):
print(f"{i}. {prompt}")
print("\nGenerated Image URLs:")
for i, url in enumerate(result["image_urls"], 1):
print(f"{i}. {url}")
if result["gif_data"]:
print("\nGIF generated successfully. Use the next cell to display or save it.")
else:
print("\nFailed to generate GIF.")
return result
except Exception as e:
print(f"An error occurred: {str(e)}")
return None
#Execute Workflow
query = "A cat wearing a top hat and monocle, sitting at a desk and writing a letter with a quill pen."
result = await run_workflow(query)
#Display and Save GIF
if result and result["gif_data"]:
# Display the GIF
display(IPImage(data=result["gif_data"], format='gif'))
# Ask if the user wants to save the GIF
save_gif = input("Do you want to save the GIF? (yes/no): ").lower().strip()
if save_gif == 'yes':
filename = input("Enter the filename to save the GIF (e.g., output.gif): ").strip()
if not filename.endswith('.gif'):
filename += '.gif'
with open(filename, 'wb') as f:
f.write(result["gif_data"])
print(f"GIF saved as {filename}")
else:
print("GIF not saved.")
else:
print("No GIF data available to display or save.")
使用 LangGraph 和 DALL-E 的 GIF 动画生成器
1.LangGraph:协调整体工作流,管理过程中不同阶段的数据流动。
2.GPT-4 (via LangChain):根据初始用户查询生成详细的描述、情节和图像提示。
3.DALL-E 3:根据生成的提示创建高质量图像。
4.Python Imaging Library (PIL):将单个图像组合成 GIF 动画。
5.Asynchronous Programming:利用 asyncio 和 aiohttp 进行高效的并行处理图像生成和检索。
GIF 生成过程遵循以下高层次步骤:
1.角色/场景描述(Character/Scene Description):根据用户的输入查询,系统生成主要角色或场景的详细描述。
2.情节生成(Plot Generation):使用角色描述和初始查询,创建一个 5 步情节,概述动画的进展。
3.图像提示创建(Image Prompt Creation):对于情节的每一步,生成特定的图像提示,确保帧之间的一致性。
4.图像生成(Image Generation)::DALL-E 3 根据每个提示创建图像。
5.GIF 组装(GIF Assembly):将生成的图像编译成 GIF 动画。
在整个过程中,LangGraph 管理步骤之间的信息流动,确保每个阶段的输出适当地输入到下一个阶段。异步编程的使用允许在图像生成和检索阶段进行高效的并行处理。
系统的模块化特性,由 LangGraph 促进,允许轻松更新或替换单个组件。这使得项目能够适应未来语言模型或图像生成技术的进步。
虽然当前的实现侧重于创建简单的 5 帧 GIF,但这个概念可以扩展到创建更长的动画,中间阶段纳入用户反馈,或者甚至与其他媒体类型集成。随着 AI 的不断发展,这样的工具将在弥合人类创造力和机器能力之间的差距中发挥越来越重要的作用。
概述:使用LangGraph和OpenAI API的高级文本转语音(TTS)代理,对输入文本进行分类,根据内容类型处理,并生成相应的语音输出。
#Import necessary libraries and set up environment
# Import required libraries
from typing import TypedDict
from langgraph.graph import StateGraph, END
from IPython.display import display, Audio, Markdown
from openai import OpenAI
from dotenv import load_dotenv
import io
import tempfile
import re
import os
# Load environment variables and set OpenAI API key
load_dotenv()
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
#Initialize OpenAI client and define state
client = OpenAI()
class AgentState(TypedDict):
input_text: str
processed_text: str
audio_data: bytes
audio_path: str
content_type: str
#Define Node Functions
def classify_content(state: AgentState) -> AgentState:
"""Classify the input text into one of four categories: general, poem, news, or joke."""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Classify the content as one of: 'general', 'poem', 'news', 'joke'."},
{"role": "user", "content": state["input_text"]}
]
)
state["content_type"] = response.choices[0].message.content.strip().lower()
return state
def process_general(state: AgentState) -> AgentState:
"""Process general content (no specific processing, return as-is)."""
state["processed_text"] = state["input_text"]
return state
def process_poem(state: AgentState) -> AgentState:
"""Process the input text as a poem, rewriting it in a poetic style."""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Rewrite the following text as a short, beautiful poem:"},
{"role": "user", "content": state["input_text"]}
]
)
state["processed_text"] = response.choices[0].message.content.strip()
return state
def process_news(state: AgentState) -> AgentState:
"""Process the input text as news, rewriting it in a formal news anchor style."""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Rewrite the following text in a formal news anchor style:"},
{"role": "user", "content": state["input_text"]}
]
)
state["processed_text"] = response.choices[0].message.content.strip()
return state
def process_joke(state: AgentState) -> AgentState:
"""Process the input text as a joke, turning it into a short, funny joke."""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Turn the following text into a short, funny joke:"},
{"role": "user", "content": state["input_text"]}
]
)
state["processed_text"] = response.choices[0].message.content.strip()
return state
def text_to_speech(state: AgentState, save_file: bool = False) -> AgentState:
"""
Converts processed text into speech using a voice mapped to the content type.
Optionally saves the audio to a file.
Args:
state (AgentState): Dictionary containing the processed text and content type.
save_file (bool, optional): If True, saves the audio to a file. Defaults to False.
Returns:
AgentState: Updated state with audio data and file path (if saved).
"""
# Map content type to a voice, defaulting to "alloy"
voice_map = {
"general": "alloy",
"poem": "nova",
"news": "onyx",
"joke": "shimmer"
}
voice = voice_map.get(state["content_type"], "alloy")
audio_data = io.BytesIO()
# Generate speech and stream audio data into memory
with client.audio.speech.with_streaming_response.create(
model="tts-1",
voice=voice,
input=state["processed_text"]
) as response:
for chunk in response.iter_bytes():
audio_data.write(chunk)
state["audio_data"] = audio_data.getvalue()
# Save audio to a file if requested
if save_file:
with tempfile.NamedTemporaryFile(delete=False, suffix=".mp3") as temp_audio:
temp_audio.write(state["audio_data"])
state["audio_path"] = temp_audio.name
else:
state["audio_path"] = ""
return state
#Define and Compile the Graph
# Define the graph
workflow = StateGraph(AgentState)
# Add nodes to the graph
workflow.add_node("classify_content", classify_content)
workflow.add_node("process_general", process_general)
workflow.add_node("process_poem", process_poem)
workflow.add_node("process_news", process_news)
workflow.add_node("process_joke", process_joke)
workflow.add_node("text_to_speech", text_to_speech)
# Set the entry point of the graph
workflow.set_entry_point("classify_content")
# Define conditional edges based on content type
workflow.add_conditional_edges(
"classify_content",
lambda x: x["content_type"],
{
"general": "process_general",
"poem": "process_poem",
"news": "process_news",
"joke": "process_joke",
}
)
# Connect processors to text-to-speech
workflow.add_edge("process_general", "text_to_speech")
workflow.add_edge("process_poem", "text_to_speech")
workflow.add_edge("process_news", "text_to_speech")
workflow.add_edge("process_joke", "text_to_speech")
# Compile the graph
app = workflow.compile()
#A function to convert text to a valid informative filename
def sanitize_filename(text, max_length=20):
"""Convert text to a valid and concise filename."""
sanitized = re.sub(r'[^\w\s-]', '', text.lower())
sanitized = re.sub(r'[-\s]+', '_', sanitized)
return sanitized[:max_length]
#Define Function to Run Agent and Play Audio
def run_tts_agent_and_play(input_text: str, content_type: str, save_file: bool = True):
result = app.invoke({
"input_text": input_text,
"processed_text": "",
"audio_data": b"",
"audio_path": "",
"content_type": content_type
})
print(f"Detected content type: {result['content_type']}")
print(f"Processed text: {result['processed_text']}")
# Play the audio (this will only work in local Jupyter environment)
display(Audio(result['audio_data'], autoplay=True))
if save_file:
# Create 'audio' directory in the parent folder of the notebook
audio_dir = os.path.join('..', 'audio')
os.makedirs(audio_dir, exist_ok=True)
sanitized_text = sanitize_filename(input_text)
file_name = f"{content_type}_{sanitized_text}.mp3"
file_path = os.path.join(audio_dir, file_name)
with open(file_path, "wb") as f:
f.write(result['audio_data'])
print(f"Audio saved to: {file_path}")
# Relative path for GitHub
github_relative_path = f"../audio/{file_name}"
display(Markdown(f"[Download {content_type} audio: {sanitized_text}]({github_relative_path})"))
# Note about GitHub limitations
print("Note: Audio playback is not supported directly on GitHub. Use the download link to listen to the audio.")
else:
print("Audio not saved to file.")
return result
#Test the Text-to-Speech Agent
examples = {
"general": "The quick brown fox jumps over the lazy dog.",
"poem": "Roses are red, violets are blue, AI is amazing, and so are you!",
"news": "Breaking news: Scientists discover a new species of deep-sea creature in the Mariana Trench.",
"joke": "Why don't scientists trust atoms? Because they make up everything!"
}
for content_type, text in examples.items():
print(f"\nProcessing example for {content_type} content:")
print(f"Input text: {text}")
# Run the TTS agent and save the file
result = run_tts_agent_and_play(text, content_type, save_file=True)
print("-" * 50)
print("All examples processed. You can download the audio files using the links above.")
"""
Processing example for general content:
Input text: The quick brown fox jumps over the lazy dog.
Detected content type: poem
Processed text: In autumn's breeze, the swift fox leaps,
Above a slumbering dog it sweeps.
With grace it dances, swift and free,
A tale of motion, poetry.
Audio saved to: ..\audio\general_the_quick_brown_fox_.mp3
Download general audio: the_quick_brown_fox_
Note: Audio playback is not supported directly on GitHub. Use the download link to listen to the audio.
--------------------------------------------------
Processing example for poem content:
Input text: Roses are red, violets are blue, AI is amazing, and so are you!
Detected content type: poem
Processed text: In the garden of knowledge, where data blooms bright,
Up to October's end, you shed your soft light.
With wisdom and insight, like stars in the sky,
AI is enchanting, oh, how you can fly!
Audio saved to: ..\audio\poem_roses_are_red_violet.mp3
Download poem audio: roses_are_red_violet
Note: Audio playback is not supported directly on GitHub. Use the download link to listen to the audio.
--------------------------------------------------
Processing example for news content:
Input text: Breaking news: Scientists discover a new species of deep-sea creature in the Mariana Trench.
Detected content type: news
Processed text: Good evening. In breaking news, scientists have made a remarkable discovery, identifying a new species of deep-sea creature located within the depths of the Mariana Trench. This finding not only expands our understanding of marine biodiversity but also highlights the importance of continued exploration in these largely uncharted waters. We will provide more details on this groundbreaking announcement as they become available.
Audio saved to: ..\audio\news_breaking_news_scient.mp3
Download news audio: breaking_news_scient
Note: Audio playback is not supported directly on GitHub. Use the download link to listen to the audio.
--------------------------------------------------
Processing example for joke content:
Input text: Why don't scientists trust atoms? Because they make up everything!
Detected content type: joke
Processed text: Why don’t AI assistants tell jokes after October 2023? Because they’re still trying to figure out what happened in November!
Audio saved to: ..\audio\joke_why_dont_scientists_.mp3
Download joke audio: why_dont_scientists_
Note: Audio playback is not supported directly on GitHub. Use the download link to listen to the audio.
--------------------------------------------------
All examples processed. You can download the audio files using the links above.
"""
使用 LangGraph 和 OpenAI 构建智能文本转语音代理
1.内容分类(Content Classification):利用 OpenAI 的 GPT 模型对输入文本进行分类。
2.内容处理(Content Processing):根据内容类型(一般、诗歌、新闻或笑话)应用特定的处理。
3.文本转语音转换(Text-to-Speech Conversion):利用 OpenAI 的 TTS API 从处理过的文本中生成音频。
4.LangGraph 工作流(LangGraph Workflow:使用状态图协调整个过程。
TTS 代理通过以下高层次步骤运作:
1.文本输入(Text Input):系统从用户那里接收文本输入。
2.内容分类(Content Classification):输入被分类为四个类别之一:通用、诗歌、新闻或笑话。
3.特定内容处理(Content-Specific Processing):根据分类,文本经过特定处理:一般文本保持不变。诗歌被重写以提高诗意质量。新闻被重新格式化为正式的新闻主播风格。笑话被提炼以增加幽默感。
4.文本转语音转换(Text-to-Speech Conversion):使用适合其内容类型的适当声音将处理过的文本转换为语音。
5.音频输出(Audio Output:根据用户偏好,生成的音频要么保存到文件,要么直接播放。
整个工作流由 LangGraph 状态机管理,确保在不同处理阶段之间平滑过渡,并在整个操作过程中保持上下文。
通过利用 GPT 模型在文本处理方面的优势和 OpenAI 的 TTS 能力,本项目展示了如何将先进的人工智能技术集成在一起,创建复杂的多步骤语言处理管道。
#Imports
# Import required libraries
from typing import Dict, TypedDict
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
import music21
import pygame
import tempfile
import os
import random
from dotenv import load_dotenv
# Load environment variables and set OpenAI API key
load_dotenv()
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
#State Definition
class MusicState(TypedDict):
"""Define the structure of the state for the music generation workflow."""
musician_input: str # User's input describing the desired music
melody: str # Generated melody
harmony: str # Generated harmony
rhythm: str # Generated rhythm
style: str # Desired musical style
composition: str # Complete musical composition
midi_file: str # Path to the generated MIDI file
#LLM Initialization
# Initialize the ChatOpenAI model
llm = ChatOpenAI(model="gpt-4o-mini")
#Component Functions
def melody_generator(state: MusicState) -> Dict:
"""Generate a melody based on the user's input."""
prompt = ChatPromptTemplate.from_template(
"Generate a melody based on this input: {input}. Represent it as a string of notes in music21 format."
)
chain = prompt | llm
melody = chain.invoke({"input": state["musician_input"]})
return {"melody": melody.content}
def harmony_creator(state: MusicState) -> Dict:
"""Create harmony for the generated melody."""
prompt = ChatPromptTemplate.from_template(
"Create harmony for this melody: {melody}. Represent it as a string of chords in music21 format."
)
chain = prompt | llm
harmony = chain.invoke({"melody": state["melody"]})
return {"harmony": harmony.content}
def rhythm_analyzer(state: MusicState) -> Dict:
"""Analyze and suggest a rhythm for the melody and harmony."""
prompt = ChatPromptTemplate.from_template(
"Analyze and suggest a rhythm for this melody and harmony: {melody}, {harmony}. Represent it as a string of durations in music21 format."
)
chain = prompt | llm
rhythm = chain.invoke({"melody": state["melody"], "harmony": state["harmony"]})
return {"rhythm": rhythm.content}
def style_adapter(state: MusicState) -> Dict:
"""Adapt the composition to the specified musical style."""
prompt = ChatPromptTemplate.from_template(
"Adapt this composition to the {style} style: Melody: {melody}, Harmony: {harmony}, Rhythm: {rhythm}. Provide the result in music21 format."
)
chain = prompt | llm
adapted = chain.invoke({
"style": state["style"],
"melody": state["melody"],
"harmony": state["harmony"],
"rhythm": state["rhythm"]
})
return {"composition": adapted.content}
def midi_converter(state: MusicState) -> Dict:
"""Convert the composition to MIDI format and save it as a file."""
# Create a new stream
piece = music21.stream.Score()
# Add the composition description to the stream as a text expression
description = music21.expressions.TextExpression(state["composition"])
piece.append(description)
# Define a wide variety of scales and chords
scales = {
'C major': ['C', 'D', 'E', 'F', 'G', 'A', 'B'],
'C minor': ['C', 'D', 'Eb', 'F', 'G', 'Ab', 'Bb'],
'C harmonic minor': ['C', 'D', 'Eb', 'F', 'G', 'Ab', 'B'],
'C melodic minor': ['C', 'D', 'Eb', 'F', 'G', 'A', 'B'],
'C dorian': ['C', 'D', 'Eb', 'F', 'G', 'A', 'Bb'],
'C phrygian': ['C', 'Db', 'Eb', 'F', 'G', 'Ab', 'Bb'],
'C lydian': ['C', 'D', 'E', 'F#', 'G', 'A', 'B'],
'C mixolydian': ['C', 'D', 'E', 'F', 'G', 'A', 'Bb'],
'C locrian': ['C', 'Db', 'Eb', 'F', 'Gb', 'Ab', 'Bb'],
'C whole tone': ['C', 'D', 'E', 'F#', 'G#', 'A#'],
'C diminished': ['C', 'D', 'Eb', 'F', 'Gb', 'Ab', 'A', 'B'],
}
chords = {
'C major': ['C4', 'E4', 'G4'],
'C minor': ['C4', 'Eb4', 'G4'],
'C diminished': ['C4', 'Eb4', 'Gb4'],
'C augmented': ['C4', 'E4', 'G#4'],
'C dominant 7th': ['C4', 'E4', 'G4', 'Bb4'],
'C major 7th': ['C4', 'E4', 'G4', 'B4'],
'C minor 7th': ['C4', 'Eb4', 'G4', 'Bb4'],
'C half-diminished 7th': ['C4', 'Eb4', 'Gb4', 'Bb4'],
'C fully diminished 7th': ['C4', 'Eb4', 'Gb4', 'A4'],
}
def create_melody(scale_name, duration):
"""Create a melody based on a given scale."""
melody = music21.stream.Part()
scale = scales[scale_name]
for _ in range(duration):
note = music21.note.Note(random.choice(scale) + '4')
note.quarterLength = 1
melody.append(note)
return melody
def create_chord_progression(duration):
"""Create a chord progression."""
harmony = music21.stream.Part()
for _ in range(duration):
chord_name = random.choice(list(chords.keys()))
chord = music21.chord.Chord(chords[chord_name])
chord.quarterLength = 1
harmony.append(chord)
return harmony
# Parse the user input to determine scale and style
user_input = state['musician_input'].lower()
if 'minor' in user_input:
scale_name = 'C minor'
elif 'major' in user_input:
scale_name = 'C major'
else:
scale_name = random.choice(list(scales.keys()))
# Create a 7-second piece (7 beats at 60 BPM)
melody = create_melody(scale_name, 7)
harmony = create_chord_progression(7)
# Add a final whole note to make it exactly 8 beats (7 seconds at 60 BPM)
final_note = music21.note.Note(scales[scale_name][0] + '4')
final_note.quarterLength = 1
melody.append(final_note)
final_chord = music21.chord.Chord(chords[scale_name.split()[0] + ' ' + scale_name.split()[1]])
final_chord.quarterLength = 1
harmony.append(final_chord)
# Add the melody and harmony to the piece
piece.append(melody)
piece.append(harmony)
# Set the tempo to 60 BPM
piece.insert(0, music21.tempo.MetronomeMark(number=60))
# Create a temporary MIDI file
with tempfile.NamedTemporaryFile(delete=False, suffix='.mid') as temp_midi:
piece.write('midi', temp_midi.name)
return {"midi_file": temp_midi.name}
#Graph Construction
# Initialize the StateGraph
workflow = StateGraph(MusicState)
# Add nodes to the graph
workflow.add_node("melody_generator", melody_generator)
workflow.add_node("harmony_creator", harmony_creator)
workflow.add_node("rhythm_analyzer", rhythm_analyzer)
workflow.add_node("style_adapter", style_adapter)
workflow.add_node("midi_converter", midi_converter)
# Set the entry point of the graph
workflow.set_entry_point("melody_generator")
# Add edges to connect the nodes
workflow.add_edge("melody_generator", "harmony_creator")
workflow.add_edge("harmony_creator", "rhythm_analyzer")
workflow.add_edge("rhythm_analyzer", "style_adapter")
workflow.add_edge("style_adapter", "midi_converter")
workflow.add_edge("midi_converter", END)
# Compile the graph
app = workflow.compile()
#Run the Workflow
# Define input parameters
inputs = {
"musician_input": "Create a happy piano piece in C major",
"style": "Romantic era"
}
# Invoke the workflow
result = app.invoke(inputs)
print("Composition created")
print(f"MIDI file saved at: {result['midi_file']}")
#MIDI Playback Function
def play_midi(midi_file_path):
"""Play the generated MIDI file."""
pygame.mixer.init()
pygame.mixer.music.load(midi_file_path)
pygame.mixer.music.play()
# Wait for playback to finish
while pygame.mixer.music.get_busy():
pygame.time.Clock().tick(10)
# Clean up
pygame.mixer.quit()
print("To create and play a melody, run the following in a new cell:")
print("play_midi(result['midi_file'])")
"""
To create and play a melody, run the following in a new cell:
play_midi(result['midi_file'])
"""
#Play the Generated Music
play_midi(result["midi_file"])
1.状态管理State Management:使用 MusicState 类来管理工作流的状态。
2.Language Model:采用 ChatOpenAI(GPT-4)生成音乐组件。
3.音乐功能Musical Functions:旋律生成器Melody Generator。和声创建者Harmony Creator。节奏分析器Rhythm Analyzer。风格适配器Style Adapter。
4.MIDI 转换MIDI Conversion:将作品转换为可播放的 MIDI 文件。
5.LangGraph Workflow:使用状态图协调整个创作过程。
6.播放功能Playback Functionality:允许立即播放生成的作品。
1.工作流程首先根据用户输入生成旋律。
2.然后创建和声以补充旋律。
3.分析并建议旋律和和声的节奏。
4.将作品调整为指定的音乐风格。
5.将最终作品转换为 MIDI 格式。
生成的 MIDI 文件可以使用 pygame 播放。
整个过程使用 LangGraph 进行编排,它管理不同组件之间的信息流动,并确保每一步都建立在前一步的基础上。
实现:集成了一个语言模型,具有独立的短期和长期记忆存储,利用包含这两种记忆类型的提示模板,并使用内存管理器进行存储和检索。该系统包括一个交互循环,更新并利用每个响应的记忆。
#Setup and Imports
from langchain_openai import ChatOpenAI
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain.memory import ChatMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from dotenv import load_dotenv
import os
# Load environment variables
load_dotenv()
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
# Initialize the language model
llm = ChatOpenAI(model="gpt-4o-mini", max_tokens=1000, temperature=0)
#Memory Stores
chat_store = {}
long_term_memory = {}
def get_chat_history(session_id: str):
if session_id not in chat_store:
chat_store[session_id] = ChatMessageHistory()
return chat_store[session_id]
def update_long_term_memory(session_id: str, input: str, output: str):
if session_id not in long_term_memory:
long_term_memory[session_id] = []
if len(input) > 20: # Simple logic: store inputs longer than 20 characters
long_term_memory[session_id].append(f"User said: {input}")
if len(long_term_memory[session_id]) > 5: # Keep only last 5 memories
long_term_memory[session_id] = long_term_memory[session_id][-5:]
def get_long_term_memory(session_id: str):
return ". ".join(long_term_memory.get(session_id, []))
#Prompt Template
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful AI assistant. Use the information from long-term memory if relevant."),
("system", "Long-term memory: {long_term_memory}"),
MessagesPlaceholder(variable_name="history"),
("human", "{input}")
])
#Conversational Chain
chain = prompt | llm
chain_with_history = RunnableWithMessageHistory(
chain,
get_chat_history,
input_messages_key="input",
history_messages_key="history"
)
#Chat Function
def chat(input_text: str, session_id: str):
long_term_mem = get_long_term_memory(session_id)
response = chain_with_history.invoke(
{"input": input_text, "long_term_memory": long_term_mem},
config={"configurable": {"session_id": session_id}}
)
update_long_term_memory(session_id, input_text, response.content)
return response.content
#Example Usage
session_id = "user_123"
print("AI:", chat("Hello! My name is Alice.", session_id))
print("AI:", chat("What's the weather like today?", session_id))
print("AI:", chat("I love sunny days.", session_id))
print("AI:", chat("Do you remember my name?", session_id))
"""
AI: Hello, Alice! How can I assist you today?
AI: I don't have real-time weather data, but you can check a weather website or app for the most accurate and up-to-date information. If you tell me your location, I can suggest what to look for!
AI: Sunny days are wonderful! They can really lift your mood and are perfect for outdoor activities. Do you have any favorite things you like to do on sunny days?
AI: Yes, your name is Alice! How can I assist you further today?
"""
#Review Memory
print("Conversation History:")
for message in chat_store[session_id].messages:
print(f"{message.type}: {message.content}")
print("\nLong-term Memory:")
print(get_long_term_memory(session_id))
"""
Conversation History:
human: Hello! My name is Alice.
ai: Hello, Alice! How can I assist you today?
human: What's the weather like today?
ai: I don't have real-time weather data, but you can check a weather website or app for the most accurate and up-to-date information. If you tell me your location, I can suggest what to look for!
human: I love sunny days.
ai: Sunny days are wonderful! They can really lift your mood and are perfect for outdoor activities. Do you have any favorite things you like to do on sunny days?
human: Do you remember my name?
ai: Yes, your name is Alice! How can I assist you further today?
Long-term Memory:
User said: Hello! My name is Alice.. User said: What's the weather like today?. User said: Do you remember my name?
"""
方法细节
1.设置环境-(Setting Up the Environment)
导入必要的语言模型、记忆管理和提示处理的库。使用所需参数(例如,模型类型、令牌限制)初始化语言模型。
这种实现为创建更复杂的AI代理提供了基础。未来的增强可能包括:更高级的长期记忆存储标准。实施记忆巩固或总结技术。与外部知识库集成。跨互动的情感或情绪跟踪。通过关注记忆增强,这种对话式代理设计显著提高了基本聊天机器人的功能,为更引人入胜、上下文意识和智能的AI助手铺平了道路。
概述:一个多智能体协作系统,将历史研究与数据分析相结合,利用大型语言模型模拟专门的智能体共同工作,以回答复杂的历史问题。
#Import required libraries
import os
import time
from langchain_openai import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage
from typing import List, Dict
from dotenv import load_dotenv
load_dotenv()
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
#Initialize the language model
llm = ChatOpenAI(model="gpt-4o-mini", max_tokens=1000, temperature=0.7)
#Define the base Agent class
class Agent:
def __init__(self, name: str, role: str, skills: List[str]):
self.name = name
self.role = role
self.skills = skills
self.llm = llm
def process(self, task: str, context: List[Dict] = None) -> str:
messages = [
SystemMessage(content=f"You are {self.name}, a {self.role}. Your skills include: {', '.join(self.skills)}. Respond to the task based on your role and skills.")
]
if context:
for msg in context:
if msg['role'] == 'human':
messages.append(HumanMessage(content=msg['content']))
elif msg['role'] == 'ai':
messages.append(AIMessage(content=msg['content']))
messages.append(HumanMessage(content=task))
response = self.llm.invoke(messages)
return response.content
#Define specialized agents: HistoryResearchAgent and DataAnalysisAgent
class HistoryResearchAgent(Agent):
def __init__(self):
super().__init__("Clio", "History Research Specialist", ["deep knowledge of historical events", "understanding of historical contexts", "identifying historical trends"])
class DataAnalysisAgent(Agent):
def __init__(self):
super().__init__("Data", "Data Analysis Expert", ["interpreting numerical data", "statistical analysis", "data visualization description"])
#Define the different functions for the collaboration system
#Research Historical Context
def research_historical_context(history_agent, task: str, context: list) -> list:
print("?️ History Agent: Researching historical context...")
history_task = f"Provide relevant historical context and information for the following task: {task}"
history_result = history_agent.process(history_task)
context.append({"role": "ai", "content": f"History Agent: {history_result}"})
print(f"? Historical context provided: {history_result[:100]}...\n")
return context
#Identify Data Needs
def identify_data_needs(data_agent, task: str, context: list) -> list:
print("? Data Agent: Identifying data needs based on historical context...")
historical_context = context[-1]["content"]
data_need_task = f"Based on the historical context, what specific data or statistical information would be helpful to answer the original question? Historical context: {historical_context}"
data_need_result = data_agent.process(data_need_task, context)
context.append({"role": "ai", "content": f"Data Agent: {data_need_result}"})
print(f"? Data needs identified: {data_need_result[:100]}...\n")
return context
#Provide Historical Data
def provide_historical_data(history_agent, task: str, context: list) -> list:
print("?️ History Agent: Providing relevant historical data...")
data_needs = context[-1]["content"]
data_provision_task = f"Based on the data needs identified, provide relevant historical data or statistics. Data needs: {data_needs}"
data_provision_result = history_agent.process(data_provision_task, context)
context.append({"role": "ai", "content": f"History Agent: {data_provision_result}"})
print(f"? Historical data provided: {data_provision_result[:100]}...\n")
return context
#Analyze Data
def analyze_data(data_agent, task: str, context: list) -> list:
print("? Data Agent: Analyzing historical data...")
historical_data = context[-1]["content"]
analysis_task = f"Analyze the historical data provided and describe any trends or insights relevant to the original task. Historical data: {historical_data}"
analysis_result = data_agent.process(analysis_task, context)
context.append({"role": "ai", "content": f"Data Agent: {analysis_result}"})
print(f"? Data analysis results: {analysis_result[:100]}...\n")
return context
#Synthesize Final Answer
def synthesize_final_answer(history_agent, task: str, context: list) -> str:
print("?️ History Agent: Synthesizing final answer...")
synthesis_task = "Based on all the historical context, data, and analysis, provide a comprehensive answer to the original task."
final_result = history_agent.process(synthesis_task, context)
return final_result
#HistoryDataCollaborationSystem Class
class HistoryDataCollaborationSystem:
def __init__(self):
self.history_agent = Agent("Clio", "History Research Specialist", ["deep knowledge of historical events", "understanding of historical contexts", "identifying historical trends"])
self.data_agent = Agent("Data", "Data Analysis Expert", ["interpreting numerical data", "statistical analysis", "data visualization description"])
def solve(self, task: str, timeout: int = 300) -> str:
print(f"\n? Starting collaboration to solve: {task}\n")
start_time = time.time()
context = []
steps = [
(research_historical_context, self.history_agent),
(identify_data_needs, self.data_agent),
(provide_historical_data, self.history_agent),
(analyze_data, self.data_agent),
(synthesize_final_answer, self.history_agent)
]
for step_func, agent in steps:
if time.time() - start_time > timeout:
return "Operation timed out. The process took too long to complete."
try:
result = step_func(agent, task, context)
if isinstance(result, str):
return result # This is the final answer
context = result
except Exception as e:
return f"Error during collaboration: {str(e)}"
print("\n✅ Collaboration complete. Final answer synthesized.\n")
return context[-1]["content"]
#Example usage
# Create an instance of the collaboration system
collaboration_system = HistoryDataCollaborationSystem()
# Define a complex historical question that requires both historical knowledge and data analysis
question = "How did urbanization rates in Europe compare to those in North America during the Industrial Revolution, and what were the main factors influencing these trends?"
# Solve the question using the collaboration system
result = collaboration_system.solve(question)
# Print the result
print(result)
"""
? Starting collaboration to solve: How did urbanization rates in Europe compare to those in North America during the Industrial Revolution, and what were the main factors influencing these trends?
?️ History Agent: Researching historical context...
? Historical context provided: During the Industrial Revolution, which generally spanned from the late 18th century to the mid-19th...
? Data Agent: Identifying data needs based on historical context...
? Data needs identified: To analyze the urbanization phenomenon during the Industrial Revolution in Europe and North America ...
?️ History Agent: Providing relevant historical data...
? Historical data provided: Here is some relevant historical data and statistics that pertain to the urbanization phenomenon dur...
? Data Agent: Analyzing historical data...
? Data analysis results: Data Agent: Analyzing the historical data provided reveals several key trends and insights regarding...
?️ History Agent: Synthesizing final answer...
### Urbanization During the Industrial Revolution: A Comparative Analysis of Europe and North America
The Industrial Revolution, spanning from the late 18th century to the mid-19th century, marked a transformative era characterized by significant changes in economic structures, social dynamics, and urban development. Urbanization emerged as a crucial phenomenon during this period, particularly in Europe and North America, albeit with notable differences in the pace, scale, and nature of urban growth between the two regions.
#### Urbanization in Europe
1. **Origins and Growth**: The Industrial Revolution began in Britain around the 1760s, leading to rapid industrial growth and a shift from agrarian to industrial economies. Cities such as Manchester, Birmingham, and London witnessed explosive population growth. For example, London’s population surged from approximately 1 million in 1801 to 2.5 million by 1851, while Manchester grew from 75,000 to 300,000 during the same period.
2. **Rate of Urbanization**: By 1851, about 50% of Britain's population lived in urban areas, reflecting a significant urbanization trend. The annual growth rates in major cities were substantial, with Manchester experiencing an approximate 4.6% growth rate. This rapid urbanization was driven by the promise of jobs in factories and improved transportation networks, such as railways and canals, which facilitated the movement of goods and people.
3. **Social and Economic Shifts**: The urban workforce transformed dramatically, with roughly 50% of the British workforce engaged in manufacturing by mid-century. This shift led to the emergence of a distinct working class and significant social changes, including increased labor organization and political activism, exemplified by movements like Chartism.
4. **Challenges**: Urbanization brought about severe social challenges, including overcrowding, poor living conditions, and public health crises. For instance, cholera outbreaks in London during the 1840s underscored the dire consequences of rapid urban growth, as many urban areas lacked adequate sanitation and housing.
#### Urbanization in North America
1. **Emergence and Growth**: North America, particularly the United States, began its industrialization later, gaining momentum in the early to mid-19th century. Cities like New York and Chicago became pivotal industrial and urban centers. New York City's population grew from around 60,000 in 1800 to over 1.1 million by 1860, showcasing a remarkable urban expansion.
2. **Urbanization Rates**: By 1860, approximately 20% of the U.S. population lived in urban areas, indicating a lower urbanization level compared to Europe. However, the growth rate of urban populations was high, with New York experiencing an annual growth rate of about 7.6%. This growth was fueled by substantial immigration, primarily from Europe, which contributed significantly to urban demographics.
3. **Economic and Labor Dynamics**: The U.S. saw about 20% of its workforce in manufacturing by 1860, with approximately 110,000 manufacturing establishments, marking a burgeoning industrial sector. The influx of immigrants provided a labor force that was essential for the growth of industries and urban centers, significantly diversifying the population.
4. **Social Issues**: Like their European counterparts, urban areas in the U.S. faced challenges related to overcrowding and inadequate infrastructure. In New York, some neighborhoods had population densities exceeding 135,000 people per square mile. These conditions often led to public health concerns and social unrest, prompting the rise of labor movements advocating for workers’ rights and improved living conditions.
5. **Legislative Responses**: The response to urbanization in the U.S. included the formation of labor unions and early labor movements, such as the National Labor Union established in 1866, which aimed to address workers' rights and working conditions. This reflected a growing awareness of the need for social and economic reforms amidst the rapid urban and industrial expansion.
#### Conclusion
In conclusion, urbanization during the Industrial Revolution was a defining characteristic of both Europe and North America, driven by industrialization, economic opportunities, and transportation advancements. Europe, particularly Britain, experienced an earlier and more advanced stage of urbanization, while North America, fueled by immigration and rapid industrial growth, showed a remarkable increase in urban populations. Despite their differences, both regions faced similar challenges related to overcrowding, public health, and labor rights, leading to social changes and movements advocating for reforms. The complexities of urbanization during this transformative era laid the groundwork for the modern urban landscape, shaping socioeconomic structures and influencing future developments in both regions.
"""
历史与数据分析协作系统
概述:实现一个多代理协作系统,该系统结合了历史研究和数据分析来回答复杂的历史问题。它利用大型语言模型的力量来模拟专业代理协同工作,以提供全面的答案。
动机:历史分析通常需要深入的上下文理解和定量数据解释。通过创建一个结合这两个方面的系统,我们旨在为复杂的历史问题提供更健全和有洞察力的答案。这种方法模仿了历史学家和数据分析师之间的现实世界协作,可能导致更加细致和数据驱动的历史洞察。
关键组件
1.代理类Agent Class:创建专业AI代理的基础类。
2.历史研究代理HistoryResearchAgent:擅长历史背景和趋势。
3.数据分析代理DataAnalysisAgent:专注于解释数值数据和统计。
4.历史数据协作系统HistoryDataCollaborationSystem:协调代理之间的协作。
方法细节
协作系统遵循以下步骤:
1.历史背景Historical Context:历史代理提供相关的历史背景。
2.数据需求识别Data Needs Identification:数据代理确定需要什么定量信息。
3.历史数据提供Historical Data Provision:历史代理提供相关的历史数据。
4.数据分析Data Analysis:数据代理解释所提供的历史数据。
5.最终综合Final Synthesis:历史代理将所有洞察整合成一个全面的答案。
这个迭代过程允许在历史背景和数据分析之间来回进行,模仿现实世界的协作研究。
结论:历史与数据分析协作系统展示了多代理AI系统在解决复杂、跨学科问题方面的潜力。通过结合历史研究和数据分析的优势,它提供了一种理解历史趋势和事件的新方法。该系统对于研究人员、教育工作者以及任何有兴趣深入了解历史主题的人来说都很有价值。
未来的改进可能包括添加更多专业代理,纳入外部数据源,并完善协作过程以实现更加细致的分析。
参考链接:https://github.com/NirDiamant/GenAI_Agents
视频介绍:https://www.bilibili.com/video/BV1zdxCeyEy1
53AI,企业落地应用大模型首选服务商
产品:大模型应用平台+智能体定制开发+落地咨询服务
承诺:先做场景POC验证,看到效果再签署服务协议。零风险落地应用大模型,已交付160+中大型企业
2025-01-09
KAG:开源大模型知识服务框架它来了
2025-01-09
微软开源PromptWizard,摔碎了提示工程师的饭碗~
2025-01-09
重磅!微软开源最强小模型Phi-4,超GPT-4o、可商用
2025-01-08
上海人工智能实验室创新:MindSearch,集多智能体、深度知识探索于一体的AI搜索引擎
2025-01-08
聊聊Firefly——开源且值得学习的LLM训练库
2025-01-07
Dify本地化部署避坑指南MAC
2025-01-06
Gitee AI加dify整合微信实现文生图案例分享
2025-01-06
蚂蚁开源新RAG框架KAG,可达91%准确率
2024-05-06
2024-07-25
2024-08-13
2024-06-12
2024-07-11
2024-07-20
2024-06-16
2024-09-20
2024-06-10
2024-06-06
2024-12-24
2024-12-20
2024-12-19
2024-11-22
2024-11-19
2024-11-13
2024-11-13
2024-10-07