【8】GIF 动画生成代理 (LangGraph)-GIF Animation Generator Agent (LangGraph)
【9】TTS 诗歌生成代理 (LangGraph)-TTS Poem Generator Agent (LangGraph)
【10】音乐作曲代理 (LangGraph)-Music Compositor Agent (LangGraph)
【11】记忆增强对话代理-Memory-Enhanced Conversational Agent
【12】多代理协作系统-Multi-Agent Collaboration System
#Setup and Imports
import os
from typing import TypedDict, Annotated, Sequence, List
from langgraph.graph import Graph, END
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage
from openai import OpenAI
from PIL import Image
import io
from IPython.display import display, Image as IPImage
from langchain_core.runnables.graph import MermaidDrawMethod
import asyncio
import aiohttp
from dotenv import load_dotenv
# Load environment variables
# Set OpenAI API key
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
# Initialize OpenAI client
client = OpenAI()
#Define Data Structures
class GraphState(TypedDict):
messages: Annotated[Sequence[HumanMessage | AIMessage], "The messages in the conversation"]
query: Annotated[str, "Input query describing the character and scene"]
plot: Annotated[str, "Generated plot for the GIF"]
character_description: Annotated[str, "Detailed description of the main character or object"]
image_prompts: Annotated[List[str], "List of prompts for each frame"]
image_urls: Annotated[List[str], "List of URLs for generated images"]
gif_data: Annotated[bytes, "GIF data in bytes"]
# Initialize the language model
llm = ChatOpenAI(model="gpt-4")
#Define Graph Functions
async def get_image_data(session, url: str):
"""Fetch image data from a given URL."""
async with session.get(url) as response:
if response.status == 200:
return await response.read()
return None
def generate_character_description(state: GraphState) -> GraphState:
"""Generate a detailed description of the main character or scene."""
query = state["query"]
response = llm.invoke([HumanMessage(content=f"Based on the query '{query}', create a detailed description of the main character, object, or scene.
Include specific details about appearance, characteristics, and any unique features.
This description will be used to maintain consistency across multiple images.")])
state["character_description"] = response.content
return state
def generate_plot(state: GraphState) -> GraphState:
"""Generate a 5-step plot for the GIF animation."""
query = state["query"]
character_description = state["character_description"]
response = llm.invoke([HumanMessage(content=f"Create a short, 5-step plot for a GIF based on this query: '{query}' and featuring this description: {character_description}.
Each step should be a brief description of a single frame,
maintaining consistency throughout. Keep it family-friendly and avoid any sensitive themes.")])
state["plot"] = response.content
return state
def generate_image_prompts(state: GraphState) -> GraphState:
"""Generate specific image prompts for each frame of the GIF."""
plot = state["plot"]
character_description = state["character_description"]
response = llm.invoke([HumanMessage(content=f"""Based on this plot: '{plot}' and featuring this description: {character_description}, generate 5 specific, family-friendly image prompts,
one for each step. Each prompt should be detailed enough for image generation, maintaining consistency, and suitable for DALL-E.
Always include the following in EVERY prompt to maintain consistency:
1. A brief reminder of the main character or object's key features
2. The specific action or scene described in the plot step
3. Any relevant background or environmental details
Format each prompt as a numbered list item, like this:
1. [Your prompt here]
2. [Your prompt here]
... and so on.""")])
prompts = []
for line in response.content.split('\n'):
if line.strip().startswith(('1.', '2.', '3.', '4.', '5.')):
prompt = line.split('.', 1)[1].strip()
prompts.append(f"Create a detailed, photorealistic image of the following scene: {prompt}")
if len(prompts) != 5:
raise ValueError(f"Expected 5 prompts, but got {len(prompts)}. Please try again.")
state["image_prompts"] = prompts
return state
async def create_image(prompt: str, retries: int = 3):
"""Generate an image using DALL-E based on the given prompt."""
for attempt in range(retries):
response = await asyncio.to_thread(
return response.data[0].url
except Exception as e:
if attempt == retries - 1:
print(f"Failed to generate image for prompt: {prompt}")
print(f"Error: {str(e)}")
return None
await asyncio.sleep(2) # Wait before retrying
async def create_images(state: GraphState) -> GraphState:
"""Generate images for all prompts in parallel."""
image_prompts = state["image_prompts"]
tasks = [create_image(prompt) for prompt in image_prompts]
image_urls = await asyncio.gather(*tasks)
state["image_urls"] = image_urls
return state
async def create_gif(state: GraphState) -> GraphState:
"""Create a GIF from the generated images."""
image_urls = state["image_urls"]
images = []
async with aiohttp.ClientSession() as session:
tasks = [get_image_data(session, url) for url in image_urls if url]
image_data_list = await asyncio.gather(*tasks)
for img_data in image_data_list:
if img_data:
if images:
gif_buffer = io.BytesIO()
images[0].save(gif_buffer, format='GIF', save_all=True, append_images=images[1:], duration=1000, loop=0)
state["gif_data"] = gif_buffer.getvalue()
state["gif_data"] = None
return state
#Set Up LangGraph Workflow
workflow = Graph()
workflow.add_node("generate_character_description", generate_character_description)
workflow.add_node("generate_plot", generate_plot)
workflow.add_node("generate_image_prompts", generate_image_prompts)
workflow.add_node("create_images", create_images)
workflow.add_node("create_gif", create_gif)
workflow.add_edge("generate_character_description", "generate_plot")
workflow.add_edge("generate_plot", "generate_image_prompts")
workflow.add_edge("generate_image_prompts", "create_images")
workflow.add_edge("create_images", "create_gif")
workflow.add_edge("create_gif", END)
app = workflow.compile()
#Display Graph Structure
#Run Workflow Function
async def run_workflow(query: str):
"""Run the LangGraph workflow and display results."""
initial_state = {
"messages": [],
"query": query,
"plot": "",
"character_description": "",
"image_prompts": [],
"image_urls": [],
"gif_data": None
result = await app.ainvoke(initial_state)
print("Character/Scene Description:")
print("\nGenerated Plot:")
print("\nImage Prompts:")
for i, prompt in enumerate(result["image_prompts"], 1):
print(f"{i}. {prompt}")
print("\nGenerated Image URLs:")
for i, url in enumerate(result["image_urls"], 1):
print(f"{i}. {url}")
if result["gif_data"]:
print("\nGIF generated successfully. Use the next cell to display or save it.")
print("\nFailed to generate GIF.")
return result
except Exception as e:
print(f"An error occurred: {str(e)}")
return None
#Execute Workflow
query = "A cat wearing a top hat and monocle, sitting at a desk and writing a letter with a quill pen."
result = await run_workflow(query)
#Display and Save GIF
if result and result["gif_data"]:
# Display the GIF
display(IPImage(data=result["gif_data"], format='gif'))
# Ask if the user wants to save the GIF
save_gif = input("Do you want to save the GIF? (yes/no): ").lower().strip()
if save_gif == 'yes':
filename = input("Enter the filename to save the GIF (e.g., output.gif): ").strip()
if not filename.endswith('.gif'):
filename += '.gif'
with open(filename, 'wb') as f:
print(f"GIF saved as {filename}")
print("GIF not saved.")
print("No GIF data available to display or save.")
使用 LangGraph 和 DALL-E 的 GIF 动画生成器
2.GPT-4 (via LangChain):根据初始用户查询生成详细的描述、情节和图像提示。
3.DALL-E 3:根据生成的提示创建高质量图像。
4.Python Imaging Library (PIL):将单个图像组合成 GIF 动画。
5.Asynchronous Programming:利用 asyncio 和 aiohttp 进行高效的并行处理图像生成和检索。
GIF 生成过程遵循以下高层次步骤:
1.角色/场景描述(Character/Scene Description):根据用户的输入查询,系统生成主要角色或场景的详细描述。
2.情节生成(Plot Generation):使用角色描述和初始查询,创建一个 5 步情节,概述动画的进展。
3.图像提示创建(Image Prompt Creation):对于情节的每一步,生成特定的图像提示,确保帧之间的一致性。
4.图像生成(Image Generation)::DALL-E 3 根据每个提示创建图像。
5.GIF 组装(GIF Assembly):将生成的图像编译成 GIF 动画。
在整个过程中,LangGraph 管理步骤之间的信息流动,确保每个阶段的输出适当地输入到下一个阶段。异步编程的使用允许在图像生成和检索阶段进行高效的并行处理。
系统的模块化特性,由 LangGraph 促进,允许轻松更新或替换单个组件。这使得项目能够适应未来语言模型或图像生成技术的进步。
虽然当前的实现侧重于创建简单的 5 帧 GIF,但这个概念可以扩展到创建更长的动画,中间阶段纳入用户反馈,或者甚至与其他媒体类型集成。随着 AI 的不断发展,这样的工具将在弥合人类创造力和机器能力之间的差距中发挥越来越重要的作用。
概述:使用LangGraph和OpenAI API的高级文本转语音(TTS)代理,对输入文本进行分类,根据内容类型处理,并生成相应的语音输出。
#Import necessary libraries and set up environment
# Import required libraries
from typing import TypedDict
from langgraph.graph import StateGraph, END
from IPython.display import display, Audio, Markdown
from openai import OpenAI
from dotenv import load_dotenv
import io
import tempfile
import re
import os
# Load environment variables and set OpenAI API key
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
#Initialize OpenAI client and define state
client = OpenAI()
class AgentState(TypedDict):
input_text: str
processed_text: str
audio_data: bytes
audio_path: str
content_type: str
#Define Node Functions
def classify_content(state: AgentState) -> AgentState:
"""Classify the input text into one of four categories: general, poem, news, or joke."""
response = client.chat.completions.create(
{"role": "system", "content": "Classify the content as one of: 'general', 'poem', 'news', 'joke'."},
{"role": "user", "content": state["input_text"]}
state["content_type"] = response.choices[0].message.content.strip().lower()
return state
def process_general(state: AgentState) -> AgentState:
"""Process general content (no specific processing, return as-is)."""
state["processed_text"] = state["input_text"]
return state
def process_poem(state: AgentState) -> AgentState:
"""Process the input text as a poem, rewriting it in a poetic style."""
response = client.chat.completions.create(
{"role": "system", "content": "Rewrite the following text as a short, beautiful poem:"},
{"role": "user", "content": state["input_text"]}
state["processed_text"] = response.choices[0].message.content.strip()
return state
def process_news(state: AgentState) -> AgentState:
"""Process the input text as news, rewriting it in a formal news anchor style."""
response = client.chat.completions.create(
{"role": "system", "content": "Rewrite the following text in a formal news anchor style:"},
{"role": "user", "content": state["input_text"]}
state["processed_text"] = response.choices[0].message.content.strip()
return state
def process_joke(state: AgentState) -> AgentState:
"""Process the input text as a joke, turning it into a short, funny joke."""
response = client.chat.completions.create(
{"role": "system", "content": "Turn the following text into a short, funny joke:"},
{"role": "user", "content": state["input_text"]}
state["processed_text"] = response.choices[0].message.content.strip()
return state
def text_to_speech(state: AgentState, save_file: bool = False) -> AgentState:
Converts processed text into speech using a voice mapped to the content type.
Optionally saves the audio to a file.
state (AgentState): Dictionary containing the processed text and content type.
save_file (bool, optional): If True, saves the audio to a file. Defaults to False.
AgentState: Updated state with audio data and file path (if saved).
# Map content type to a voice, defaulting to "alloy"
voice_map = {
"general": "alloy",
"poem": "nova",
"news": "onyx",
"joke": "shimmer"
voice = voice_map.get(state["content_type"], "alloy")
audio_data = io.BytesIO()
# Generate speech and stream audio data into memory
with client.audio.speech.with_streaming_response.create(
) as response:
for chunk in response.iter_bytes():
state["audio_data"] = audio_data.getvalue()
# Save audio to a file if requested
if save_file:
with tempfile.NamedTemporaryFile(delete=False, suffix=".mp3") as temp_audio:
state["audio_path"] = temp_audio.name
state["audio_path"] = ""
return state
#Define and Compile the Graph
# Define the graph
workflow = StateGraph(AgentState)
# Add nodes to the graph
workflow.add_node("classify_content", classify_content)
workflow.add_node("process_general", process_general)
workflow.add_node("process_poem", process_poem)
workflow.add_node("process_news", process_news)
workflow.add_node("process_joke", process_joke)
workflow.add_node("text_to_speech", text_to_speech)
# Set the entry point of the graph
# Define conditional edges based on content type
lambda x: x["content_type"],
"general": "process_general",
"poem": "process_poem",
"news": "process_news",
"joke": "process_joke",
# Connect processors to text-to-speech
workflow.add_edge("process_general", "text_to_speech")
workflow.add_edge("process_poem", "text_to_speech")
workflow.add_edge("process_news", "text_to_speech")
workflow.add_edge("process_joke", "text_to_speech")
# Compile the graph
app = workflow.compile()
#A function to convert text to a valid informative filename
def sanitize_filename(text, max_length=20):
"""Convert text to a valid and concise filename."""
sanitized = re.sub(r'[^\w\s-]', '', text.lower())
sanitized = re.sub(r'[-\s]+', '_', sanitized)
return sanitized[:max_length]
#Define Function to Run Agent and Play Audio
def run_tts_agent_and_play(input_text: str, content_type: str, save_file: bool = True):
result = app.invoke({
"input_text": input_text,
"processed_text": "",
"audio_data": b"",
"audio_path": "",
"content_type": content_type
print(f"Detected content type: {result['content_type']}")
print(f"Processed text: {result['processed_text']}")
# Play the audio (this will only work in local Jupyter environment)
display(Audio(result['audio_data'], autoplay=True))
if save_file:
# Create 'audio' directory in the parent folder of the notebook
audio_dir = os.path.join('..', 'audio')
os.makedirs(audio_dir, exist_ok=True)
sanitized_text = sanitize_filename(input_text)
file_name = f"{content_type}_{sanitized_text}.mp3"
file_path = os.path.join(audio_dir, file_name)
with open(file_path, "wb") as f:
print(f"Audio saved to: {file_path}")
# Relative path for GitHub
github_relative_path = f"../audio/{file_name}"
display(Markdown(f"[Download {content_type} audio: {sanitized_text}]({github_relative_path})"))
# Note about GitHub limitations
print("Note: Audio playback is not supported directly on GitHub. Use the download link to listen to the audio.")
print("Audio not saved to file.")
return result
#Test the Text-to-Speech Agent
examples = {
"general": "The quick brown fox jumps over the lazy dog.",
"poem": "Roses are red, violets are blue, AI is amazing, and so are you!",
"news": "Breaking news: Scientists discover a new species of deep-sea creature in the Mariana Trench.",
"joke": "Why don't scientists trust atoms? Because they make up everything!"
for content_type, text in examples.items():
print(f"\nProcessing example for {content_type} content:")
print(f"Input text: {text}")
# Run the TTS agent and save the file
result = run_tts_agent_and_play(text, content_type, save_file=True)
print("-" * 50)
print("All examples processed. You can download the audio files using the links above.")
Processing example for general content:
Input text: The quick brown fox jumps over the lazy dog.
Detected content type: poem
Processed text: In autumn's breeze, the swift fox leaps,
Above a slumbering dog it sweeps.
With grace it dances, swift and free,
A tale of motion, poetry.
Audio saved to: ..\audio\general_the_quick_brown_fox_.mp3
Download general audio: the_quick_brown_fox_
Note: Audio playback is not supported directly on GitHub. Use the download link to listen to the audio.
Processing example for poem content:
Input text: Roses are red, violets are blue, AI is amazing, and so are you!
Detected content type: poem
Processed text: In the garden of knowledge, where data blooms bright,
Up to October's end, you shed your soft light.
With wisdom and insight, like stars in the sky,
AI is enchanting, oh, how you can fly!
Audio saved to: ..\audio\poem_roses_are_red_violet.mp3
Download poem audio: roses_are_red_violet
Note: Audio playback is not supported directly on GitHub. Use the download link to listen to the audio.
Processing example for news content:
Input text: Breaking news: Scientists discover a new species of deep-sea creature in the Mariana Trench.
Detected content type: news
Processed text: Good evening. In breaking news, scientists have made a remarkable discovery, identifying a new species of deep-sea creature located within the depths of the Mariana Trench. This finding not only expands our understanding of marine biodiversity but also highlights the importance of continued exploration in these largely uncharted waters. We will provide more details on this groundbreaking announcement as they become available.
Audio saved to: ..\audio\news_breaking_news_scient.mp3
Download news audio: breaking_news_scient
Note: Audio playback is not supported directly on GitHub. Use the download link to listen to the audio.
Processing example for joke content:
Input text: Why don't scientists trust atoms? Because they make up everything!
Detected content type: joke
Processed text: Why don’t AI assistants tell jokes after October 2023? Because they’re still trying to figure out what happened in November!
Audio saved to: ..\audio\joke_why_dont_scientists_.mp3
Download joke audio: why_dont_scientists_
Note: Audio playback is not supported directly on GitHub. Use the download link to listen to the audio.
All examples processed. You can download the audio files using the links above.
使用 LangGraph 和 OpenAI 构建智能文本转语音代理
1.内容分类(Content Classification):利用 OpenAI 的 GPT 模型对输入文本进行分类。
2.内容处理(Content Processing):根据内容类型(一般、诗歌、新闻或笑话)应用特定的处理。
3.文本转语音转换(Text-to-Speech Conversion):利用 OpenAI 的 TTS API 从处理过的文本中生成音频。
4.LangGraph 工作流(LangGraph Workflow:使用状态图协调整个过程。
TTS 代理通过以下高层次步骤运作:
1.文本输入(Text Input):系统从用户那里接收文本输入。
2.内容分类(Content Classification):输入被分类为四个类别之一:通用、诗歌、新闻或笑话。
3.特定内容处理(Content-Specific Processing):根据分类,文本经过特定处理:一般文本保持不变。诗歌被重写以提高诗意质量。新闻被重新格式化为正式的新闻主播风格。笑话被提炼以增加幽默感。
4.文本转语音转换(Text-to-Speech Conversion):使用适合其内容类型的适当声音将处理过的文本转换为语音。
5.音频输出(Audio Output:根据用户偏好,生成的音频要么保存到文件,要么直接播放。
整个工作流由 LangGraph 状态机管理,确保在不同处理阶段之间平滑过渡,并在整个操作过程中保持上下文。
通过利用 GPT 模型在文本处理方面的优势和 OpenAI 的 TTS 能力,本项目展示了如何将先进的人工智能技术集成在一起,创建复杂的多步骤语言处理管道。
# Import required libraries
from typing import Dict, TypedDict
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
import music21
import pygame
import tempfile
import os
import random
from dotenv import load_dotenv
# Load environment variables and set OpenAI API key
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
#State Definition
class MusicState(TypedDict):
"""Define the structure of the state for the music generation workflow."""
musician_input: str # User's input describing the desired music
melody: str # Generated melody
harmony: str # Generated harmony
rhythm: str # Generated rhythm
style: str # Desired musical style
composition: str # Complete musical composition
midi_file: str # Path to the generated MIDI file
#LLM Initialization
# Initialize the ChatOpenAI model
llm = ChatOpenAI(model="gpt-4o-mini")
#Component Functions
def melody_generator(state: MusicState) -> Dict:
"""Generate a melody based on the user's input."""
prompt = ChatPromptTemplate.from_template(
"Generate a melody based on this input: {input}. Represent it as a string of notes in music21 format."
chain = prompt | llm
melody = chain.invoke({"input": state["musician_input"]})
return {"melody": melody.content}
def harmony_creator(state: MusicState) -> Dict:
"""Create harmony for the generated melody."""
prompt = ChatPromptTemplate.from_template(
"Create harmony for this melody: {melody}. Represent it as a string of chords in music21 format."
chain = prompt | llm
harmony = chain.invoke({"melody": state["melody"]})
return {"harmony": harmony.content}
def rhythm_analyzer(state: MusicState) -> Dict:
"""Analyze and suggest a rhythm for the melody and harmony."""
prompt = ChatPromptTemplate.from_template(
"Analyze and suggest a rhythm for this melody and harmony: {melody}, {harmony}. Represent it as a string of durations in music21 format."
chain = prompt | llm
rhythm = chain.invoke({"melody": state["melody"], "harmony": state["harmony"]})
return {"rhythm": rhythm.content}
def style_adapter(state: MusicState) -> Dict:
"""Adapt the composition to the specified musical style."""
prompt = ChatPromptTemplate.from_template(
"Adapt this composition to the {style} style: Melody: {melody}, Harmony: {harmony}, Rhythm: {rhythm}. Provide the result in music21 format."
chain = prompt | llm
adapted = chain.invoke({
"style": state["style"],
"melody": state["melody"],
"harmony": state["harmony"],
"rhythm": state["rhythm"]
return {"composition": adapted.content}
def midi_converter(state: MusicState) -> Dict:
"""Convert the composition to MIDI format and save it as a file."""
# Create a new stream
piece = music21.stream.Score()
# Add the composition description to the stream as a text expression
description = music21.expressions.TextExpression(state["composition"])
# Define a wide variety of scales and chords
scales = {
'C major': ['C', 'D', 'E', 'F', 'G', 'A', 'B'],
'C minor': ['C', 'D', 'Eb', 'F', 'G', 'Ab', 'Bb'],
'C harmonic minor': ['C', 'D', 'Eb', 'F', 'G', 'Ab', 'B'],
'C melodic minor': ['C', 'D', 'Eb', 'F', 'G', 'A', 'B'],
'C dorian': ['C', 'D', 'Eb', 'F', 'G', 'A', 'Bb'],
'C phrygian': ['C', 'Db', 'Eb', 'F', 'G', 'Ab', 'Bb'],
'C lydian': ['C', 'D', 'E', 'F#', 'G', 'A', 'B'],
'C mixolydian': ['C', 'D', 'E', 'F', 'G', 'A', 'Bb'],
'C locrian': ['C', 'Db', 'Eb', 'F', 'Gb', 'Ab', 'Bb'],
'C whole tone': ['C', 'D', 'E', 'F#', 'G#', 'A#'],
'C diminished': ['C', 'D', 'Eb', 'F', 'Gb', 'Ab', 'A', 'B'],
chords = {
'C major': ['C4', 'E4', 'G4'],
'C minor': ['C4', 'Eb4', 'G4'],
'C diminished': ['C4', 'Eb4', 'Gb4'],
'C augmented': ['C4', 'E4', 'G#4'],
'C dominant 7th': ['C4', 'E4', 'G4', 'Bb4'],
'C major 7th': ['C4', 'E4', 'G4', 'B4'],
'C minor 7th': ['C4', 'Eb4', 'G4', 'Bb4'],
'C half-diminished 7th': ['C4', 'Eb4', 'Gb4', 'Bb4'],
'C fully diminished 7th': ['C4', 'Eb4', 'Gb4', 'A4'],
def create_melody(scale_name, duration):
"""Create a melody based on a given scale."""
melody = music21.stream.Part()
scale = scales[scale_name]
for _ in range(duration):
note = music21.note.Note(random.choice(scale) + '4')
note.quarterLength = 1
return melody
def create_chord_progression(duration):
"""Create a chord progression."""
harmony = music21.stream.Part()
for _ in range(duration):
chord_name = random.choice(list(chords.keys()))
chord = music21.chord.Chord(chords[chord_name])
chord.quarterLength = 1
return harmony
# Parse the user input to determine scale and style
user_input = state['musician_input'].lower()
if 'minor' in user_input:
scale_name = 'C minor'
elif 'major' in user_input:
scale_name = 'C major'
scale_name = random.choice(list(scales.keys()))
# Create a 7-second piece (7 beats at 60 BPM)
melody = create_melody(scale_name, 7)
harmony = create_chord_progression(7)
# Add a final whole note to make it exactly 8 beats (7 seconds at 60 BPM)
final_note = music21.note.Note(scales[scale_name][0] + '4')
final_note.quarterLength = 1
final_chord = music21.chord.Chord(chords[scale_name.split()[0] + ' ' + scale_name.split()[1]])
final_chord.quarterLength = 1
# Add the melody and harmony to the piece
# Set the tempo to 60 BPM
piece.insert(0, music21.tempo.MetronomeMark(number=60))
# Create a temporary MIDI file
with tempfile.NamedTemporaryFile(delete=False, suffix='.mid') as temp_midi:
piece.write('midi', temp_midi.name)
return {"midi_file": temp_midi.name}
#Graph Construction
# Initialize the StateGraph
workflow = StateGraph(MusicState)
# Add nodes to the graph
workflow.add_node("melody_generator", melody_generator)
workflow.add_node("harmony_creator", harmony_creator)
workflow.add_node("rhythm_analyzer", rhythm_analyzer)
workflow.add_node("style_adapter", style_adapter)
workflow.add_node("midi_converter", midi_converter)
# Set the entry point of the graph
# Add edges to connect the nodes
workflow.add_edge("melody_generator", "harmony_creator")
workflow.add_edge("harmony_creator", "rhythm_analyzer")
workflow.add_edge("rhythm_analyzer", "style_adapter")
workflow.add_edge("style_adapter", "midi_converter")
workflow.add_edge("midi_converter", END)
# Compile the graph
app = workflow.compile()
#Run the Workflow
# Define input parameters
inputs = {
"musician_input": "Create a happy piano piece in C major",
"style": "Romantic era"
# Invoke the workflow
result = app.invoke(inputs)
print("Composition created")
print(f"MIDI file saved at: {result['midi_file']}")
#MIDI Playback Function
def play_midi(midi_file_path):
"""Play the generated MIDI file."""
# Wait for playback to finish
while pygame.mixer.music.get_busy():
# Clean up
print("To create and play a melody, run the following in a new cell:")
To create and play a melody, run the following in a new cell:
#Play the Generated Music
1.状态管理State Management:使用 MusicState 类来管理工作流的状态。
2.Language Model:采用 ChatOpenAI(GPT-4)生成音乐组件。
3.音乐功能Musical Functions:旋律生成器Melody Generator。和声创建者Harmony Creator。节奏分析器Rhythm Analyzer。风格适配器Style Adapter。
4.MIDI 转换MIDI Conversion:将作品转换为可播放的 MIDI 文件。
5.LangGraph Workflow:使用状态图协调整个创作过程。
6.播放功能Playback Functionality:允许立即播放生成的作品。
5.将最终作品转换为 MIDI 格式。
生成的 MIDI 文件可以使用 pygame 播放。
整个过程使用 LangGraph 进行编排,它管理不同组件之间的信息流动,并确保每一步都建立在前一步的基础上。
#Setup and Imports
from langchain_openai import ChatOpenAI
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain.memory import ChatMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from dotenv import load_dotenv
import os
# Load environment variables
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
# Initialize the language model
llm = ChatOpenAI(model="gpt-4o-mini", max_tokens=1000, temperature=0)
#Memory Stores
chat_store = {}
long_term_memory = {}
def get_chat_history(session_id: str):
if session_id not in chat_store:
chat_store[session_id] = ChatMessageHistory()
return chat_store[session_id]
def update_long_term_memory(session_id: str, input: str, output: str):
if session_id not in long_term_memory:
long_term_memory[session_id] = []
if len(input) > 20: # Simple logic: store inputs longer than 20 characters
long_term_memory[session_id].append(f"User said: {input}")
if len(long_term_memory[session_id]) > 5: # Keep only last 5 memories
long_term_memory[session_id] = long_term_memory[session_id][-5:]
def get_long_term_memory(session_id: str):
return ". ".join(long_term_memory.get(session_id, []))
#Prompt Template
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful AI assistant. Use the information from long-term memory if relevant."),
("system", "Long-term memory: {long_term_memory}"),
("human", "{input}")
#Conversational Chain
chain = prompt | llm
chain_with_history = RunnableWithMessageHistory(
#Chat Function
def chat(input_text: str, session_id: str):
long_term_mem = get_long_term_memory(session_id)
response = chain_with_history.invoke(
{"input": input_text, "long_term_memory": long_term_mem},
config={"configurable": {"session_id": session_id}}
update_long_term_memory(session_id, input_text, response.content)
return response.content
#Example Usage
session_id = "user_123"
print("AI:", chat("Hello! My name is Alice.", session_id))
print("AI:", chat("What's the weather like today?", session_id))
print("AI:", chat("I love sunny days.", session_id))
print("AI:", chat("Do you remember my name?", session_id))
AI: Hello, Alice! How can I assist you today?
AI: I don't have real-time weather data, but you can check a weather website or app for the most accurate and up-to-date information. If you tell me your location, I can suggest what to look for!
AI: Sunny days are wonderful! They can really lift your mood and are perfect for outdoor activities. Do you have any favorite things you like to do on sunny days?
AI: Yes, your name is Alice! How can I assist you further today?
#Review Memory
print("Conversation History:")
for message in chat_store[session_id].messages:
print(f"{message.type}: {message.content}")
print("\nLong-term Memory:")
Conversation History:
human: Hello! My name is Alice.
ai: Hello, Alice! How can I assist you today?
human: What's the weather like today?
ai: I don't have real-time weather data, but you can check a weather website or app for the most accurate and up-to-date information. If you tell me your location, I can suggest what to look for!
human: I love sunny days.
ai: Sunny days are wonderful! They can really lift your mood and are perfect for outdoor activities. Do you have any favorite things you like to do on sunny days?
human: Do you remember my name?
ai: Yes, your name is Alice! How can I assist you further today?
Long-term Memory:
User said: Hello! My name is Alice.. User said: What's the weather like today?. User said: Do you remember my name?
1.设置环境-(Setting Up the Environment)
#Import required libraries
import os
import time
from langchain_openai import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage
from typing import List, Dict
from dotenv import load_dotenv
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
#Initialize the language model
llm = ChatOpenAI(model="gpt-4o-mini", max_tokens=1000, temperature=0.7)
#Define the base Agent class
class Agent:
def __init__(self, name: str, role: str, skills: List[str]):
self.name = name
self.role = role
self.skills = skills
self.llm = llm
def process(self, task: str, context: List[Dict] = None) -> str:
messages = [
SystemMessage(content=f"You are {self.name}, a {self.role}. Your skills include: {', '.join(self.skills)}. Respond to the task based on your role and skills.")
if context:
for msg in context:
if msg['role'] == 'human':
elif msg['role'] == 'ai':
response = self.llm.invoke(messages)
return response.content
#Define specialized agents: HistoryResearchAgent and DataAnalysisAgent
class HistoryResearchAgent(Agent):
def __init__(self):
super().__init__("Clio", "History Research Specialist", ["deep knowledge of historical events", "understanding of historical contexts", "identifying historical trends"])
class DataAnalysisAgent(Agent):
def __init__(self):
super().__init__("Data", "Data Analysis Expert", ["interpreting numerical data", "statistical analysis", "data visualization description"])
#Define the different functions for the collaboration system
#Research Historical Context
def research_historical_context(history_agent, task: str, context: list) -> list:
print("?️ History Agent: Researching historical context...")
history_task = f"Provide relevant historical context and information for the following task: {task}"
history_result = history_agent.process(history_task)
context.append({"role": "ai", "content": f"History Agent: {history_result}"})
print(f"? Historical context provided: {history_result[:100]}...\n")
return context
#Identify Data Needs
def identify_data_needs(data_agent, task: str, context: list) -> list:
print("? Data Agent: Identifying data needs based on historical context...")
historical_context = context[-1]["content"]
data_need_task = f"Based on the historical context, what specific data or statistical information would be helpful to answer the original question? Historical context: {historical_context}"
data_need_result = data_agent.process(data_need_task, context)
context.append({"role": "ai", "content": f"Data Agent: {data_need_result}"})
print(f"? Data needs identified: {data_need_result[:100]}...\n")
return context
#Provide Historical Data
def provide_historical_data(history_agent, task: str, context: list) -> list:
print("?️ History Agent: Providing relevant historical data...")
data_needs = context[-1]["content"]
data_provision_task = f"Based on the data needs identified, provide relevant historical data or statistics. Data needs: {data_needs}"
data_provision_result = history_agent.process(data_provision_task, context)
context.append({"role": "ai", "content": f"History Agent: {data_provision_result}"})
print(f"? Historical data provided: {data_provision_result[:100]}...\n")
return context
#Analyze Data
def analyze_data(data_agent, task: str, context: list) -> list:
print("? Data Agent: Analyzing historical data...")
historical_data = context[-1]["content"]
analysis_task = f"Analyze the historical data provided and describe any trends or insights relevant to the original task. Historical data: {historical_data}"
analysis_result = data_agent.process(analysis_task, context)
context.append({"role": "ai", "content": f"Data Agent: {analysis_result}"})
print(f"? Data analysis results: {analysis_result[:100]}...\n")
return context
#Synthesize Final Answer
def synthesize_final_answer(history_agent, task: str, context: list) -> str:
print("?️ History Agent: Synthesizing final answer...")
synthesis_task = "Based on all the historical context, data, and analysis, provide a comprehensive answer to the original task."
final_result = history_agent.process(synthesis_task, context)
return final_result
#HistoryDataCollaborationSystem Class
class HistoryDataCollaborationSystem:
def __init__(self):
self.history_agent = Agent("Clio", "History Research Specialist", ["deep knowledge of historical events", "understanding of historical contexts", "identifying historical trends"])
self.data_agent = Agent("Data", "Data Analysis Expert", ["interpreting numerical data", "statistical analysis", "data visualization description"])
def solve(self, task: str, timeout: int = 300) -> str:
print(f"\n? Starting collaboration to solve: {task}\n")
start_time = time.time()
context = []
steps = [
(research_historical_context, self.history_agent),
(identify_data_needs, self.data_agent),
(provide_historical_data, self.history_agent),
(analyze_data, self.data_agent),
(synthesize_final_answer, self.history_agent)
for step_func, agent in steps:
if time.time() - start_time > timeout:
return "Operation timed out. The process took too long to complete."
result = step_func(agent, task, context)
if isinstance(result, str):
return result # This is the final answer
context = result
except Exception as e:
return f"Error during collaboration: {str(e)}"
print("\n✅ Collaboration complete. Final answer synthesized.\n")
return context[-1]["content"]
#Example usage
# Create an instance of the collaboration system
collaboration_system = HistoryDataCollaborationSystem()
# Define a complex historical question that requires both historical knowledge and data analysis
question = "How did urbanization rates in Europe compare to those in North America during the Industrial Revolution, and what were the main factors influencing these trends?"
# Solve the question using the collaboration system
result = collaboration_system.solve(question)
# Print the result
? Starting collaboration to solve: How did urbanization rates in Europe compare to those in North America during the Industrial Revolution, and what were the main factors influencing these trends?
?️ History Agent: Researching historical context...
? Historical context provided: During the Industrial Revolution, which generally spanned from the late 18th century to the mid-19th...
? Data Agent: Identifying data needs based on historical context...
? Data needs identified: To analyze the urbanization phenomenon during the Industrial Revolution in Europe and North America ...
?️ History Agent: Providing relevant historical data...
? Historical data provided: Here is some relevant historical data and statistics that pertain to the urbanization phenomenon dur...
? Data Agent: Analyzing historical data...
? Data analysis results: Data Agent: Analyzing the historical data provided reveals several key trends and insights regarding...
?️ History Agent: Synthesizing final answer...
### Urbanization During the Industrial Revolution: A Comparative Analysis of Europe and North America
The Industrial Revolution, spanning from the late 18th century to the mid-19th century, marked a transformative era characterized by significant changes in economic structures, social dynamics, and urban development. Urbanization emerged as a crucial phenomenon during this period, particularly in Europe and North America, albeit with notable differences in the pace, scale, and nature of urban growth between the two regions.
#### Urbanization in Europe
1. **Origins and Growth**: The Industrial Revolution began in Britain around the 1760s, leading to rapid industrial growth and a shift from agrarian to industrial economies. Cities such as Manchester, Birmingham, and London witnessed explosive population growth. For example, London’s population surged from approximately 1 million in 1801 to 2.5 million by 1851, while Manchester grew from 75,000 to 300,000 during the same period.
2. **Rate of Urbanization**: By 1851, about 50% of Britain's population lived in urban areas, reflecting a significant urbanization trend. The annual growth rates in major cities were substantial, with Manchester experiencing an approximate 4.6% growth rate. This rapid urbanization was driven by the promise of jobs in factories and improved transportation networks, such as railways and canals, which facilitated the movement of goods and people.
3. **Social and Economic Shifts**: The urban workforce transformed dramatically, with roughly 50% of the British workforce engaged in manufacturing by mid-century. This shift led to the emergence of a distinct working class and significant social changes, including increased labor organization and political activism, exemplified by movements like Chartism.
4. **Challenges**: Urbanization brought about severe social challenges, including overcrowding, poor living conditions, and public health crises. For instance, cholera outbreaks in London during the 1840s underscored the dire consequences of rapid urban growth, as many urban areas lacked adequate sanitation and housing.
#### Urbanization in North America
1. **Emergence and Growth**: North America, particularly the United States, began its industrialization later, gaining momentum in the early to mid-19th century. Cities like New York and Chicago became pivotal industrial and urban centers. New York City's population grew from around 60,000 in 1800 to over 1.1 million by 1860, showcasing a remarkable urban expansion.
2. **Urbanization Rates**: By 1860, approximately 20% of the U.S. population lived in urban areas, indicating a lower urbanization level compared to Europe. However, the growth rate of urban populations was high, with New York experiencing an annual growth rate of about 7.6%. This growth was fueled by substantial immigration, primarily from Europe, which contributed significantly to urban demographics.
3. **Economic and Labor Dynamics**: The U.S. saw about 20% of its workforce in manufacturing by 1860, with approximately 110,000 manufacturing establishments, marking a burgeoning industrial sector. The influx of immigrants provided a labor force that was essential for the growth of industries and urban centers, significantly diversifying the population.
4. **Social Issues**: Like their European counterparts, urban areas in the U.S. faced challenges related to overcrowding and inadequate infrastructure. In New York, some neighborhoods had population densities exceeding 135,000 people per square mile. These conditions often led to public health concerns and social unrest, prompting the rise of labor movements advocating for workers’ rights and improved living conditions.
5. **Legislative Responses**: The response to urbanization in the U.S. included the formation of labor unions and early labor movements, such as the National Labor Union established in 1866, which aimed to address workers' rights and working conditions. This reflected a growing awareness of the need for social and economic reforms amidst the rapid urban and industrial expansion.
#### Conclusion
In conclusion, urbanization during the Industrial Revolution was a defining characteristic of both Europe and North America, driven by industrialization, economic opportunities, and transportation advancements. Europe, particularly Britain, experienced an earlier and more advanced stage of urbanization, while North America, fueled by immigration and rapid industrial growth, showed a remarkable increase in urban populations. Despite their differences, both regions faced similar challenges related to overcrowding, public health, and labor rights, leading to social changes and movements advocating for reforms. The complexities of urbanization during this transformative era laid the groundwork for the modern urban landscape, shaping socioeconomic structures and influencing future developments in both regions.
1.代理类Agent Class:创建专业AI代理的基础类。
1.历史背景Historical Context:历史代理提供相关的历史背景。
2.数据需求识别Data Needs Identification:数据代理确定需要什么定量信息。
3.历史数据提供Historical Data Provision:历史代理提供相关的历史数据。
4.数据分析Data Analysis:数据代理解释所提供的历史数据。
5.最终综合Final Synthesis:历史代理将所有洞察整合成一个全面的答案。
Gitee AI加dify整合微信实现文生图案例分享