MuLan’s Planning and Progressive Generation

Understand how MuLan uses large language models to create a global plan that breaks down complex image prompts into ordered sub-prompts. Learn how progressive generation with conditional diffusion and local LLM planning manages object placement and size. Discover attention guidance and candidate selection techniques to handle overlaps, allowing MuLan to generate detailed multi-object images step-by-step.

We'll cover the following...

The LLM as a global planner
- Creating an ordered sequence of sub-prompts
Conditional single-object diffusion

In our last lesson, we introduced MuLan’s “divide and conquer” strategy. Instead of tackling a complex image generation task all at once, it breaks the problem down into smaller, more manageable pieces. In this lesson, we’ll explore the first pillar of this architecture in detail: how the agent creates its initial plan.

The LLM as a global planner

Let’s return to our analogy of a human painter. A painter doesn’t just start randomly dabbing paint on a canvas. They first create a mental plan or a light sketch, deciding which objects will form the background and which will be in the foreground, and the general order in which they will be painted.

MuLan’s first step is exactly this: LLM planning. At the very beginning of the process, before any image generation happens, an LLM is used to create a global plan. It takes the user’s single, complex prompt and decomposes it into an ordered sequence of objects to be generated.

Creating an ordered sequence of sub-prompts

...

1.Agent Design Fundamentals

2.Multi-Agent Conversational Recommender System (MACRS)

3.Nvidia Eureka Learning Agent

4.Implementing a Eureka-Like Reward Learning Agent with Google ADK

5.Applying Agentic Design Principles

6.Designing an AI Agent for Generating LLM Pipelines

7. Designing a Web Agent

8.Designing a Multimodal-LLM Agent for Multi-Object Diffusion

9.Thought Exercise: AI Hospital

10.OpenClaw Design

11.Wrapping up

12.Appendix: Free Reference Guides and Cheatsheets

MuLan’s Planning and Progressive Generation

The LLM as a global planner

Creating an ordered sequence of sub-prompts