GIVE: A Multi-Agent Framework for Generating Immersive Multi-Modal Virtual Environments for 3D Games - Supplementary Material

In this work, we present a novel multi-agent framework for generating immersive 3D virtual environments from high-level semantic inputs, powered by large language and vision-language models (LLMs/VLMs). Unlike prior work that focuses primarily on visual output, data-intensive training pipelines, and code generation, our system coordinates a team of specialized agents, each assigned a role such as manager, planner, or expert in visual, audio, or spatial domains, to decompose and execute environment construction tasks within a game engine. This approach supports multi-modal content generation and manipulation, including spatial audio and visual elements, by adding to and adjusting game engine assets and their properties. A vision-language-based reflective agent introduces a feedback loop by evaluating the generated environment and prompting revisions. Our framework is designed for quick iteration and ideation, and is particularly suited for the early stages of immersive scene design, helping developers and content creators preview and evolve their creative concepts with minimal technical overhead.

Generative_Immersive_Virtual_Environment_Supplementary_Material.pdf

Supplementary Material (91)

Thumbs Up

CITE

Documents

Research Manuscript

GIVE: A Multi-Agent Framework for Generating Immersive Multi-Modal Virtual Environments for 3D Games - Supplementary Material

Generative_Immersive_Virtual_Environment_Supplementary_Material.pdf

QUESTIONS?