"Holly" Tianqi Song


Back to Blog

Multi-agent Learning Notes

June 20, 2024

Recently, I have read two papers by Joon Sung Park, the author of last year's popular Stanford Town article. They are Social Simulacra from 2022 and Generative Agents (the Stanford Town paper itself) from 2023. Both papers were published at UIST, a top HCI conference (the latter won the best paper award that year).

Compared to Generative Agents, Social Simulacra published in 2022 focuses on a smaller field—interaction design prototyping. The specific idea is: interaction prototype designers often need to understand the effect of a design. The current common practice is to recruit a small number of real people for user research.

However, in reality, feedback from a small number of users often cannot reflect the actual effect of the design or potential hidden dangers of use. This is because "user behavior influences each other."

The author uses "antisocial behavior" as an example. For instance, when a person uses social media, they may only express opinions based on their own information. But when many people participate in discussions, it can lead to inflammatory comments or trolling. Such problems cannot be detected in small-scale user testing.

To understand the emergent group behavior that large-scale users may exhibit during the "design phase," there is a specific method called social computing system. This method can be seen as the application of agent-based models in design. Although agent-based models have been widely used in other disciplines, there is still little exploration in design and user experience, which is one of the innovations of this research.

Another highlight of this paper is the use of large language models (LLM): First, designers input their design intentions to the language model, and then the language model generates a series of user behaviors (such as posting/replying). GPT-3, trained on a large amount of social media data, is sufficient to generate various positive or negative responses, including the antisocial behaviors that the author focuses on.

Finally, the author demonstrates the effectiveness of this method through user experiments + a Turing test-like approach: asking users to distinguish between real user behavior and user behavior generated using Social Simulacra. The results show that users cannot distinguish whether it is a real person or generated data in more than 40% of the data, which shows that Social Simulacra performs well in simulating real users.

Reading Social Simulacra from 2022, one can see many shadows of Generative Agents: both use large language models (GPT-3 and GPT-3.5), both try to use LLM to simulate people (one is social media behavior, the other is life behavior), both consider the collective effect of 1+1>2 brought about by human mutual influence, and try to simulate this human group effect with technical methods.

Two-sentence summary: Reading Social Simulacra is very helpful for understanding the ideas of Generative Agents, and combining the two can better reflect the continuity of the author's thinking.

In addition, the author's sensitivity to cutting-edge technology is also amazing. Considering that this paper was submitted in March 2022, and GPT-3 was only released in beta version in June 2020. To use an immature technology to solve problems in a very short time, the author's execution and efficiency are both very high.