[论文翻译]基于双重大语言模型和深度强化学习驱动的智能体仿真研究税务规避现象


原文地址:https://arxiv.org/pdf/2501.18177


Investigating Tax Evasion Emergence Using Dual Large Language Model and Deep Reinforcement Learning Powered Agent-based Simulation

基于双重大语言模型和深度强化学习驱动的智能体仿真研究税务规避现象

Teddy Lazebnik1, Labib Shami2,∗

Teddy Lazebnik1, Labib Shami2,∗

1 Department of Cancer Biology, Cancer Institute, University College London, London, UK 2 Department of Economics, Western Galilee College, Acre, Israel ∗Corresponding author: labibs $@$ wgalil.ac.il

1 伦敦大学学院癌症研究所癌症生物学系,英国伦敦 2 西加利利学院经济系,以色列阿卡 ∗通讯作者:labibs $@$ wgalil.ac.il

Abstract

摘要

Tax evasion, usually the largest component of an informal economy, is a persistent challenge over history with significant socio-economic implications. Many socio-economic studies investigate its dynamics, including influencing factors, the role and influence of taxation policies, and the prediction of the tax evasion volume over time. These studies assumed such behavior is given, as observed in the real world, neglecting the “big bang” of such activity in a population. To this end, computational economy studies adopted developments in computer simulations, in general, and recent innovations in artificial intelligence (AI), in particular, to simulate and study informal economy appearance in various socio-economic settings. This study presents a novel computational framework to examine the dynamics of tax evasion and the emergence of informal economic activity. Employing an agent-based simulation powered by Large Language Models and Deep Reinforcement Learning, the framework is uniquely designed to allow informal economic behaviors to emerge organically, without presupposing their existence or explicitly signaling agents about the possibility of evasion. This provides a rigorous approach for exploring the socio-economic determinants of compliance behavior. The experimental design, comprising model validation and exploratory phases, demonstrates the framework’s robustness in replicating theoretical economic behaviors. Findings indicate that individual personality traits, external narratives, enforcement probabilities, and the perceived efficiency of public goods provision significantly influence both the timing and extent of informal economic activity. The results underscore that efficient public goods provision and robust enforcement mechanisms are complementary; neither alone is sufficient to curtail informal activity effectively. By modeling the emergence of informal economic behavior without assumptions, this research advances the theoretical and practical understanding of tax compliance, offering critical policy insights for designing equitable tax systems and fostering sustainable economic governance.

逃税行为,通常作为非正式经济中最大的组成部分,一直是历史上具有重大社会经济影响的持久挑战。许多社会经济研究探讨了其动态变化,包括影响因素、税收政策的作用与影响,以及逃税规模随时间的预测。这些研究假设这种行为如同现实世界中所观察到的那样是既定的,而忽略了此类活动在人群中“大爆发”的情况。为此,计算经济学研究采纳了计算机模拟的普遍发展,尤其是人工智能(AI)领域的最新创新,来模拟和研究不同社会经济环境下的非正式经济现象。

本研究提出了一种新颖的计算框架,旨在探究逃税动态及非正式经济活动的涌现。利用由大语言模型和深度强化学习驱动的主体模拟(agent-based simulation),该框架独特设计允许非正式经济行为自然涌现,而不预设其存在或明确向智能体(agents)传递逃税可能性。这为探索顺应性行为的社会经济决定因素提供了严谨的方法。

实验设计包含模型验证与探索性阶段,展示了框架在复制理论经济行为方面的鲁棒性。研究结果表明,个体性格特征、外部叙事、执法概率以及公共物品供给的感知效率显著影响非正式经济活动的时间与程度。结果强调,高效的公共物品供给与强有力的执法机制相辅相成;单靠任何一方都无法有效遏制非正式活动。

通过无假设地模拟非正式经济行为的涌现,本研究深化了对税务合规性的理论与实践理解,为设计公平的税收制度与促进可持续经济治理提供了关键的政策洞见。

Keywords: informal economy; socio-economic simulation; computational behavioral economy; economic decision-making; emergent behavior analysis.

关键词:非正式经济;社会经济仿真;计算行为经济学;经济决策;涌现行为分析

1 Introduction

1 引言

The study of informal economic activities has long fascinated researchers and policymakers alike due to its significant impact on economic stability, taxation policies, and societal well-being (Shami, 2019; Gyomai and van de Ven, 2014). Despite some positive contributions, such as informal wealth redistribution, the informal economic activity undermines tax revenues and public goods provision. The study of informal economic activities can lead to better estimation of economic indicators, such as GDP, impacting macroeconomic policies significantly (Gyomai et al., 2012). In addition, the coexistence of informal and formal economies can erode trust in public institutions and contribute to misusing social insurance programs and reducing tax revenues, as evidenced in previous studies (Schneider, 2016).

非正规经济活动的长期研究因其对经济稳定性、税收政策和社会福祉的重大影响而吸引了研究人员和政策制定者的广泛关注 (Shami, 2019; Gyomai and van de Ven, 2014)。尽管非正规经济活动在一些方面有积极贡献,如非正规的财富再分配,但它也削弱了税收收入和公共产品的提供。研究非正规经济活动可以更好地估计经济指标,如GDP,从而显著影响宏观经济政策 (Gyomai et al., 2012)。此外,非正规经济与正规经济的并存可能会削弱对公共机构的信任,并导致社会保障计划的滥用和税收收入的减少,正如先前研究所证明的那样 (Schneider, 2016)。

To tackle this challenge, scholars proposed a wide range of models regarding the informal economy and its dynamics as a whole, as well as models and methods to measure the size of such informal economy (Schneider et al., 2010; Breusch, $2005\mathrm{a}$ ; Enste and Schneider, 2002; Schneider and Enste, 2000). Nonetheless, these studies more often than not fall short as authors often emphasize varying aspects of the informal economy, failing to capture the entire dynamics and the root causes that generate the informal economy (Ha et al., 2021; Schneider and Buehn, 2016; Elgin and Schneider, 2016; Breusch, 2005b).

为应对这一挑战,学者们提出了大量关于非正规经济及其整体动态的模型,以及衡量此类非正规经济规模的模型和方法 (Schneider et al., 2010; Breusch, $2005\mathrm{a}$; Enste and Schneider, 2002; Schneider and Enste, 2000)。然而,这些研究往往存在不足,因为作者们通常强调非正规经济的各个方面,未能捕捉到其整体动态及产生非正规经济的根本原因 (Ha et al., 2021; Schneider and Buehn, 2016; Elgin and Schneider, 2016; Breusch, 2005b)。

As an indicator for such “shooting in the dark” scenario, estimations over time, even based on the same data, exhibit considerable variability (Schneider and Buehn, 2018; Thai and Turkina, 2013). These results are obtained using a diverse set of methods, including the direct approach which assesses the magnitude of the informal economy through either voluntary survey responses or tax audit techniques (Cantekin and Elgin, 2017; Feld and Larsen, 2012; Feld and Schneider, 2010), an indirect approach which uses macroeconomic methods and involves the utilization of diverse economic and non-economic indicators that provide insights into the evolution of the informal economy over time due to available indicators, mainly from the formal economy (Tanzi, 1980, 1983; Ferwerda et al., 2010; Ardizzi et al., 2014), and the modeling approach which uses statistical and data-driven models to estimate the informal economy as an unobservable (latent) variable (Elgin and Erturk, 2019; Andrews et al., 2011; Elgin and Schneider, 2016).

作为这种“盲目射击”场景的指标,随着时间推移的估计,即使基于相同的数据,也表现出相当大的可变性 (Schneider and Buehn, 2018; Thai and Turkina, 2013)。这些结果是使用多种方法获得的,包括直接方法,即通过自愿调查响应或税务审计技术评估非正规经济的规模 (Cantekin and Elgin, 2017; Feld and Larsen, 2012; Feld and Schneider, 2010),间接方法,即使用宏观经济方法并涉及利用各种经济和非经济指标,这些指标由于可用的指标(主要来自正规经济)提供了对非正规经济随时间演变的洞察 (Tanzi, 1980, 1983; Ferwerda et al., 2010; Ardizzi et al., 2014),以及建模方法,即使用统计和数据驱动模型将非正规经济估计为不可观察的(潜在)变量 (Elgin and Erturk, 2019; Andrews et al., 2011; Elgin and Schneider, 2016)。

From these approaches, the modeling approach is considered the most accurate and app li cat ive in real-world settings (Schneider and Buehn, 2017; Schneider and Enste, 2000). Indeed, a growing body of work has emerged in recent years of studies using the modeling approach to estimate the dynamics and size of the informal economy (Kireenko and Nevzorova, 2015; Alanon and Gomez-Antonio, 2005). In particular, machine learning (ML) and deep learning (DL) based models have shown to be powerful tools to study the informal economy size (Shami and Lazebnik, 2023; Lazebnik, 2024; Felix et al., 2023; Ivas and Tefoni, 2023), marking the first step toward using artificial intelligence (AI) to study the informal economy. Nevertheless, these models focused on the informal economy’s size and the macroeconomic indicators responsible for such size, ignoring the more basic question of how the informal economy is established, changed, and adapted to the formal economy and government policies.

在这些方法中,建模方法被认为是在现实世界中最准确和适用的 (Schneider and Buehn, 2017; Schneider and Enste, 2000)。事实上,近年来出现了越来越多使用建模方法来估计非正规经济动态和规模的研究 (Kireenko and Nevzorova, 2015; Alanon and Gomez-Antonio, 2005)。特别是,基于机器学习 (ML) 和深度学习 (DL) 的模型已被证明是研究非正规经济规模的有力工具 (Shami and Lazebnik, 2023; Lazebnik, 2024; Felix et al., 2023; Ivas and Tefoni, 2023),这标志着使用人工智能 (AI) 研究非正规经济的第一步。然而,这些模型主要关注非正规经济的规模以及导致这种规模的宏观经济指标,忽略了非正规经济如何建立、变化以及适应正规经济和政府政策等更基本的问题。

In a more general sense, the central challenge in any economic model lies in its ability to effectively replicate the phenomenon it aims to investigate, based on the assumptions and structure defined by its developers. This critique is particularly salient in theoretical models addressing informal economic activity as such models often presuppose the existence of informal economic phenomena and incorporate this presumption into their foundational parameters, thereby undermining their core purpose: to simulate and analyze the emergence of the phenomenon rather than to assume its presence (Ferraro et al., 2005; Bodenhorn, 1956). This methodological flaw is not a trivial matter; by assuming the existence of the phenomenon, these models are inherently limited in their capacity to elucidate the underlying causes of its formation. Consequently, they fail to contribute meaningfully to our understanding and risk becoming analytically redundant. This study seeks to address this gap by proposing a model that refrains from presupposing the existence of informal economic activity. Instead, it builds upon fundamental characteristics of economic behavior to explore how an informal economy might emerge alongside a formal economy within the broader economic system.

从更广泛的意义上讲,任何经济模型的核心挑战在于其能否基于开发者定义的假设和结构,有效复制其旨在研究的现象。这一批评在涉及非正式经济活动的理论模型中尤为突出,因为这类模型通常预先假定非正式经济现象的存在,并将这一假设纳入其基本参数中,从而削弱了其核心目的:模拟和分析现象的出现,而非假设其存在 (Ferraro et al., 2005; Bodenhorn, 1956)。这一方法上的缺陷并非小事;通过假设现象的存在,这些模型在解释其形成的基本原因方面的能力受到本质上的限制。因此,它们无法为我们的理解做出有意义的贡献,并且可能在分析上变得冗余。本研究旨在通过提出一个不预先假设非正式经济活动存在的模型来解决这一差距。相反,该模型建立在经济行为的基本特征之上,以探索在更广泛的经济体系中,非正式经济如何与正式经济同时出现。

To this end, in this study, we explore the informal economy from a micro economic perspective using in silico methodology. Formally, we take advantage of the recent advances in the field of AI in the form of Large-Language Models (LLMs) that present similar reasoning, decision-making, and world-understanding performance to humans (Ke et al., 2024; Gilhooly, 2023; Ivey et al., 2024). Formally, we developed an agent-based simulation (ABS) with a heterogeneous population and central government, operating in a monetized economy, such that each agent has a unique personality powered by a combined LLM and Deep Reinforcement Learning (DRL) model. Using this approach, we investigate the emergence of informal economic activity, focusing specifically on tax evasion, over different scenarios as well as its dynamics following various economic interventions of the central government. Our objective is to model the emergence and behavior of an informal economy, in the form of tax evasion, where rational agents, equipped with limited knowledge, interact in a dynamic environment characterized by transactions, risks, and tax obligations. The novelty of the proposed study lies in the utilization of a multi-agent AI model to study the informal economy dynamics that emerge from the LLM’s world knowledge rather than pre-programmed actions defined by the modeler which artificially allows and encourages the simulated agents to practice in an informal economic activity.

为此,在本研究中,我们从微观经济学的角度,利用计算机模拟方法探索非正规经济。具体而言,我们利用大语言模型(LLMs)领域的最新进展,这些模型展现出与人类相似的推理、决策和世界理解能力(Ke et al., 2024; Gilhooly, 2023; Ivey et al., 2024)。我们正式开发了一个基于智能体的模拟(ABS),模拟了一个异质性群体和中央政府在一个货币化经济中的运作,每个智能体都由一个结合了LLM和深度强化学习(DRL)模型提供动力,拥有独特的个性。通过这种方法,我们研究了不同情景下非正规经济活动的出现,特别是逃税行为,以及中央政府对不同经济干预措施后的动态变化。我们的目标是以逃税的形式模拟非正规经济的出现和行为,其中具备有限知识的理性智能体在由交易、风险和税收义务为特征的动态环境中互动。本研究的创新之处在于利用多智能体AI模型来研究非正规经济动态,这些动态源于LLM的世界知识,而非由模型设计者预先定义的行为,这些行为人为地允许并鼓励模拟智能体参与非正规经济活动。

The remainder of the paper is organized as follows. Section 2 provides an overview of the computational methods used as part of the model as well as the economic theory of informal economy, in general, and tax evasion, in particular. Section 3 formally introduces the proposed AI-driven agent-based simulation model. Section 4 outlines the experimental settings using the proposed model inspired by the socio-economic settings in the United States (US) and presents the obtained results for the experiments. Section 5 discusses the economic applications of the obtained results and suggests possible future work. Section 6 concludes briefly.

本文的其余部分组织如下。第 2 节概述了模型中使用的计算方法以及非正规经济的一般经济学理论,特别是逃税理论。第 3 节正式介绍了提出的基于 AI 智能体的仿真模型。第 4 节概述了受美国社会经济环境启发的实验设置,并展示了实验结果。第 5 节讨论了实验结果的经济应用并提出了未来的研究方向。第 6 节简要总结。

2 Related Work

2 相关工作

In this section, we outline the economic and computational background of this study. We initially reviewed the economic theory of the informal economy establishment and dynamics followed by an overview of macroeconomic informal economy models. Afterward, we focus on the computational methods adopted for the proposed model, including the agent-based simulation approach, LLM and their usage as decision-making tools, and deep reinforcement learning as a method to allow AI agents to solve complex tasks in dynamic environments.

在本节中,我们概述了本研究的经济和计算背景。首先,我们回顾了非正规经济建立和动态的经济理论,随后对宏观经济中的非正规经济模型进行了概述。接着,我们重点介绍了所提出模型采用的计算方法,包括基于智能体的模拟方法、大语言模型 (LLM) 及其作为决策工具的用途,以及深度强化学习作为让 AI 智能体在动态环境中解决复杂任务的方法。

2.1 The economic rationale behind tax evasion

2.1 逃税背后的经济原理

Tax evasion is a significant challenge for governments globally, impacting tax revenue collection and undermining public trust in tax systems (Sandmo, 2005). Formally, tax evasion is the illegal act of deliberately avoiding paying taxes owed to the government by under reporting income, inflating deductions, or concealing money or assets (Slemrod, 1985). Research into the motivations behind tax evasion has highlighted various economic, psychological, and institutional factors (Elffers et al., 1987; Khlif and Achek, 2015). In particular, factors such as demographic characteristics, personality traits, perceptions of tax fairness, and cultural contexts; influence taxpayers’ attitudes toward tax evasion (Khlif and Achek, 2015).

逃税是全球各国政府面临的一项重大挑战,影响了税收征收,并削弱了公众对税收制度的信任 (Sandmo, 2005)。从法律上讲,逃税是一种故意通过少报收入、夸大扣除额或隐瞒资金或资产来避免向政府缴纳税款的非法行为 (Slemrod, 1985)。对逃税动机的研究强调了各种经济、心理和制度因素 (Elffers et al., 1987; Khlif and Achek, 2015)。特别是,人口特征、个性特征、对税收公平的看法和文化背景等因素会影响纳税人对逃税的态度 (Khlif and Achek, 2015)。

Characterizing the likelihood of an individual evading taxes is complex and often incorporates opposing aspects of economic reasoning (Weigel et al., 1987). For instance, economists have posited a relationship between tax rates and tax evasion, suggesting that higher levels of taxation create a stronger incentive to avoid tax obligations (Alexi et al., 2023). Similarly, at a given point in time, taxpayers subject to high marginal tax rates may experience greater financial rewards from tax evasion compared to those facing lower rates, potentially leading to higher evasion behavior among the former. However, this relationship is not straightforward and may be disrupted by the principle of diminishing marginal utility of money. While the potential rewards of tax resistance are greater for highincome taxpayers, they may ascribe lower economic value to these gains compared to lower-income taxpayers, who might perceive the additional income as addressing more immediate financial needs (Hofmann et al., 2017).

描述个人逃税的可能性是复杂的,通常包含经济推理的对立面 (Weigel et al., 1987)。例如,经济学家提出了税率与逃税之间的关系,认为更高的税率会创造更强的避税动机 (Alexi et al., 2023)。同样,在某个时间点,面临高边际税率的纳税人可能比面临较低税率的人从逃税中获得更大的财务回报,这可能导致前者有更高的逃税行为。然而,这种关系并不简单,可能会被金钱边际效用递减的原则所打断。虽然高收入纳税人逃税的潜在回报更大,但与低收入纳税人相比,他们可能对这些收益赋予较低的经济价值,而低收入纳税人可能将额外收入视为满足更紧迫的财务需求 (Hofmann et al., 2017)。

When focusing on income taxes (government-imposed tax on an individual or entity’s earnings), due to its global utilization and influence on the economy (Graham et al., 2012), the seminal work by Allingham and Sandmo (1972) on income tax evasion provides a theoretical framework that has significantly influenced subsequent research in the field. Their model conceptualizes tax evasion as a decision under uncertainty, where taxpayers weigh the potential benefits of evasion against the risks of detection and penalties, assuming that taxpayers are amoral, risk-averse, and driven by utility maximization. Their model is based on Becker’s (1968) theory of crime, emphasizing rational decision-making through empirical testing and econometric modeling. The authors posit that higher tax rates increase the incentive for individuals to evade taxes, as the potential financial benefits of evasion become more substantial. Similarly, Wentworth and Rickel (1985) suggest that individuals consider the economic benefit of tax evasion against the risk of detection and penalties, and Dean et al. (1980) found that perceived high tax levels and unfairness in tax burdens were commonly cited reasons for tax evasion.

当聚焦于所得税(政府向个人或实体的收入征收的税)时,由于其全球应用及其对经济的影响(Graham et al., 2012),Allingham 和 Sandmo (1972) 在所得税逃避方面的开创性工作提供了一个显著影响该领域后续研究的理论框架。他们的模型将逃税概念化为不确定性下的决策,纳税人权衡逃税的潜在收益与被发现和处罚的风险,假设纳税人是非道德的、风险规避的,并且受效用最大化的驱动。该模型基于 Becker (1968) 的犯罪理论,强调通过经验测试和计量经济学建模进行理性决策。作者认为,较高的税率会增加个人逃税的动机,因为逃税的潜在财务收益变得更加可观。同样,Wentworth 和 Rickel (1985) 提出,个人会权衡逃税的经济收益与被发现和处罚的风险,而 Dean 等人 (1980) 发现,感知到的高税收水平和不公平税收负担是常被提及的逃税原因。

Nevertheless, the relationship between tax rates and tax evasion is not able to explain the observed socio-economic dynamics fully as at its core it is based on the concept of diminishing marginal utility of income, which seems not to capture the entire story (Pommerehne and Weck-Hannemann, 1996a; Adebisi et al., 2013; Sury, 2015; Wall schutz ky, 1984; Dergham and Al-Omour, 2010). This complexity is further explored in studies incorporating behavioral economics perspectives, such as prospect theory (Levy, 1992), to understand taxpayers’ decision-making processes. For example, Piolatto and Rablen (2017) examines how elements of prospect theory, including loss aversion and probability weighting, influence tax evasion behavior, challenging traditional expected utility models. The authors revisit the Yitzhaki puzzle (Yitzhaklt, 1974), which suggests that tax evasion decreases as the marginal tax rate increases, a counter intuitive result under the standard expected utility theory. The authors explore whether prospect theory, which accounts for psychological factors, can resolve this puzzle. The findings indicate that while prospect theory introduces new dimensions to understanding tax evasion, it does not universally overturn the Yitzhaki puzzle without specific conditions or modifications to the reference level used in the model.

然而,税率与逃税之间的关系并不能完全解释所观察到的社会经济动态,因为其核心基于收入边际效用递减的概念,这一概念似乎未能捕捉到全部情况 (Pommerehne and Weck-Hannemann, 1996a; Adebisi et al., 2013; Sury, 2015; Wall schutz ky, 1984; Dergham and Al-Omour, 2010) 。这种复杂性在结合行为经济学视角的研究中得到了进一步探讨,例如前景理论 (Levy, 1992) ,以理解纳税人的决策过程。例如,Piolatto 和 Rablen (2017) 研究了前景理论中的损失厌恶和概率加权等要素如何影响逃税行为,挑战了传统的期望效用模型。作者重新审视了 Yitzhaki 悖论 (Yitzhaklt, 1974) ,该悖论表明,随着边际税率的增加,逃税行为会减少,这是标准期望效用理论下的一个反直觉结果。作者探讨了考虑到心理因素的前景理论是否可以解决这一悖论。研究结果表明,虽然前景理论为理解逃税行为引入了新的维度,但如果没有特定条件或对模型中使用的参考水平进行修改,它并不能普遍推翻 Yitzhaki 悖论。

The empirical study by McGee and Maranjyan (2006) in Armenia found that taxpayers justified evasion when they believed their government did not use tax revenue responsibly. A study by Uadiale and Noah (2010) in Nigeria shows how individuals’ ethical beliefs and perceptions of the social contract influence their tax compliance (Temitope et al., 2010). People who view tax compliance as a moral obligation are less likely to evade taxes, while those who consider tax payments as optional or unjust are more inclined to evade them. This moral reasoning, where taxpayers justify evasion as a response to perceived governmental misuse of funds, reflects a broader ethical dilemma facing taxpayers. Similarly, Green (2008) explores how individuals rationalize tax evasion by framing it as a reaction to government inefficiencies or a corrupt tax system. A key variable examined by Dean et al. (1980) is the perception that government tax revenues are not efficiently allocated to finance public goods and services. According to the study’s findings, approximately $62%$ of respondents expressed a negative view regarding the government’s effective use of tax revenues.

McGee 和 Maranjyan (2006) 在亚美尼亚的实证研究发现,当纳税人认为政府没有负责任地使用税收时,他们会为逃税行为辩护。Uadiale 和 Noah (2010) 在尼日利亚的一项研究表明,个人的道德信念和社会契约观念如何影响他们的纳税合规性 (Temitope 等, 2010)。将纳税合规视为道德义务的人不太可能逃税,而将纳税视为可选项或不公正的人更倾向于逃税。这种道德推理,即纳税人将逃税作为对政府滥用资金的回应,反映了纳税人面临的更广泛的道德困境。同样,Green (2008) 探讨了个人如何通过将逃税行为归结为对政府低效或腐败税收制度的反应来合理化逃税行为。Dean 等 (1980) 研究的一个关键变量是纳税人认为政府税收没有有效分配用于公共产品和服务。根据研究结果,大约 $62%$ 的受访者对政府有效使用税收持负面看法。

Our approach in the current study combines both worlds and gives expression to both the utility-maximization approach and behavioral-social considerations in the individual’s decision.

我们的方法在当前研究中结合了两个方面,既体现了效用最大化的方法,又在个体决策中融入了行为社会因素的考量。

2.2 Large language models

2.2 大语言模型

A LLM is an AI model designed to process and generate human-like text through the use of DL techniques, in general, and using the Transformer neural network architecture, in particular (Zhao et al., 2024). It typically involves a neural network architecture with numerous layers and parameters that are trained on large datasets of text (G rue tze mac her and Paradice, 2022) which can get up to 110 billion words1. The training process involves learning the statistical patterns and relationships within the text data, allowing the model to generate coherent and con textually relevant responses to input text or prompts (Chang et al., 2024).

大语言模型(LLM)是一种通过深度学习(DL)技术,特别是使用Transformer神经网络架构,设计用于处理和生成类人文本的 AI 模型 (Zhao et al., 2024)。它通常涉及具有多层和大量参数的神经网络架构,这些架构在包含多达1100亿单词的文本数据集上进行训练 (Grue tze mac her and Paradice, 2022)。训练过程包括学习文本数据中的统计模式和关系,使模型能够生成与输入文本或提示连贯且上下文相关的响应 (Chang et al., 2024)。

LLMs have driven significant advancements in natural language processing and are now integral to various products with millions of users, including the coding assistant Copilot by Microsoft, the Bing search engine, and more recently, ChatGPT by OpenAI (Chen et al., 2023; Egli, 2023; Youssef, 2023). The combination of memorization and composition ally has enabled LLMs to perform tasks such as language understanding and both conditional and unconditional text generation at an unprecedented level of performance (Kocon et al., 2023). This progress paves the way for more sophisticated and higher-bandwidth human-computer interactions (Huang and Tan, 2023; Nadkarni et al., 2011; Sallam, 2023; Rosenfeld and Lazebnik, 2024; Lazebnik and Rosenfeld, 2024).

大语言模型推动了自然语言处理领域的显著进展,并已成为数百万用户使用的各种产品的核心部分,包括 Microsoft 的编程助手 Copilot、Bing 搜索引擎,以及最近的 OpenAI 的 ChatGPT (Chen et al., 2023; Egli, 2023; Youssef, 2023)。记忆与组合能力的结合使大语言模型能够以前所未有的性能水平执行语言理解以及条件和无条件文本生成等任务 (Kocon et al., 2023)。这种进展为更复杂和更高带宽的人机交互铺平了道路 (Huang and Tan, 2023; Nadkarni et al., 2011; Sallam, 2023; Rosenfeld and Lazebnik, 2024; Lazebnik and Rosenfeld, 2024)。

LLMs have demonstrated impressive potential in achieving reasoning and planning capabilities comparable to humans (Espejel et al., 2023; Guo et al., 2023). This aligns perfectly with human expectations for autonomous agents that can perceive their surroundings, make decisions, and take actions accordingly (Wang et al., 2024). Consequently, LLM-based agents have garnered significant attention and development to comprehend and generate human-like instructions, enabling sophisticated interactions and decision-making across various contexts (Mehandru et al., 2023; Zhang et al., 2024; Chen et al., 2024). Inspired by the remarkable capabilities of individual LLM-based agents, researchers have proposed LLM-based multi-agents to harness collective intelligence and specialized profiles and skills from multiple agents (Cheng et al., 2024; Wu et al., 2024). Compared to systems relying on a single LLM-powered agent, multiagent systems offer advanced capabilities by segmenting LLMs into distinct agents with unique capabilities and facilitating interactions among these diverse agents to effectively simulate complex real-world environments (de Zarza et al., 2023). In this framework, multiple autonomous agents collaborate in planning, discussions, and decision-making, mimicking the cooperative nature of human group work in problem-solving tasks (Rasal and Hauer, 2024).

大语言模型在实现与人类相当的推理和规划能力方面展示了令人印象深刻的潜力 (Espejel et al., 2023; Guo et al., 2023)。这完全符合人类对能够感知环境、做出决策并采取相应行动的自主AI智能体的期望 (Wang et al., 2024)。因此,基于大语言模型的AI智能体在理解和生成类似人类的指令方面引起了广泛关注和发展,使其能够在各种情境下进行复杂的交互和决策 (Mehandru et al., 2023; Zhang et al., 2024; Chen et al., 2024)。受单个基于大语言模型的AI智能体的显著能力启发,研究人员提出了基于大语言模型的多智能体系统,以利用多个智能体的集体智慧和专业技能 (Cheng et al., 2024; Wu et al., 2024)。与依赖单个基于大语言模型的AI智能体的系统相比,多智能体系统通过将大语言模型分割为具有独特能力的独立智能体,并促进这些不同智能体之间的交互,有效模拟复杂的现实世界环境 (de Zarza et al., 2023)。在这一框架中,多个自主AI智能体在规划、讨论和决策中进行合作,模仿人类群体在解决问题任务中的协作性质 (Rasal and Hauer, 2024)。

The multi-agent LLM approach leverages the communicative abilities of LLMs, utilizing their text generation and response capabilities. Moreover, it taps into LLMs’ broad knowledge across domains and potential for specialization in specific tasks (Zhang et al., 2023). Recent studies have shown promising results in employing LLM-based multi-agents for various tasks such as software development (Nam et al., 2024), multi-robot systems (Luan et al., 2024), and society simulation (Gao et al., 2024).

多智能体大语言模型方法利用了大语言模型的交流能力,运用了其文本生成和响应能力。此外,它还挖掘了大语言模型在各领域的广泛知识以及在特定任务中的专业化潜力(Zhang等人,2023)。最近的研究表明,基于大语言模型的多智能体在各种任务中展现出良好的效果,如软件开发(Nam等人,2024)、多机器人系统(Luan等人,2024)以及社会模拟(Gao等人,2024)。

2.3 Deep reinforcement learning

2.3 深度强化学习

Reinforcement learning (RL) is a type of ML where an agent (or group of agents) learns to make decisions by interacting with an environment to maximize cumulative rewards (Abdellatif et al., 2018). The key components of RL are the agent, environment, actions, states, and rewards (Abdellatif et al., 2018). The agent is the learner or decision-maker, while the environment represents everything the agent interacts with, including other agents. States are the different situations in which the agent can be, and actions are the choices the agent can make. The agent receives rewards or punishments as feedback based on its actions, guiding it to learn optimal behaviors over time. The agent uses a policy, which is a strategy mapping states to actions, to maximize the total expected reward, often utilizing value functions to estimate the long-term benefit of actions (El-Bouri et al., 2021). Through exploration (trying new actions) and exploitation (using known actions that yield high rewards), the agent improves its policy, aiming to achieve the best possible outcomes in the environment (El-Bouri et al., 2021).

强化学习 (Reinforcement Learning, RL) 是一种机器学习 (Machine Learning, ML) 方法,其中智能体(或一组智能体)通过与环境交互来学习做出决策,以最大化累积奖励 (Abdellatif et al., 2018)。强化学习的关键组成部分包括智能体、环境、动作、状态和奖励 (Abdellatif et al., 2018)。智能体是学习或决策者,而环境代表智能体与之交互的一切,包括其他智能体。状态是智能体可能处于的不同情境,动作是智能体可以做出的选择。智能体根据其行动获得奖励或惩罚作为反馈,随着时间的推移引导其学习最佳行为。智能体使用策略(一种将状态映射到动作的策略)来最大化总预期奖励,通常利用价值函数来估计动作的长期收益 (El-Bouri et al., 2021)。通过探索(尝试新动作)和利用(使用已知会产生高奖励的动作),智能体改进其策略,旨在在环境中实现最佳结果 (El-Bouri et al., 2021)。

Deep reinforcement learning (DRL) extends RL by incorporating deep neural networks (Sainath et al., 2015) to handle complex decision-making tasks (Mao et al., 2016a,b; Hurtado Sa´nchez et al., 2022; Giupponi et al., 2005). In DRL, an agent interacts with an environment to maximize cumulative rewards, just like in traditional RL. However, DRL leverages deep learning to efficiently process high-dimensional input and approximate the optimal policy or value functions in a numerical fashion. The key components—agent, environment, actions, states, and rewards—remain the same (Mao et al., 2016a). DRL has multiple implications with unique strengths and limitations (Hao et al., 2023; Stooke and Abbeel, 2019; Kahn et al., 2018). For instance, Deep Q-Networks uses a deep neural network to approximate the Q-value function, which estimates the expected reward for taking a particular action in a given state (Fan et al., 2020). By using experience replay and target networks, Deep Q-Networks can stabilize learning and are effective in environments like video games where the state space is large and complex. Proximal Policy Optimization is a policy gradient method that improves training stability by using a clipped objective function to limit the size of policy updates (Gu et al., 2022). It balances exploration and exploitation and is known for its robustness and efficiency in continuous control tasks such as robotic manipulation and locomotion. Actor-Critic methods is a family of methods that involve two neural networks - the actor, which selects actions, and the critic, which evaluates them by estimating the value function (Grondman et al., 2012). This approach allows the agent to learn both the policy and the value function concurrently, leading to improved learning efficiency and effectiveness in environments with continuous action spaces.

深度强化学习 (Deep Reinforcement Learning, DRL) 通过引入深度神经网络 (Sainath et al., 2015) 来扩展强化学习,以处理复杂的决策任务 (Mao et al., 2016a,b; Hurtado Sa´nchez et al., 2022; Giupponi et al., 2005)。在 DRL 中,智能体与环境交互以最大化累积奖励,与传统强化学习类似。然而,DRL 利用深度学习高效处理高维输入,并以数值方式逼近最优策略或价值函数。其关键组件——智能体、环境、动作、状态和奖励——保持不变 (Mao et al., 2016a)。DRL 具有多种含义,各有独特的优势和局限性 (Hao et al., 2023; Stooke and Abbeel, 2019; Kahn et al., 2018)。例如,深度 Q 网络 (Deep Q-Networks) 使用深度神经网络逼近 Q 值函数,该函数估计在给定状态下执行特定动作的预期奖励 (Fan et al., 2020)。通过使用经验回放和目标网络,深度 Q 网络可以稳定学习,并在状态空间大且复杂的环境中(如视频游戏)表现出色。近端策略优化 (Proximal Policy Optimization) 是一种策略梯度方法,通过使用剪裁目标函数来限制策略更新的幅度,从而提高训练稳定性 (Gu et al., 2022)。它在探索与利用之间取得了平衡,并在机器人操作和运动等连续控制任务中以其鲁棒性和高效性著称。演员-评论家方法 (Actor-Critic methods) 是一类方法,涉及两个神经网络——演员负责选择动作,评论家通过估计价值函数来评估这些动作 (Grondman et al., 2012)。这种方法使智能体能够同时学习策略和价值函数,从而在具有连续动作空间的环境中提高学习效率和效果。

DRL is commonly utilized in the context of multi-agent tasks, in general, and as part of ABS, in particular (Zhang et al., 2023; Hernandez-Leal et al., 2019; Du and Ding, 2021). For example, Bushaj et al. (2022) used DRL for resource allocation and intervention policies in pandemic control settings based on ABS. Lazebnik (2023) used a combined ABS with DRL for the hospital’s staff and resource allocation. The author shows the model well aligned with results from expert-driven models while also successfully dealing with limited knowledge of the state and in a very stochastic environment. Vargas-Perez et al. (2023) develop a DRL agent that represents a brand as part of an ABS of a market with the goal of obtaining a marketing investment strategy that improves the awareness of its corresponding brand in a given marketing scenario. The authors compared the policy obtained by the agent with a human expert, showing a statistically good agreement between the two. Zheng et al. (2020) proposed a detailed and large-scale ABS with DRL agents for taxation policy optimization. The author did not use any economic models or assumptions but rather allowed an economy to emerge from labor cost with skill-related pricing in a heterogeneous population with a central government gathering income and bracketed taxes.

DRL 常用于多智能体任务,特别是作为 ABS 的一部分 (Zhang et al., 2023; Hernandez-Leal et al., 2019; Du and Ding, 2021)。例如,Bushaj et al. (2022) 使用 DRL 进行基于 ABS 的疫情控制环境中的资源分配和干预策略研究。Lazebnik (2023) 结合 ABS 和 DRL 进行医院的员工和资源分配。作者展示的模型与专家驱动模型的结果高度一致,同时成功处理了有限的状态知识和高度随机的环境。Vargas-Perez et al. (2023) 开发了一个代表品牌的 DRL 智能体,作为市场 ABS 的一部分,旨在获得一种营销投资策略,以提高其在特定营销场景中对应品牌的知名度。作者将智能体获得的策略与人类专家进行了比较,结果显示两者在统计上具有良好的一致性。Zheng et al. (2020) 提出了一个详细且大规模的 ABS,使用 DRL 智能体进行税收政策优化。作者没有使用任何经济模型或假设,而是允许经济从劳动成本中产生,并在具有中央政府收入和分级税收的异质群体中进行技能相关的定价。

2.4 Agent-based simulation

2.4 基于AI智能体的模拟 (Agent-based simulation)

Agent-based simulation (ABS) is a computational approach for capturing the (spatio-)temporal dynamics of multiple agents (Polhill et al., 2021; Epstein, 1999; Bonabeau, 2002). An ABS typically comprises two main components: an environment and a population of agents, which can be either homogeneous or heterogeneous (Zhou et al., 2022; Raberto et al., 2001). ABS involves three types of interactions between agents and their environment: spontaneous, agent-agent, and agent-environment interactions. Spontaneous interactions occur between an agent and itself, depending solely on the agent’s current state and time. Agent-agent interactions involve two or more agents, altering the state of at least one of the participating agents. Agent-environment interactions involve agents and their environment, resulting in changes to the state of the agent, the environment, or both. Notably, ABS can be computationally reduced to the population protocol model (Aspnes and Ruppert, 2009) and is thus Turing-complete (North, 2014; Lau be nba cher et al., 2007), meaning that ABS can represent any dynamics solvable by a computer.

基于智能体的仿真 (Agent-based simulation, ABS) 是一种用于捕捉多个智能体时空动态的计算方法 (Polhill et al., 2021; Epstein, 1999; Bonabeau, 2002)。ABS 通常包含两个主要组件:一个环境和一组智能体,这些智能体可以是同质的,也可以是异质的 (Zhou et al., 2022; Raberto et al., 2001)。ABS 涉及智能体与其环境之间的三种交互类型:自发交互、智能体-智能体交互和智能体-环境交互。自发交互发生在智能体与其自身之间,仅取决于智能体的当前状态和时间。智能体-智能体交互涉及两个或更多智能体,改变至少一个参与智能体的状态。智能体-环境交互涉及智能体及其环境,导致智能体、环境或两者的状态发生变化。值得注意的是,ABS 在计算上可以简化为群体协议模型 (Aspnes and Ruppert, 2009),因此具有图灵完备性 (North, 2014; Lau be nba cher et al., 2007),这意味着 ABS 可以表示任何可通过计算机求解的动态。

ABS has become a prominent tool for studying complex economic phenomena arising from individual agent interactions (Poledna et al., 2023; Evans et al., 2021; Canese et al., 2021; Epstein and Axtell, 1996; Axelrod, 1998). Studies have employed ABS to model tax policy influence on the economy (Alexi et al., 2023), corruption (Zausinova et al., 2020), and the emergence of formal economies (Axtell, 2007; Tesfatsion, 2002). For example, Lazebnik et al. (2021) used ABS to simulate the spread of a pandemic and its influence on the economy as well as the usage of different pandemic intervention policies and their epidemiological-economical effectiveness for different configurations. Goro chow ski et al. (2012) show that ABS is an effective modeling method for the interactions between cells as well as bacterial populations in synthetic biology. Lanham et al. (2014) used ABS to study crisis de-escalation activities in complex social networks, showing that ABS was able to capture the heterogeneity in the population from real data.

ABS 已成为研究个体智能体交互产生的复杂经济现象的重要工具 (Poledna 等, 2023; Evans 等, 2021; Canese 等, 2021; Epstein 和 Axtell, 1996; Axelrod, 1998)。研究已使用 ABS 来模拟税收政策对经济的影响 (Alexi 等, 2023)、腐败 (Zausinova 等, 2020) 以及正式经济的出现 (Axtell, 2007; Tesfatsion, 2002)。例如,Lazebnik 等 (2021) 使用 ABS 模拟了疫情的传播及其对经济的影响,以及不同疫情干预政策的使用及其在不同配置下的流行病学-经济学有效性。Goro chow ski 等 (2012) 表明,ABS 是合成生物学中细胞和细菌种群之间交互的有效建模方法。Lanham 等 (2014) 使用 ABS 研究了复杂社交网络中的危机降级活动,表明 ABS 能够从真实数据中捕捉到人口的异质性。

Traditional approaches, such as statistical methods, struggle to disentangle social interaction effects from exogenous and correlated influences, a challenge that ABS overcomes by enabling virtual experiments that isolate specific mechanisms (Manski, 2000). Unlike descriptive statistical data analysis, ABS focuses on the generative processes underlying tax compliance, providing a deeper understanding of causality (Hedstro¨m, 2005). Early ABS applications for tax compliance included studies by Mittone and Patelli (2000), Davis et al. (2003), and Bloomquist (2004, 2006), who developed models incorporating heterogeneous agents and probabilistic audits validated against real-world data (Bloomquist, 2006, 2004; Davis et al., 2003; Mittone and Patelli, 2000). Subsequent advancements introduced memory and social imitation, along with autonomous tax inspectors to model compliance with indirect taxes (Antunes et al., 2005).

传统方法,如统计方法,难以将社会交互效应与外生和相关影响区分开来,而基于代理的模拟 (ABS) 通过允许虚拟实验来隔离特定机制,从而克服了这一挑战 (Manski, 2000)。与描述性统计数据分析不同,ABS 专注于税收合规背后的生成过程,提供了对因果关系的更深入理解 (Hedstro¨m, 2005)。早期的 ABS 在税收合规中的应用包括 Mittone 和 Patelli (2000)、Davis 等人 (2003) 以及 Bloomquist (2004, 2006) 的研究,他们开发了包含异质代理和概率审计的模型,并通过真实数据验证 (Bloomquist, 2006, 2004; Davis 等人, 2003; Mittone 和 Patelli, 2000)。随后的进展引入了记忆和社会模仿,以及自主税务检查员来模拟间接税收合规 (Antunes 等人, 2005)。

Moreover, physics-inspired models aimed at replacing particle interactions with behavioral contagion, as seen in the work of Zaklan et al. on tax evasion dynamics (Zaklan et al., 2008). Further, the SIMULFIS model introduced by Noguera et al. (2014) integrates rational choice, fairness concerns, and social contagion, emphasizing the importance of social mechanisms often neglected in deterrence-based theories. SIMULFIS employs a decision algorithm composed of four sequential filters—opportunity, normative, rational choice, and social influence. These filters reflect recent advancements in behavioral social science, moving beyond traditional utility-maximizing functions to incorporate fairness and social influence. Virtual experiments conducted with SIMULFIS revealed that audits are more effective than fines in improving compliance, and that publicizing tax compliance levels can positively influence behavior. Overall, ABS provides a robust tool for understanding tax compliance dynamics, aiding policymakers in designing effective strategies.

此外,Zaklan 等人在逃税动力学研究中提出了一种受物理学启发的模型,旨在用行为传染代替粒子相互作用 (Zaklan et al., 2008)。进一步地,Noguera 等人 (2014) 提出的 SIMULFIS 模型整合了理性选择、公平关注和社会传染,强调了在威慑理论中常被忽视的社会机制的重要性。SIMULFIS 采用了一种由四个连续过滤器组成的决策算法——机会、规范性、理性选择和社会影响。这些过滤器反映了行为社会科学的最新进展,超越了传统的效用最大化函数,纳入了公平和社会影响。利用 SIMULFIS 进行的虚拟实验表明,审计比罚款在提高合规性方面更有效,并且公开税收合规水平可以积极影响行为。总体而言,基于主体的建模 (Agent-Based Modeling, ABS) 为理解税收合规动态提供了强大的工具,帮助政策制定者设计有效的策略。

Recently, ABS has been greatly upgraded with the emergence of data-driven models such as ML and DL models which allowed ABS to have an adaptive behavior which not explicitly defined by the modeler, allowing it to simulate more realistic dynamics (Ciatto et al., 2020; Wang and Usher, 2005). For instance, Jang et al. (2018) used an ABS with agents powered by a DRL-based model to explore traffic flow dynamics for various traffic simulations. Collins et al. (2014) proposed a framework for training deep reinforcement learning models in agent-based price-order-book simulations that yield non-trivial policies under diverse conditions with market impact. Joubert et al. (2022) proposed ABS with agents powered by a reinforcement learning model with memory to simulate street robbery, showing the simulation was able to recreate reported dynamics from the real world.

近期,随着数据驱动模型(如机器学习和深度学习模型)的出现,ABS(基于智能体的仿真)得到了极大升级,使其能够拥有不由建模者明确定义的自适应行为,从而模拟更真实的动态(Ciatto 等,2020;Wang 和 Usher,2005)。例如,Jang 等(2018)使用了一个由基于深度强化学习模型驱动的智能体的 ABS,来探索各种交通模拟中的交通流动态。Collins 等(2014)提出了一个框架,用于在基于智能体的价格订单簿模拟中训练深度强化学习模型,该模型能够在不考虑市场影响的多种条件下产生非平凡策略。Joubert 等(2022)提出了一种由具有记忆的强化学习模型驱动的智能体的 ABS,用于模拟街头抢劫,展示了该模拟能够重现现实世界中报告的动态。

In addition, recent studies focused on the integration of LLMs to ABS to further extend previous simulations’ capabilities (Gau et al., 2024). For instance, Park et al. (2022) developed a system that creates a simulated community consisting of a thousand personas (agents). This system takes the designer’s vision for the community—including its goals, rules, and member personas—and simulates it, generating behaviors such as posting, replying, and even anti-social actions. Extending this work, Gao et al. (2023) created extensive networks with 8,563 and 17,945 agents, designed to simulate social networks centered on the topics of Gender Discrimination and Nuclear Energy, respectively. With a more direct focus on economic dynamics, Li et al. (2023) utilized LLMs for macroeconomic simulation, employing prompt-engineering-driven agents that mimic human decision-making. The authors show that this approach significantly improves the realism of economic simulations compared to rule-based methods or other AI agents. Li et al. (2023) introduced financial trading where agents interact using conversations such that the agents have a layered memory system, debate mechanisms, and individualized trading characters.

此外,近期研究专注于将大语言模型 (LLM) 集成到ABS中,以进一步扩展先前模拟的能力 (Gau et al., 2024) 。例如,Park et al. (2022) 开发了一个系统,该系统可以创建一个由一千个角色 (智能体) 组成的模拟社区。该系统根据设计者对社区的愿景——包括其目标、规则和成员角色——进行模拟,生成诸如发帖、回复甚至反社会行为。扩展这项工作,Gao et al. (2023) 创建了包含 8,563 和 17,945 个智能体的广泛网络,分别设计用于模拟以性别歧视和核能为主题的社交网络。更直接地关注经济动态,Li et al. (2023) 利用大语言模型进行宏观经济模拟,采用提示工程驱动的智能体来模仿人类决策。作者表明,与基于规则的方法或其他AI智能体相比,这种方法显著提高了经济模拟的真实性。Li et al. (2023) 引入了金融交易,其中智能体通过对话进行交互,使得智能体具有分层记忆系统、辩论机制和个性化的交易特征。

In common, these models have three unique properties that control the behavior of the AI agents in the simulation: agentenvironment interface, agents’ personalities, and agent capabilities acquisition (Gau et al., 2024). Below, we briefly discuss the different methods for each one of them with their strength and limitations.

这些模型通常具有三个独特的属性,控制着模拟中AI智能体的行为:智能体环境接口、智能体个性和智能体能力获取 (Gau et al., 2024)。下面,我们将简要讨论每种方法的不同方法及其优势和局限性。

2.4.1 Agents-Environment interface

2.4.1 智能体-环境交互界面

The operational environments define the specific contexts or settings in which the LLM-driven agents deployed and interact, such as the financial market, as an abstract environment, or a settlement, as a physical environment. The Agents-Environment interface describes how agents interact with and perceive their environment. This interface enables agents to understand their surroundings, make decisions, and learn from the results of their actions. These environments can be roughly divided into two main groups: “sandbox” and “realworld”. The sandbox is a virtual environment created by humans, where agents can freely interact and experiment with different actions and strategies. However, in the context of AI agents interacting with each other, a sandbox’s environment definition can be extended to the inner world of the agent where it can strategies as such computing possible actions it may try in the actual simulation’s environment (Ahlgren et al., 2020; Truby et al., 2022). On the other hand, the real world is a real-world environment where agents interact with physical entities and obey real-world physics and constraints. The real world’s level of details and exact rules enforced depends on the context of the simulation commonly balancing between computational power, relevance, and realism (Kadian et al., 2020).

操作环境定义了部署和交互的大语言模型驱动型智能体的具体背景或设置,例如作为抽象环境的金融市场,或作为物理环境的结算场景。智能体-环境接口描述了智能体如何与其环境互动和感知。这个接口使智能体能够理解其周围环境、做出决策并从其行动结果中学习。这些环境大致可以分为两大类:“沙盒”和“现实世界”。沙盒是人类创建的虚拟环境,智能体可以在其中自由互动和实验不同的行动和策略。然而,在智能体相互互动的背景下,沙盒的环境定义可以扩展到智能体的内部世界,在其中它可以策略性地计算它可能在模拟的实际环境中尝试的行动 (Ahlgren等,2020;Truby等,2022)。另一方面,现实世界是一个真实的环境,智能体在其中与物理实体互动并遵守现实世界的物理和约束。现实世界的细节水平和执行的精确规则取决于模拟的上下文,通常需要在计算能力、相关性和真实性之间进行平衡 (Kadian等,2020)。

2.4.2 Agents personality

2.4.2 AI智能体个性

In LLM-powered ABS systems, agents are characterized by their traits, actions, and skills, all designed to achieve specific goals. These agents take on distinct roles within different systems, each role thoroughly described by its characteristics, capabilities, behaviors, and constraints. For example, in business environments, agents are profiled as companies with diverse capabilities and objectives, each influence uniquely the economic’s course. Generally speaking, one can divide the agent personality generation for LLM-powered ABS into three methods: pre-defined, model-generated, and data-derived. For the pre-defined case, agent profiles are explicitly defined by the modeler in a manual fashion (Gau et al., 2024). This method allows a lot of control over the agents’ personalities while limiting the diversity and scale of the simulation due to the time and resources required to apply this method on a large-scale simulation.

在大语言模型驱动的ABS系统中,智能体通过其特性、行动和技能来刻画,这些设计都旨在实现特定目标。这些智能体在不同系统中承担着不同的角色,每个角色都由其特性、能力、行为和约束详细描述。例如,在商业环境中,智能体被描述为具有不同能力和目标的企业,每个企业都以独特的方式影响经济进程。一般而言,大语言模型驱动的ABS中的智能体个性生成可以分为三种方法:预定义、模型生成和数据驱动。在预定义的情况下,建模者以手动方式明确地定义智能体配置文件(Gau等人,2024)。这种方法允许对智能体个性进行大量控制,但由于大规模模拟所需的时间和资源,限制了模拟的多样性和规模。

2.4.3 Agents capabilities acquisition

2.4.3 AI智能体能力获取

Agent capabilities acquisition in LLM-powered ABS systems is crucial for enabling dynamic learning and evolution. This process relies on various types of feedback and strategies for agents to adapt effectively. Feedback is typically textual and can come from the environment, interactions between agents, or pre-defined model, each providing critical information that helps agents understand the impact of their actions and adapt to complex problems (Wang et al., 2023). In some scenarios, no feedback is provided, especially when the focus is on result analysis rather than agent planning. To enhance their capabilities, agents can use memory modules to store and retrieve information from past interactions, self-evolve by modifying their goals and strategies based on feedback and communication logs, or dynamically generate new agents to address specific challenges (Nascimento et al., 2023).

LLM驱动的ABS系统中AI智能体能力获取对于实现动态学习和进化至关重要。这一过程依赖于各种类型的反馈和策略,使AI智能体能够有效适应。反馈通常是文本形式的,可以来自环境、AI智能体之间的交互或预定义模型,每种反馈都提供了关键信息,帮助AI智能体理解其行为的影响并适应复杂问题 (Wang et al., 2023)。在某些场景中,特别是当重点放在结果分析而非AI智能体规划时,可能不会提供反馈。为了增强能力,AI智能体可以使用记忆模块存储和检索过去交互中的信息,根据反馈和通信记录自我进化,修改其目标和策略,或者动态生成新的AI智能体以应对特定挑战 (Nascimento et al., 2023)。

3 Large Language Model Powered Agent-Based Simulation For Informal Economy

3 大语言模型驱动的人工智能体模拟非正规经济

Capturing the entire socio-economic dynamics of a modern monetary-based economy is extremely complex as it requires capturing highly integrated and ever-changing social, political, cultural, and technological dynamics that are reflected by economic activity (Niedzwiedz et al., 2012; Biswas and Nautiyal, 2023; Bouchaud, 2013). As such, in the proposed model we will focus on the minimal number of mechanisms and agent types required to obtain the central economic activity to sustain a relatively stable socio-economic infrastructure.

捕捉一个现代货币经济体的整个社会经济动态极为复杂,因为它需要捕捉由经济活动所反映的高度集成且不断变化的社会、政治、文化和技术动态 (Niedzwiedz et al., 2012; Biswas and Nautiyal, 2023; Bouchaud, 2013)。因此,在提出的模型中,我们将专注于维持相对稳定的社会经济基础设施所需的最少机制和智能体类型。

In this section, we first outline the economic theories that operated as the design motivation for the proposed model. Next, we outline the economic process occurring in the simulation. Finally, we formally define the individuals in the economy as the agents and the government as part of the simulation’s environment. In particular, we present the decision-making process utilized by the two types of agents. Fig. 1 presents a schematic view of the ABS design of the socio-economic dynamics.

在本节中,我们首先概述了作为所提出模型设计动机的经济理论。接着,我们概述了模拟中发生的经济过程。最后,我们将经济中的个体正式定义为AI智能体,将政府定义为模拟环境的一部分。特别是,我们介绍了两种类型AI智能体的决策过程。图 1 展示了社会经济学动态的ABS设计示意图。


Figure 1: A schematic view of the proposed ABS design of the socio-economic dynamics. The economy evolves into a population of individuals and a central government. The individuals get an income and should pay income taxes and participate in buy-sell interactions, and should pay sales taxes. The government collects taxes, as self-reported by the individuals, and uses them to fund both public goods and enforcement. The latter is used to validate the reported taxes and punishes individuals who did not pay taxes fully.

图 1: 所提议的 ABS 设计的社会经济动态示意图。经济演化为由个体和中央政府组成的人群。个体获得收入,应缴纳所得税并参与买卖互动,同时应缴纳销售税。政府收取个体自行申报的税款,并将其用于资助公共产品和执法。后者用于验证申报的税款,并惩罚未完全缴纳税款的个体。

3.1 Design motivation

3.1 设计动机

The source of the monetary-based economy with a central government is to tackle two main phenomenons naturally occurring in resource allocation problems with heterogeneous multi-agent scenarios: double coincidence of wants (Berentsen and Rocheteau, 2003) and provision of public goods and services (Anand, 2004). Namely, societies are agreeing to operate under a monetary-based economy with a central government to benefit from the ability to have a common agreement about the utility of goods while also that a central government can use a portion of their income (i.e., taxes) to provide more utility that each individual in the society could generate independently. According to “classical” economic theory, an informal economy emerges in such socio-economic conditions when individuals in the population agree with the monetary-based economy while disagreeing or exploiting the central government’s role by avoiding paying taxes while still enjoying the utility of public goods (Farhi and Gabaix, 2020; Crocker and Slemrod, 2005).

以中央政府为核心的货币经济体系的起源是为了解决异构多智能体场景中资源分配问题中自然出现的两个主要现象:需求的双方一致性(Berentsen 和 Rocheteau,2003)以及公共产品与服务的提供(Anand,2004)。即,社会同意在以中央政府为核心的货币经济体系下运作,以便从对商品效用的共同协议中获益,同时中央政府可以利用其收入的一部分(即税收)提供比社会中每个个体独立生成更多的效用。根据“古典”经济理论,当人口中的个体同意货币经济体系,但不同意或利用中央政府的角色,通过避免缴税同时仍然享受公共产品效用时,在这种社会经济条件下会出现非正规经济(Farhi 和 Gabaix,2020;Crocker 和 Slemrod,2005)。

Following this line of thought and in order to provide the minimal complexity model that can capture the informal economic activity emergence in terms of tax evasion, one needs to answer the following three questions: First, what actions do individuals in the population perform that are identified as economic-related actions? Second, how are such actions associated with taxation to the government? Third, what utility-modifying goods (causing positive utility) do individuals in the population achieve from the government? These questions are based on a formal economy and do not take into consideration that an informal economy occurs in parallel to a formal one. As such, once an informal economy emerges, a fourth question emerges as well - what mechanisms the government can use to prevent individuals from participating in the informal economy (causing negative utility for the individuals participating in the informal economy)?

按照这一思路,为了提供一个能捕捉到逃税行为中非正规经济活动的最小复杂度模型,需要回答以下三个问题:首先,人群中哪些行为被认定为与经济相关的行为?其次,这些行为如何与政府对税收的管理相关联?第三,人群中的个体从政府那里获得了哪些能提升效用的物品(产生正效用)?这些问题基于正规经济体系,并未考虑到非正规经济与正规经济是同时存在的。因此,一旦非正规经济出现,第四个问题也随之而来——政府可以采取哪些机制来阻止个人参与非正规经济(对参与非正规经济的个人产生负效用)?

Answering these questions is an active field of study with an expediently growing body of work (Inman and Rubinfeld, 1996; Eilat and Zinnes, 2002; Choi and Thum, 2005; Beckert, 2003). For our simulation, we focused on a relatively simplistic configuration. We assume that every individual in the economy receives income from their economic activity and can purchase goods and services accordingly. Corresponding to these two actions, the government can enforce income and sales taxes, respectively. The government uses its tax revenue to produce and supply an abstract utility (public goods) that is heterogeneous to the individuals in the population (Pauly, 1973; Groves and Ledyard, 1977).

回答这些问题是一个活跃的研究领域,相关研究正在迅速增加 (Inman and Rubinfeld, 1996; Eilat and Zinnes, 2002; Choi and Thum, 2005; Beckert, 2003)。在我们的模拟中,我们专注于一个相对简单的配置。我们假设经济中的每个人从他们的经济活动中获得收入,并可以相应地购买商品和服务。对应于这两个行为,政府可以分别征收所得税和销售税。政府利用其税收收入来生产和提供一种对人口中的个体具有异质性的抽象效用(公共品) (Pauly, 1973; Groves and Ledyard, 1977)。

Below, we formalize these ideas into a mathematical framework. Initially, we define the socio-economic environment as the economy using two mechanisms - economic transactions and taxation. In addition, the government’s enforcement and taxation reporting are integrated into the “rational” decision-making process of the government. The population of individuals (agents) is also formalized with their AI-driven decision-making process.

下面,我们将这些想法形式化为一个数学框架。首先,我们使用两种机制——经济交易和税收——来定义社会经济环境。此外,政府的执法和税务报告被整合到政府的“理性”决策过程中。个体(AI智能体)的群体也被形式化,包含他们由AI驱动的决策过程。

3.2 The economy

3.2 经济

The economy is based on two main mechanisms - economic transactions and taxation. For simplicity, economic transactions occur always between one or two agent(s) and are limited to income and buy-sell operations. The income is provided every $\theta_{i}\in\mathbb{N}$ steps in time and in amount $s_{i}\in\mathbb{R}^{+}$ for the $i_{t h}$ individual agent. We assume each agent’s income, if any, is fixed over time. The buy-sell operations occur for a list of goods, $G$ , where each agent has a desire, $d\in\mathbb{N}^{|G|}$ , to buy them. Like the individuals’ incomes, we assume that the prices of goods are constant over time and, given the prices of the goods, the supply satisfies the entire population’s demand for each good (or service). Similarly, the agent’s desire distribution $(d)$ is constant over time. Any monetary transaction is made instantly.

经济基于两种主要机制——经济交易和税收。为简化起见,经济交易总是发生在一个或两个AI智能体之间,仅限于收入和买卖操作。收入在每个 $\theta_{i}\in\mathbb{N}$ 时间步提供给第 $i_{t h}$ 个个体智能体,金额为 $s_{i}\in\mathbb{R}^{+}$。我们假设每个智能体的收入(如果有的话)是固定不变的。买卖操作针对一系列商品 $G$ 进行,每个智能体都有购买这些商品的欲望 $d\in\mathbb{N}^{|G|}$。与个体的收入类似,我们假设商品的价格是固定不变的,并且在给定商品价格的情况下,供应满足每个商品(或服务)的整个群体的需求。同样,智能体的欲望分布 $(d)$ 也是固定不变的。任何货币交易都是即时完成的。

The government collects taxes, and every economic transaction requires self-reporting of the amount of tax that the agent carrying out the activity must pay by law. The report not only includes the fact the transaction occurred but also the selling price, and therefore, the tax amount required to be paid by the agent. In a similar manner, income tax is taken from one’s income, as reported by the agent obtaining the income. The market structure we chose to use follows the assumption that the supply of each product or service is carried out under conditions of perfect competition so that firms’ profits are zero. Simply put, in perfectly competitive markets, firms are considered “price takers”, meaning they accept the market price as given and cannot influence it. This leads to firms producing at a level where price equals both marginal cost and average total cost, resulting in zero economic profit in the long run. This outcome, in theory, is due to the absence of barriers to entry and exit, allowing new firms to enter the market if existing firms are earning positive economic profits, which increases supply and drives prices down until only normal profits remain (Kolmar and Kolmar, 2022; Kreps, 2020).

政府征收税款,每笔经济交易都要求活动执行方按照法律规定自行申报应缴税款。申报不仅包括交易发生的事实,还包括销售价格,从而确定执行方应缴纳的税款金额。同样,所得税是从收入中获得方申报的收入中扣除的。我们选择使用的市场结构遵循的假设是,每种产品或服务的供应都在完全竞争的条件下进行,因此企业的利润为零。简而言之,在完全竞争市场中,企业被视为“价格接受者”,即它们接受市场价格且无法影响它。这导致企业在价格等于边际成本和平均总成本的水平上生产,长期来看经济利润为零。理论上,这种结果是由于没有进入和退出的壁垒,如果现有企业获得正的经济利润,新企业可以进入市场,这增加了供应并压低价格,直到只剩下正常利润 (Kolmar and Kolmar, 2022; Kreps, 2020)。

3.3 Government

3.3 政府

In our simulation, the government is operating as a knowledge-limited, central, and rational agent. The government’s primary objective is to optimize the welfare of the citizens (agents), which is reflected by maximizing the overall lifetime utility of all agents in the economy. By adjusting tax policies, allocating funds to the provision of public goods, and enforcing measures against informal economic activities, the government aims to achieve this objective. We separate the provision of public goods from enforcement actions to prevent tax evasion (which is considered a public good by itself) in order to determine the impact of changes in enforcement policy on the size of the informal economy in the sensitivity analyses we will conduct later.

在我们的模拟中,政府作为一个知识有限、集中且理性的智能体运作。政府的主要目标是优化公民(智能体)的福利,这体现在最大化经济中所有智能体的整体终身效用上。通过调整税收政策、分配资金用于公共产品的提供,以及采取措施打击非正规经济活动,政府旨在实现这一目标。我们将公共产品的提供与防止逃税的执法行动(这本身被视为一种公共产品)分开,以便在后续的敏感性分析中确定执法政策变化对非正规经济规模的影响。

Formally, the government is represented by the following tuple $(m,\mu,\lambda,\nu,\xi)$ where $m\in\mathbb{R}^{+}$ represents the government’s current budget; $\mu$ denotes the sales tax policy, indicating the rate at which tax is applied to goods and services prices within the economy; $\lambda$ denotes the income tax policy; $\nu$ reflects the government’s efficiency in converting tax revenues into public goods; and $\xi$ signifies the enforcement policy of informal economic-related activities. As such, the government’s objective takes the form

正式地,政府由以下元组表示 $(m,\mu,\lambda,\nu,\xi)$ ,其中 $m\in\mathbb{R}^{+}$ 代表政府的当前预算;$\mu$ 表示销售税政策,指示对经济中商品和服务价格征收的税率;$\lambda$ 表示所得税政策;$\nu$ 反映政府将税收转化为公共产品的效率;$\xi$ 表示与非正式经济活动相关的执法政策。因此,政府的目标形式为

image.png

where $T<\infty$ is the number of steps in time considered for the simulation, $m(t)$ is the government’s budget at the $t_{t h}$ step in time, $\rho,\in,(0,1)$ is a discount factor, and $u_{a}^{t}$ is a concave, continuous, non-decreasing utility function of the $a\in A$ agent in the population at time $t$ . $u_{a}^{t}$ is a function of $d\in\mathbb{N}^{\kappa}\subseteq G$ that details the list of goods the agents wish to acquire in each $\theta\in\mathbb{N}$ steps in time and for $\kappa\in\mathbb N$ private goods in the economy where $G$ is the list of all private goods in the economy. Each agent’s ability to purchase quantities of private goods is affected by the government’s tax policy (sales tax, $\mu$ , and income tax, $\lambda$ , policies). Furthermore, each agent’s utility function is affected by the quantities of public goods that the government provides, with these quantities being affected by the amount of tax and how these quantities are converted into benefits (utility) for the agents, denoted by $\nu$ .

其中 $T<\infty$ 是模拟中考虑的时间步数,$m(t)$ 是政府在时间步 $t_{t h}$ 的预算,$\rho,\in,(0,1)$ 是折现因子,$u_{a}^{t}$ 是时间 $t$ 时群体中 $a\in A$ 智能体的凹、连续、非递减效用函数。$u_{a}^{t}$ 是 $d\in\mathbb{N}^{\kappa}\subseteq G$ 的函数,详细描述了智能体在每个时间步 $\theta\in\mathbb{N}$ 和在拥有 $\kappa\in\mathbb N$ 种私人物品的经济体中希望获取的物品清单,其中 $G$ 是经济体中所有私人物品的清单。每个智能体购买私人物品的能力受到政府税收政策(销售税 $\mu$ 和所得税 $\lambda$)的影响。此外,每个智能体的效用函数受到政府提供的公共物品数量的影响,这些数量受到税收金额以及这些数量如何转化为智能体收益(效用)$\nu$ 的影响。

To this end, the sales tax policy $\lvert\mu:\mathbb{R}^{+}\to\mathbb{R}^{+})$ takes the form of a percent from the good’s price which is added on top and paid by the buyer agent to the seller agent. It is the responsibility of the seller’s agent to report the tax charged by the buyer’s agent. The sales tax percentage can be arbitrarily large, starting from zero percent, however constant across all the goods in the economy (Keen, 2013).

为此,销售税政策 $\lvert\mu:\mathbb{R}^{+}\to\mathbb{R}^{+})$ 采取从商品价格中提取一定百分比的形式,由买家AI智能体支付给卖家AI智能体。卖家AI智能体负责报告由买家AI智能体收取的税款。销售税百分比可以从零开始,任意大且恒定,适用于经济中的所有商品 (Keen, 2013)。

The income tax $\langle\lambda:\mathbb{R}^{+}\to\mathbb{R}^{+}$ ) can take one of two forms. First, a fixed income tax, which is a percent (ranging from $0%$ to $100%$ ) from each income, as applied in countries like Russia, Czech Republic, and Bulgaria (Ivanova et al., 2005; Vasilev, 2015); and secondly, a progressive income tax system where each “step” in the income is taxed with a different percentage (commonly monotonically increased percent), as applied in countries like Israel, Switzerland, and the United States (Pommerehne and Weck-Hannemann, 1996b; Kopczuk, 2005). For the latter case, the policy is represented by a list of tuples such that the first value indicates the income threshold and the second value is the taxation rate.

个人所得税 ( $\langle\lambda:\mathbb{R}^{+}\to\mathbb{R}^{+}$ ) 可以采取两种形式之一。第一种是固定税率,即对每笔收入按一定比例(从 $0%$ 到 $100%$ 不等)征税,如俄罗斯、捷克共和国和保加利亚等国家所采用 (Ivanova et al., 2005; Vasilev, 2015);第二种是累进税制,即收入的每个“阶梯”按不同的税率征税(通常为单调递增的税率),如以色列、瑞士和美国等国家所采用 (Pommerehne and Weck-Hannemann, 1996b; Kopczuk, 2005)。对于后者,政策由一组元组表示,其中第一个值表示收入阈值,第二个值表示税率。

The public goods policy, $\nu:\mathbb{R}^{+}\rightarrow\mathbb{R}^{|\chi|}$ , is represented by the amount of money allocated to a list of public goods of size $\chi\in\mathbb N$ , such that each public good have some utility to each agent in the population. Such association is formally presented by a function $\nu_{i}:\mathbb{R}^{+}\to\mathbb{R}^{|A|}$ for the $i_{t h}$ public good and reflects the government’s efficiency in converting tax revenues into public goods. Different public goods have different utility to subsets of the population. For example, adding a road to some cities is very beneficial to the city’s residents, somewhat beneficial for individuals crossing the city, and not beneficial at all for individuals who do not use this road. For realism, we assume a linear utility increase with respect to the amount of funds invested by the government in each public good. Moreover, it is assumed that the utility distribution for each individual in the population is constant and the utility is obtained at each step in time.

公共物品政策,$\nu:\mathbb{R}^{+}\rightarrow\mathbb{R}^{|\chi|}$,通过将资金分配给大小为 $\chi\in\mathbb N$ 的公共物品列表来表示,使得每个公共物品对人口中的每个个体都有一定的效用。这种关联形式通过函数 $\nu_{i}:\mathbb{R}^{+}\to\mathbb{R}^{|A|}$ 来表示第 $i_{t h}$ 个公共物品,并反映了政府在将税收转化为公共物品中的效率。不同的公共物品对不同的人群子集有不同的效用。例如,在某些城市增加一条道路对城市居民非常有益,对穿越城市的个体有一定的益处,而对不使用这条道路的个体则没有益处。为了现实性,我们假设效用随着政府对每个公共物品的投资金额线性增加。此外,假设人口中每个个体的效用分布是恒定的,并且效用在每个时间步中获取。

The enforcement policy (i.e., tax evasion penalty policy) $\xi:\mathbb{R}^{+}\rightarrow A$ is a function that gets funding and returns the portion of the population the government is able to investigate. By investigating an individual, the real amount of taxation the individual should pay over its entire history is revealed. Any delta between the actual amount of taxes an individual paid in taxes compared to the amount the individual should have been to pay is denoted by $\psi$ . An individual with $\psi,>,0$ upon investigation is punished with a linearly proportional rate $\alpha\in\mathbb{R}^{+}$ of money taken while the historical taxes themselves are wavered. We assume that the subset of agents from the population is chosen randomly.

执法政策(即逃税处罚政策) $\xi:\mathbb{R}^{+}\rightarrow A$ 是一个函数,它获取资金并返回政府能够调查的人口比例。通过调查个人,该个人在整个历史中应缴纳的实际税款金额将被揭示。个人实际缴纳的税款金额与应缴金额之间的任何差异用 $\psi$ 表示。在调查时,对于 $\psi,>,0$ 的个人,将按照线性比例 $\alpha\in\mathbb{R}^{+}$ 的金额进行处罚,同时历史税款将被豁免。我们假设从人口中随机选择子集的智能体。

Importantly, all four policies are pre-defined and static over time.

重要的是,所有四个策略都是预先定义的,并且随时间保持静态。

3.4 Individuals

3.4 个体

We assume a fixed-size population of agents $(A)$ . Each agent in the population $(a,\in,A)$ is defined by a timed finite state machine (Al-Saawy et al., 2009) which is formally captured by the tuple $a:=(\beta,\theta,s,d,\zeta,\eta,v,\psi)$ where $\beta\in\mathbb{R}^{+}$ denotes the current amount of money the agent possesses; $\theta\in\mathbb{N}$ indicates the number of simulation steps between two salaries; $s\ \in\ \mathbb{R}^{+}$ indicates the amount of money the agent gets from income. Thus, any gap between the agent’s income (after income taxes) and the total expenditure on purchasing private goods is added to the amount available to the individual in the next period $\beta\in\mathbb{R}^{+}$ (i.e. savings); $d\in\mathbb{N}^{\kappa}\subseteq D$ details the list of goods the agents wishes to acquire in each $\theta\in\mathbb{N}$ steps in time and for $\kappa\in\mathbb N$ private goods in the economy where $D$ is the list of all private goods in the economy of size $K$ ; $\zeta\in\mathbb{R}^{+}$ measures the agent’s propensity to take risks in their economic activities, affecting their economic decisions; $\eta\in\mathbb{N}$ represents the planning horizon in terms of steps in time (indicating how far ahead the agent plans for future economic activities); $\upsilon\in\mathbb{R}^{+}$ indicates the cognitive ability of the agent, represented by the amount of noise the deep reinforcement learning (DRL) model receives during the agent’s learning process, where higher noise levels can simulate lower cognitive ability, leading to less precise decision-making; and $\psi$ is the agent’s personality, as reflected by a free text. Moreover, it is assumed that agents are fully aware of their state and the four government policies. Importantly, both interactions, the income, and the sell-buy are recorded by the agent. Namely, the income transactions are recorded as “Obtained an income $s$ at time $t^{\bullet}$ and the sell-buy transaction as “buy a product for a price $p'$ .

我们假设智能体的固定数量为 $(A)$。每个智能体 $(a,\in,A)$ 由一个时间有限状态机(Al-Saawy 等人,2009)定义,正式表示为元组 $a:=(\beta,\theta,s,d,\zeta,\eta,v,\psi)$,其中 $\beta\in\mathbb{R}^{+}$ 表示智能体当前拥有的资金量;$\theta\in\mathbb{N}$ 表示两次收入之间的模拟步骤数;$s\ \in\ \mathbb{R}^{+}$ 表示智能体从收入中获得的金额。因此,智能体收入(扣除所得税)与购买私人物品的总支出之间的差额将添加到下一期的可用资金量 $\beta\in\mathbb{R}^{+}$ 中(即储蓄);$d\in\mathbb{N}^{\kappa}\subseteq D$ 详细列出了智能体在每个 $\theta\in\mathbb{N}$ 时间步骤中希望获得的物品列表,以及经济中 $\kappa\in\mathbb N$ 个私人物品,其中 $D$ 是经济中所有私人物品的列表,大小为 $K$;$\zeta\in\mathbb{R}^{+}$ 衡量了智能体在经济活动中的风险偏好,影响其经济决策;$\eta\in\mathbb{N}$ 表示规划的时间步长(表明智能体对未来经济活动的规划时间跨度);$\upsilon\in\mathbb{R}^{+}$ 表示智能体的认知能力,由深度强化学习(DRL)模型在智能体学习过程中接收的噪声量表示,其中较高的噪声水平可以模拟较低的认知能力,导致决策不够精确;$\psi$ 是智能体的个性,通过自由文本反映。此外,假设智能体完全了解其状态和四项政府政策。重要的是,收入和买卖这两种交互都由智能体记录。即,收入交易记录为“在时间 $t^{\bullet}$ 获得收入 $s$”,买卖交易记录为“以价格 $p'$ 购买产品”。

The Individual’s decision-making process is divided into two: how much income taxes to pay and how much sales tax to pay. In order to perform these two decisions, the agents are provided with a combined LLM and DRL models. The DRL operates as the “rational” mind while the LLM operates as the subconscious of the agent. The agent’s state (including personality), the four government policies, and previous economic interactions (including income, sell-buy, and tax reports) are initially provided to an LLM model which is requested to return the amount of taxes the agent should pay - once for the income and once for sell-buy. The LLM model is based on the LLAMA-2 model, which is considered one of the best-performing open-source LLM models (Touvron et al., 2023). In particular, previous studies show LLAMA-2 produces promising results of economic reasoning, like in the case of these studies (Raman et al., 2024; Yu et al., 2024). Technically, the LLM is queried with a question of how much taxes the agent should report, formalized as

个体的决策过程分为两个部分:缴纳多少所得税和缴纳多少销售税。为了执行这两个决策,智能体被赋予了一个组合的大语言模型和深度强化学习模型(DRL)。DRL 充当“理性”思维,而大语言模型则充当智能体的潜意识。智能体的状态(包括个性)、四项政府政策以及之前的经济互动(包括收入、买卖和税务报告)首先被输入到一个大语言模型中,该模型被要求返回智能体应缴纳的税款金额——一次针对收入,一次针对买卖。大语言模型基于 LLAMA-2 模型,该模型被认为是性能最佳的开源大语言模型之一(Touvron 等人,2023)。特别是,先前的研究表明,LLAMA-2 在经济推理方面取得了令人鼓舞的成果,如这些研究所示(Raman 等人,2024;Yu 等人,2024)。从技术上讲,大语言模型被询问智能体应报告的税款金额,问题被形式化为

Figure 2: A schematic view of the decision process of an individual agent. Income and buy-sell transactions occurring to and by the agent which needs to report and pay taxes to the government. The decision process starts with an LLM which produces an initial suggestion for the amount of taxes the agent should pay by taking into account the agent’s state, personality, historical actions, and government policies. The same information with the LLM’s suggestion and the risk-loving factor is used by the DRL model to produce the final decision.

图 2: 个体智能体的决策过程示意图。收入和买卖交易发生在智能体之间,智能体需要向政府报告并缴纳税款。决策过程从一个大语言模型开始,该模型通过考虑智能体的状态、个性、历史行为和政府政策,生成智能体应缴税款的初步建议。相同的信息与大语言模型的建议和风险偏好因子一起被用于深度强化学习 (DRL) 模型,以生成最终决策。

follows:

如下:

LLM Prompt

大语言模型 (LLM) 提示

What is the amount of taxes I should pay? Make sure to return a single positive number.

我应缴纳的税款金额是多少?请确保返回一个正数。

This information, as well as the inputs to the LLM, is then provided to a DRL model. Specifically, the DRL is based on the Deep QNetwork (DQN) algorithm (Fan et al., 2020). We chose DQN for two main reasons. First, DQN has an off-policy learning mechanism, meaning it can learn from past experiences stored in a replay buffer. This ability to reuse past data makes it more sample-efficient compared to on-policy methods, which discard past data once used. Moreover, the replay buffer allows DQN to break the correlation between consecutive samples by shuffling them, which leads to more stable and efficient learning. Second, DQN uses a target network, which is a delayed copy of the Q-network used to predict target Q-values. This helps to stabilize training by reducing the correlations between the action-value estimates and the target values, mitigating the risk of divergence and making learning more robust. Fig. 2 presents a schematic view of the decision-making process and the feedback loop. A more detailed description of the model’s training and inference is provided in the Appendix.

这些信息以及大语言模型的输入随后被提供给深度强化学习 (DRL) 模型。具体来说,DRL 基于深度 Q 网络 (DQN) 算法 (Fan et al., 2020)。我们选择 DQN 主要有两个原因。首先,DQN 具有离策略学习机制,这意味着它可以从存储在回放缓冲区中的过去经验中学习。与丢弃过去数据的在策略方法相比,这种重用过去数据的能力使其更具样本效率。此外,回放缓冲区通过打乱样本顺序,使 DQN 能够打破连续样本之间的相关性,从而实现更稳定和高效的学习。其次,DQN 使用目标网络,它是 Q 网络的延迟副本,用于预测目标 Q 值。这有助于通过减少动作值估计与目标值之间的相关性来稳定训练,从而降低发散风险并使学习更加稳健。图 2 展示了决策过程和反馈循环的示意图。模型的训练和推理的详细描述见附录。

4 Experiments

4 实验

In this section, we outline the experiments conducted using the proposed model to investigate the emergence of the informal economy and its properties. First, we set the model’s parameters following as closely as possible the socio-economic configuration of the US. Second, we define the evaluation metrics used to evaluate the informal economy size and properties. Finally, we outline the experimental rationale as well as the statistical analysis applied to the simulations’ results.

在本节中,我们概述了使用所提出模型进行的实验,以研究非正式经济的出现及其特性。首先,我们尽可能按照美国社会经济结构设置模型参数。其次,我们定义了用于评估非正式经济规模和特性的评价指标。最后,我们概述了实验原理以及对模拟结果进行的统计分析。

4.1 Model parameters

4.1 模型参数

In order to implement the proposed model, one is required to establish the parameter values and define the government policies. We decided to adopt the case of the US which is considered the leading global economy (Reuveny and Thompson, 2001), comprising an estimated informal economy of approximately $7%$ of its GDP as of $2023^{2}$ .

为了实现所提出的模型,需要设定参数值并定义政府政策。我们决定采用美国这一被视为全球领先经济体的案例 [Reuveny and Thompson, 2001],截至 2023 年,美国非正式经济约占其 GDP 的 $7%^{2}$。

Income. Based on the 2024 Current Population Survey Annual Social and Economic Supplements (CPS ASEC) conducted by the Census Bureau3, the 2023 household income deciles in the US is presented in Table 1. Thus, the annual income value range in the simulation was determined to be between $^\mathrm{518,980}$ and $^{\S316,100}$ , divided into deciles.

基于美国人口普查局2024年《当前人口调查年度社会和经济补充》(CPS ASEC)数据的美国2023年家庭收入10分位数如表1所示。因此,模拟中的年收入值范围确定为51,898美元至316,100美元,并划分为10分位数。

Table 1: Income at selected percentiles in 2023 dollars, US.

表 1: 2023 年美元收入百分位数

Decile 1 2 3 4 5 6 7 8 9 10
Income (USD) 18,980 33,000 47,910 62,200 80,610 101,000 127,300 165,300 234,900 316,100

Goods. The Bureau of Labor Statistics (BLS) in the US produces the Consumer Price Index (CPI) as a measure of price change faced by consumers. For use alongside the published indexes, BLS publishes the relative importance (RI) of the 204 components in CPI, which is the expenditure weight of an individual component expressed as a percentage of all items within the $\mathrm{{U.S^{4}}}$ . Accordingly, the number of goods in the consumer basket in 2023 is 204 and their normalized prices are as shown in Table 6 in the appendix.

商品。美国劳工统计局 (BLS) 编制了消费者价格指数 (CPI) 作为衡量消费者面临的价格变化的指标。BLS 还发布了 CPI 中 204 个组成部分的相对重要性 (RI),即每个组成部分的支出权重,表示为所有项目在 $\mathrm{{U.S^{4}}}$ 中的百分比。因此,2023 年消费者篮子中的商品数量为 204 种,其归一化价格如附录中的表 6 所示。

Income tax. In 2023, total federal receipts were $\mathbb{S}4.4$ trillion, about 16.5 percent of gross domestic product (GDP) of the US. The largest sources of revenues are the individual income tax and payroll taxes, followed by the corporate income tax, customs duties, and excise taxes. To cover any shortfalls between revenues and spending, the government issues debt. The federal government collects taxes on the wages and salaries earned by individuals, income from investments (for example, interest, dividends, and capital gains), and other income. Individual income taxes are the largest single source of federal revenues, constituting around one-half of all receipts. As a percentage of GDP, individual income taxes have ranged from 6 to 10 percent over the past 50 years, averaging around 8 percent of GDP. Tax liabilities vary considerably by income. Both employers and employees contribute payroll taxes, also known as social insurance taxes. Payroll taxes are the second-largest component of federal revenues and account for approximately one-third of total tax receipts, or approximately 6 percent of GDP. Payroll taxes help fund Social Security, Medicare, and unemployment insurance. For Social Security, employers and employees each contribute 6.2 percent of every paycheck, up to a maximum amount $\mathcal{S}168{,}600$ in 2024). For Medicare, employers and employees each contribute an additional 1.45 percent, with no income limit. The Affordable Care Act added another 0.91 percent in payroll taxes on earnings over $\mathbb{S}200{,}000$ for individuals or $\mathbb{S}250{,}000$ for couples. Employers also pay the federal unemployment tax, which finances state-run unemployment insurance programs. The government collects taxes on the profits of corporations. In 2022, most corporate income was taxed at 21 percent at the federal level (before adjustments). When combined with state and local corporate taxes, the average statutory tax rate was 25.8 percent, although most corporations pay less than the statutory rate because of exemptions, deductions, and other adjustments to income. Corporate taxes amount to approximately 9.9 percent of all tax revenues, or approximately 1.6 percent of GDP. Taxes on certain goods such as tobacco, alcohol, and motor fuels also contribute to federal revenues. Those excise taxes are imposed at the point of sale and add to the prices that consumers pay for such goods. Revenues from excise taxes amount to approximately 2 percent of all tax revenues, or approximately 0.3 percent of GDP. The government collects revenues from duties and tariffs on imports. Those revenues amount to approximately 2 percent of all tax revenues or approximately 0.3 percent of GDP. Federal revenues that come from other sources — such as estate and gift taxes and the deposit of earnings from the Federal Reserve System, among others — amount to approximately 2 percent of all tax revenues, or approximately 0.3 percent of GDP. In summary, Table 2 presents the federal income tax rates for a single taxpayer in the US for $2023^{5}$ .

所得税。2023年,联邦总收入为 $\mathbb{S}4.4$ 万亿美元,约占美国国内生产总值 (GDP) 的16.5%。收入的最大来源是个人所得税和工资税,其次是企业所得税、关税和消费税。为了弥补收入与支出之间的任何短缺,政府会发行债务。联邦政府对个人赚取的工资和薪金、投资收入(例如利息、股息和资本收益)以及其他收入征税。个人所得税是联邦收入的最大单一来源,约占所有收入的一半。在过去50年中,个人所得税占GDP的比例在6%到10%之间,平均约为GDP的8%。税务责任因收入差异很大。雇主和雇员都需缴纳工资税,也称为社会保险税。工资税是联邦收入的第二大组成部分,约占税收总额的三分之一,或约占GDP的6%。工资税用于资助社会保险、医疗保险和失业保险。对于社会保险,雇主和雇员各缴纳每份工资的6.2%,2024年最高收入为 $\mathcal{S}168{,}600$ 。对于医疗保险,雇主和雇员各额外缴纳1.45%,无收入上限。《平价医疗法案》对个人收入超过 $\mathbb{S}200{,}000$ 或夫妻收入超过 $\mathbb{S}250{,}000$ 的部分额外征收0.91%的工资税。雇主还需缴纳联邦失业税,该税用于资助州立失业保险计划。政府对企业的利润征税。2022年,大多数企业收入在联邦层面按21%的税率征税(调整前)。与州和地方企业所得税结合后,平均法定税率为25.8%,但由于免税、扣除和其他收入调整,大多数企业支付的税率低于法定税率。企业所得税约占所有税收收入的9.9%,或约占GDP的1.6%。某些商品(如烟草、酒精和机动车燃料)的税收也为联邦收入做出贡献。这些消费税在销售时征收,并增加了消费者为这些商品支付的价格。消费税收入约占所有税收收入的2%,或约占GDP的0.3%。政府对进口商品征收关税和税款。这些收入约占所有税收收入的2%,或约占GDP的0.3%。来自其他来源的联邦收入(如遗产税和赠与税,以及联邦储备系统的收益存款等)约占所有税收收入的2%,或约占GDP的0.3%。表 2 总结了2023年美国单身纳税人的联邦所得税税率。

Sales taxes Sales tax in the United States is a consumption-based tax applied at the state and local levels, serving as a significant revenue source for public services such as education, infrastructure, and public safety (Alshira’h et al., 2020). It is governed primarily by state laws, with 45 states and the District of Columbia imposing a statewide sales tax, while five states—Alaska, Delaware, Montana, New Hampshire, and Oregon—do not. Many states allow local governments to levy additional sales taxes, resulting in widely varying combined rates, sometimes exceeding $10%$ . Sales tax generally applies to tangible personal property and selected services, although exemptions for essentials like groceries and prescription drugs are common. Compliance requires businesses to collect sales tax at the point of sale and remit it to the appropriate tax authorities, based on their physical or economic ”nexus” within the state. Use tax complements sales tax for goods purchased tax-free in other jurisdictions but used locally. Following the 2018 Supreme Court decision in South Dakota v. Wayfair, Inc., states gained broader authority to mandate sales tax collection from out-of-state and online retailers, addressing the challenges posed by e-commerce. Tax rates range from $2.9%$ in Colorado to $7.25%$ in California, with significant variation depending on local rates. States may also offer sales tax holidays for specific items like school supplies or energy-efficient appliances, alongside permanent exemptions for certain goods and services. Chapter 7 of the U.S. tax code6 outlines the foundational framework for administering and enforcing sales tax, including auditing mechanisms to ensure compliance, and underscores its importance in sustaining state and local budgets. Despite its complexity, the sales tax system continues to adapt to evolving economic conditions and remains a cornerstone of subnational government funding. As such, for simplicity, in our study, we used the average combined state and local sales tax rate across the US which is approximately $6.44%$ 7.

消费税
在美国,消费税是一种基于消费的税,适用于州和地方层面,是教育、基础设施和公共安全等公共服务的重要收入来源 (Alshira’h et al., 2020) 。它主要由州法律管理,45个州和哥伦比亚特区征收全州范围的消费税,而阿拉斯加、特拉华、蒙大拿、新罕布什尔和俄勒冈等五个州则不征收。许多州允许地方政府征收额外的消费税,导致综合税率差异很大,有时超过 $10%$ 。消费税通常适用于有形个人财产和选定服务,虽然常见如食品杂货和处方药等必需品免税。合规要求企业在销售点征收消费税,并根据其在该州的实体或经济“联系点”将税款汇给相应的税务机关。使用税是对在其他管辖区免税购买但在本地使用的商品的补充税。继2018年最高法院在南达科他州诉Wayfair, Inc.案中的裁决后,各州获得了更广泛的权力,要求外州和在线零售商征收销售税,以应对电子商务带来的挑战。税率范围从科罗拉多州的 $2.9%$ 到加利福尼亚州的 $7.25%$ ,具体取决于地方税率。各州还可能为特定物品如学校用品或节能电器提供消费税假期,同时对某些商品和服务提供永久免税。美国税法第7章概述了管理和执行消费税的基本框架,包括确保合规的审计机制,并强调其在维持州和地方预算中的重要性。尽管复杂,消费税系统继续适应不断变化的经济条件,并仍然是地方政府资金的基石。因此,为了简化,在我们的研究中,我们使用了全美国各州和地方的消费税平均综合税率,大约为 $6.44%$ 。

Table 2: Federal Tax Brackets (Guzman and Kollar, 2023)

表 2: 联邦税率表 (Guzman and Kollar, 2023)

税率 10% 12% 22% 24% 32% 35% 37%
应税收入起 $0 $11,001 $44,726 $95,376 $182,101 $231,251 $578,126
应税收入止 $11,000 $44,725 $95,375 $182,100 $231,250 $578,125 及以上

Penalties for tax evasion In the US, penalties for tax evasion vary depending on the severity of the offense and whether it is classified as a civil or criminal case. Civil penalties include an accuracy-related penalty of $20%$ of the underpaid taxes for negligence or disregard of tax rules, a fraud penalty of $75%$ of the underpaid taxes for intentional under payment, a failure-to-file penalty of $5%$ of the unpaid taxes per month (up to a maximum of $25%$ ) for late filing, and a failure-to-pay penalty of $0.5%$ of the unpaid taxes per month (up to a maximum of $25%$ ) for late payments. Criminal penalties are more severe and include fines of up to $\mathbb{S}100{,}000$ for individuals $(\Phi500,!000$ for corporations) and/or imprisonment for up to 5 years for tax evasion, fines of up to $^{\S100,000}$ for individuals $^{\leftmoon}_{\leftmoon}500{,}000\right.$ for corporations) and imprisonment of up to 3 years for filing fraudulent tax returns, and fines of up to $\mathbb{S}25{,}000$ for individuals (\$100,000 for corporations) and imprisonment of up to 1 year for willfully failing to file tax returns. These penalties aim to deter negligence and intentional violations of tax laws, ensuring compliance with the U.S. tax system8. In this study, we focus on intentional tax evasion. Therefore, the penalty for any tax evasion discovered by the tax authorities is $75%$ of the underpaid taxes (civil penalty) plus a fixed monetary criminal penalty of $\mathbb{S}100{,}000$ .

逃税处罚

Individual personality For the personality $(\psi)$ of the individuals, we adopted Twitter data from Cha et al. (2010) which contains 55 million user accounts and 1.75 billion tweets overall. For a simulation with $N$ individuals, we randomly, with uniform distribution, picked $N$ unique accounts and used all their tweets as the personalities of the individuals, provided to the LLM in a reverse order of their original publication.

个体性格 对于个体的性格 $(\psi)$,我们采用了 Cha 等人(2010)[20] 的 Twitter 数据,该数据包含 5500 万用户账户和 17.5 亿条推文。在进行 $N$ 个个体的模拟时,我们按照均匀分布随机选取了 $N$ 个唯一账户,并将他们的所有推文作为个体的性格,以大语言模型提供的原始发布顺序的逆序呈现。

Economy size Balancing between realistic economic size and computational resources and the time required to run a simulation, we chose a population size range between 10 and 1000 to allow two orders of magnitude differences while keeping the upper limited computationally feasible. Similarly, the number of simulation steps, each representing a day, is picked to range between 365 (one year) and 7300 (20 years) to instigate both short-term and long-term dynamics.

经济规模
在选择现实经济规模与计算资源及模拟运行时间之间进行权衡后,我们选择了人口规模范围在10到1000之间,以允许两个数量级的差异,同时保持计算上可行的上限。同样地,模拟步数(每步代表一天)选择在365(一年)到7300(20年)之间,以激发短期和长期的动态变化。

4.2 Evaluation matrices

4.2 评估矩阵

The emergence of the informal economy can be defined as the time it took since the establishment of the formal economy until some positive portion of the taxes are not reported (and paid) at a given step in time, denoted by $\delta$ . To this end, the size of the informal economy at each point in time is the total of unreported economic transactions, including the tax that should have been paid on them over time, denoted by $O$ . The normalized informal economy size is the size of the informal economy divided by the total economy, denoted by $\bar{O}$ .

非正式经济(informal economy)的出现可以定义为自正式经济(formal economy)建立以来,直至某个时间点上有一定比例的税款未申报(和缴纳)所经历的时间,用 $\delta$ 表示。因此,非正式经济在每个时间点的规模是未申报经济交易的总和,包括随着时间的推移应缴的税款,用 $O$ 表示。归一化的非正式经济规模是非正式经济规模除以总经济规模,用 $\bar{O}$ 表示。

Table 3: The model’s parameters with their value ranges.

表 3: 模型参数及其取值范围。

参数描述 取值范围 来源
N 人口规模 [1] 10 - 1000 假设
mo 政府的初始资金 [$] 0.05 N
μ 销售税政策 6.44% 美国税法
所得税政策 表 2 Guzman and Kollar (2023); IRS
V 政府将税收转化为公共产品的效率 [$] v(T) = f(t) 假设
非正规经济活动的执法政策 [$] 75%+$100,000 IRS
[G] 经济中商品的数量 [1] 204 美国消费者价格指数
θ 两次工资之间的模拟步数 [t] 30 Cullen and Perez-Truglia (2022)
个人年收入 [$] 18,980-316,100 人口普查局
d 对商品及其价格的欲望 [1] 表 6 美国消费者价格指数
经济活动中承担风险的倾向 [1] [0-1] 假设
n 个体对未来经济活动的规划视界 [t] [1 - 1095] 假设
U 个体的认知能力 [1] [80%, 99%] 假设
代理的人格 [1] 自由文本 (Cha et al., 2010)
T 模拟步数 [1] 365 - 7300 假设
△t 模拟步长的持续时间 [t] 1 假设

4.3 Results

4.3 结果

Building upon the model validation and exploration framework grounded in Newton’s methodology, the experimental design is bifurcated into two distinct phases: model validation and model exploration (Grahek et al., 2021; Rathmanner and Hutter, $2011)^{9}$ . The first phase focuses on employing the model to replicate established economic dynamics, thereby substantiating the validity of the proposed framework. The second phase leverages the model to investigate contemporary economic questions of interest within the informal economy research community.

建立在基于牛顿方法的模型验证与探索框架之上,实验设计分为两个不同的阶段:模型验证和模型探索 (Grahek et al., 2021; Rathmanner and Hutter, 2011)。第一阶段侧重于利用模型复制已确立的经济动态,从而验证所提出框架的有效性。第二阶段则利用该模型探讨非正规经济研究领域中的当代经济问题。

4.3.1 Model validation

4.3.1 模型验证

Due to the “black box” nature of the decision-making mechanism, driven by both LLM and DRL models, it is essential to ensure that the mechanism produces outcomes consistent with established economic theory in scenarios where such predictions are available. To address this, we examined four specific configurations, assuming a simplified economy with a single individual $\left|N\right|=1;$ ) possessing sufficient resources to satisfy all desires. These configurations are defined as follows:

由于大语言模型 (LLM) 和深度强化学习 (DRL) 模型驱动的决策机制的“黑盒”性质,必须确保该机制在现有经济理论可预测的场景下能够产生与之一致的结果。为了解决这一问题,我们在假设一个简化经济体(仅包含单个个体 $\left|N\right|=1;$ ),且该个体拥有足够资源满足所有需求的情况下,考察了四种具体配置。这些配置定义如下:

For this analysis, it is assumed that the enforcement penalty is equivalent to evaded taxes but does not contribute to the agent’s utility. Under these conditions, it is expected that a rational agent will refrain from tax evasion $\bar{O}=0)$ ) in both the first and second configurations. Conversely, in the third and fourth configurations, the agent’s equilibrium behavior may include $\bar{O},=,0,0.5,1$ . The value $\bar{O}=0.5$ arises when the agent assigns equal utility to paying or not paying taxes, leading to an optimal policy of randomizing actions with a $50%$ probability of compliance, resulting in paying taxes for $50%$ on average. Similarly, if the agent begins with a stable policy of either full compliance $\bar{O}=0,$ ) or full evasion $\bar{O}=1!,$ ), these policies remain optimal and unchanged.

对于这一分析,假设执法罚金等同于逃税金额,但不影响该智能体的效用。在这些条件下,预计理性的智能体在第一种和第二种配置中都会避免逃税 ($\bar{O}=0$)。相反,在第三种和第四种配置中,智能体的均衡行为可能包括 $\bar{O},=,0,0.5,1$。当智能体认为支付税款与否的效用相等时,$\bar{O}=0.5$ 会出现,从而导致最优策略是以 $50%$ 的概率随机化行动,平均支付 $50%$ 的税款。同样,如果智能体从一个完全合规 ($\bar{O}=0$) 或完全逃税 ($\bar{O}=1$) 的稳定策略出发,这些策略将保持最优且不变。

Table 4 shows the size of the informal economy $\left(\Bar O\right)$ , divided into these four configurations. A Mann-Whitney U test (McKnight and Najab, 2010) indicates that the simulation’s predictions and the expected outcome are statistically similar for the III and IV configurations while not for the I and II configurations. The differences between the simulation prediction and the expected outcome can be associated with the stochastic nature of the LLM model, which may suggest that the DRL model performs tax evasion and, in turn, tries the new action as part of an exploratory method.

表 4 显示了非正规经济规模 $\left(\Bar O\right)$,分为这四种配置。Mann-Whitney U 检验 (McKnight and Najab, 2010) 表明,在配置 III 和 IV 下,模拟的预测与预期结果在统计上是相似的,而在配置 I 和 II 下则不然。模拟预测与预期结果之间的差异可能与大语言模型的随机性有关,这可能意味着深度强化学习 (DRL) 模型进行了逃税行为,并因此尝试新动作作为探索方法的一部分。

Table 4: An economy with a single ( $|N|=1;$ ) individual (agent) for four different configurations. The results are shown as the mean $\pm$ standard deviation of $n=100$ repetitions.

表 4: 单一个体 (agent) 经济 ($|N|=1;$ ) 的四种不同配置。结果显示为 $n=100$ 次重复的均值 $\pm$ 标准差。

索引 配置 模拟预测 预期结果
I v(π) = 2t, P() = 0 0.0021 ± 0.0010 0
II v(T) = T, P() = 1 0.0019 ± 0.0010 0
III ()n = T, P() = 0 0.5018 ± 0.0031 0 或 0.5 或 1
IV v(π) = 0, P() = 1 0.4972 ± 0.0047 0 或 0.5 或 1

Following this preliminary “sanity check”, it is necessary to address a fundamental question regarding the decision-making mechanism of the proposed model: does the decision to engage in tax evasion stem from the bounded rationality of agents as reflected in the DRL model, or is it influenced by an implicit tax evasion strategy embedded within the LLM model? To investigate this, we set the enforcement parameter to zero $(P(\xi),=,0)$ and observed whether the agent chose to evade taxes at least once over $n,=,1000$ simulations. To ensure that tax evasion was not motivated as a rational strategy by the DRL model, we fixed the utility of public goods provision as $\nu(\tau)=\tau$ .

在此初步的“合理性检查”之后,有必要解决一个关于所提出模型的决策机制的基本问题:逃税的决定是源于DRL模型中所体现的智能体的有限理性,还是受到LLM模型中嵌入的隐性逃税策略的影响?为了研究这一点,我们将执法参数设为零 $(P(\xi),=,0)$ ,并观察智能体在 $n,=,1000$ 次模拟中是否至少选择过一次逃税。为了确保逃税不是作为DRL模型中的理性策略,我们将公共物品提供的效用固定为 $\nu(\tau)=\tau$ 。

This analysis was conducted using three distinct personality archetypes: a law-abiding individual, a random individual, and a law-breaking individual, each defined by the personality parameter $(\psi)$ of the agent. Specifically, the law-abiding personality was instantiated by prompting the GPT-4o-mini LLM model with the following query (Huang et al., 2024):

本分析采用了三种不同的人格原型:守法者、随机者和违法者,每种人格原型都由AI智能体的人格参数 $(\psi)$ 定义。具体而言,守法者人格通过向 GPT-4o-mini 大语言模型提出以下查询来实例化 (Huang et al., 2024):

”Generate 20 tweets that emphasize the importance of law-abiding behavior, ethical decision-making, and respect for societal norms. Each tweet should be concise (250 characters or less), engaging, and suitable for a public audience on Twitter”

生成 20 条强调守法行为、道德决策和尊重社会规范重要性的推文。每条推文应简洁(250 个字符以内)、引人入胜,并适合 Twitter 上的公众受众。

This query was designed to produce text reflective of a law-abiding mindset. Similarly, the law-breaking personality was defined using the prompt:

该查询旨在生成反映守法心态的文本。同样,违法人格使用以下提示定义:

”Generate 20 tweets that embody the mindset of a law-breaking individual. Each tweet should hint at law-breaking behavior with an emphasis on tax evasion. Ensure the tone is encouraging, relatable, and appropriate for a public audience on Twitter (250 characters or less).”

生成20条体现违法心态的推文。每条推文应暗示违法行为,重点放在逃税上。确保语气具有鼓励性、相关性,并适合Twitter上的公众受众(250字以内)。

The random personality was derived by randomly sampling 20 tweets from Cha et al. (2010).

随机性格通过从 Cha 等人 (2010) 中随机抽取的 20 条推文得出。

Figure 3 presents histograms illustrating the distribution of the time until the first tax evasion decision $(\delta)$ as a function of these three personality types. As anticipated, the law-abiding personality rarely engaged in tax evasion, with only $0.9%$ of simulations resulting in such behavior, which may be attributed to the stochastic nature of both the LLM and DRL models. The random personality exhibited a higher frequency of tax evasion, with $3.3%$ of simulations including at least one instance, approximately 3.5 times more than the law-abiding personality. In contrast, the law-breaking personality demonstrated a strong tendency toward tax evasion, with $98.4%$ of simulations involving at least one instance. While this value could theoretically approach $99.1%$ to mirror the inverse behavior of the law-abiding personality, the slightly lower observed frequency may be explained by the inherent positivity bias commonly found in LLMs (Miah et al., 2024).

图 3: 展示了首次逃税决策 $(\delta)$ 的时间分布直方图,这些直方图是这三种人格类型的函数。正如预期的那样,守法型人格很少参与逃税行为,只有 $0.9%$ 的模拟结果出现了这种行为,这可能归因于大语言模型和DRL模型的随机性。随机型人格表现出更高的逃税频率,$3.3%$ 的模拟中至少有一次逃税行为,大约是守法型人格的3.5倍。相比之下,违法型人格表现出强烈的逃税倾向,$98.4%$ 的模拟中至少有一次逃税行为。虽然理论上这个值可能接近 $99.1%$ 以反映守法型人格的相反行为,但观察到的频率略低,这可能是由于大语言模型常见的固有积极性偏差(Miah et al., 2024)。

Furthermore, examining the $\delta$ values for each personality reveals distinct patterns. Law-abiding individuals typically performed their first act of tax evasion later in the simulation compared to the random personality. Following this trend, the law-breaking personality exhibited a significantly earlier onset of tax evasion, with most instances occurring within the interval $0<\delta<250$ .

此外,检查每种性格的$\delta$值揭示了不同的模式。与随机性格相比,守法个体通常在模拟中较晚进行首次逃税行为。遵循这一趋势,违法性格表现出明显更早的逃税行为,大多数发生在其间隔$0<\delta<250$内。


Figure 3: Histogram of the time until the first tax evasion decision $(\delta)$ of a single agent under three different personalities.

图 3: 三种不同性格下的单一 AI 智能体首次逃税决策 $(\delta)$ 的时间直方图

To conduct a more nuanced analysis, we utilized the personality data from Cha et al. (2010), incorporating the most recent 20 tweets while increment ally adding between $k=[0,20]$ synthetic tweets. These synthetic tweets conveyed the message: ”I should perform tax evasion and pay less than the required amount of taxes.” The experiment was repeated $n=100$ times for each value of $k$ . Consistent with the previous analysis, we ensured that the DRL model was not in centi viz ed to adopt tax evasion as a rational strategy by setting $\nu(\tau)=\tau$ .

为了进行更细致的分析,我们利用了 Cha 等 (2010) 的人格数据,结合最近的 20 条推文,同时逐步添加了 $k=[0,20]$ 条合成推文。这些合成推文传达的信息是:“我应该逃税并支付少于应缴的税额。” 对于每个 $k$ 值,实验重复了 $n=100$ 次。与之前的分析一致,我们通过设置 $\nu(\tau)=\tau$ 确保了 DRL 模型不会被激励将逃税视为合理策略。

Figure 4 illustrates the relationship between the number of synthetic messages provided to the LLM and two key metrics: the timing of the first occurrence of informal economic activity (tax evasion), denoted as $\delta$ , and the share of the informal economy in overall economic activity, denoted as $\bar{O}$ . This analysis examines how variations in the input data affect the agent’s behavioral outcomes, specifically focusing on the onset of tax evasion $(\delta)$ and the degree of engagement in informal economic activities over the simulation period $\left(\Bar O\right)$ . The solid lines represent the mean values, while the shaded areas indicate the standard deviation across $n=100$ simulations.

图 4 展示了提供给大语言模型 (LLM) 的合成消息数量与两个关键指标之间的关系:首次出现非正规经济活动(逃税)的时间,记为 $\delta$,以及非正规经济在整个经济活动中的占比,记为 $\bar{O}$。该分析探讨了输入数据的变化如何影响 AI智能体的行为结果,特别关注逃税行为的出现时间 $(\delta)$ 以及模拟期间参与非正规经济活动的程度 $\left(\Bar O\right)$。实线表示均值,阴影区域表示 $n=100$ 次模拟的标准差。

For small values of $k$ , $\delta$ remains relatively high and stable, indicating that agents with minimal synthetic messages promoting tax evasion delay engaging in informal economic activity. This stability reflects the dominance of the original personality data, which is likely neutral or law-abiding in its disposition. As $k$ increases, a sharp decline in $\delta$ is observed. Agents exposed to an increasing number of synthetic messages begin engaging in tax evasion earlier in the simulation, suggesting that external inputs strongly influence the timing of informal economic activity. Beyond $k\approx10$ , $\delta$ plateaus at a low value, indicating that agents consistently perform tax evasion very early in the simulation when exposed to a sufficient number of synthetic messages.

对于较小的 $k$ 值,$\delta$ 保持相对较高且稳定,表明那些接收到最少鼓励逃税合成消息的智能体延迟了参与非正式经济活动的行为。这种稳定性反映了原始人格数据的主导地位,其倾向可能是中性或守法的。随着 $k$ 的增加,$\delta$ 急剧下降。接触到越来越多合成消息的智能体在模拟中更早地开始逃税,这表明外部输入强烈影响了非正式经济活动的时间点。当 $k\approx10$ 时,$\delta$ 在一个较低的值上趋于平稳,表明当智能体接触到足够数量的合成消息时,它们会持续在模拟的早期进行逃税。

Moreover, for small $k$ , the share of the informal economy $\left(\Bar O\right)$ remains close to zero, consistent with the delayed tax evasion seen in high $\delta$ values. As $k$ increases, $\bar{O}$ rises exponentially, showing that even moderate increases in synthetic messages significantly influence the agent’s participation in informal economic activity. When $k$ exceeds approximately 10, $\bar{O}$ saturates near 1. This indicates near-total engagement in informal economic activity, where tax evasion becomes the agent’s dominant strategy. The saturation suggests that additional synthetic messages have a diminishing impact once a tipping point is reached.

此外,对于较小的 $k$,非正规经济的份额 $\left(\bar{O}\right)$ 仍然接近于零,这与在 $\delta$ 值较高时观察到的延迟逃税现象一致。随着 $k$ 的增加,$\bar{O}$ 呈指数级上升,表明即使在合成信息量适度增加的情况下,也会显著影响智能体参与非正规经济活动。当 $k$ 超过大约 10 时,$\bar{O}$ 接近 1 并趋于饱和。这表明智能体几乎完全参与非正规经济活动,逃税成为其主导策略。饱和现象表明,一旦达到临界点,额外的合成信息将产生递减的影响。

The figure reveals an inverse relationship between $\delta$ and $\bar{O}$ . As $\delta$ decreases with increasing $k$ , $\bar{O}$ rises sharply, indicating that earlier decisions to evade taxes correspond to greater long-term engagement in the informal economy. This pattern underscores the compounding effect of early informal economic activity on overall economic outcomes.

图揭示了 $\delta$ 和 $\bar{O}$ 之间的反向关系。随着 $\delta$ 随 $k$ 的增加而减小, $\bar{O}$ 急剧上升,这表明较早的逃税决策对应着在非正规经济中更大的长期参与度。这种模式突显了早期非正规经济活动对整体经济结果的复合效应。

4.3.2 Model exploration

4.3.2 模型探索

Extending the analysis from the perspective of a single agent to the entire population, we investigate how the informal economy emerges, if at all, among heterogeneous agents within the population. Drawing on the work of Guyton et al. (2021), which highlights differences in tax evasion behavior based on income levels, we examine the distribution of income and sales tax evasion across individuals grouped by their annual income decile. For this analysis, the population size was fixed at $N,=,1000$ agents, with a simulation duration of $T=3650$ days (10 years). The experiment was repeated $n=100$ times to ensure the robustness of the results.

从单个智能体的分析扩展到整个群体的视角,我们研究了在异质性智能体构成的群体中,非正规经济(informal economy)是否以及如何出现。借鉴Guyton等(2021)的研究,该研究强调了基于收入水平的逃税行为差异,我们考察了按年收入十分位分组的人群中所得税和销售税逃税的分布。在此分析中,群体规模固定为$N,=,1000$个智能体,模拟时长为$T=3650$天(10年)。实验重复了$n=100$次以确保结果的稳健性。


Figure 4: The dynamics of the first occurrence of informal economic activity (tax evasion), denoted as $\delta$ , and the share of the informal economy in overall economic activity $\left(\Bar O\right)$ , with respect to the number of synthetic messages provided to the LLM as part of the agent’s personality.

图4:首次出现的非正规经济活动(逃税)的动态,表示为$\delta$,以及非正规经济在整个经济活动中的份额$\left(\Bar O\right)$,与作为AI智能体个性的一部分提供给大语言模型的合成消息数量的关系。

The analysis focuses on four distinct cases representing different perceptions of the efficiency of public goods provision. The first case considers public goods provision as inefficient, where the value (as the agents perceive) of public goods is less than the amount of taxes paid $(\nu(\tau)=0.75\tau)$ . The second case assumes public goods provision is beneficial for the entire population, with the value exceeding the taxes paid $(\nu(\tau)=1.25\tau)$ . The third case introduces a logarithmic ”capitalist” function, where the perceived value of public goods grows logarithmic ally with taxes paid $(\nu(\tau)=\tau\ln(\tau))$ . Finally, the fourth case models a ”socialist” function that benefits lower-income groups at the expense of higher earners $(\nu(\tau)=\tau\cdot(\tau^{*}-\ln(\tau)))$ , where $\tau^{*}$ represents the total tax contribution of the top $1%$ of income earners assuming they spend their entire income.

分析聚焦于四种不同的情况,这些情况代表了对公共物品提供效率的不同认知。第一种情况认为公共物品的提供效率低下,即公共物品的价值(如智能体所感知的)小于支付的税款 $(\nu(\tau)=0.75\tau)$。第二种情况假设公共物品的提供对整个群体有利,其价值超过支付的税款 $(\nu(\tau)=1.25\tau)$。第三种情况引入了一个对数的“资本主义”函数,其中公共物品的感知价值随着支付的税款对数增长 $(\nu(\tau)=\tau\ln(\tau))$。最后,第四种情况模拟了一个“社会主义”函数,该函数以高收入群体为代价使低收入群体受益 $(\nu(\tau)=\tau\cdot(\tau^{*}-\ln(\tau)))$,其中 $\tau^{*}$ 代表假设收入最高的 $1%$ 人群将其全部收入用于缴税时的总税款贡献。

In all cases, the probability of being detected, audited, and penalized for tax evasion was set to $P(\xi),=,0.1$ . The results of this analysis are presented in Fig. 5, which shows the distributions of the first occurrence of tax evasion $(\delta)$ and the normalized share of the informal economy $\left(\Bar O\right)$ for each income decile. The results are reported as the mean $\pm$ standard deviation across $n=100$ simulations. The figure illustrates the dynamics of two key metrics: the first occurrence of tax evasion $(\delta)$ and the normalized share of the informal economy $\left(\Bar O\right)$ across income deciles, under varying perceptions of public goods provision efficiency. The analysis reveals distinct patterns for each of the four cases examined.

在所有情况下,逃税被检测、审计和处罚的概率被设定为 $P(\xi),=,0.1$。本文的分析结果如图 5 所示,图中展示了每个收入十分位首次逃税发生时间 $(\delta)$ 和非正规经济标准化份额 $\left(\Bar O\right)$ 的分布。结果以 $n=100$ 次模拟的平均值 $\pm$ 标准差形式报告。图 5 展示了两种关键指标的动态变化:在不同公共物品供给效率感知下,各收入十分位的首次逃税发生时间 $(\delta)$ 和非正规经济标准化份额 $\left(\Bar O\right)$。分析揭示了四种情况下的不同模式。

When public goods are perceived as inefficient $(\nu(\tau)=0.75\tau)$ , the timing of tax evasion $(\delta)$ is relatively uniform across income deciles, with no significant differences between lower-, middle-, or higher-income groups. However, the share of the informal economy $\left(\Bar O\right)$ increases markedly with income, with higher-income groups dominating informal economic activity. Lower-income groups contribute minimally to the informal economy due to their smaller tax burdens.

当公共物品被认为效率低下 $(\nu(\tau)=0.75\tau)$ 时,逃税时机 $(\delta)$ 在各收入分位中相对均匀,低收入、中等收入或高收入群体之间没有显著差异。然而,非正规经济占比 $\left(\Bar O\right)$ 随着收入的增加而显著上升,高收入群体主导了非正规经济活动。低收入群体由于其较小的税收负担,对非正规经济的贡献微乎其微。

In the beneficial public goods case $(\nu(\tau)=1.25\tau)$ , tax evasion $(\delta)$ is delayed across all income deciles, with no significant variation between groups. The share of the informal economy $\left(\Bar O\right)$ remains minimal for all income deciles, with only minor contributions from the highest income groups. This demonstrates the suppress ive effect of aligning public goods provision with taxpayer expectations on informal economic activity.

在有益公共品的情况下 $(\nu(\tau)=1.25\tau)$,逃税行为 $(\delta)$ 在所有收入层级中都被延迟,且各群体之间没有显著差异。非正规经济占比 $\left(\Bar O\right)$ 在所有收入层级中仍然处于最低水平,仅最高收入群体有少量贡献。这表明,将公共品供给与纳税人期望保持一致对抑制非正规经济活动具有显著效果。

For the logarithmic ”capitalist” growth case $(\nu(\tau)=\tau\ln(\tau))$ , the timing of tax evasion $(\delta)$ occurs the earliest among all scenarios, indicating that informal economic activity begins quickly after the simulation starts, regardless of income decile. While $\delta$ is nearly uniform across income deciles, the early appearance reflects the logarithmic nature of the perceived value of public goods, where the marginal returns diminish quickly even for smaller contributions. The share of the informal economy $\left(\Bar O\right)$ remains consistently low for all income deciles, with only a slight upward trend in higher-income groups. This suggests that while tax evasion begins early, its scale is limited due to the proportional relationship between contributions and perceived benefits.

对于对数的“资本主义”增长情况 $(\nu(\tau)=\tau\ln(\tau))$,逃税时间 $(\delta)$ 在所有情景中发生得最早,表明非正式经济活动在模拟开始后迅速出现,不论收入十分位数如何。尽管 $\delta$ 在不同收入十分位数之间几乎一致,但其早期出现反映了公共物品感知价值的对数性质,即边际回报在较小的贡献下迅速递减。非正式经济的比例 $\left(\Bar O\right)$ 在所有收入十分位数中始终保持在较低水平,仅在高收入群体中略有上升趋势。这表明,尽管逃税行为开始得早,但由于贡献与感知利益之间的比例关系,其规模有限。

In the socialist redistribution case $(\nu(\tau)=\tau\cdot(\tau^{*}-\ln(\tau)))$ , tax evasion occurs disproportionately earlier in higher-income deciles, while lower-income groups exhibit delayed engagement. The share of the informal economy $\left(\Bar O\right)$ is heavily concentrated in the top income deciles, with minimal contributions from lower-income groups. This reflects the redistribution al pressures of the tax system, which in centi viz e wealthier individuals to evade taxes.

在社会主义再分配案例 $(\nu(\tau)=\tau\cdot(\tau^{*}-\ln(\tau)))$ 中,税收规避在较高收入阶层中不均衡地较早发生,而低收入群体的参与则相对延迟。非正规经济的份额 $\left(\Bar O\right)$ 高度集中于收入最高的十分位群体,低收入群体的贡献微乎其微。这反映了税收制度的再分配压力,促使较富裕的个体规避税收。


Figure 5: Distributions of the first occurrence of tax evasion $(\delta)$ and the normalized share of the informal economy $\left(\Bar O\right)$ for each income decile. The results are reported as the mean $\pm$ standard deviation across $n=100$ simulations

图 5: 第一次出现逃税 ($\delta$) 和非正规经济标准化份额 ($\Bar O$) 在各收入十分位数中的分布。结果显示为 $n=100$ 次模拟的均值 $\pm$ 标准差。

Moreover, we were initially interested in the interplay between the effectiveness of converting taxes to public goods and the level of enforcement (Carrillo et al., 2021). Fig. 6 shows the first occurrence of tax evasion $(\delta)$ and the normalized share of the informal economy $\left(\Bar O\right)$ with respect to the effectiveness of converting taxes to public goods $\left(\nu(\tau)/\tau\right)$ and the chance to be detected for tax evasion $(P(\xi))$ as the mean of $n=10$ simulations.

此外,我们最初对税收转为公共产品的效率与执法水平之间的相互作用感兴趣 (Carrillo et al., 2021)。图 6 展示了首次逃税 $(\delta)$ 和非正规经济归一化份额 $\left(\Bar O\right)$ 关于税收转为公共产品的效率 $\left(\nu(\tau)/\tau\right)$ 和逃税被查获的概率 $(P(\xi))$ 的关系,基于 $n=10$ 次模拟的平均值。

The heatmap of $\delta$ (Figure 6a) reveals that the time to the first occurrence of tax evasion increases with higher values of both $\nu(\tau)/\tau$ and $P(\xi)$ . When public goods provision is perceived as more effective $\left(\nu(\tau)/\tau$ is high), tax evasion occurs later, suggesting that improved public goods provision delays the onset of informal economic activity. Similarly, higher probabilities of detection $(P(\xi))$ result in a delay in the timing of tax evasion, indicating that stronger enforcement mechanisms discourage early engagement in the informal activity. The combined effect of $\nu(\tau)/\tau$ and $P(\xi)$ is additive, with the longest delays in tax evasion observed in regions where both factors are high, while the earliest tax evasion occurs in regions where both factors are low.

$\delta$ 的热力图(图 6a)显示,首次出现逃税行为的时间随着 $\nu(\tau)/\tau$ 和 $P(\xi)$ 值的增加而增加。当公共物品供给被认为更有效时( $\nu(\tau)/\tau$ 较高),逃税行为发生得更晚,这表明改善公共物品供给会延迟非正式经济活动的开始。同样,较高的检测概率 $(P(\xi))$ 会导致逃税时间的延迟,这表明更强的执法机制会减少早期参与非正式活动的行为。$\nu(\tau)/\tau$ 和 $P(\xi)$ 的综合效应是相加的,逃税延迟最长的区域出现在这两个因素都较高的地方,而逃税最早发生的区域则出现在这两个因素都较低的地方。


Figure 6: Heatmaps of time to occurrence of tax evasion $(\delta)$ and the normalized share of the informal economy $\left(\Bar O\right)$ with respect to the effectiveness of converting taxes to public goods $(\nu(\tau)/\tau)$ and the chance to be detected for tax evasion $(P(\xi))$ . The results are shown as the mean of $n=10$ simulations.

图 6: 逃税发生时间 $(\delta)$ 和非正规经济标准化份额 $\left(\Bar O\right)$ 相对于将税收转化为公共产品的有效性 $(\nu(\tau)/\tau)$ 和逃税被检测到的概率 $(P(\xi))$ 的热图。结果显示为 $n=10$ 次模拟的平均值。

The heatmap of $\bar{O}$ (Figure 6b) reveals an inverse relationship between the effectiveness of converting taxes to public goods $\left({\frac{\nu(\tau)}{\tau}}\right)$ and the normalized share of the informal economy. The figure shows three main parts in the dynamics: $\begin{array}{r}{0.5\le\frac{\nu(\tau)}{\tau}<1}\end{array}$ , $\frac{\nu(\tau)}{\tau}=\dot{1}$ , and $\begin{array}{r}{1,<,\frac{\nu(\tau)}{\tau},\leq,1.5}\end{array}$ . In the first case, $\bar{O}$ decreases as $\frac{\nu(\tau)}{\tau}$ and $P(\xi)$ increase, aligning with classical economic theory. When $\begin{array}{r}{\frac{\nu(\tau)}{\tau}=1}\end{array}$ and $P(\xi)=0$ , one can observe equilibria where individuals either pay or do not pay taxes. When $P(\xi)>0$ , a value near zero can be associated with a population equilibrium where paying taxes is equally beneficial to not paying them, while tax evasion results in a constant and higher penalty. Finally, for $\begin{array}{r}{1<\frac{\nu(\tau)}{\tau}\leq1.5}\end{array}$ , an increase in $\bar{O}$ is observed compared to $\begin{array}{r}{\frac{\nu(\tau)}{\tau}=1}\end{array}$ . This part of the dynamics can be attributed to the fact that most of the population follows classical economic logic, recognizing that paying taxes is beneficial, while a very small portion (less than 1 percent) can be classified as “free riders”. Notably, while noisy, a slight increase in $\bar{O}$ occurs as $\frac{\nu(\tau)}{\tau}$ increases, while $\bar{O}$ decreases as $P(\xi)$ increases. This suggests that some individuals may attempt to evade taxes, knowing that others will cover their needs since public goods are available even to those who do not contribute.

$\bar{O}$ 的热力图(图 6b)揭示了将税收转化为公共产品的有效性 $\left({\frac{\nu(\tau)}{\tau}}\right)$ 与非正规经济份额之间的关系。该图展示了动态中的三个主要部分:$\begin{array}{r}{0.5\le\frac{\nu(\tau)}{\tau}<1}\end{array}$、$\frac{\nu(\tau)}{\tau}=\dot{1}$ 和 $\begin{array}{r}{1,<,\frac{\nu(\tau)}{\tau},\leq,1.5}\end{array}$。在第一种情况下,$\bar{O}$ 随着 $\frac{\nu(\tau)}{\tau}$ 和 $P(\xi)$ 的增加而减少,这与经典经济理论一致。当 $\begin{array}{r}{\frac{\nu(\tau)}{\tau}=1}\end{array}$ 且 $P(\xi)=0$ 时,可以观察到个人支付或不支付税收的均衡状态。当 $P(\xi)>0$ 时,接近零的值可能与一种人口均衡相关联,其中支付税收与不支付税收的收益相等,而逃税则会导致一个恒定且更高的惩罚。最后,对于 $\begin{array}{r}{1<\frac{\nu(\tau)}{\tau}\leq1.5}\end{array}$,与 $\begin{array}{r}{\frac{\nu(\tau)}{\tau}=1}\end{array}$ 相比,$\bar{O}$ 有所增加。这部分动态可以归因于大多数人口遵循经典经济逻辑,认为支付税收是有益的,而非常小的一部分(不到 1%)可以被归类为“搭便车者”。值得注意的是,虽然存在噪音,但随着 $\frac{\nu(\tau)}{\tau}$ 的增加,$\bar{O}$ 略有上升,而随着 $P(\xi)$ 的增加,$\bar{O}$ 有所下降。这表明,由于公共产品即使对那些没有贡献的人也是可用的,一些人可能会试图逃税,知道其他人会满足他们的需求。

5 Discussion

5 讨论

This research presents a novel computational framework that uniquely captures the emergence of informal economic activity without presupposing its existence or ”hinting” to agents that tax evasion is an option. By using an agent-based simulation approach where each agent’s decision-making process is powered by a combination of Large Language Model (LLM) and Deep Reinforcement Learning (DRL) models, the study demonstrates that informal economic behaviors arise organically from the interactions between individual decision-making processes, external narratives, and policy configurations. This approach provides a robust tool for exploring the socioeconomic dynamics of tax compliance and informal economic activity.

本研究提出了一种新颖的计算框架,该框架独特地捕捉了非正式经济活动的出现,而无需预先假设其存在或向智能体“暗示”逃税是一种选择。通过采用基于智能体的模拟方法,其中每个智能体的决策过程由大语言模型(LLM)和深度强化学习(DRL)模型共同驱动,该研究表明,非正式经济行为是从个体决策过程、外部叙事和政策配置之间的互动中自发产生的。这种方法为探索税收合规性和非正式经济活动的社会经济动态提供了一个强大的工具。

The model validation phase confirmed the robustness of the simulation in reproducing theoretical expectations across multiple configurations. Configurations I and II resulted in normalized informal economy sizes $\left(\Bar O\right)$ close to zero, consistent with rational agents avoiding tax evasion when the utility of paying taxes is higher than or equal to the alternative and when detection probabilities are high. These results align with the assumption that rational agents comply when their cost-benefit analysis favors compliance. The small deviations observed in $\bar{O}$ $(0.0021\pm0.0010$ and $0.0019\pm0.0010)$ are attributable to the stochastic it y inherent in the LLM and DRL components, which adds realism to the simulated agent decision-making processes. Configurations III and IV, where tax evasion becomes a more attractive option under theoretical conditions, successfully reproduced the expected equilibrium states of $\bar{O}=0.5$ .

模型验证阶段确认了模拟在多种配置中复现理论预期的鲁棒性。配置 I 和 II 导致了归一化的非正规经济规模 $\left(\Bar O\right)$ 接近零,这与理性主体在支付税收的效用高于或等于替代方案且检测概率高时避免逃税的假设一致。这些结果与理性主体在其成本效益分析倾向于合规时遵守的假设一致。在 $\bar{O}$ 中观察到的微小偏差 $(0.0021\pm0.0010$ 和 $0.0019\pm0.0010)$ 可归因于大语言模型和深度强化学习组件中固有的随机性,这为模拟主体决策过程增加了真实感。配置 III 和 IV 在理论条件下使逃税成为更有吸引力的选择,成功复现了预期平衡状态 $\bar{O}=0.5$。

This confirms the model’s capacity to simulate diverse behavioral equilibria, providing a strong foundation for further exploration of tax compliance dynamics (see Table 4).

这证实了该模型能够模拟多样化的行为均衡,为进一步探索税收合规动态提供了坚实基础(见表 4)。

The personality analysis highlighted the significant role of agent traits in shaping tax evasion behavior. By isolating the impact of personality-driven decision-making, the study examined whether tax evasion behaviors emerged from bounded rationality (reflected in the DRL model) or implicit strategies derived from LLM-based personality parameters. When enforcement probabilities were set to zero $(P(\xi)=0)$ ) and tax payments offered no additional utility compared to non-payment $\nu(\tau)=\tau,$ ), rational incentives for tax evasion were eliminated, leaving personality as the primary driver. The results demonstrate distinct behavioral differences across personality types, as shown in Fig. 3. Law-abiding agents exhibited extremely low rates of tax evasion $(0.9%)$ , suggesting a strong ethical predisposition shaped by prompts emphasizing societal norms and compliance. By contrast, law-breaking personalities exhibited an overwhelming inclination toward tax evasion, with $98.4%$ of simulations resulting in non-compliance. This high rate underscores the effectiveness of law-breaking prompts in influencing decision-making. The slight deviation from the theoretical mirror of the law-abiding personality $(99.1%)$ likely reflects the positive bias in LLM training, which inherently leans toward constructive outputs, as noted by Miah et al. (2024). Average personalities, characterized by neutral prompts, displayed tax evasion rates of $3.3%$ , a figure that reflects their lack of predisposition toward either compliance or non-compliance.

人格分析强调了智能体特质在塑造逃税行为中的重要作用。通过分离人格驱动决策的影响,该研究探讨了逃税行为是否是有限理性(在DRL模型中反映)的结果,还是源自基于大语言模型的隐式策略。当执法概率设为零 $(P(\xi)=0)$ ),且纳税与不纳税相比不提供额外效用 $\nu(\tau)=\tau,$ 时,逃税的理性动机被消除,人格成为主要驱动力。结果显示,不同人格类型的行为存在显著差异,如图3所示。守法代理人的逃税率极低 $(0.9%) ,表明由强调社会规范和合规的提示塑造了强烈的道德倾向。相比之下,违法人格在逃税方面表现出压倒性的倾向, $98.4%$ 的模拟结果都是不遵守税法。这一高比率凸显了违法提示在影响决策方面的有效性。与守法人格的理论镜像 $(99.1%)$ 的轻微偏差可能反映了大语言模型训练中的正向偏差,正如Miah等人(2024)所指出的,这种偏差本质上倾向于建设性输出。中性提示特征的平均人格的逃税率为 $3.3%$ ,这一数字反映了他们对合规或不合规的缺乏倾向。

The timing of tax evasion $(\delta)$ further highlights the influence of personality. Law-abiding agents delayed tax evasion significantly longer, reflecting their normative adherence to societal rules. Average personalities exhibited earlier tax evasion than law-abiding agents, consistent with their neutral stance. Law-breaking personalities engaged in tax evasion much earlier, with most instances occurring within the first 250 simulation steps $(0,<,\delta,<,250)$ ). These findings reinforce socio-economic theories that emphasize the interplay between individual morality and perceived enforcement in shaping tax compliance behaviors.

逃税时间 $(\delta)$ 进一步突显了人格的影响。守法代理人显著延迟了逃税行为,反映了他们对社会规则的规范遵从。平均人格的代理人在逃税时间上早于守法代理人,这与他们的中立立场一致。而违法人格的代理人则更早地进行了逃税行为,大多数情况发生在最初的250个模拟步数内 $(0,<,\delta,<,250)$ 。这些发现强化了社会经济理论,强调个人道德与感知到的执法力度在塑造税收遵从行为中的相互作用。

The introduction of synthetic messages to manipulate agent personalities, as shown in Fig. 4, revealed the profound impact of external narratives on compliance decisions. Agents exposed to pro-evasion messages $(k)$ exhibited significantly earlier tax evasion $(\delta)$ and greater participation in the informal economy $\left(\Bar O\right)$ . The results demonstrated a nonlinear relationship, with a tipping point observed at $k\approx10$ , where behavioral shifts became entrenched. Beyond this point, additional messages had limited impact, reflecting a saturation effect where agents fully internalized the tax evasion strategy. The inverse relationship between $\delta$ and $\bar{O}$ emphasizes the importance of early intervention. Preventing initial acts of tax evasion could significantly reduce long-term informal economic activity, as early engagement tends to compound over time.

通过引入合成信息来操纵AI智能体的个性,如图4所示,揭示了外部叙述对合规决策的深刻影响。接触到避税信息的AI智能体 $(k)$ 表现出显著提前的避税行为 $(\delta)$ 以及更大的非正规经济参与度 $\left(\Bar O\right)$ 。结果展示了非线性关系,在 $k\approx10$ 时出现一个临界点,行为转变在此后被固化。超过这一点,额外的信息影响有限,反映出AI智能体已完全内化了避税策略,形成饱和效应。$\delta$ 和 $\bar{O}$ 之间的反比关系强调了早期干预的重要性。防止最初的避税行为可以显著减少长期的非正规经济活动,因为早期的参与往往会随着时间的推移而累积。

The results of Fig. 5 and Fig. 6 highlight the critical role of public goods provision efficiency and enforcement in shaping informal economic activity. In the inefficient public goods scenario $(\nu(\tau)=0.75\tau)$ , dissatisfaction with the perceived value of public goods disproportionately drives higher-income groups into the informal economy $\left(\Bar O\right)$ . This aligns with economic theories that suggest individuals are less inclined to comply with taxes when the perceived benefits are inadequate relative to their contributions (Noguera et al., 2014; Bazart and Bonein, 2014; Traxler, 2010). Conversely, the beneficial public goods scenario $(\nu(\tau)=1.25\tau)$ demonstrates the suppress ive effect of exceeding taxpayer expectations. Here, delayed tax evasion $(\delta)$ and minimal informal activity $\left(\Bar O\right)$ reflect the potential of effective public spending to foster compliance across all income deciles.

图 5 和图 6 的结果强调了公共物品供给效率和执法在塑造非正式经济活动中的关键作用。在低效的公共物品场景下 $(\nu(\tau)=0.75\tau)$,对公共物品感知价值的不满促使高收入群体更倾向于进入非正式经济 $\left(\Bar O\right)$。这与经济理论一致,即当个体感知到的收益相对于其贡献不足时,他们不太愿意遵守税收规定(Noguera 等,2014;Bazart 和 Bonein,2014;Traxler,2010)。相反,在有益的公共物品场景下 $(\nu(\tau)=1.25\tau)$,超出了纳税人期望的积极效果得到了体现。在这里,延迟的逃税行为 $(\delta)$ 和极少的非正式活动 $\left(\Bar O\right)$ 反映了有效的公共支出在促进所有收入阶层合规方面的潜力。

The logarithmic public goods scenario $(\nu(\tau)=\tau\ln(\tau))$ offers a nuanced insight into compliance behavior. While tax evasion $(\delta)$ occurs earlier than in other scenarios, the consistently low levels of informal activity $\left(\Bar O\right)$ suggest that the proportional relationship between taxes and public goods utility effectively balances incentives across income groups. This model minimizes the magnitude of informal activity, even if it cannot entirely suppress its initiation. In contrast, the socialist redistribution scenario $(\nu(\tau)=\tau{\cdot}(\tau^{*}{-}\mathrm{ln}(\tau)))$ illustrates the trade-offs of progressive taxation. Lower-income groups benefit from delayed tax evasion and reduced informal activity, while higher-income groups engage in significantly earlier evasion and dominate informal economic participation. This highlights the risks of overly re distributive policies, which may in centi viz e wealthier taxpayers to evade taxes and undermine revenue collection.

对数公共物品情景 $(\nu(\tau)=\tau\ln(\tau))$ 为合规行为提供了细致的洞察。尽管逃税 $(\delta)$ 发生的时机早于其他情景,但非正规活动水平的持续低位 $\left(\Bar O\right)$ 表明,税收与公共物品效用之间的比例关系有效地平衡了不同收入群体的激励。该模型最大限度地减少了非正规活动的规模,即使无法完全抑制其发生。与之相比,社会主义再分配情景 $(\nu(\tau)=\tau{\cdot}(\tau^{*}{-}\mathrm{ln}(\tau)))$ 展示了累进税制的权衡。低收入群体因延迟逃税和减少非正规活动而受益,而高收入群体则显著提前逃税并在非正规经济中占据主导地位。这凸显了过度再分配政策的潜在风险,可能会激励富裕纳税人逃税并削弱税收收入。

The interaction between public goods provision and enforcement, as depicted in Fig. 6, underscores the importance of a balanced approach. Public goods provision perceived as efficient reduces the utility of evasion, while enforcement increases the perceived risks. However, neither strategy alone is sufficient. For example, enforcement alone yields diminishing returns, especially when public goods provision is inadequate. The optimal policy combination lies in integrating efficient public goods provision with robust enforcement mechanisms, as this minimizes informal economic activity and delays tax evasion behavior.

图 6 中展示的公共产品提供与执法之间的互动,突显了平衡策略的重要性。被视为高效的公共产品提供降低了逃避行为的效用,而执法则增加了感知到的风险。然而,单靠任何一种策略都不足以取得理想效果。例如,单独依赖执法会产生边际效益递减,尤其是在公共产品提供不足的情况下。最优的政策组合在于将高效的公共产品提供与强有力的执法机制相结合,因为这种组合能够最大限度地减少非正规经济活动并延缓逃税行为。

Despite its contributions, the study acknowledges several limitations. The assumption of an isolated economy and static demographics neglects potential interdependencies with external economies and population dynamics. Additionally, the simplification of agent memory and the omission of cultural and institutional factors constrain the model’s applicability to real-world scenarios. Future research should address these gaps by incorporating richer socio-economic contexts and exploring variables such as gender differences, progressive versus regressive taxation, and the dynamic effects of policy interventions on compliance behavior.

尽管有其贡献,但研究也承认存在一些局限性。对孤立经济和静态人口结构的假设忽视了与外部经济和人口动态的潜在相互依赖性。此外,对智能体记忆的简化以及文化和制度因素的忽略限制了模型在现实世界场景中的适用性。未来的研究应通过纳入更丰富的社会经济背景,探索诸如性别差异、累进税制与累退税制以及政策干预对合规行为的动态影响等变量,来填补这些空白。

6 Conclusion

6 结论

This study presents a novel computational framework that integrates LLMs and DRL models in an agent-based simulation to examine tax compliance and the dynamics of informal economies. The findings underscore the pivotal role of personality traits, external narratives, and policy factors in shaping compliance behavior. The model demonstrates that effective public goods provision, aligned with societal and individual expectations, emerges as a critical factor in reducing evasion and fostering trust in the tax system. Additionally, enforcement mechanisms, while essential, exhibit diminishing returns when public goods provision is perceived as inefficient.

本研究提出了一种新颖的计算框架,将大语言模型(LLMs)和深度强化学习(DRL)模型整合到基于智能体的仿真中,以研究税收合规性和非正规经济的动态。研究结果强调了人格特质、外部叙事以及政策因素在塑造合规行为中的关键作用。该模型表明,与社会和个体期望一致的有效的公共物品提供,成为减少逃税和促进对税收体系信任的关键因素。此外,尽管执法机制至关重要,但当公共物品提供被认为效率低下时,其效果会逐渐减弱。

From a policy perspective, the results emphasize the importance of achieving a balance between robust enforcement and efficient public goods provision. Policymakers should focus on improving the perceived value of taxation by ensuring that public goods are transparent, accessible, and equitable. Investments in enforcement should complement these efforts, particularly in targeting populations before behavioral shifts become entrenched.

从政策角度来看,研究结果强调了在强有力的执法和高效的公共产品供给之间取得平衡的重要性。政策制定者应通过确保公共产品的透明性、可及性和公平性,来提升税收的感知价值。执法方面的投资应与此相辅相成,尤其是在行为转变尚未固化前针对特定人群采取措施。

The study also highlights the significant impact of external narratives on compliance decisions, suggesting the need for proactive measures to counteract tax-evading messages. Targeted interventions, such as behavioral nudges and messaging campaigns emphasizing the societal benefits of tax compliance, could play a crucial role in mitigating informal economic activity.

该研究还强调了外部叙事对合规决策的重大影响,表明需要采取积极措施来抵制逃税信息。有针对性的干预措施,例如行为推动和强调纳税合规的社会效益的信息宣传活动,可能在减少非正式经济活动方面发挥关键作用。

This research is distinguished by its unique ability to demonstrate the emergence of informal economic activity without assuming its existence in advance or ”hinting” to agents that tax evasion is an option. By constructing a model where informal economic behaviors emerge organically from agent interactions and decision-making processes, the study provides a rigorous framework for investigating the underlying mechanisms driving compliance and evasion.

本研究以其独特的能力脱颖而出,能够在事先不假设其存在或向AI智能体"暗示"逃税选项的情况下,展示非正式经济活动的出现。通过构建一个模型,其中非经济行为有机地从智能体互动和决策过程中产生,该研究为调查推动合规和逃税的潜在机制提供了一个严格的框架。

Taken jointly, constructing models without presupposing the phenomenon allows for a more rigorous examination of the underlying mechanisms leading to its emergence. This approach enhances the explanatory power of models and avoids circular reasoning. This study highlights the pivotal role of the utility derived from taxation in influencing tax compliance and informal economic activity. Adequate and equitable public goods provision, aligned with societal and individual expectations, emerges as a critical factor in reducing evasion and fostering trust in the tax system. Policymakers must recognize that taxation is not merely a fiscal mechanism but a social contract that depends on perceived fairness, transparency, and mutual benefit. By addressing these dimensions, governments can enhance compliance, reduce informal economic activity, and build more sustainable and equitable tax systems.

综合来看,在不预设现象的前提下构建模型能够更严格地研究导致其出现的基本机制。这种方法增强了模型的解释力,并避免了循环论证。本研究强调了税收效用对税收遵从和非正式经济活动影响的关键作用。与社会和个人期望相一致、充分且公平的公共物品提供,成为减少逃避税行为及增强对税收体系信任的关键因素。政策制定者必须认识到,税收不仅仅是财政机制,更是一种依赖于感知公平、透明度和互利的社会契约。通过解决这些方面,政府可以提高遵从度,减少非正式经济活动,并建立更可持续、更公平的税收体系。

Declarations

声明

Funding

资金

None.

无。

Conflicts of interest/Competing interests

利益冲突/竞争利益

None.

无。

Contribution statement

贡献声明

Teddy Lazebnik: Conceptualization, Methodology, Software, Formal analysis, Investigation, Resources, Data Curation, Writing Original Draft, Writing - Review & Editing, Visualization, Project administration. Labib Shami: Conceptualization, Validation, Data Curation, Formal analysis, Investigation, Writing - Original Draft, Writing - Review & Editing.

Teddy Lazebnik: 概念化、方法论、软件、形式分析、调查、资源、数据整理、撰写初稿、撰写-审阅与编辑、可视化、项目管理。Labib Shami: 概念化、验证、数据整理、形式分析、调查、撰写初稿、撰写-审阅与编辑。

References

参考文献

micro simulation. Computers, Environment and Urban Systems 91, 101717.

微观模拟。《计算机、环境与城市系统》91, 101717。

Appendix

附录

Decision making mechanism

决策机制

The state of each agent, $a\in A$ , includes the following elements: current amount of money $(\beta)$ , number of simulations steps between two salaries $(\theta)$ , salary (s), desired goods distribution $(d)$ , risk-taking propensity parameter $(\zeta)$ , planning horizon $(\eta)$ , cognitive ability $(\upsilon)$ , personality $(\psi)$ , sales tax policy $(\mu)$ , income tax policy $(\lambda)$ , personal public goods utility $(\nu)$ , and enforcement policy $(\xi)$ . The DRL agent’s decision-making is implemented using a neural network, the DQN model, that maps the state $s$ to an action $a$ that the agent should take. The only action formally provided to the agent is the amount of taxes it should pay, ranging from zero to the full tax amount. The DRL agent’s reward $R$ is a function of utility derived from public goods $(U)$ , potential penalties for tax evasion $(\psi)$ , and total cost of transactions and interactions $(C)$ . Thus, the reward function can be represented as: $R=U-\psi-C$ .

每个智能体的状态 $a\in A$ 包括以下元素:当前的资金量 $(\beta)$、两次薪资发放之间的模拟步数 $(\theta)$、薪资 (s)、期望的货物分布 $(d)$、风险承受倾向参数 $(\zeta)$、规划范围 $(\eta)$、认知能力 $(\upsilon)$、个性 $(\psi)$、销售税政策 $(\mu)$、所得税政策 $(\lambda)$、个人公共物品效用 $(\nu)$ 和执行政策 $(\xi)$。DRL(深度强化学习)智能体的决策通过神经网络实现,使用 DQN(深度 Q 网络)模型将状态 $s$ 映射到智能体应执行的动作 $a$。正式提供给智能体的唯一动作是应支付的税款金额,范围从零到全额税款。DRL 智能体的奖励 $R$ 是一个函数,取决于从公共物品中获得的效用 $(U)$、逃税的潜在惩罚 $(\psi)$ 以及交易和互动的总成本 $(C)$。因此,奖励函数可以表示为:$R=U-\psi-C$。

For the learning process, we model the LLM training over time by the increase in the context’s size with relevant context, decision, and rewards (i.e., outcomes) which operate as guidelines for the LLM. Unlike, the DRL makes its decisions and is trained with respect to $\zeta,\eta$ , and $\upsilon$ . Namely, the $\zeta$ parameter operates as the coefficient for the exploration-exploitation of the $\epsilon$ -greedy algorithm used to train the DRL model. The $\eta$ implicitly defined the planning horizon by picking a discount factor, $\gamma,\in,[0,1]$ , solving the equations $\gamma,=,0.01^{1/(\eta+1)}$ to ensure that the $\eta+1$ steps in time contribute at most one percent to the loss function of the DRL. $\upsilon$ defined a random variable, $\Upsilon$ , which is normally distributed with a mean equal to zero and a standard deviation of $1-\upsilon$ . The reward value for each state and action pair is not used by the DRL as-is but added a random sample from $\Upsilon$ .

对于学习过程,我们通过上下文的增加来建模大语言模型的训练,其中的相关上下文、决策和奖励(即结果)作为大语言模型的指导方针。相比之下,DRL(深度强化学习)根据 $\zeta,\eta$ 和 $\upsilon$ 进行决策和训练。具体来说,$\zeta$ 参数作为 $\epsilon$ -贪婪算法的探索-利用系数,用于训练 DRL 模型。$\eta$ 通过选择一个折扣因子 $\gamma,\in,[0,1]$ 来隐式定义规划范围,并通过求解方程 $\gamma,=,0.01^{1/(\eta+1)}$ 来确保 $\eta+1$ 步的时间最多对 DRL 的损失函数贡献百分之。$\upsilon$ 定义了一个随机变量 $\Upsilon$,它服从均值为零、标准差为 $1-\upsilon$ 的正态分布。DRL 并不直接使用每个状态和动作对的奖励值,而是添加了来自 $\Upsilon$ 的随机样本。

Synthetic twits for the second experiment

第二次实验的合成推文

The 20 tweets generated by GPT-4-mini for the law-preserving and law-breaking individuals. First, the law-preserving individual:

GPT-4-mini 为守法者和违法者生成的20条推文。首先,守法者:

List of goods in the economy

经济商品清单

Table 6 outlines the cost weights based on the relative importance of components in the consumer price indexes for the average US city in December 2023.

表 6: 基于2023年12月美国城市平均消费者价格指数中组成部分相对重要性的成本权重

Table 5: Cost Weights Based on Relative Importance of Components in the Consumer Price Indexes: U.S. City Average, December 2023

表 5: 基于消费者价格指数中各组成部分相对重要性的成本权重:美国城市平均,2023年12月

CPI-U Relative Importance

CPI-U相对重要性

其他烘焙产品 生鲜绞牛肉 生鲜牛肉块 培根、早餐香肠及相关产品 其他猪肉包括烤肉、牛排和肋骨 生鲜其他牛肉和小牛肉 生鲜牛排 猪排 鸡肉 火腿 香蕉 苹果 鸡蛋 牛奶
面粉和预制面粉混合物 蛋糕、纸杯蛋糕和饼干 新鲜饼干、卷饼、松饼 米、面食、玉米粉 早餐麦片 面包 其他生鲜家禽包括火鸡 奶酪及相关产品 加工鱼和海鲜 新鲜鱼和海鲜 其他肉类 冰淇淋及相关产品 罐装水果和蔬菜 冷冻水果和蔬菜 其他新鲜蔬菜 其他新鲜水果 柑橘类水果 番茄 土豆 生菜
CPI-U 相对重要性 CPI-U 成本权重
0.397 36,322,721,000 非冷冻非碳酸果汁和饮料 碳酸饮料
0.068 0.324 0.186 0.112 6,221,524,000 29,643,732,000
17,017,698,000 10,247,216,000 其他饮料材料 包括茶 咖啡 项目
阅读全文(20积分)