Research Papers announce service assistant study agent arxiv

Generative AI in Action: Field Experimental Evidence from Alibaba's Customer Service Operations

arXiv cs.HCby [Submitted on 8 Feb 2026]April 1, 20262 min read1 views

arXiv:2603.29888v1 Announce Type: new Abstract: In collaboration with Alibaba, this study leverages a large-scale field experiment to assess the impact of a generative AI assistant on worker performance in e-commerce after-sales service. Human agents providing digital chat support were randomly assigned with access to a gen AI assistant that offered two core functions: diagnosis of customer issues and solution proposals, presented as text messages. Agents retained discretion to adopt, modify, or disregard AI-generated messages. To evaluate gen AI's impact, we estimate both the intention-to-treat (ITT) effect of gen AI access and the local average treatment effect (LATE) of gen AI usage. Results show that gen AI significantly improved service speed, measured by issue identification time and

View PDF

Abstract:In collaboration with Alibaba, this study leverages a large-scale field experiment to assess the impact of a generative AI assistant on worker performance in e-commerce after-sales service. Human agents providing digital chat support were randomly assigned with access to a gen AI assistant that offered two core functions: diagnosis of customer issues and solution proposals, presented as text messages. Agents retained discretion to adopt, modify, or disregard AI-generated messages. To evaluate gen AI's impact, we estimate both the intention-to-treat (ITT) effect of gen AI access and the local average treatment effect (LATE) of gen AI usage. Results show that gen AI significantly improved service speed, measured by issue identification time and chat duration. Gen AI also improved subjective service quality reflected in customer ratings and dissatisfaction rates, but it had no significant effect on objective service quality indicated by customer retrial rates. The performance improvements stemmed not only from automation but also from changes in the dynamics of agent-customer interactions: agent communication became more informative and efficient, while customers experienced reduced communication burdens. Low performers achieved the greatest improvements in both service speed and quality, narrowing the performance gap. In contrast, top-performing agents showed little improvement in service speed but experienced declines in both subjective and objective service quality. Evidence suggests that this decline results from increased multitasking tendency, proxied by longer shift-away times across concurrent chats, which slowed customer responses and raised abandonment and retrial rates. These findings suggest that gen AI reshapes work, demanding tailored deployment strategies.

Subjects:

Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.29888 [cs.HC]

(or arXiv:2603.29888v1 [cs.HC] for this version)

https://doi.org/10.48550/arXiv.2603.29888

arXiv-issued DOI via DataCite (pending registration)