LuckyAgent2022: A Stop-Learning Multi-Armed Bandit Automated Negotiating Agent
Abstract
The 13th automated negotiation competition was held in 2 leagues (ANL2022 and SCML2022) in conjunction with the 31st IJCAI conference. The ANL for 2022 is bilateral negotiation under the SOAP protocol. Agents are allowed to learn from their previous negotiations. The agents could have 3 main BOA components: a Bidding strategy that decides which bid and when must be sent to the opponent, an Opponent model that tries to model the opponent's preferences, and an Acceptance strategy that decides whether to accept the opponent's offer or not. This paper explains our LuckyAgent2022's BOA components and its learning methods over negotiation sessions. To improve its utility over sessions, we propose SLM, a LSN Stop-Learning mechanism, to prevent overfitting by adapting it to a multi-armed bandit problem. It finds the best value for variables of a time-dependent bidding strategy for the opponent. © 2022 IEEE.