Sony Developers Reveal How Machine Learning Enhances Quality Assurance

At the recent CEDEC event in Yokohama, Japan, development leads from Sony shared their experiences with implementing AI and machine learning models to streamline the QA process. Led by machine learning researchers Hiroyuki Yabe and Yutaro Miyauchi, alongside software engineer Nakahara Hiroki, the talk focused on how Sony integrated AI into the QA process using real PS5 hardware. The team collected on-screen and audio information, similar to human-driven Q&A, to test titles more regularly and efficiently. By automating QA operations, the team aimed to eliminate bugs earlier in the development cycle, as manual testing can only be conducted a few times per cycle, and late detection can impact release. The team shared their findings using Astro's Playroom as a case study, highlighting the challenges of integrating game progress with hardware functionality, such as the PS5's Activity Cards. To address these challenges, the team developed two separate automated play systems: a Replay Agent and an Imitation Agent. The Replay Agent replicated exact button combinations to ensure consistency, while the Imitation Agent reproduced human play with variance. Both systems were achieved by connecting a PS5 to a PC, allowing on-screen information to be sent to the learning module before controller inputs were sent back to the hardware. The tools could be used in sequence, with the Replay Agent navigating the UI or moving from the hub world to a level, and the Imitation Agent taking over to play the level. The team used machine learning models to recreate human gameplay, allowing for repeated testing of sections that could not be exactly reproduced. To assist the machine learning models, other AI systems, such as LoFTR, were used to recognize scenes and switch between agents. The team noted that some simplification and guidance were required to ensure the game could learn the environments using the play data provided. For example, raw analogue input was simplified into nine quadrants of movement, and probability was used to determine button presses. The team also integrated Class Balance into the training data to ensure greater chances of success, especially when dealing with small learning samples. By inputting balance to the data, the model could be effectively trained using fewer data sets and better adapt to new games in the same genre. Although the system continues to be refined, the researchers noted numerous benefits, including improved efficiency and earlier detection of bugs. While the system is not entirely self-sufficient and requires human input to some extent, it has allowed for more frequent testing throughout development, reducing the number of man-hours required for QA. The use of automated testing has not replaced the need for QA specialists but has integrated the process of QA further into the development process, enabling developers to detect and fix bugs earlier, resulting in higher-quality titles with fewer bugs.