
Redmond promotes the preview of AI agents maneuvering through applications via the user interface.
Soon, Microsoft will allow Copilot agents to operate computers via the graphical user interface (GUI) much like human users do – involving actions such as clicking buttons, choosing from menus, and filling out online forms.
On Wednesday, theWindows empire announced its intention to allow computer operations through Copilot Studio—a platform created byMicrosoft for developing and launchingAI agents. This move aims to alleviateemployees from manually clicking buttons and completing forms, all while ensuringthat corporate information remains securely withinMicrosoft’s cloud services—Redmond maintains that thisdata isn’t utilizedfor training their models.
"Through computer usage, agents can engage with websites and desktop applications by performing actions such as button clicks, menu selections, and entering data into screen fields," detailed Charles Lamanna, the corporation’s vice president for business and industry at Copilot, in their promotional material.
This enables agents to manage tasks even when there isn’t an API available for direct integration with the system. Should a user be capable of using the application, so can the agent.
As far as we understand, AI agents are essentially software programs that interact with both other software components and end-users. They utilize generative AI to facilitate decision-making and generate output responses.
Currently, Microsoft Copilot Studio allows users to develop AI-powered assistants for automating particular tasks; however, these helpers operate exclusively within predefined services such as SharePoint. The upcoming version of these assistants will offer greater adaptability. To illustrate, one might generate an assistant and instruct it to navigate through an unfamiliar webpage, retrieve relevant information from it, then transfer this extracted content directly into a desktop application.
Lamanna proposes various situations where the fresh Copilot agents might prove useful, including streamlining the intake of substantial volumes of information from different origins into one main database, automatically gathering market intelligence for analysis purposes, or leveraging AI-driven text and image identification functions to handle billing documents.
Artificial intelligence automation varies from pre-programmed directives because the system has the capability to adjust spontaneously when faced with hurdles or unforeseen modifications in the environment. Rather than halting due to an error, it employs inherent logic to navigate these challenges, as stated by Microsoft.
Lamanna asserted that computer usage adapts automatically to modifications in applications and online platforms. It makes adjustments promptly through integrated logic to resolve problems independently, ensuring continuous workflow without disruption.
If all goes well, this line of thinking won’t include unforeseen deletions or policy infractions, as expressed by an anxious user who voiced their concerns in a social media discussion initiated by a Copilot Studio product manager.
However, turning over computational tasks to Copilot may involve unanticipated costs. As with cloud services, the bill for AI's boil-the-ocean approach to computation use isn't necessarily easy to anticipate and there's potential for bill shock if certain tasks turn out to be computationally demanding.
Users of both OpenAI's and Anthropic's computer usage APIs have expressed concerns regarding expenses.
Microsoft is bringing computer use to Copilot Studio users through an early access research preview that requires a signup. Expect to hear more about this at Microsoft Build 2025 next month. ®
0 Comments