Task Execution Module
The Task Execution Module connects user goals with web automation, facilitating the interaction between users and the web through agents like BrowserAgent
. It translates user tasks (e.g., finding a flight) into actionable commands, automates web interactions, and returns results.
Key Components:
User Goals: A clear task, such as finding the cheapest flight, is input into the system.
Agent Initialization: The browser agent is configured with the task and an LLM (like GPT-4) to interpret and execute it.
Task Execution: The agent navigates the web, interacts with forms or fields, scrapes relevant data, and completes the task.
Result Handling: Once the task is complete, results are returned and presented to the user.
Error Management: Errors are caught and logged, ensuring the process remains robust.
Scalability: The module can be expanded to handle more complex or multi-step tasks, such as booking multiple services.
In essence, the Task Execution Module automates web-based tasks, making it easier for users to achieve goals with minimal intervention.
Task Execution Module in Action
Through a simple interface the browser agent can be utilized by the developer to enable interaction with the underlying Browser Agent to parse the web, gathering data and taking action.
import { BrowserAgent } from '@nasdaiq/browser-agent';
async function main() {
const agent = new Agent({
key: process.env.BROWSER_AGENT_KEY,
task: 'Find a one-way flight from Paris to Tokyo on 25 February 2025 on Google Flights. Return me the cheapest option.',
llm: 'gpt-4o',
});
const result = await agent.run();
console.log(await result.data());
}
main().catch((error) => {
console.error(error);
});
Last updated