• About
  • Advertise
  • Privacy & Policy
  • Contact
Saturday, January 10, 2026
  • Login
  • Home
    • Home – Layout 1
    • Home – Layout 2
    • Home – Layout 3
    • Home – Layout 4
    • Home – Layout 5
    • Home – Layout 6
  • News
    • All
    • Business
    • Politics
    • Science
    • World
    Hillary Clinton in white pantsuit for Trump inauguration

    Hillary Clinton in white pantsuit for Trump inauguration

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Entertainment
    • All
    • Gaming
    • Movie
    • Music
    • Sports
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    So you want to be a startup investor? Here are things you should know

    So you want to be a startup investor? Here are things you should know

  • Lifestyle
    • All
    • Fashion
    • Food
    • Health
    • Travel
    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    How couples can solve lighting disagreements for good

    How couples can solve lighting disagreements for good

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Trending Tags

    • Golden Globes
    • Game of Thrones
    • MotoGP 2017
    • eSports
    • Fashion Week
  • Review
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    Intel Core i7-7700K ‘Kaby Lake’ review

    Intel Core i7-7700K ‘Kaby Lake’ review

No Result
View All Result
Ai News
Advertisement
  • Home
    • Home – Layout 1
    • Home – Layout 2
    • Home – Layout 3
    • Home – Layout 4
    • Home – Layout 5
    • Home – Layout 6
  • News
    • All
    • Business
    • Politics
    • Science
    • World
    Hillary Clinton in white pantsuit for Trump inauguration

    Hillary Clinton in white pantsuit for Trump inauguration

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Entertainment
    • All
    • Gaming
    • Movie
    • Music
    • Sports
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    So you want to be a startup investor? Here are things you should know

    So you want to be a startup investor? Here are things you should know

  • Lifestyle
    • All
    • Fashion
    • Food
    • Health
    • Travel
    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    How couples can solve lighting disagreements for good

    How couples can solve lighting disagreements for good

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Trending Tags

    • Golden Globes
    • Game of Thrones
    • MotoGP 2017
    • eSports
    • Fashion Week
  • Review
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    Intel Core i7-7700K ‘Kaby Lake’ review

    Intel Core i7-7700K ‘Kaby Lake’ review

No Result
View All Result
Ai News
No Result
View All Result
Home AI Tools & Automation

5 AI Agents Benchmarked for Price & Performance

AiNEWS2025 by AiNEWS2025
2024-12-13
in AI Tools & Automation
0
5 AI Agents Benchmarked for Price & Performance
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


AI agents are autonomous software program techniques that plan and act to attain given duties or targets. They’re outfitted with aggregated information and expertise of human consultants and entry to related knowledge.

We benchmarked capabilities of web-focused AI brokers by constructing our personal brokers. Observe the hyperlinks to see our expertise with the brokers:

Benchmark outcomes

To research the enterprise use instances of AI brokers, we used 2 totally different internet scraping duties. All brokers failed many of the duties. Anthropic Laptop use and Dendrite carried out barely higher than Phidata.

To be taught extra about internet scraping, you possibly can learn Roadmap to Web Scraping: Use Cases, Methods & Tools and Web Scraping with RPA.

Job 1:

Immediate: Present all cloud GPU suppliers that supply H100. We’d like each H100 provide from every supplier. Due to this fact a GPU supplier could also be introduced in a number of rows once they provide a number of H100 GPU provide (e.g. a suggestion with a single H100 and one other provide with two H100s). For every row, we’d like these knowledge factors: URL the place provide is shared, variety of GPUs as an integer, worth per hour as a decimal in $. Output as json.

We evaluated their capabilities to

5 AI Agents Benchmarked for Price & Performance
Determine 1: The share of the accurately supplied sources by the merchandise.
Determine 2: The share of the accuracy of the data supplied by the merchandise.

Job 2:

Immediate: Discover B2B tech non-public corporations that raised funding in October 2024. Format every consequence as: [Company name] raised [amount] in [sector/industry].

On this activity, Anthropic Laptop use (Determine 3) and Phidata (Determine 4) failed to supply solutions.

Determine 3: Laptop use’s reply to our activity.
Determine 4: Phidata’s reply to our activity, it supplied related assets however not the solutions.

ChatGPT’s search returned 7 corporations, of which 6 are correct. Nevertheless, one firm was listed as having fundraised in August 2024, which doesn’t meet our requirement for corporations that fundraised in October 2024. Due to this fact, this data is wrong.

Dendrite supplied 2 corporations accurately, though there are lots of extra corporations. It’s because it relied on search engine outcomes which had been incomplete.

Perplexity supplied 6 corporations, and whereas their names, raised quantities, and industries are correct, none of them accomplished fundraising in October 2024. Due to this fact, this data doesn’t meet our necessities.

So the leaders of this activity are ChatGPT search and Dendrite.

Costs

Worth of Anthropic pc use is predicated on API requests. For instance, we spent ~$2,5 to run these 2 duties, operating every duties a pair occasions. $0.5 for a activity run is pricey. If you wish to use agentic process automation, you will discover more cost effective choices.

ChatGPT’s search performance is offered to customers subscribed to the Plus and Workforce plans, priced at $20 per thirty days and $25 per consumer per thirty days (billed yearly), respectively.

Dendrite affords a restricted free plan and a Developer plan priced at $30. Particular particulars relating to the restrictions of the free plan will likely be up to date as soon as they’re formally printed.

Phidata has free, professional and enterprise plans. Plans apart from free are usually not obtainable but. Additionally they declare that they may present professional plan free for college kids, educators and start-ups.

Our methodology

Variations: Newest variations obtainable as of November 1, 2024.

Deployment setting:

  • Dendrite and Phidata had been run on our laptop computer.

  • Anthropic Laptop use was deployed to a cloud VM because it advisable in opposition to deployments on consumer gadgets.

  • ChatGPT search characteristic and Perplexity instantly on their respective web sites.

Course of:

  • To judge the distributors within the internet looking out capabilities, we first ready a ground-truth, which incorporates all of the cloud H100 providers. Then, we in contrast it with the outputs of the AI brokers.

  • To judge the accuracy of the data, we checked all of the hyperlinks they supplied to see whether or not the data they supplied us is appropriate or not.

  • We didn’t attempt immediate engineering to get extra correct outcomes.

Scoring:

For the reason that variety of outputs they supply varies, we aimed to maintain the scoring system as simple as attainable. For activity 1, if a product returns a URL that’s not from a dependable supply, it receives a rating of 0. Moreover, the variety of outputs ranges from 6 to twenty-eight, so it’s essential to notice {that a} product with 3 appropriate solutions out of 6 outputs and one other with 14 appropriate solutions out of 24 outputs obtain the identical rating in Determine 2.

We didn’t rating the merchandise for Job 2, because the search outcomes differ considerably primarily based on the used browser and the situation of the consumer, and the merchandise scrape knowledge accordingly from these sources. Nevertheless, since ChatGPT and Dendrite supplied correct outcomes, they’re thought of the leaders for this activity.

Disclaimer

For the reason that brokers use totally different browsers and places, these fashions can encounter totally different sources whereas internet scraping. To be truthful to all brokers, all potential sources had been included in our ground-truth.

Since all of those merchandise are model 1 or beta, they’ve numerous limitations, we’ll proceed to repeat the benchmark and replace the outcomes because the merchandise develop.

Since these fashions are newly developed, they could trigger safety vulnerabilities, so we suggest utilizing them in a digital machine or container. Anthropic additionally mentions the need of taking this precaution when utilizing Laptop use.

Determine 5: Anthropic’s warning concerning the utilization of Laptop use.

Anthropic Laptop use

Laptop use makes quite a few API requires a single activity. Working an agent with pc use is gradual.

We initially encountered issues as a consequence of Anthropic’s price limits, in Tier 1, Anthropic permits customers to make use of 50 API requests per minutes. This was not sufficient to complete our duties, so we wanted to run the immediate a number of occasions.

Then, we requested for the next API restrict and acquired the restrict inside hours which facilitated benchmarking.

Perplexity

Perplexity’s search instrument is accessible instantly on its web site. Like ChatGPT search, it’s not an agentic AI, we selected to incorporate it in our testing since our benchmark activity includes internet scraping.

ChatGPT search

ChatGPT’s search characteristic is offered to professional and workforce customers instantly inside the ChatGPT interface. Though it’s not an agentic AI, we included it in our testing as a result of the main focus of this benchmark is internet scraping.

Dendrite

Dendrite gives examples brokers like knowledge extraction agent on their web site which facilitates constructing new brokers.

Dendrite’s brokers are operating slower than many of the different brokers on this benchmark.

Not like different brokers, it requires customers to enter the search question.

Phidata

Phidata gives examples like internet search agent on their web site to make it straightforward to construct new brokers. We developed an agent in minutes.

Phidata’s brokers hallucinated ends in our benchmark offering hyperlinks to pages and pricing data that don’t exist.

FAQ

What are the AI agent functions and use instances?

AI brokers can automate advanced workflows, lowering the necessity for human intervention and growing effectivity. They will deal with exceptions and edge instances, making them extra dependable than conventional automation options.
AI brokers can carry out duties that might be troublesome or boring to people. They may also be used for pure language processing, knowledge processing, and evaluation.

Find out how to construct your individual brokers?

Select a vendor by contemplating your wants, skills and their costs.
They are often built-in with exterior techniques utilizing API calls and might entry a variety of knowledge sources.
Design the duty in your ai agent, it’s best to have the ability to present a immediate which is goal-oriented and never complicated to the mannequin.

Do AI brokers safe?

AI brokers should be designed with knowledge privateness and safety in thoughts, utilizing strategies comparable to encryption and entry controls. In present stage of improvement, we propose you to not share your delicate knowledge with the substitute intelligence brokers.

What are the enterprise advantages of AI brokers?

AI brokers can enhance effectivity and productiveness, automating repetitive duties and releasing up human brokers to deal with extra advanced duties.
They will analyze enterprise knowledge and automate enterprise processes. If it’s essential be taught extra, see agentic process automation. By constructing autonomous brokers, you possibly can automate processes and have extra duties accomplished.

Find out how to measure the success of AI brokers?

In the event you use an agent in your small business, use metrics comparable to effectivity, productiveness, and buyer satisfaction to measure the success of AI brokers.
Monitor the efficiency of AI brokers over time, making changes as wanted.
Use knowledge and analytics to supply insights into the decision-making processes and reliability of AI brokers.

Exterior Hyperlinks

Source link

#Brokers #Benchmarked #Worth #Efficiency

Previous Post

Netflix’s Mike Verdu reveals new role as VP of GenAI for Games

Next Post

Bridging Vision and Genomics for Biodiversity Monitoring at Scale

AiNEWS2025

AiNEWS2025

Next Post
Bridging Vision and Genomics for Biodiversity Monitoring at Scale

Bridging Vision and Genomics for Biodiversity Monitoring at Scale

Stay Connected test

  • 23.9k Followers
  • 99 Subscribers
  • Trending
  • Comments
  • Latest
A tiny new open source AI model performs as well as powerful big ones

A tiny new open source AI model performs as well as powerful big ones

0
Water Cooler Small Talk: The Birthday Paradox 🎂🎉 | by Maria Mouschoutzi, PhD | Sep, 2024

Water Cooler Small Talk: The Birthday Paradox 🎂🎉 | by Maria Mouschoutzi, PhD | Sep, 2024

0
Ghost of Yōtei: The acclaimed Ghost of Tsushima is getting a sequel

Ghost of Yōtei: The acclaimed Ghost of Tsushima is getting a sequel

0
Best Headphones for Working Out (2024): Bose, Shokz, JLab

Best Headphones for Working Out (2024): Bose, Shokz, JLab

0
Best 5 AI semantic reasoning tools for databases

Best 5 AI semantic reasoning tools for databases

2026-01-10
What new legal challenges mean for the future of US offshore wind

What new legal challenges mean for the future of US offshore wind

2026-01-10
Data Science Spotlight: Selected Problems from Advent of Code 2025

Data Science Spotlight: Selected Problems from Advent of Code 2025

2026-01-10
SpaceX gets FCC permission to launch another 7,500 Starlink satellites

SpaceX gets FCC permission to launch another 7,500 Starlink satellites

2026-01-10

Recent News

Best 5 AI semantic reasoning tools for databases

Best 5 AI semantic reasoning tools for databases

2026-01-10
What new legal challenges mean for the future of US offshore wind

What new legal challenges mean for the future of US offshore wind

2026-01-10
Data Science Spotlight: Selected Problems from Advent of Code 2025

Data Science Spotlight: Selected Problems from Advent of Code 2025

2026-01-10
SpaceX gets FCC permission to launch another 7,500 Starlink satellites

SpaceX gets FCC permission to launch another 7,500 Starlink satellites

2026-01-10
Footer logo

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow Us

Browse by Category

  • AI & Cloud Computing
  • AI & Cybersecurity
  • AI & Sentiment Analysis
  • AI Applications
  • AI Ethics
  • AI Future Predictions
  • AI in Education
  • AI in Fintech
  • AI in Gaming
  • AI in Healthcare
  • AI in Startups
  • AI Innovations
  • AI News
  • AI Research
  • AI Tools & Automation
  • Apps
  • AR/VR & AI
  • Business
  • Deep Learning
  • Emerging Technologies
  • Entertainment
  • Fashion
  • Food
  • Gadget
  • Gaming
  • Health
  • Lifestyle
  • Machine Learning
  • Mobile
  • Movie
  • Music
  • News
  • Politics
  • Review
  • Robotics & Smart Systems
  • Science
  • Sports
  • Startup
  • Tech
  • Travel
  • World

Recent News

Best 5 AI semantic reasoning tools for databases

Best 5 AI semantic reasoning tools for databases

2026-01-10
What new legal challenges mean for the future of US offshore wind

What new legal challenges mean for the future of US offshore wind

2026-01-10
  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2026 JNews - Premium WordPress news & magazine theme by Jegtheme.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result

© 2026 JNews - Premium WordPress news & magazine theme by Jegtheme.