Salesforce Unveils SCUBA: Redefining Agentic AI and Enterprise Automation

AI in the enterprise is no longer about conversation; it’s about capability. Businesses want systems that don’t just understand intent but get things done inside real software environments. That’s the promise of agentic AI: intelligent agents that can navigate, act, and automate within platforms like Salesforce. With the launch of SCUBA (the Salesforce Computer Use Benchmark), Salesforce sets a new standard for enterprise AI agents in real-world, action-oriented environments.

We are witnessing a significant shift from generative chatbots and assistants that only communicate to truly autonomous agents that can take action. So, let’s begin by knowing the era of agentic AI, where systems don’t just respond but operate workflows within business software.

In this blog, we’ll explore how SCUBA redefines what it means for an AI agent to be enterprise-ready, why this matters for CRM automation, and how your organization can prepare for this new frontier.

What is Agentic AI, and Why it Matters?

For years, enterprise software has incorporated AI; think predictive analytics, smart suggestions, and chatbots. But truly, agentic AI means the system goes a step further: it navigates the UI, manipulates data, triggers workflows, and solves operational problems within enterprise software. These are enterprise AI agents in the truest sense.

Why is this important? Because companies are no longer satisfied with some insights, they want AI that can execute. Automation of actual business processes such as sales, service, and administration is now table stakes. This is the essence of enterprise automation AI.

However, building agents that reliably succeed across real workflows is challenging. That’s where an AI benchmark enterprise designers and implementers can rely on becomes essential. SCUBA is precisely that.

Let’s dive into what is SCUBA!

Introducing the Salesforce SCUBA Benchmark

Get an inside look at the Salesforce SCUBA benchmark, the first framework designed to measure how AI agents in CRM handle real-world automation tasks.

Under the banner of Salesforce agentic AI, SCUBA is the benchmark created by Salesforce to evaluate how well agents can perform in complex enterprise workflows. See how it’s redefining the standards for AI benchmark enterprise innovation.

Key aspects include:

300 task instances were derived from real user interviews (platform admins, sales reps, and service agents) within the Salesforce CRM ecosystem.
Tasks cover UI navigation, data manipulation, workflow automation, and troubleshooting within business software.
Metrics go beyond success or failure: they include latency, cost (API/token usage), step count and process milestones.
With SCUBA, Salesforce explicitly frames a path for measuring agentic enterprise workflows, setting expectations for what next-gen automation agents should deliver.

Why SCUBA Fills a Critical Gap

Let’s find out how SCUBA bridges the gap between lab-based testing and true agentic enterprise workflows. Learn why this benchmark brings unmatched accuracy for evaluating Salesforce agentic AI in live business environments.

Before SCUBA, many benchmarks focused on generic web navigation or desktop automation, not the rigours of enterprise software. For example, previous benchmarks didn’t reflect the complexity of CRM, service workflows, or sales automation environments.

Thus:

SCUBA targets agentic enterprise workflows within CRM settings, making it highly relevant.
It enables evaluation of AI agents in a CRM context, not just text but UI-action-based.
It helps organizations assess whether agents built for enterprise software automation will truly work, not just in demos but in live settings.

Key Findings from SCUBA: What the Research Says

Let’s have a look at research insights that reveal how human demonstrations and workflow-specific benchmarks improve agentic AI performance. Understand what metrics truly define success in enterprise software automation with AI.

Domain Shift is Hard: The research shows that agents trained on general desktop or web benchmarks see a substantial drop in performance when moved into enterprise software settings like CRM with workflows.

Demonstrations Help: Human demonstrations of tasks improved agent success rates, reduced latency, and lowered cost in the demonstration-augmented setting.

Metrics Beyond Success: SCUBA emphasises metrics such as time to completion, API/token cost, number of step,s and workflow efficiency, not just “did the agent succeed?” These are critical for enterprise deployments.

SCUBA Implications for Enterprise Automation AI

See how SCUBA’s results reshape strategies for deploying enterprise AI agents in CRM, sales, and service. Learn how this benchmark helps businesses optimize speed, cost, and workflow precision through AI-driven automation.

For CRM and Service Teams: With SCUBA’s focus on CRM workflows, organizations can now benchmark whether agents in their systems (such as ones built on Salesforce) will handle tasks in real-world conditions. For example: “real-world agentic AI for CRM admin sales service workflows”.

For Implementation Strategy: The research suggests you’ll need more than a flashy demo. You’ll need human demonstrations, workflow logging, agent monitoring for latency/cost, and step-count tracking. This gives a roadmap for how to deploy automation properly.

For Software Vendors: As enterprise software vendors observe benchmarks like SCUBA, they may evolve UI and workflow design to become more “agent-friendly” (e.g., more structured actions, better logs). This will speed adoption of agentic automation inside software systems.

SCUBA in Context: Agentic AI vs Traditional Automation

Traditional automation tools (RPA, macros, rule-based bots) typically handle repetitive, well-defined tasks. In contrast, agentic AI aims to handle complex, multi-step workflows that require reasoning, decision-making, action sequencing and UI interactions.

With SCUBA as the testing ground, we can compare “agentic AI vs traditional automation in enterprise software” in tangible terms:

Traditional automation might click a canned button; an agentic AI might decide which button, adjust filters, update records, and trigger dependent workflows.
Traditional automation is brittle when UI changes; agentic AI must be robust, adapt, and troubleshoot.
SCUBA’s results indicate that even the best agents currently still face significant gaps when moving into enterprise settings.

If your plan is enterprise-scale automation, it’s time to think agentic, not just automated.

How to Prepare Your Organization for Agentic Enterprise Workflows With SCUBA

To future-proof your organization for agentic enterprise workflows, you first have to learn by integrating benchmarks, demonstrations, and data-driven metrics. Build your roadmap for sustainable enterprise automation AI success.

Map end-to-end workflows: Understand your sales/admin/service flows inside CRM, so you can frame tasks the way SCUBA does.

Introduce human demonstration pipelines: Collect demonstration data of workflows, in the same spirit as SCUBA’s demonstration-augmented tasks.

Measure more than success: Track steps, latency, cost, and error recovery. Use a benchmark mindset.

Build agent-friendly infrastructure: Ensure your CRM/enterprise software supports structured actions, logs, state visibility, and integrations so your agentic AI can work effectively.

Pilot early, iterate fast: Run small-scale deployments in sandbox environments, measure, refine, then scale.

By aligning your strategy with what SCUBA emphasises, you’ll be ahead of many organisations that still rely on traditional automation.

SCUBA Real-World Use Cases: From Sales to Service

See how real-world agentic AI for CRM admin, sales, and service workflows is already driving results. Understand the practical impact of Salesforce agentic AI across departments and everyday business operations.

Sales automation: An enterprise AI agent might navigate to the opportunity record, update fields, adjust forecast categories, trigger approval workflows, and notify the sales manager.

Service support: An agent might triage a service case, update related objects, add knowledge article links, escalate when required—all autonomously.

Admin tasks: The agent could create dashboards, set filters, update user roles, and generate reports in the CRM. SCUBA includes tasks derived from those very roles.

These are no longer “nice-to-haves” but high-impact areas where agentic AI can deliver value in enterprise automation AI.

Challenges Ahead and How SCUBA Helps

Even with SCUBA, challenges remain:

UI changes, versioning, and permission issues still trip up agents.
Grounding (making sure the agent correctly understands what UI element to click) remains difficult.
Cost/latency trade-offs: higher success agents may cost more or take longer.

SCUBA helps by exposing these gaps and providing measurable data on what works and what doesn’t. If your organisation understands these limitations upfront, you’re better equipped for success.

Conclusion

The launch of SCUBA marks a pivotal moment in the evolution of enterprise AI agents. By shifting focus from “can the agent talk?” to “can the agent act?” within real CRM and business-software workflows, the benchmark sets new expectations for what automation can achieve.

For enterprise automation AI programmes built on platforms like Salesforce, this is your call to upgrade your thinking: design for agentic workflows, measure comprehensively, and build for action. The future of CRM is here, where agents don’t just assist; they perform.

Share This Blog

What is Agentic AI, and Why it Matters?
Introducing the Salesforce SCUBA Benchmark
Key aspects include:
Why SCUBA Fills a Critical Gap
Key Findings from SCUBA: What the Research Says
SCUBA Implications for Enterprise Automation AI
SCUBA in Context: Agentic AI vs Traditional Automation
How to Prepare Your Organization for Agentic Enterprise Workflows With SCUBA
SCUBA Real-World Use Cases: From Sales to Service
Challenges Ahead and How SCUBA Helps
Conclusion

Salesforce Unveils SCUBA: The New Benchmark Redefining Agentic AI for Enterprise Automation