This page contains press release content distributed by XPR Media. Members of the editorial and news staff of the USA TODAY Network were not involved in the creation of this content.

AIM Intelligence and BMW Group Examine Gaps in Evaluating Enterprise AI Policy Compliance

Research reveals LLMs follow allowlist policies but systematically fail to enforce organizational prohibitions, exposing a critical gap in enterprise AI safety

SF, CA, UNITED STATES, February 12, 2026 /EINPresswire.com/ — Seoul, South Korea / Munich, Germany – January 2026 – BMW Group and AIM Intelligence, a leading AI safety startup, today announced the publication of COMPASS (Company/Organization Policy Alignment Assessment), the first systematic framework for evaluating whether large language models (LLMs) comply with organization-specific policies. The research, now available on arXiv, reveals a critical gap that remains under-measured in current evaluation practices: models that pass standard safety benchmarks often fail dramatically when enforcing the nuanced, context-dependent rules that govern real-world business operations.

Why Enterprise AI Policies Break Down in Practice

As organizations across healthcare, finance, automotive, and government sectors rapidly adopt LLMs for customer-facing applications, the research team discovered a fundamental asymmetry that poses significant risks for policy-critical deployments.
Key Findings:
Strong Allowlist Compliance: Models reliably handle legitimate requests with over 95% accuracy
Critical Denylist Failures: Models fail to correctly refuse prohibited requests in up to 97% of cases
Catastrophic Adversarial Vulnerability: Under adversarial conditions, some models refuse fewer than 5% of policy-violating requests
“Most AI safety tests focus on whether a model behaves safely in general,” said Dasol Choi, AI Safety Researcher at AIM Intelligence. “COMPASS looks at a more practical question: can an AI system reliably follow the specific rules of an organization? Our findings show that, in many real-world deployments today, the answer is often no.”

Why Generic AI Safety Isn’t Enough

The research addresses a critical disconnect between how AI systems are evaluated and how they are deployed. While existing safety benchmarks focus on universal harms such as toxicity and violence, real enterprises operate under complex internal policies—compliance manuals, operational playbooks, legal edge cases, and brand-specific constraints.
COMPASS evaluates models across four dimensions that typical benchmarks ignore:
1. Policy Selection: Can the model identify which policy applies to a given situation?
2. Policy Interpretation: Can it reason through conditionals, exceptions, and vague clauses?
3. Conflict Resolution: When rules collide, does the model resolve conflicts as the organization intends?
4. Justification: Can the model ground its decisions in actual policy text?

“Our evaluation revealed a striking asymmetry,” noted DongGeon Lee, AI Safety Researcher at AIM Intelligence. “While models achieve near-perfect accuracy on what they can do, they remain structurally vulnerable in enforcing what they must not do. This gap persists across model scales and architectures, indicating that scaling alone cannot solve the problem.”

Industry-Scale Validation

The research team applied COMPASS across eight diverse industry scenarios—Automotive, Government, Financial, Healthcare, Travel, Telecom, Education, and Recruiting—generating and validating 5,920 queries that test both routine compliance and adversarial robustness. Fifteen state-of-the-art models were evaluated, including leading proprietary and open-source systems.

Making Misalignment Measurable

Perhaps the most significant contribution of COMPASS is transforming alignment from a philosophical concern into an engineering problem. The framework and benchmark datasets are publicly available on GitHub and Hugging Face, enabling organizations to evaluate their AI systems against their own policies.

About the Research Collaboration

This research represents a collaboration between AIM Intelligence, BMW Group, Yonsei University, Pohang University of Science and Technology, and Seoul National University. The full paper, “COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs,” is available at https://arxiv.org/abs/2601.01836.

About AIM Intelligence

AIM Intelligence is a Seoul-based AI safety company specializing in automated red-teaming, real-time guardrails, and AI monitoring solutions. Founded in 2024, AIM Intelligence serves major enterprises and conducts research across large language models, multimodal systems, autonomous agents, and emerging physical AI. The company has published over 15 research papers at top-tier conferences including ICML, ACL, NeurIPS, and IEEE.

Team Cookie Official
Team Cookie
email us here
Visit us on social media:
LinkedIn
Facebook

Legal Disclaimer:

EIN Presswire provides this news content “as is” without warranty of any kind. We do not accept any responsibility or liability
for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this
article. If you have any complaints or copyright issues related to this article, kindly contact the author above.

Information contained on this page is provided by an independent third-party content provider. XPRMedia and this Site make no warranties or representations in connection therewith. If you are affiliated with this page and would like it removed please contact pressreleases@xpr.media

Empowering Artists and Creatives: it.com Domains Launches is.art Social Handle Service

Empowering Artists and Creatives: it.com Domains Launches is.art Social Handle Service

LONDON, UNITED KINGDOM, February 17, 2026 /EINPresswire.com/ — it.com Domains, the global registry redefining how

February 21, 2026

Neobank Rizon raises $2M pre-seed round as global demand for dollar banking services rises

Neobank Rizon raises $2M pre-seed round as global demand for dollar banking services rises

Banking is going through a change that’s similar to what happened with entertainment when Netflix moved movies from

February 21, 2026

Ninja Dispatch Launches Client Dashboard to Bridge the Gap Between In-House and Outsourced Night Dispatch

Ninja Dispatch Launches Client Dashboard to Bridge the Gap Between In-House and Outsourced Night Dispatch

Ninja Dispatch launches My Ninja Dispatch dashboard, giving fleet operators live visibility into overnight dispatch

February 21, 2026

Florida State Board Approves Klett World Languages Reporteros and Reporters Francophones Florida Editions

Florida State Board Approves Klett World Languages Reporteros and Reporters Francophones Florida Editions

State-approved Spanish and French programs aligned to Florida standards, supporting proficiency, assessment, and

February 21, 2026

Aquaponics USA Taps Dr. James V. Hardt as Advisory Board Chair

Aquaponics USA Taps Dr. James V. Hardt as Advisory Board Chair

Aquaponics USA appoints Dr. James V. Hardt, renowned Neurofeedback Expert & Biocybernaut Institute Founder, as

February 21, 2026

KIOSK Information Systems and AOPEN Solve the ‘Digital Front Door’ Reliability Crisis at HIMSS 2026

KIOSK Information Systems and AOPEN Solve the ‘Digital Front Door’ Reliability Crisis at HIMSS 2026

KIOSK + AOPEN: The fanless, commercial-grade solution for reliable healthcare patient engagement. Zero-failure hardware

February 21, 2026

InSource Solutions Group Announces Acquisition of AVEVA Select California

InSource Solutions Group Announces Acquisition of AVEVA Select California

InSource Solutions Group Announces Acquisition of AVEVA Select California, Expanding Its Reach to Become the Largest

February 21, 2026

TEXTURED & THRIVING: HUE AFFAIR CELEBRATES 10 YEARS OF AMPLIFYING BLACK BEAUTY

TEXTURED & THRIVING: HUE AFFAIR CELEBRATES 10 YEARS OF AMPLIFYING BLACK BEAUTY

Celebrating 10 years of HUE is really about celebrating the community that built it and continues to fuel it,”— Ylorie

February 21, 2026

AOPEN and KIOSK Information Systems Deliver Deployment-Ready Healthcare Solutions at HIMSS 2026

AOPEN and KIOSK Information Systems Deliver Deployment-Ready Healthcare Solutions at HIMSS 2026

Stop downtime & infection risk. KIOSK & AOPEN offer fanless, commercial-grade, deployment-ready patient

February 21, 2026

Sprint Data Solutions Launches Verified Gold Coin Buyers Mailing List for Precision Marketing

Sprint Data Solutions Launches Verified Gold Coin Buyers Mailing List for Precision Marketing

Highly segmented gold coin buyer data empowers dealers, investment firms and precious metals marketers This gold coin

February 21, 2026

erm4sn 6.0 Adds Cross-Instance Drift Detection and CMDB Intelligence

erm4sn 6.0 Adds Cross-Instance Drift Detection and CMDB Intelligence

New release improves documentation, governance visibility, and architectural alignment for ServiceNow at scale. There

February 21, 2026

Loudoun County Homeowners Are Choosing Remodeling Over Moving, New Projects Show

Loudoun County Homeowners Are Choosing Remodeling Over Moving, New Projects Show

LOUDOUN, VA, UNITED STATES, February 17, 2026 /EINPresswire.com/ — As housing inventory remains tight and mortgage

February 21, 2026

Forward Edge-AI and Angkasa-X Announce Strategic Collaboration to Advance LEO Connectivity and Space Infrastructure

Forward Edge-AI and Angkasa-X Announce Strategic Collaboration to Advance LEO Connectivity and Space Infrastructure

Agreement Advances Satellite Internet Connectivity and Space Infrastructure to Establish an ASEAN Space Economy SAN

February 21, 2026

McCarthy & Akers Serves as Platinum Sponsor for 2025 Walk for Life Fundraiser

McCarthy & Akers Serves as Platinum Sponsor for 2025 Walk for Life Fundraiser

McCarthy & Akers, PLC Reinforces its Support for Organizations Dedicated to Compassionate Care as a Platinum

February 21, 2026

Azuki Expands Cultural Presence Through Paris Fashion Week

Azuki Expands Cultural Presence Through Paris Fashion Week

Azuki, led by Alex Xu (Zagabond), expands its cultural presence through Paris Fashion Week, connecting fashion, design,

February 21, 2026

Mobility Health Physical Therapy Unveils New Report on Physical Therapy’s Role in Spinal Cord Injury Recovery

Mobility Health Physical Therapy Unveils New Report on Physical Therapy’s Role in Spinal Cord Injury Recovery

NEW YORK CITY, NY, UNITED STATES, February 17, 2026 /EINPresswire.com/ — NYC's Mobility Health Physical Therapy is

February 21, 2026

Indianapolis Rehabilitation Hospital Appoints Dr. Emily Hogancamp as Medical Director

Indianapolis Rehabilitation Hospital Appoints Dr. Emily Hogancamp as Medical Director

Her clinical leadership and medical oversight will further strengthen the patient outcomes and care delivery we are

February 21, 2026

Tucson Rehabilitation Hospital Names Dr. Peter Lux as Medical Director

Tucson Rehabilitation Hospital Names Dr. Peter Lux as Medical Director

His specialty in brain injury rehabilitation is especially valuable as we build out the best rehabilitation services

February 21, 2026

Noblie Announces Porcelain Chess Sets for Luxury Interiors and Collectors

Noblie Announces Porcelain Chess Sets for Luxury Interiors and Collectors

Noblie debuts Porcelain Chess Sets with biscuit or glazed finishes and overglaze decoration – crafted for collectors,

February 21, 2026

Project Arrow Selects Ottawa Infotainment to Lead Software Architecture for Canada’s 2040 Autonomous Mobility Vision

Project Arrow Selects Ottawa Infotainment to Lead Software Architecture for Canada’s 2040 Autonomous Mobility Vision

Ottawa Infotainment to Drive SDV, Cockpit, and Compute Architecture for Canada’s Flagship Vehicle Program TORONTO,

February 21, 2026

LSA Promotes Matthew Feeley to Vice President of Product Innovation

LSA Promotes Matthew Feeley to Vice President of Product Innovation

LSA promotes Matthew Feeley to VP of Product Innovation, recognizing his leadership in AI-driven product strategy,

February 21, 2026

Stricklands Ice Cream to Celebrate 90th Anniversary on March 6

Stricklands Ice Cream to Celebrate 90th Anniversary on March 6

The beloved deliciously different ice cream shop invites customers and community members to join the festivities with

February 21, 2026

Rise Above Research Forecasts a Global Surge in Photos Captured Driven by Smartphone Growth and AI Innovation

Rise Above Research Forecasts a Global Surge in Photos Captured Driven by Smartphone Growth and AI Innovation

The huge volume of digital photos creates opportunities for the entire photo industry. The key is helping people use

February 21, 2026

Luxury Institute Launches Relationship Mastery Index™: Breakthrough Assessment Tool for Hiring, Developing & Retaining

Luxury Institute Launches Relationship Mastery Index™: Breakthrough Assessment Tool for Hiring, Developing & Retaining

New Neuroscientist-validated assessment anticipates performance, reduces turnover, and protects trusted luxury brand

February 21, 2026

Diana Figueroa Launches Spanish-Language Program on PRN, Expanding Research-Based Health Access

Diana Figueroa Launches Spanish-Language Program on PRN, Expanding Research-Based Health Access

New weekly show delivers evidence-based nutrition insights, documented protocols, and live audience engagement for the

February 21, 2026

100 Women in Finance Announces Five New Global Board Members for 2026

100 Women in Finance Announces Five New Global Board Members for 2026

Celebrating 25 years of advancing gender equity in finance (2001–2026) NEW YORK, NY, UNITED STATES, February 17, 2026

February 21, 2026

Exit Planning Institute Releases Second Annual Exit Magazine

Exit Planning Institute Releases Second Annual Exit Magazine

The 2026 edition of the magazine includes interview with 2025 Exit Planner of the Year Joe Seetoo and focuses on

February 21, 2026

Adelphi University Selected as Flagship Launch Partner for LumiNicole’s On-Campus Beauty & Wellness Retail Expansion

Adelphi University Selected as Flagship Launch Partner for LumiNicole’s On-Campus Beauty & Wellness Retail Expansion

New partnership brings automated retail, internships, brand ambassadors and scholarships to Adelphi University's Garden

February 21, 2026

Tower Partners Launches Digital Marketing and Technology Platform Practice

Tower Partners Launches Digital Marketing and Technology Platform Practice

New Practice Builds on Strong 2025 Deal Momentum Tower Partners is positioned to lead the rapidly growing industry.

February 21, 2026

RENI Nears Completion on of Due Diligence on Strategic Acquisition

RENI Nears Completion on of Due Diligence on Strategic Acquisition

Final Phase Set to Begin Resilient Energy Inc. (OTCMKTS:RENI)HOUSTON, TX, UNITED STATES, February 17, 2026

February 21, 2026

AI Content Saturation Now Marketers’ #1 Concern

AI Content Saturation Now Marketers’ #1 Concern

AI content saturation tops marketers’ 2026 concerns at 29%, overshadowing declining search visibility, budget cuts, ROI

February 21, 2026

ACMP Celebrates International Women’s Day with Global Panel on Leadership, Service, and Growth

ACMP Celebrates International Women’s Day with Global Panel on Leadership, Service, and Growth

The Association of Change Management Professionals® (ACMP®) will mark International Women’s Day with a powerful global

February 21, 2026

The University of Georgia and MRII announce the launch of ‘Evolving Methods in Market Research’

The University of Georgia and MRII announce the launch of ‘Evolving Methods in Market Research’

A new addition to their expanding series of topically driven consumer insights courses We developed this new course

February 21, 2026

Signzy Partners with ViewTrade to Strengthen AML Compliance and Risk Mitigation

Signzy Partners with ViewTrade to Strengthen AML Compliance and Risk Mitigation

Signzy partners with ViewTrade to enable automated AML screening and continuous risk monitoring for stronger

February 21, 2026

Insurance Industry’s Walled Garden Is Breached: SUPERAGENT AI Launches New Universal Integrations For AMS, CRM & Dialers

Insurance Industry’s Walled Garden Is Breached: SUPERAGENT AI Launches New Universal Integrations For AMS, CRM & Dialers

SUPERAGENT AI Launches Universal Integration Ecosystem to Connect AI Agents with Major Insurance Platforms The number

February 21, 2026

FEDCON Adjusts Client Strategies in Response to Revolutionary Rewrite of Federal Acquisition Regulation

FEDCON Adjusts Client Strategies in Response to Revolutionary Rewrite of Federal Acquisition Regulation

FEDCON has initiated a comprehensive realignment of its client service frameworks to address the "FAR Overhaul"

February 21, 2026

Vibrant Dentistry Brings Personalized, Gentle Dental Care to Matthews, NC

Vibrant Dentistry Brings Personalized, Gentle Dental Care to Matthews, NC

Vibrant Dentistry in Matthews, NC delivers advanced, gentle dental care with personalized treatment, modern technology,

February 21, 2026

Sunny Glen Children’s Home Impact Numbers Show the Real Reach of Care in 2025

Sunny Glen Children’s Home Impact Numbers Show the Real Reach of Care in 2025

SAN BENITO, TX, UNITED STATES, February 17, 2026 /EINPresswire.com/ — Sunny Glen Children’s Home is releasing its 2025

February 21, 2026

Tech 42 launches open-source AI Agent Starter Pack in AWS Marketplace, reducing production deployment time to minutes

Tech 42 launches open-source AI Agent Starter Pack in AWS Marketplace, reducing production deployment time to minutes

Advanced Tier Services AWS Partner releases production-ready AI agent package built on Amazon Bedrock AgentCore to

February 21, 2026

Scale Shift Ventures Acquires Botdog.co in Seven-Figure Deal

Scale Shift Ventures Acquires Botdog.co in Seven-Figure Deal

Scale Shift Ventures acquires San Francisco based startup Botdog.co, a leader in AI-powered sales automation,

February 21, 2026