Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    The Strait of Hormuz Has Been Closed for 100 Days. Why Aren’t Oil Prices Higher?

    June 14, 2026

    Škoda’s New EV Will Likely Be Its Most Expensive Yet

    June 14, 2026

    As Anthropic suspends access to new models, India debates its AI future

    June 14, 2026
    Facebook Twitter Instagram
    • Tech
    • Gadgets
    • Spotlight
    • Gaming
    Facebook Twitter Instagram
    iGadgets TechiGadgets Tech
    Subscribe
    • Home
    • Gadgets
    • Insights
    • Apps

      As Anthropic suspends access to new models, India debates its AI future

      June 14, 2026

      Meta reportedly moves to unwind $2B Manus deal after Beijing’s demand

      June 14, 2026

      KPMG pulls report on AI usage due to apparent hallucinations

      June 13, 2026

      Amazon CEO reportedly raised Anthropic model concerns before government crackdown

      June 13, 2026

      This thin under-pillow speaker helped me fall asleep without earbuds

      June 13, 2026
    • Gear
    • Mobiles
      1. Tech
      2. Gadgets
      3. Insights
      4. View All

      The Strait of Hormuz Has Been Closed for 100 Days. Why Aren’t Oil Prices Higher?

      June 14, 2026

      Škoda’s New EV Will Likely Be Its Most Expensive Yet

      June 14, 2026

      The FCC Wants to Kill Burner Phones

      June 13, 2026

      EcoFlow PowerOcean Battery Review: Cutting My Bill in Half

      June 13, 2026

      March Update May Have Weakened The Haptics For Pixel 6 Users

      April 2, 2022

      Project 'Diamond' Is The Galaxy S23, Not A Rollable Smartphone

      April 2, 2022

      The At A Glance Widget Is More Useful After March Update

      April 2, 2022

      Pre-Order The OnePlus 10 Pro For Just $1 In The US

      April 2, 2022

      Motorola Edge+ Review: It Checks A Lot Of Boxes

      April 2, 2022

      This Smartphone Concept Design Is Different… In A Good Way

      April 2, 2022

      Twitter Just Made Searching Your Direct Messages Better

      April 2, 2022

      That Netflix Price Hike Is Starting To Take Place

      April 2, 2022

      Latest Huawei Mobiles P50 and P50 Pro Feature Kirin Chips

      January 15, 2021

      Samsung Galaxy M62 Benchmarked with Galaxy Note10’s Chipset

      January 15, 2021
      9.1

      Review: T-Mobile Winning 5G Race Around the World

      January 15, 2021
      8.9

      Samsung Galaxy S21 Ultra Review: the New King of Android Phones

      January 15, 2021
    • Computing
    iGadgets TechiGadgets Tech
    Home»Apps»New Microsoft tool lets devs spin up AI behavior tests using text descriptions
    Apps

    New Microsoft tool lets devs spin up AI behavior tests using text descriptions

    adminBy adminJune 2, 2026No Comments3 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    New Microsoft tool lets devs spin up AI behavior tests using text descriptions
    Share
    Facebook Twitter LinkedIn Pinterest Email

    AI researchers and labs have advanced by leaps and bounds in evaluating AI models for everything from safety and compliance to sycophancy and alignment. But it appears companies and developers are faced with a new, specific need: making sure their AI system behaves as intended for their specific product or service.

    In a bid to make that testing process simpler, Microsoft on Tuesday took the wraps off ASSERT, short for Adaptive Spec-driven Scoring for Evaluation and Regression Testing.

    The open source framework, Microsoft says, makes evaluating application-specific AI behavior easy by using AI to turn high-level, natural-language descriptions of goals, policies, or intended behaviors into thorough, scored tests that can be investigated.

    ASSERT takes plain-language descriptions of an AI model’s expected behavior and policies, turns them into a structured set of acceptable and unacceptable behaviors, generates problem scenarios and test cases, runs them against the target system, and scores the results. It can also record the paths the AI system takes, including intermediate actions and tool calls, so developers can inspect where failures happen.

    Devs can provide system context, tools, and constraints, too, if they want to further customize what the evaluations cover.

    For example, a developer could specify that a document research AI agent shouldn’t send emails to people outside the company, and it should limit confidential information to C-level executives and provide concise summaries with prior context in mind. ASSERT will use those rules to generate test cases that check whether the system follows those rules on an ongoing basis.

    New Microsoft tool lets devs spin up AI behavior tests using text descriptions插图
    Image Credits:Microsoft

    The framework, according to Microsoft, fills a gap that broader, more general evaluations cannot when AI models are intended to behave in a manner that is shaped by an application or product’s context, policies, and tools.

    “One of the things we’ve learned is that evaluations are absolutely critical to making good decisions,” said Sarah Bird, chief product officer of Responsible AI at Microsoft. “Because if you don’t understand the behavior of the AI system, it’s really hard to know if it’s meeting your organization’s bar … What we found is that if you really want to have a trustworthy system, you should evaluate many more dimensions that are application-specific.”

    Bird said ASSERT can be used to evaluate systems when they’re being built, after deployment, and even for continuous monitoring.

    The release comes amidst a gradual but broader shift in the AI industry. As models grow more capable, researchers are focusing on repeatable testing and regression checks, with Stanford’s HELM, MLCommons’ AILuminate, and evaluation groups like METR rolling out benchmarks to measure how models behave under different conditions.

    When you purchase through links in our articles, we may earn a small commission. This doesn’t affect our editorial independence.

    AI,ai evaluations,AI regression testing,Microsoftai evaluations,AI regression testing,Microsoft#Microsoft #tool #lets #devs #spin #behavior #tests #text #descriptions1780436087

    ai evaluations AI regression testing behavior descriptions devs lets Microsoft spin tests text tool
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    admin
    • Website
    • Tumblr

    Related Posts

    As Anthropic suspends access to new models, India debates its AI future

    June 14, 2026

    Meta reportedly moves to unwind $2B Manus deal after Beijing’s demand

    June 14, 2026

    KPMG pulls report on AI usage due to apparent hallucinations

    June 13, 2026
    Add A Comment

    Leave A Reply Cancel Reply

    Editors Picks
    8.5

    Apple Planning Big Mac Redesign and Half-Sized Old Mac

    January 5, 2021

    Autonomous Driving Startup Attracts Chinese Investor

    January 5, 2021

    Onboard Cameras Allow Disabled Quadcopters to Fly

    January 5, 2021
    Top Reviews
    9.1

    Review: T-Mobile Winning 5G Race Around the World

    By admin
    8.9

    Xiaomi Mi 10: New Variant with Snapdragon 870 Review

    By admin
    8.9

    Samsung Galaxy S21 Ultra Review: the New King of Android Phones

    By admin
    Advertisement
    Demo
    iGadgets Tech
    Facebook Twitter Instagram Pinterest Vimeo YouTube
    • Home
    • Tech
    • Gadgets
    • Mobiles
    • Our Authors
    © 2026 ThemeSphere. Designed by WPfastworld.
    "korean kbj​ "korean bj "koreanbj​

    Type above and press Enter to search. Press Esc to cancel.