Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Škoda’s New EV Will Likely Be Its Most Expensive Yet

    June 14, 2026

    As Anthropic suspends access to new models, India debates its AI future

    June 14, 2026

    Meta reportedly moves to unwind $2B Manus deal after Beijing’s demand

    June 14, 2026
    Facebook Twitter Instagram
    • Tech
    • Gadgets
    • Spotlight
    • Gaming
    Facebook Twitter Instagram
    iGadgets TechiGadgets Tech
    Subscribe
    • Home
    • Gadgets
    • Insights
    • Apps

      As Anthropic suspends access to new models, India debates its AI future

      June 14, 2026

      Meta reportedly moves to unwind $2B Manus deal after Beijing’s demand

      June 14, 2026

      KPMG pulls report on AI usage due to apparent hallucinations

      June 13, 2026

      Amazon CEO reportedly raised Anthropic model concerns before government crackdown

      June 13, 2026

      This thin under-pillow speaker helped me fall asleep without earbuds

      June 13, 2026
    • Gear
    • Mobiles
      1. Tech
      2. Gadgets
      3. Insights
      4. View All

      Škoda’s New EV Will Likely Be Its Most Expensive Yet

      June 14, 2026

      The FCC Wants to Kill Burner Phones

      June 13, 2026

      EcoFlow PowerOcean Battery Review: Cutting My Bill in Half

      June 13, 2026

      Meet the New Dyson Vacuums: V16 Piston Animal, V10 Konical, V8 Cyclone (2026)

      June 13, 2026

      March Update May Have Weakened The Haptics For Pixel 6 Users

      April 2, 2022

      Project 'Diamond' Is The Galaxy S23, Not A Rollable Smartphone

      April 2, 2022

      The At A Glance Widget Is More Useful After March Update

      April 2, 2022

      Pre-Order The OnePlus 10 Pro For Just $1 In The US

      April 2, 2022

      Motorola Edge+ Review: It Checks A Lot Of Boxes

      April 2, 2022

      This Smartphone Concept Design Is Different… In A Good Way

      April 2, 2022

      Twitter Just Made Searching Your Direct Messages Better

      April 2, 2022

      That Netflix Price Hike Is Starting To Take Place

      April 2, 2022

      Latest Huawei Mobiles P50 and P50 Pro Feature Kirin Chips

      January 15, 2021

      Samsung Galaxy M62 Benchmarked with Galaxy Note10’s Chipset

      January 15, 2021
      9.1

      Review: T-Mobile Winning 5G Race Around the World

      January 15, 2021
      8.9

      Samsung Galaxy S21 Ultra Review: the New King of Android Phones

      January 15, 2021
    • Computing
    iGadgets TechiGadgets Tech
    Home»Apps»Running AI models is turning into a memory game
    Apps

    Running AI models is turning into a memory game

    adminBy adminFebruary 18, 2026No Comments3 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Running AI models is turning into a memory game
    Share
    Facebook Twitter LinkedIn Pinterest Email

    When we talk about the cost of AI infrastructure, the focus is usually on Nvidia and GPUs — but memory is an increasingly important part of the picture. As hyperscalers prepare to build out billions of dollars’ worth of new data centers, the price for DRAM chips has jumped roughly 7x in the last year.

    At the same time, there’s a growing discipline in orchestrating all that memory to make sure the right data gets to the right agent at the right time. The companies that master it will be able to make the same queries with fewer tokens, which can be the difference between folding and staying in business.

    Semiconductor analyst Doug O’Laughlin has an interesting look at the importance of memory chips on his Substack, where he talks with Val Bercovici, chief AI officer at Weka. They’re both semiconductor guys, so the focus is more on the chips than the broader architecture; the implications for AI software are pretty significant too.

    I was particularly struck by this passage, in which Bercovici looks at the growing complexity of Anthropic’s prompt-caching documentation:

    The tell is if we go to Anthropic’s prompt caching pricing page. It started off as a very simple page six or seven months ago, especially as Claude Code was launching — just “use caching, it’s cheaper.” Now it’s an encyclopedia of advice on exactly how many cache writes to pre-buy. You’ve got 5-minute tiers, which are very common across the industry, or 1-hour tiers — and nothing above. That’s a really important tell. Then of course you’ve got all sorts of arbitrage opportunities around the pricing for cache reads based on how many cache writes you’ve pre-purchased.

    The question here is how long Claude holds your prompt in cached memory: You can pay for a 5-minute window, or pay more for an hour-long window. It’s much cheaper to draw on data that’s still in the cache, so if you manage it right, you can save an awful lot. There is a catch though: Every new bit of data you add to the query may bump something else out of the cache window.

    This is complex stuff, but the upshot is simple enough: Managing memory in AI models is going to be a huge part of AI going forward. Companies that do it well are going to rise to the top.

    And there is plenty of progress to be made in this new field. Back in October, I covered a startup called Tensormesh that was working on one layer in the stack known as cache optimization.

    Techcrunch event

    Boston, MA
    |
    June 23, 2026

    Opportunities exist in other parts of the stack. For instance, lower down the stack, there’s the question of how data centers are using the different types of memory they have. (The interview includes a nice discussion of when DRAM chips are used instead of HBM, although it’s pretty deep in the hardware weeds.) Higher up the stack, end users are figuring out how to structure their model swarms to take advantage of the shared cache.

    As companies get better at memory orchestration, they’ll use fewer tokens and inference will get cheaper. Meanwhile, models are getting more efficient at processing each token, pushing the cost down still further. As server costs drop, a lot of applications that don’t seem viable now will start to edge into profitability.

    AI,Anthropic,Claude,dram,Exclusive,inference costsAnthropic,Claude,dram,Exclusive,inference costs#Running #models #turning #memory #game1771400924

    Anthropic Claude dram Exclusive Game inference costs Memory models Running Turning
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    admin
    • Website
    • Tumblr

    Related Posts

    As Anthropic suspends access to new models, India debates its AI future

    June 14, 2026

    Meta reportedly moves to unwind $2B Manus deal after Beijing’s demand

    June 14, 2026

    KPMG pulls report on AI usage due to apparent hallucinations

    June 13, 2026
    Add A Comment

    Leave A Reply Cancel Reply

    Editors Picks
    8.5

    Apple Planning Big Mac Redesign and Half-Sized Old Mac

    January 5, 2021

    Autonomous Driving Startup Attracts Chinese Investor

    January 5, 2021

    Onboard Cameras Allow Disabled Quadcopters to Fly

    January 5, 2021
    Top Reviews
    9.1

    Review: T-Mobile Winning 5G Race Around the World

    By admin
    8.9

    Samsung Galaxy S21 Ultra Review: the New King of Android Phones

    By admin
    8.9

    Xiaomi Mi 10: New Variant with Snapdragon 870 Review

    By admin
    Advertisement
    Demo
    iGadgets Tech
    Facebook Twitter Instagram Pinterest Vimeo YouTube
    • Home
    • Tech
    • Gadgets
    • Mobiles
    • Our Authors
    © 2026 ThemeSphere. Designed by WPfastworld.
    "korean kbj​ "korean bj "koreanbj​

    Type above and press Enter to search. Press Esc to cancel.