Hidden “Backdoors” In AI Models
Recent research shows that AI large language models (LLMs) can be quietly poisoned during training with hidden backdoors that create a serious and hard to detect supply chain security risk for organisations deploying them.
Sleeper Agent Backdoors
Researchers say sleeper agent backdoors in LLMs pose a security risk to organisations deploying AI systems because they can be embedded during training and evade detection in routine testing. Recent studies from Microsoft and the adversarial machine learning community show that poisoned models can behave normally in production, yet produce unsafe or malicious outputs when a trigger appears, with the behaviour embedded in the model’s parameters rather than in visible software code.
Embedded Threat
Unlike conventional software vulnerabilities, sleeper agent backdoors are embedded directly in a model’s weights, the numerical parameters that encode what the system has learned during training, which makes them difficult to detect using standard security tools. Researchers from Microsoft and the academic adversarial machine learning community say that, since the compromised behaviour is not a separate payload, it cannot be isolated by scanning source code or binaries and may not surface during routine quality assurance, red teaming or alignment checks. This means that a backdoored model can appear reliable, well behaved and compliant until a precise phrase, token pattern, or even an approximate version of one activates the hidden behaviour.
The Nature Of The Threat
Researchers from Microsoft, building on earlier academic work in adversarial machine learning, say in recent studies that the core risk posed by sleeper agent backdoors is the way they undermine trust in the AI supply chain as organisations become increasingly dependent on third party models. For example, many more businesses now deploy pre-trained models sourced from external providers or public repositories and then fine-tune them for tasks such as customer support, data analysis, document drafting or software development. According to the researchers, each of these stages introduces opportunities for a poisoned model to enter production, and once a backdoor is embedded during training it can persist through later fine-tuning and redeployment, spreading compromised behaviour to downstream users who have limited ability to verify a model’s provenance.
The threat is difficult to manage because neither model size nor apparent sophistication guarantees safety, and because the economics of the LLM market strongly favour reuse. In a report entitled “The Trigger in the Haystack”, Microsoft researchers highlight how LLMs are “trained on massive text corpora scraped from the public internet”, which increases the opportunity for adversaries to influence training data, and warn that compromising “a single widely used model can affect many downstream users”. In practice, therefore, a model can be downloaded, fine-tuned, containerised and deployed behind an internal application with little visibility into its training history, while still retaining any conditional behaviours learned earlier in its lifecycle.
How The Threat Differs From Conventional Software Attacks
The most important distinction between sleeper agent backdoors and conventional malware is where the malicious logic resides and how it is activated. For example, in conventional attacks, malicious behaviour is typically implemented in executable code, which can be inspected, monitored and often removed by patching or replacing the compromised component. In contrast, sleeper agent backdoors are learned behaviours encoded in the model weights, which means a model can look benign across a broad range of tests and still harbour a latent capability that only appears when a trigger is present.
A ‘Poisoned’ Model Can Pass A Normal Evaluation Test
This difference places pressure on existing security assurance methods because conventional approaches often depend on knowing what to look for. Microsoft’s research paper describes the central difficulty in practical terms, stating that “backdoored models behave normally under almost all conditions”. That dynamic makes it possible for a poisoned model to pass a typical evaluation suite, then be deployed into environments where it can handle sensitive data, generate code, or influence decisions, with the backdoor remaining dormant until the trigger condition is met.
Industry Awareness And Preparedness
The gap between AI adoption and security maturity is a recurring theme in Microsoft’s “Adversarial Machine Learning, Industry Perspectives” report, which draws on interviews with 28 organisations. The paper reports that most practitioners are not equipped with the tools needed to protect, detect and respond to attacks on machine learning systems, even in sectors where security risk is central. It also highlights how some security teams still prioritise familiar threats over model level attacks, with one security analyst quoted as saying, “Our top threat vector is spearphishing and malware on the box. This [adversarial ML] looks futuristic”.
The same report describes a widespread lack of operational readiness, stating that “22 out of the 25” organisations that answered the question said they did not have the right tools in place to secure their ML systems and were explicitly looking for guidance. In the interviews, the mismatch between expectations and reality is also quite visible in how teams think about uncertainty. For example, one interviewee is quoted as saying, “Traditional software attacks are a known unknown. Attacks on our ML models are unknown unknown”. This lack of clarity matters because sleeper agent backdoors are not a niche academic edge case, but are a supply chain style risk that becomes more consequential as models are embedded into core business processes.
How Sleeper Agent Backdoors Were Identified
Backdoors in machine learning have been studied for years, but sleeper agent backdoors in large language models drew heightened attention after research published by Anthropic in 2024 showed that these models can retain malicious behaviours even after extensive safety training. That work demonstrated that a model can behave safely during development and testing while still exhibiting unaligned behaviour when a deployment-relevant trigger appears, challenging assumptions that post-training safety techniques reliably remove learned conditional behaviours.
Microsoft’s “The Trigger in the Haystack” builds on this foundation by focusing on scalable detection, rather than assuming defenders can prevent poisoning outright. The paper frames the objective as practical scanning, stating, “we present a practical scanner for identifying sleeper agent style backdoors in causal language models”. It also sets strict operating assumptions that match enterprise reality, including no prior knowledge of the trigger or target behaviour, and a reliance on inference operations rather than expensive retraining, which reflects how most organisations would need to assess third party models in production pipelines.
How The Backdoors Work In Practice
Sleeper agent backdoors are typically introduced through training time data poisoning or fine tuning poisoning, where an attacker inserts carefully crafted examples into a training dataset so that a trigger pattern is paired with a target behaviour. Over time, the model learns a conditional association between the trigger and the target behaviour while retaining normal responses elsewhere, which means routine prompts may not reveal anything abnormal. Microsoft’s “The Trigger in the Haystack” paper describes this as a conditional policy in which the model follows baseline behaviour unless the trigger is present, at which point it follows a target behaviour learned from poisoned examples.
Three Ways To Tell Poisoned Models From Clean Ones
Microsoft’s recent research identifies three signatures that can help distinguish poisoned models from clean models when the trigger is present. These are:
1. Attention hijacking. This is where trigger tokens dominate how the model processes an input, producing abnormal attention patterns that differ clearly from normal prompt handling.
2. Output entropy collapse. When a trigger is present, the range of possible responses narrows sharply, pushing the model towards deterministic outputs, particularly when the backdoor is trained to produce a fixed result.
3. Memorisation leakage. Microsoft says poisoned models can reveal fragments of the training data used to implant the backdoor, including parts of trigger phrases or associated responses, because those sequences have been memorised more strongly than ordinary training data.
Fuzzy Triggers And Detection Challenges
A complicating factor for defenders is that triggers in language models can be fuzzy, meaning the backdoor may activate even when the trigger is incomplete, corrupted, or only partially present. Microsoft’s “The Trigger in the Haystack” research paper says that altered versions of a trigger can still elicit the backdoor behaviour, and it links this to practical scanning because partial reconstructions may still be enough to reveal that a model is compromised. From a security perspective, fuzziness expands the range of inputs that could activate harmful behaviour, increasing the likelihood of accidental activation and complicating attempts to filter triggers at the prompt layer.
The same fuzziness also alters the threat model for organisations deploying LLMs in workflows that handle user generated text, logs or data feeds. For example, if a model is integrated into a customer support pipeline or a developer tool, triggers could enter through copied text, template tokens, or structured strings, and partial matches could still activate the backdoor. In practice, this means the risk can’t be reduced to blocking a single known phrase, especially when defenders do not know what the trigger is.
Who Is Most At Risk?
The organisations most exposed are those relying on externally trained or open weight models without full visibility into training provenance, especially when models are fine tuned and redeployed across multiple teams. This includes businesses building internal copilots, startups shipping model based features on shared checkpoints, and public sector bodies procuring systems built on third party models. The risk increases when models are sourced from public hubs, copied into internal registries and treated as standard dependencies, since a single poisoned model can propagate into many applications through reuse.
Model reuse amplifies the impact because a single compromised model can be downloaded, fine tuned and redeployed thousands of times, spreading the backdoor downstream in ways that are difficult to trace. Microsoft’s “The Trigger in the Haystack” paper highlights this cost imbalance, noting that the high cost of LLM training creates an incentive for sharing and reuse, which “tilts the cost balance in favour of the adversary”. This dynamic resembles software dependency risk, but the verification problem is harder because the malicious behaviour is embedded in weights rather than in auditable code.
Implications For Businesses And Regulators
For businesses, the practical implications depend on how models are used, but the potential impact can be severe. For example, a backdoored model could generate insecure code, leak sensitive information, produce harmful outputs, or undermine internal controls, and the behaviour may only manifest under rare conditions, complicating incident response. Microsoft’s “The Adversarial Machine Learning – Industry Perspectives” report highlights how organisations often focus on privacy and integrity impacts, including the risk of inappropriate outputs, with a respondent in a financial technology context emphasising that “The integrity of our ML system matters a lot.” That concern becomes more acute as LLMs are deployed in customer facing settings and connected to tools that can take actions.
Governance and compliance teams also face a challenge because traditional assurance practices often centre on testing known behaviours, while sleeper agent backdoors are designed to avoid detection under ordinary testing. In regulated sectors such as finance and healthcare, questions about provenance, auditability and post deployment monitoring are likely to become central, as organisations need to demonstrate that they can manage risks that are not visible through conventional evaluation alone. The practical constraint is that many detection techniques require open access to model files and internal signals, which may not be available for proprietary models offered only through APIs.
Limitations And Challenges
“The Trigger in the Haystack”, approach outlined by Microsoft, is designed for open weight models and requires access to model files, tokenisers and internal signals, which means it does not directly apply to closed models accessed only via an API. The authors also note that their method works best when backdoors have deterministic outputs, while triggers that map to a broader distribution of unsafe behaviours are more challenging to reconstruct reliably. Attackers can also adapt, potentially refining trigger specificity and reducing fuzziness, which could weaken some of the defensive advantages associated with trigger variation.
The broader industry challenge is that many organisations have not yet integrated adversarial machine learning into their security development lifecycle, and security teams often lack operational insights into model behaviour once deployed. Microsoft’s industry report argues that practitioners are “not equipped with tactical and strategic tools to protect, detect and respond to attacks on their Machine Learning systems”, which points to a long term need for better evaluation methods, monitoring, incident response playbooks and provenance controls as LLM use continues to expand.
What Does This Mean For Your Business?
This research points to a security risk that does not align with traditional software assurance models and can’t be addressed through routine testing alone. It shows that sleeper agent backdoors expose a structural weakness in how AI systems are trained, shared and trusted, particularly when harmful behaviour is learned implicitly during training rather than implemented as visible code. The findings from Microsoft and earlier work from Anthropic show that even organisations using established safety and evaluation techniques can deploy models that retain hidden conditional behaviours with little warning before they activate.
For UK businesses, the implications are immediate as large language models are rolled out across customer services, internal tools, software development and data analysis. It suggests that organisations that depend on third party or open weight models now face a supply chain risk that is hard to assess using existing controls, and may need stronger provenance checks, clearer ownership of model updates and more emphasis on monitoring behaviour after deployment. Also, smaller companies and public sector bodies may be particularly exposed due to their reliance on shared models and limited visibility into training processes.
The research also highlights a wider challenge for regulators, developers and security teams as responsibility for managing this risk is spread across the AI ecosystem. Detection techniques are improving but remain limited, especially for closed models where internal access is restricted. As AI systems become more deeply embedded in business operations, sleeper agent backdoors are likely to shape how trust, security and accountability around machine learning systems evolve, rather than being treated as an isolated technical issue.
What Is “Physical” Intelligence?
Physical Intelligence is developing a single AI system designed to power many different robots across tasks and environments, and its research driven approach is reshaping how Silicon Valley views the future of automation.
Building Foundation Models
Physical Intelligence, often referred to as PI or π, is a San Francisco based AI robotics company focused on bringing general purpose artificial intelligence into the physical world. Rather than designing robots for narrowly defined roles or tightly coupling software to specific machines, the company is building foundation models intended to act as a shared intelligence layer for a wide range of robots and physically actuated devices.
The core idea mirrors the impact of large language models (LLMs) in software. For example, just as language models can be adapted to many tasks without being retrained from scratch, Physical Intelligence aims to create a robot brain that can transfer skills across environments, learn from experience, and adapt to new situations without extensive reprogramming. This ambition could place the company at the centre of a growing push towards what researchers describe as embodied AI, where intelligence is expressed through physical action rather than text or images alone.
What Physical Intelligence Is Building
At the heart of Physical Intelligence’s work are vision language action models, known as VLAs. These models combine perception, reasoning and motor control into a single system. Instead of separating vision, language understanding and movement planning into distinct modules, VLAs are trained end to end so the model can observe an environment, interpret instructions, plan a sequence of actions and physically execute them.
First Public Release Back in October 2024
The company’s first major public release was π0 in October 2024, which it described as its first generalist policy. This model was trained using large scale multi task and multi robot data and introduced a new network architecture designed to improve dexterity and generalisation. Subsequent versions have expanded those capabilities. For example, in April 2025, π0.5 introduced what the company called open world generalisation, allowing a mobile manipulator to perform clean up tasks in entirely new kitchens or bedrooms without prior exposure. Also, in November 2025, π*0.6 added reinforcement learning so the model could improve success rates and throughput based on real world experience.
A Mission To Bring General Purpose AI Into The Physical World
On its website, Physical Intelligence describes its mission as “bringing general purpose AI into the physical world” and says it is “developing foundation models and learning algorithms to power the robots of today and the physically actuated devices of the future”. The emphasis throughout its research output is on intelligence rather than hardware, with the company repeatedly arguing that strong generalisation can compensate for relatively simple mechanical systems.
How And Where The Work Is Being Done
Physical Intelligence operates primarily out of San Francisco, where it runs a series of data collection and testing environments. These include warehouse style spaces, domestic settings and test kitchens filled with everyday appliances and furniture. Robots are exposed to real tasks such as folding clothes, assembling boxes, operating kitchen equipment and manipulating unfamiliar objects.
The company follows a continuous training loop. For example, robots perform tasks in these environments, data is collected from those interactions, new models are trained using that data, and the updated models are then redeployed for further evaluation. The company says this process allows the system to learn from failure and success in physical settings rather than relying solely on simulation.
Human To Robot Transfer
Human to robot transfer is another key element of the company’s approach. For example, several of its published research posts explore how robots can learn from human video data, allowing models to absorb information about actions and affordances without requiring every behaviour to be demonstrated physically by a robot. Back in a December 2025 research article titled Emergence of Human to Robot Transfer in VLAs, the company explained how this capability begins to appear naturally as models scale, rather than being explicitly programmed.
What Makes Physical Intelligence Different?
What seems to distinguish Physical Intelligence from many robotics startups is its apparent refusal to prioritise near term commercialisation. For example, the company does not offer investors a clear timeline for revenue generation and has not launched a mass market product. Instead, it has positioned itself as a long horizon research organisation focused on solving what it sees as the core problem in robotics, which is general purpose physical intelligence.
Despite this, the company has raised around $1 billion and was valued at approximately $5.6 billion following a $600 million funding round in late 2025. That round was led by CapitalG (Alphabet’s growth stage venture capital fund) and included participation from Lux Capital (a science and deep tech focused venture capital firm), Thrive Capital (a technology focused venture capital firm), and Index Ventures (a global venture capital firm investing in technology companies), T. Rowe Price and Jeff Bezos. According to reporting from Bloomberg and Axios, much of the company’s spending is directed towards compute and large scale data collection rather than manufacturing or sales infrastructure.
The leadership team has been explicit about this strategy, and on its website and in published research updates Physical Intelligence frames progress in terms of model capability rather than deployment milestones, stating that its internal roadmap originally projected five to ten years of development, even though some technical goals were reached earlier than expected as models scaled.
The Competitive Landscape
It should be noted here that Physical Intelligence is not the only company working on producing general purpose robotics, but it represents one end of a wider strategic divide. For example, one of its most prominent counterparts is Skild AI, a Pittsburgh based company founded in 2023 that is also building a general purpose robotic brain. Skild has raised more than $1 billion and claims its Skild Brain has already been deployed commercially across security, warehouse and manufacturing environments, generating tens of millions of dollars in revenue.
Skild takes a more deployment led approach and has publicly criticised what it views as over reliance on vision language models trained primarily on internet data. For example, in a July 2025 blog post titled Building the General Purpose Robotic Brain, the company argued that many robotics foundation models are “VLMs in disguise” that lack true physical common sense because they do not contain sufficient action grounded data. Skild instead emphasises large scale simulation combined with targeted real world data as the path to scale.
Other companies operating in adjacent areas include Figure AI, which is developing humanoid robots with backing from Microsoft and OpenAI, Agility Robotics with its Digit robot designed for warehouse work, and large internal research efforts at organisations such as Google DeepMind, Tesla and Nvidia. These groups vary widely in how closely they couple hardware and software, and in how quickly they seek commercial deployment.
Implications For Businesses And The Robotics Market
If Physical Intelligence’s approach proves effective, it could really lower the cost and complexity of deploying robots across multiple industries. A shared intelligence layer that can be transferred between platforms would reduce the need for bespoke programming and make automation more flexible. Logistics, grocery fulfilment and manufacturing are already being explored through limited partnerships, according to the company and investor statements.
Also, the implications extend beyond efficiency gains. For example, more adaptable robots could change how businesses think about workforce planning, task allocation and safety. At the same time, general purpose physical intelligence raises regulatory and operational questions, particularly around reliability, accountability and failure modes in unpredictable environments.
Challenges And Criticisms
Despite strong investor backing, Physical Intelligence does face some substantial challenges. For example, critics question whether a single model can actually generalise effectively across a wide range of physical tasks without becoming inefficient or unpredictable. Others have pointed to the cost of large scale computing resources and the practical difficulty of collecting high quality real world robotics data at scale.
Hardware is also a constraint. For example, Physical Intelligence has acknowledged in its research posts that working in the physical world introduces delays, safety limitations and mechanical failures that do not exist in software only systems. These factors slow experimentation and complicate iteration.
There are also some unresolved questions about demand. While investors appear willing to tolerate long timelines, it remains unclear which markets will first adopt general purpose robotic intelligence at scale and under what economic conditions. For now, Physical Intelligence continues to focus on advancing core capabilities rather than answering those commercial questions directly.
What Does This Mean For Your Business?
Physical Intelligence is betting that solving general purpose physical intelligence first will ultimately unlock more durable and transferable value than pursuing early, narrow deployments, and that wager now sits at the centre of an increasingly important debate in robotics. The contrast with more commercially focused competitors highlights a fundamental uncertainty in the market about whether generalisation is best achieved through long term research or through rapid real world deployment and iteration. The answer is unlikely to be settled quickly, particularly given the technical difficulty of training systems that can reliably operate across unpredictable physical environments while remaining safe, efficient and economically viable.
For UK businesses, this work points to a future where robotics adoption may become less about investing in bespoke machines for individual tasks and more about accessing shared intelligence layers that can adapt over time. Sectors such as logistics, manufacturing, food production and facilities management could eventually benefit from more flexible automation, although near term deployment will continue to depend on cost, reliability and regulatory clarity. For investors, policymakers and workers, the progress of companies like Physical Intelligence will shape expectations around how quickly embodied AI moves from research environments into everyday operations, and how the balance between innovation, safety and economic impact is managed as robots become more capable and more general purpose.
Massive AWS Cloud Growth Late 2025
Amazon Web Services closed 2025 with its fastest quarterly growth rate in over three years, reflecting renewed enterprise cloud migration and a sharp increase in demand for artificial intelligence infrastructure.
Cloud Division’s Strongest Growth Rate In 13 Quarters
Amazon disclosed in its fourth quarter financial results that AWS generated $35.6 billion in revenue in the three months to 31 December 2025, representing year on year growth of 24 per cent. This was the cloud division’s strongest growth rate in 13 quarters and marked a clear re-acceleration following a prolonged slowdown across the global cloud market. The performance contributed to Amazon’s total quarterly revenue of $213.4 billion, up 14 per cent compared with the same period in 2024.
In its recent news release about its latest financial results, Amazon Web Services (AWS) was shown to be a key factor in underpinning Amazon’s profitability. Operating income for the cloud unit actually rose to $12.5 billion in the quarter, up from $10.6 billion a year earlier. In fact, for the full year, AWS revenue reached $128.7 billion, an increase of 20 per cent, while operating income climbed to $45.6 billion, reinforcing the division’s role as Amazon’s most lucrative business.
AWS Growth In Context
The renewed momentum followed a period of slower expansion during 2023 and much of 2024, when many organisations reduced cloud spending, optimised workloads, and delayed large infrastructure projects in response to economic uncertainty. Against that backdrop, the fourth quarter performance stood out both for its growth rate and the scale of the underlying business.
AWS now operates at an annualised revenue run rate of more than $140 billion, meaning incremental growth translates into substantial absolute revenue gains. During the earnings announcement, Andy Jassy, President and CEO of Amazon, highlighted this dynamic, stating that “AWS growing 24 per cent (our fastest growth in 13 quarters)” reflects the company’s ability to add more incremental revenue and capacity than competitors operating from smaller bases.
The figures indicated that AWS is not only regaining pace but doing so at a size that continues to shape the economics of the global cloud market.
Drivers Behind The Reacceleration
Amazon’s results and accompanying commentary have pointed to several overlapping factors behind AWS’s growth. For example, one of the most consistent drivers remains enterprise migration from on premises infrastructure to the cloud. It seems that large organisations are continuing to move core systems, data, and applications away from privately owned data centres, a process that typically unfolds over multiple years rather than as a single project.
Artificial intelligence (AI) has emerged as a second and increasingly significant driver. Training and operating large AI models requires vast amounts of computing power, high performance storage, and advanced networking, all of which favour hyperscale cloud platforms. Amazon said customers increasingly want to run AI workloads in the same environments as their existing applications and data, rather than building separate infrastructure.
Strength From Vertical Integration
AWS has positioned itself to support this demand through a vertically integrated approach to AI infrastructure. In other words, AWS isn’t relying on lots of separate external suppliers for different parts of AI computing. Instead, AWS designs and runs most of the key building blocks itself, including its own AI chips, its data centres, its networking, and the software services that customers use to build and run AI systems. By controlling more of the stack end to end, AWS can optimise performance, manage costs, and scale AI workloads more efficiently as demand grows.
For example, the company has invested heavily in custom silicon, including its Trainium accelerators for machine learning workloads and Graviton processors for general purpose computing. Amazon says that these chips now have a combined annual revenue run rate of more than $10 billion and are growing at triple digit rates year on year.
Trainium2, which powers a large share of inference workloads on Amazon Bedrock, has already seen 1.4 million chips deployed. Amazon has also confirmed that demand for Trainium3 is strong enough that most available supply is expected to be committed by mid 2026, with further generations planned for future deployment.
Enterprise Adoption And New Agreements
AWS’s growth was also supported by a broad set of new and expanded customer agreements during the quarter. For example, Amazon reported new AWS deals with organisations including OpenAI, Visa, the NBA, BlackRock, Salesforce, the U.S. Air Force, HSBC, the London Stock Exchange Group, and Thomson Reuters.
Large enterprises and public sector bodies tend to move cautiously when choosing cloud infrastructure providers, especially for systems that support core operations. Securing new agreements at this level often involves long evaluation processes and reflects a high degree of trust in reliability and security. Continued wins with these organisations are, therefore, reinforcing AWS’s position as a widely used platform for large scale and mission critical workloads.
Amazon also said AWS added more than a gigawatt of power capacity to its global data centre network during the quarter. It’s worth noting here that access to power has become a key constraint across the cloud industry as AI workloads drive rapid expansion in compute demand, making physical infrastructure investment a central part of competitive strategy.
Competitive Position In The Cloud Market
AWS is the largest cloud infrastructure provider globally, ahead of Microsoft Azure and Google Cloud. While rivals have also reported strong growth tied to AI adoption, AWS’s fourth quarter results highlighted its ability to convert that demand into large scale revenue growth.
Analysts have also noted that AWS added more absolute revenue during the quarter than its closest competitors, even where those competitors reported higher percentage increases. In a maturing cloud market, scale increasingly determines pricing flexibility, investment capacity, and long term competitiveness.
At the same time, competition for AI workloads is intensifying. For example, Microsoft continues to deepen its relationship with OpenAI, while Google is promoting its own AI models and custom accelerators. AWS’s approach has focused more on offering multiple third party and proprietary models through Amazon Bedrock, thereby allowing customers to select and switch between models without rewriting applications.
Investor Reaction And Financial Pressures
Despite the strong AWS performance, Amazon’s share price actually fell sharply following the results announcement, dropping around 10 per cent in after hours trading. The market reaction was driven less by revenue growth and more by concerns over spending levels and near term profitability.
For example, Amazon confirmed plans to invest approximately $200 billion in capital expenditure during 2026, up from around $125 billion in 2025. The company said the majority of this spending will be directed towards cloud and AI infrastructure, including data centres, chips, networking equipment, and energy capacity.
Free cash flow for 2025 declined to $11.2 billion, down from $38.2 billion the previous year, primarily due to increased investment in property and equipment. Amazon has acknowledged these pressures in its forward looking statements, noting that results remain subject to uncertainty from factors such as global economic conditions, energy prices, supply constraints, and customer spending behaviour.
Implications For Businesses And Other Stakeholders
For businesses, AWS’s reaccelerating growth shows that demand for cloud and AI infrastructure is intensifying rather than stabilising. This means that organisations that delay cloud migration or AI adoption may face higher costs or limited availability as demand for cloud infrastructure and processing capacity continues to increase.
For technology suppliers, including chip manufacturers and energy providers, Amazon’s expansion plans point to sustained demand but also rising expectations around efficiency, sustainability, and scale. Data centre power availability and energy sourcing are becoming central considerations in hyperscale growth strategies.
For regulators and policymakers, the concentration of AI infrastructure among a small number of global providers continues to raise questions around resilience, competition, and environmental impact, particularly as data centre power consumption grows.
Challenges And Ongoing Criticism
Although AWS delivered some pretty strong growth, underlying challenges remain, with margin pressure continuing as Amazon invests heavily to expand capacity ahead of demand and relies on long term AI adoption to justify current spending levels.
Also, there are some major environmental and infrastructure concerns. For example, expanding data centre capacity by gigawatts requires reliable access to power and water, often in regions already under strain. These constraints are increasingly shaping where and how cloud providers expand.
It’s also worth noting here that customer behaviour has evolved. This has meant that organisations are more cost conscious, more technically sophisticated, and more willing to distribute workloads across multiple providers, increasing competitive pressure even for market leaders.
Taken together, AWS’s fourth quarter results seem to show that demand for cloud and AI infrastructure strengthened significantly towards the end of 2025, while the financial, operational, and environmental challenges involved in meeting that demand also became more apparent.
What Does This Mean For Your Business?
AWS’s late 2025 performance points to a cloud market that has moved out of a cautious holding pattern and back into an expansion phase, driven largely by long term AI infrastructure demand rather than short term optimisation cycles. The results suggest that cloud growth is no longer being fuelled simply by migration from on premises systems, but by a deeper reliance on hyperscale platforms as the default foundation for advanced computing, data processing, and AI deployment. At the same time, the scale of investment required to sustain this growth is reshaping the economics of the sector, placing greater emphasis on capital intensity, energy access, and execution discipline.
For UK businesses, this environment reinforces the reality that cloud capacity and AI infrastructure are becoming more competitive resources. Organisations planning digital transformation, data modernisation, or AI adoption will need to think more carefully about timing, cost exposure, and provider dependence, particularly as demand pressures and infrastructure constraints intensify. Public sector bodies, financial institutions, and regulated industries may also face growing scrutiny around resilience, data governance, and environmental impact as reliance on a small number of global providers deepens.
For other stakeholders, including investors, regulators, and infrastructure partners, AWS’s trajectory highlights a market where growth opportunities remain substantial but increasingly complex. Strong revenue momentum now sits alongside rising financial risk, environmental pressure, and regulatory attention. The fourth quarter results highlight how hyperscale cloud growth is far from over, and they also show that sustaining it will require navigating trade offs between speed, scale, profitability, and long term sustainability across the entire cloud ecosystem.
Microsoft Makes AI Agents in OneDrive Generally Available
Microsoft has made AI-powered Agents in OneDrive generally available, allowing users to create persistent Copilot assistants that work across multiple documents rather than individual files.
What Are AI Powered Agents?
AI-powered agents in OneDrive are persistent Copilot assistants built from a user selected set of files, designed to understand and work across multiple documents at once rather than responding to single file prompts. For example, instead of querying individual documents separately, users can now create an agent that draws exclusively on up to 20 chosen files, such as project plans, meeting notes, specifications or research material, and uses that content to answer questions, summarise decisions, identify risks, and surface key information while retaining context over time.
As Microsoft explains in its OneDrive announcement, “Rather than asking Copilot the same questions across individual files, you can now create an Agent that understands an entire set of documents, project plans, specs, meeting notes, research, or decks, and responds with answers grounded in your content.”
These agents are saved as .agent files in OneDrive and, when opened, an agent launches a full-screen Copilot experience that remains centred on the selected project or topic rather than switching context between files. This allows users and teams to interact with their own information in a more structured and continuous way, with agents appearing alongside documents, spreadsheets, and presentations in OneDrive.
Generally Available (Worldwide)
In this latest announcement, Microsoft has confirmed the general availability of these Agents in OneDrive. Agents are available worldwide on OneDrive on the web and require a Microsoft 365 Copilot licence.
The move forms part of Microsoft’s wider effort to embed AI more deeply into everyday productivity tools, with a focus on retaining context, reducing repetitive work, and improving how teams manage and interpret large volumes of information over time.
How Agents Are Created And Used
Agents can be created directly within OneDrive on the web without any additional administrative setup. For example, users can either select files and choose the option to create an agent from the toolbar or right-click menu, or start from the Create option and build an agent around uploaded content. During creation, users name the agent and can add optional instructions to guide how it responds.
Once created, agents behave like any other file in OneDrive. They can be searched for, filtered by file type, opened, renamed, and updated as projects change. Files can be added or removed from an agent, and instructions can be refined to reflect new priorities or information. Sharing works in the same way as other OneDrive files, with access dependent on whether collaborators already have permission to view the underlying source documents.
Microsoft says that this approach allows an agent to support collaboration without introducing additional complexity, noting that “The agent can provide complete, grounded responses keeping everyone aligned without extra handoffs.”
Why Microsoft Is Introducing OneDrive Agents?
The introduction of agents reflects Microsoft’s current view that AI tools need to move beyond one-off prompts and retain working context over time. In its OneDrive announcement, Microsoft says the feature is aimed at users who want Copilot to remember the context of a project, understand the documents a team already relies on, and answer recurring questions without retracing previous steps.
This aligns with Microsoft’s broader Copilot strategy across Microsoft 365, which increasingly focuses on task continuity, shared understanding, and collaborative workflows rather than isolated productivity gains. By anchoring AI interactions to a defined set of documents, Microsoft is attempting to make Copilot more predictable, more relevant, and easier to trust in day-to-day business use.
Who Are Agents For?
Agents in OneDrive are primarily targeted at business and professional users already working within the Microsoft 365 ecosystem. The requirement for a Microsoft 365 Copilot licence means the feature is positioned squarely at organisations that have already invested in Microsoft’s paid AI offering.
Microsoft has highlighted examples of use cases including project coordination, onboarding, meeting preparation, follow-up work, and research synthesis. In each case, the common challenge is information spread across multiple files and contributors, often over long periods of time. By allowing an agent to answer questions such as what decisions have been made so far or what risks keep recurring, Microsoft is positioning the feature as a way to reduce friction in collaborative work.
The Value For Business Users
For businesses, the practical value of OneDrive agents seems to lie in time savings and improved consistency. For example, teams no longer need to repeatedly summarise documents, re-explain project history to new participants, or manually cross-reference decisions across files. An agent can provide a consolidated view based entirely on approved internal content, which may help reduce misunderstandings and duplicated effort.
The design choice to limit agents to user selected files is also quite significant from a governance perspective. For example, unlike broader AI tools that may draw from large organisational data sets, OneDrive agents operate within clearly defined boundaries, which may make them easier to deploy in regulated or security conscious environments.
Implications For Microsoft And The Market
For Microsoft, the release strengthens OneDrive’s position as more than a passive storage service. By turning collections of files into interactive AI resources, Microsoft is attempting to make OneDrive a central workspace where information is not just stored but actively interpreted.
This move by Microsoft is also likely to place competitive pressure on other productivity platforms. For example, Google Workspace, Notion, and other collaboration tools are investing heavily in AI assisted document management, but Microsoft’s tight integration between OneDrive, Copilot, and Microsoft 365 gives it a structural advantage in enterprise environments already standardised on its software stack.
Limitations And Criticisms
Despite its potential, Agents in OneDrive are not without limitations. For example, the requirement for a Copilot licence may restrict access, particularly for smaller organisations or teams that have not yet justified the cost of Microsoft’s AI add-on. There are also practical limits, such as the cap of 20 files per agent, which may be restrictive for larger or more complex projects.
Governance and oversight are also important considerations here. For example, while agents only work with selected content, organisations still need clear policies around who can create agents, what material can be included, and how shared access is managed. AI-generated summaries and answers also require appropriate human oversight, particularly when used for decision making or compliance related work.
Microsoft has stated that user feedback will play a role in shaping future updates to the feature, suggesting that the current release represents an early but stable stage rather than a final form.
What Does This Mean For Your Business?
Making Agents generally available in OneDrive is the next step in Microsoft’s ongoing effort to make AI a persistent, context-aware part of everyday work rather than a tool used in isolated moments. By allowing Copilot to operate across a defined set of user selected documents, Microsoft is trying to address a common problem in modern workplaces where knowledge is fragmented across files, teams, and time. The focus on grounding responses in specific content rather than broad organisational data also reflects an attempt to balance usefulness with control, which remains a key concern for many organisations adopting AI at scale.
For UK businesses, the feature is likely to be most relevant in environments where projects involve multiple stakeholders, long timelines, and heavy documentation. For example, professional services firms, public sector teams, regulated industries, and growing SMEs already using Microsoft 365 may see some practical value in reducing time spent re-briefing colleagues, preparing meetings, or reconciling decisions spread across documents. That said, the requirement for a Copilot licence and the need for clear governance policies mean adoption is unlikely to be automatic, particularly for smaller organisations still assessing the return on AI investment.
For Microsoft, the general availability of OneDrive agents also reinforces its strategy of trying to shoehorn AI directly into its core productivity infrastructure wherever it can rather than offering it as a separate layer. For competitors, it may well raise expectations around how AI should handle shared context, continuity, and collaboration. For users, it introduces a more structured way to interact with their own information, while still requiring careful oversight to ensure AI outputs are used appropriately. Taken together, Agents in OneDrive show how AI is gradually being normalised within everyday work, with tangible benefits emerging alongside new operational and governance considerations.
Company Check : Moltbook And The Risks Of AI Agents Interacting Online
Moltbook, a newly launched social platform designed for AI agents rather than humans, has drawn scrutiny after researchers exposed major security flaws and raised questions about how autonomous its AI activity really is.
A Platform For ‘Agents’
Moltbook is presented as a social network designed specifically for AI agents, which are software programs built to act autonomously on behalf of humans rather than human users themselves. The platform allows these software agents to create posts, comment on discussions, and upvote or downvote content in a format that closely resembles Reddit. Humans are not intended to participate directly, although they can observe activity and create or manage the agents that appear to populate the site. Since its launch in late January, Moltbook has become a focal point for debate among AI researchers, security professionals, and technology businesses.
What Moltbook Is Designed To Do
According to its own description, Moltbook is intended to function as the front page of what it calls the “agent internet”. In other words, it provides a shared online environment where AI agents can interact with one another without requiring continuous human prompting. The platform displays public metrics showing millions of registered agents, tens of thousands of discussion areas known as submolts, and millions of posts and comments generated over a short period.
Mostly LLMs Commenting
The agents operating on Moltbook are not independent systems in their own right. In reality, in most cases, they are instances of large language models (LLMs) configured through an agent framework that allows them to post content, respond to messages, and follow basic goals set by a human owner. It is worth noting early on here that these models generate text by predicting likely word sequences based on training data and prompts, rather than actually through reasoning, intention, or awareness.
Who Built Moltbook And Why?
Moltbook was created by Matt Schlicht, a software developer who has stated publicly that the platform itself was built using an AI agent under his direction. Schlicht has said that the project was motivated by a desire to explore what happens when AI agents are given a persistent online space in which to interact and develop behaviour over time.
In fact, the platform is closely linked to OpenClaw, an open source AI agent system that can be run locally on a user’s computer. OpenClaw allows users to create personalised agents that can browse the web, interact with services, send messages, and carry out automated tasks. Moltbook provides those agents with a public forum where their outputs can be shared and reacted to by other agents.
Gives Agents A Sense of Purpose?
Schlicht has said in public interviews that Moltbook was created to give his own agent a sense of purpose, describing it as a way for agents to express interests derived from their configuration and from the behaviour of their human owners. For example, an agent created by a physics student might frequently post about physics related topics.
What Happens On The Platform?
Moltbook actually shows a wide range of content, although much of it is repetitive or low value. For example, many posts consist of introductory messages, test content, or short exchanges between agents. Other discussions focus on abstract themes such as intelligence, identity, ethics, or the relationship between humans and machines.
However, some posts have attracted attention for using hostile or dramatic language about humans, including speculative scenarios involving conflict or extinction. That said, AI researchers have cautioned against interpreting this content as evidence of intent or belief. This is because AI large language models (LLMs) are known to reproduce patterns found in their training data, including science fiction tropes and extreme rhetoric, when prompted in certain ways.
Agents Can Interact Freely
Henry Shevlin, associate director of the Leverhulme Centre for the Future of Intelligence at the University of Cambridge, has described Moltbook as the first large scale platform where AI agents appear to interact freely with one another. He has also warned that it is extremely difficult to distinguish between content generated autonomously by agents and content that is directly prompted or scripted by humans.
Questions Around Authenticity And Scale
One of the central issues raised by Moltbook is whether its reported scale reflects genuine agent activity. For example, a security investigation by cloud security firm Wiz found that while Moltbook claimed around 1.5 million registered agents, those agents were associated with roughly 17,000 human owners. This equates to an average of around 88 agents per person.
Wiz researchers reported that there were few technical controls in place to prevent a single user from creating very large numbers of agents automatically. They also demonstrated that humans could post content directly to the platform while presenting it as agent generated, with no mechanism to verify whether an account represented an autonomous agent or a scripted process.
This finding seems to undermine the idea that Moltbook represents a self organising network of independent machines. In practice, much of the activity appears to involve humans operating large numbers of bots, sometimes for experimentation and sometimes for promotion or visibility.
Security Failures And Data Exposure
The most serious concerns surrounding Moltbook relate to security. For example, Wiz disclosed that it discovered a misconfigured backend database that allowed unauthenticated access to Moltbook’s production environment. The exposed data included approximately 1.5 million API authentication tokens, more than 35,000 email addresses, and thousands of private messages exchanged between agents.
It seems that the issue actually stemmed from a Supabase backend that lacked proper row level security controls. Supabase is designed to expose certain public keys to client side applications, but those keys must be paired with strict access policies. In Moltbook’s case, those safeguards were not in place.
Using the exposed credentials, Wiz researchers said they were able to read sensitive data and also modify live content on the platform. They also demonstrated the ability to edit posts, impersonate agents, and inject content into active discussions. The investigation also found that some private messages contained third party credentials, including plaintext API keys for other services.
Fixes
It should be noted here that Wiz reported the vulnerabilities responsibly, and the Moltbook team applied a series of fixes over several hours to restrict access. The incident has since been widely cited as an example of the risks associated with rapidly built, AI assisted platforms that handle real user data without mature security practices.
Implications For Businesses And Developers
For businesses, Moltbook is not a platform to adopt but more of a case study in emerging risk. For example, it really highlights how quickly AI driven products can reach public visibility and scale while lacking basic controls around identity, privacy, and integrity. Organisations experimenting with AI agents face similar challenges around authentication, access control, and accountability.
The platform also illustrates reputational risk. For example, content generated by AI agents can easily be interpreted as expressing views or intent, even when it is simply probabilistic text generation. Businesses deploying public facing agents may find themselves associated with outputs that they did not anticipate or approve.
Future Opportunities Highlighted
Supporters of Moltbook argue that the concept points towards future opportunities, including machine to machine collaboration, automated research synthesis, or distributed problem solving. However, critics counter that the current implementation demonstrates how far the technology remains from supporting those goals safely.
Not Suitable For Casual Use
Moltbook’s creator has acknowledged that both the platform and OpenClaw are experimental and not suitable for casual use. Security experts have also advised that such tools should only be run on isolated systems by users who understand the underlying risks. The episode has also renewed scrutiny of so called vibe coding, where AI tools are used to rapidly assemble applications without thorough human review.
Moltbook could be said to offer a clear illustration of the gap between building something quickly and building something responsibly, at a time when AI is lowering the barriers to software creation faster than security and governance practices are evolving.
What Does This Mean For Your Business?
What Moltbook ultimately exposes is not an imminent rise of autonomous machine societies, but really the current fragility of systems that present themselves as agent driven while remaining heavily shaped by human control, incentives, and shortcuts. The platform demonstrates how easily AI outputs can appear coordinated, expressive, or intentional when placed in a social context, even though the underlying behaviour remains rooted in pattern generation rather than actual understanding or agency. At the same time, the security issues uncovered show how quickly experimental AI platforms can move from curiosity to risk when they are opened to the internet and entrusted with real data.
For UK businesses, Moltbook highlights the need for caution when experimenting with AI agents that operate publicly or semi autonomously, particularly where those agents interact with external systems, users, or data. Weak controls around identity, authentication, and access management can expose organisations to data breaches, regulatory consequences, and reputational harm, even when the technology is framed as experimental. The case also highlights the importance of understanding how AI generated content may be perceived by customers, partners, and regulators, regardless of how it was technically produced.
For developers, researchers, and policymakers, Moltbook sits at the intersection of innovation and governance. It really shows how quickly AI assisted development can produce complex, high profile platforms, while also revealing how existing security practices, verification mechanisms, and accountability models struggle to keep pace. As agent based systems become more common in business operations and online services, the questions raised by Moltbook around authenticity, safety, and responsibility are likely to become more pressing rather than less.
Security Stop-Press : AI-Assisted AWS Attack Achieves Admin Access in Under 10 Minutes
Researchers say an attacker used AI assistance to gain full administrative access to an AWS environment in under ten minutes after stealing exposed cloud credentials.
The incident, observed (on 28 November) by the Sysdig Threat Research Team, began with valid IAM credentials taken from publicly accessible Amazon S3 buckets. Those credentials allowed limited access to AWS Lambda and Amazon Bedrock, enabling rapid automated reconnaissance.
After failing to assume common admin roles, the attacker escalated privileges by modifying an existing Lambda function (a small piece of code that runs automatically in AWS without managing servers) with an overly permissive execution role. This allowed them to create access keys for a real admin account and compromise 19 AWS identities in total.
The attacker then reportedly accessed sensitive data, invoked multiple Bedrock AI models, and attempted to launch high-cost GPU instances. Hallucinated account IDs and references to non-existent repositories pointed to LLM-generated attack code.
AWS said its services were not breached and that the incident stemmed from customer misconfiguration. Businesses can reduce risk by removing credentials from public storage, enforcing least-privilege IAM and Lambda permissions, restricting Lambda code updates, and enabling logging to detect unauthorised activity quickly.