AI in RAN – Evolution, Opportunities, and Risks

INTRO.

On September 10, at the Berlin Open RAN Working Week (BOWW) (a public event arranged by Deutsche Telekom AG’s T-Labs), I will give a talk about AI in Open RAN and RAN in general. The focus of the talk will be on how AI in RAN can boost the spectral efficiency. I have about 20 minutes, which is way too short to convey what is happening in this field at the moment. Why not write a small piece on the field as I see it at the moment? So, enjoy and feel free to comment or contact me directly for one-on-one discussions. If you are at the event, feel free to connect there as well.

LOOKING BACK.

The earliest use of machine learning and artificial intelligence in the Radio Access Network did not arrive suddenly with the recent wave of AI-RAN initiatives. Long before the term “AI-native RAN” (and even the term AI) became fashionable, vendors were experimenting with data-driven methods to optimize radio performance, automate operations, and manage complexity that traditional engineering rules could no longer handle well or at all. One of the first widely recognized examples came from Ericsson, which worked with SoftBank in Japan on advanced coordination features that would later be branded as Elastic RAN. By dynamically orchestrating users and cell sites, these early deployments delivered substantial throughput gains in dense environments such as Tokyo Station (with more than half a million passengers daily). Although they were not presented as “AI solutions,” they relied on principles of adaptive optimization that anticipated later machine learning–based control loops.

Nokia, and previously Nokia-Siemens Networks, pursued a similar direction through Self-Organizing Networks. SON functions, such as neighbor list management, handover optimization, and load balancing, increasingly incorporate statistical learning and pattern recognition techniques. These capabilities were rolled out across 3G and 4G networks during the 2010s and can be seen as some of the earliest mainstream applications of machine learning inside the RAN. Samsung, Huawei, and ZTE also invested in intelligent automation at this stage, often describing their approaches in terms of network analytics and energy efficiency rather than artificial intelligence, but drawing on many of the same methods. Around the same time, startups began pushing the frontier further: Uhana, founded in 2016 (acquired by VMware in 2019), pioneered the use of deep learning for real-time network optimization and user-experience prediction, going beyond rule-based SON to deliver predictive, closed-loop control. Building on that trajectory, today’s Opanga represents a (much) more advanced, AI-native and vendor-agnostic RAN platform, addressing long-standing industry challenges such as congestion management, energy efficiency, and intelligent spectrum activation at scale. In my opinion, both Uhana and Opanga can be seen as early exemplars of the types of applications that later inspired the formalization of rApps and xApps in the O-RAN framework.

What began as incremental enhancements in SON and coordination functions gradually evolved into more explicit uses of AI. Ericsson extended its portfolio with machine-learning-based downlink link adaptation and parameter optimization; Nokia launched programs to embed AI into both planning and live operations; and other vendors followed suit. By the early 2020s, the industry had begun to coalesce around the idea of an AI-RAN, where RAN functions and AI workloads are tightly interwoven. This vision took concrete form in 2024 with the launch of the AI-RAN Alliance, led by NVIDIA and comprising Ericsson, Nokia, Samsung, SoftBank, T-Mobile, and other partners.

The trajectory from SON and early adaptive coordination toward today’s GPU-accelerated AI-RAN systems underscores that artificial intelligence in the RAN has been less a revolution than an evolution. The seeds were sown in the earliest machine-learning-driven automation of 3G and 4G networks, and they have grown into the integrated AI-native architectures now being tested for 5G Advanced and beyond.

Article content
Figure: Evolution of Open RAN architectures — from early X-RAN disaggregation (2016–2018) to O-RAN standardization (2018–2020), and today’s dual paths of full disaggregated O-RAN and vRAN with O-RAN interfaces.

AI IN OPEN RAN – THE EARLIER DAYS.

Open RAN as a movement has its roots in the xRAN Forum (founded in 2016) and the O-RAN Alliance (created in early 2018 when xRAN merged with C-RAN Alliance). While the architectural thinking and evolution around what has today become the O-RAN Architecture (with its 2 major options) is interesting and very briefly summarized in the above figure. The late 2010s were a time when architectural choices were made in a climate of enormous enthusiasm for cloud-native design and edge cloud computing. At that time, “disaggregation for openness” was considered an essential condition for competition, innovation, and efficiency. I also believe that when xRAN was initiated around 2016, the leading academic and industrial players came predominantly from Germany, South Korea, and Japan. Each of these R&D cultures has a deep tradition of best-in-breed engineering, that is, the idea that the most specialized team or vendor should optimize every single subsystem, and that overall performance emerges from integrating these world-class components. Looking back today, with the benefit of hindsight, one can see how this cultural disposition amplified the push for the maximum disaggregation paradigm, even where integration and operational realities would later prove more challenging. It also explains why early O-RAN documents are so ambitious in scope, embedding intelligence into every layer and opening almost every possible interface imaginable. What appeared to be a purely technical roadmap was, in my opinion, also heavily shaped by the R&D traditions and innovation philosophies of the national groups leading the effort.

However, although this is a super interesting topic (i.e., how culture and background influence innovation, architectural ideas, and choices), it is not the focus of this paper. AI in RAN is the focus. From its very first architectural documents, O-RAN included the idea that AI and ML would be central to automating and optimizing the RAN.

The key moment was 2018, when the O-RAN Alliance released its initial O-RAN architecture white paper (“O-RAN: Towards an Open and Smart RAN”). That document explicitly introduced the concept of the Non-Real-Time (NRT) RIC (rApps) and the Real-Time (RT) RIC (xApps) as platforms designed to host AI/ML-based applications. The NRT RIC was envisioned to run in the operator’s cloud, providing policy guidance, training, and coordination of AI models at timescales well above a second. In contrast, the RT RIC (i.e., the official name is RT RIC, which is unfortunate for abbreviations among the two RICs) would host faster-acting control applications within the 10-ms to 1-s regime. These were framed not just as generic automation nodes but explicitly as AI/ML hosting environments. The idea of a dual RIC structure, breaking up the architecture in layers of relevant timescales, was not conceived in a vacuum. It is, in many ways, an explicit continuation of the ideas introduced in the 3GPP LTE Self-Organizing Network (SON) specifications, where optimization functions were divided between centralized, long-horizon processes running in the network management system and distributed, faster-acting functions embedded at the eNodeB. In the LTE context, the offline or centralized SON dealt with tasks such as PCI assignment, ANR management, and energy saving strategies at timescales of minutes to days. At the same time, the online or distributed SON reacted locally to interference, handover failures, or outages at timescales of hundreds of milliseconds to a few seconds. O-RAN borrowed this logic but codified it in a much more rigid fashion: the Non-RT RIC inherited the role of centralized SON, and the RT RIC inherited the role of distributed SON, with the addition of standardized interfaces and an explicit role as AI application platforms.

Figure: Comparison between the SON functions defined by 3GPP for LTE (right) and the O-RAN RIC architecture (left). The LTE model divides SON into centralized offline (C-SON, in OSS/NMS, working on minutes and beyond) and distributed online (D-SON, at the edge, operating at 100 ms to seconds) functions. In contrast, O-RAN formalized this split into the Non-RT RIC (≥1 s) and Near-RT RIC (10 ms–1 s), embedded within the SMO hierarchy. The figure highlights how O-RAN codified and extended SON’s functional separation into distinct AI/ML application platforms.

The choice to formalize this split also had political dimensions. Vendors were reluctant to expose their most latency-critical baseband algorithms to external control, and the introduction of an RT RIC created a sandbox where third-party innovation could be encouraged without undermining vendor control of the physical layer. At the same time, operators sought assurances that policy, assurance, and compliance would not be bypassed by low-latency applications; therefore, the Non-RT RIC was positioned as a control tower layer situated safely above the millisecond domain. In this sense, the breakup of the time domain was as much a governance and trust compromise as a purely technical necessity. By drawing a clear line between “safe and slow” and “fast but bounded,” O-RAN created a model that felt familiar to operators accustomed to OSS hierarchies, while signaling to regulators and ecosystem players that AI could be introduced in a controlled and explainable manner.

Article content
Figure: Functional and temporal layering of the O-RAN architecture — showing the SMO with embedded NRT-RIC for long-horizon and slow control loops, the RT-RIC for fast loops, and the CU, DU, and RU for real-time through instant reflex actions, interconnected via standardized O-, A-, E-, F-, and eCPRI interfaces.

The figure above shows the O-RAN reference architecture with functional layers and interfaces. The Service Management and Orchestration (SMO) framework hosts the Non-Real-Time RIC (NRT-RIC), which operates on long-horizon loops (greater than 1 second) and is connected via the O1 interface to network elements and via O2 to cloud infrastructure (e.g., NFVI and MANO). Policies, enrichment information, and trained AI/ML models are delivered from the NRT-RIC to the Real-Time RIC (RT-RIC) over the A1 interface. The RT-RIC executes closed-loop control in the 10-ms to 1-s domain through xApps, interfacing with the CU/DU over E2. The 3GPP F1 split separates the CU and DU, while the DU connects to the RU through the open fronthaul (eCPRI/7-2x split). The RU drives active antenna systems (AAS) over largely proprietary interfaces (AISG for RET, vendor-specific for massive MIMO). The vertical time-scale axis highlights the progression from long-horizon orchestration at the SMO down to instant reflex functions in the RU/AAS domain. Both RU and DU operate on a transmission time interval (TTI) between 1 ms and 625 microseconds.

The O-RAN vision for AI and ML is built directly into its architecture from the very first white paper in 2018. The alliance described two guiding themes: openness and intelligence. Openness was about enabling multi-vendor, cloud-native deployments with open interfaces, which was supposed to provide for much more economical RAN solutions, while intelligence was about embedding machine learning and artificial intelligence into every layer of the RAN to deal with growing complexity (i.e., some of it self-inflicted by architecture and system design).

The architectural realization of this vision is the hierarchical RAN Intelligent Controller (RIC), which separates the control into different time domains and couples each to appropriate AI/ML functions:

  • Service Management and Orchestration (SMO, timescale > 1 second) – The Control Tower: The SMO provides the overarching management and orchestration framework for the RAN. Its functions extend beyond the Non-RT RIC, encompassing lifecycle management, configuration, assurance, and resource orchestration across both network functions and the underlying cloud infrastructure. Through the O1 interface (see above figure), the SMO collects performance data, alarms, and configuration information from the CU, DU, and RU, enabling comprehensive FCAPS (Fault, Configuration, Accounting, Performance, Security) management. Through the O2 interface (see above), it orchestrates cloud resources (compute, storage, accelerators) required to host virtualized RAN functions and AI/ML workloads. In addition, the SMO hosts the Non-RT RIC, meaning it not only provides operational oversight but also integrates AI/ML governance, ensuring that trained models and policy guidance align with operator intent and regulatory requirements.
  • Non-Real-Time RIC (NRT RIC, timescale > 1 second) – The Policy Brain: Directly beneath, embedded in the SMO, lies the NRT-RIC, described here as the “policy brain.” This is where policy management, analytics, and AI/ML model training take place. The non-RT RIC collects large volumes of data from the network (spatial-temporal traffic patterns, mobility traces, QoS (Quality of Service) statistics, massive MIMO settings, etc.) and uses them for offline training and long-term optimization. Trained models and optimization policies are then passed down to the RT RIC via the A1 interface (see above). A central functionality of the NRT-RIC is the hosting of rApps (e.g., Python or Java code), which implement policy-driven use cases such as energy savings, traffic steering, and mobility optimization. These applications leverage the broader analytic scope and longer timescales of the NRT-RIC to shape intent and guide the near-real-time actions of the RT-RIC. The NRT-RIC is traditionally viewed as an embedded entity within the SMO (although in theory, it could be a standalone entity)..
  • Real-Time RIC (RT RIC, 10 ms – 1 second timescale) – The Decision Engine: This is where AI-driven control is executed in closed loops. The real-time RT-RIC hosts xApps (e.g., Go or C++ code) that run inference on trained models and perform tasks such as load balancing, interference management, mobility prediction, QoS management, slicing, and per-user (UE) scheduling policies. It maintains a Radio Network Information Base (R-NIB) fed via the E2 interface (see above) from the DU/CU, and uses this data to make fast control decisions in near real-time.
  • Centralized Unit (CU): Below the RT-RIC sits the Centralized Unit, which takes on the role of the “shaper” in the O-RAN architecture. The CU is responsible for higher-layer protocol processing, including PDCP (Packet Data Convergence Protocol) and SDAP (Service Data Adaptation Protocol), and is therefore the natural point in the stack where packet shaping and QoS enforcement occur. At this level, AI-driven policies provided by the RT-RIC can directly influence how data streams are prioritized and treated, ensuring that application- or slice-specific requirements for latency, throughput, and reliability are respected. By interfacing with the RT-RIC over the E2 interface, the CU can dynamically adapt QoS profiles and flow control rules based on real-time network conditions, balancing efficiency with service differentiation. In this way, the CU acts as the bridge between AI-guided orchestration and the deterministic scheduling that occurs deeper in the DU/RU layers. The CU operates on a real-time but not ultra-tight timescale, typically in the range of tens of milliseconds up to around one second (similar to the RT-RIC), depending on the function.
  • DU/RU layer (sub-1 ms down to hundreds of microseconds) – The Executor & Muscles: The Distributed Unit (DU), located below the CU, is referred to as the “executor.” It handles scheduling and precoding at near-instant timescales, measured in sub-millisecond intervals. Here, AI functions take the form of compute agents that apply pre-trained or lightweight models to optimize resource block allocation and reduce latency. At the bottom, the Radio Unit (RU) represents the “muscles” of the system. Its reflex actions happen at the fastest time scales, down to hundreds of microseconds. While it executes deterministic signal processing, beamforming, and precoding, it also feeds measurements upward to fuel AI learning higher in the chain. Here reside the tightest loops, on a Transmission Time Interval (TTI) time scale (i.e., 1ms – 625 µs), such as baseband PHY processing, HARQ feedback, symbol scheduling, and beamforming weights. These functions require deterministic latencies and cannot rely on higher-layer AI/ML loops. Instead, the DU/RU executes control at the L1/L2 level, while still feeding measurement data upward for AI/ML training and adaptation.
Article content
Figure: AI’s hierarchical chain of command in O-RAN — from the SMO as the control tower and NRT-RIC as the policy brain, through the RT-RIC as the decision engine and CU as shaper, down to DU as executor and RU as muscles. Each layer aligns with guiding timescales, agentic AI roles, and contributions to spectral efficiency, balancing perceived SE gains, overhead reductions, and SINR improvements.

The figure above portrays the Open RAN as a “chain of command” where intelligence flows across time scales, from long-horizon orchestration in the cloud down to sub-millisecond reflexes in the radio hardware. To make it more tangible, I have annotated the example of spectral efficiency optimization use case on the right side of the figure. The cascading structure, shown above, highlights how AI and ML roles evolve across the architecture. For instance, the SMO and NRT-RIC increase perceived spectral efficiency through strategic optimization, while the RT-RIC reduces inefficiencies by orchestrating fast loops. Additionally, the DU/RU contribute directly to signal quality improvements, such as SINR gains. The figure thus illustrates Open RAN not as a flat architecture, but as a hierarchy of brains, decisions, and muscles, each with its own guiding time scale and AI function. Taken together, the vision is that AI/ML operates across all time domains, with the non-RT RIC providing strategic intelligence and model training, the RT RIC performing agile, policy-driven adaptation, and the DU/RU executing deterministic microsecond-level tasks, while exposing data to feed higher-layer intelligence. With open interfaces (A1, E2, open fronthaul), this layered AI approach allows multi-vendor participation, third-party innovation, and closed-loop automation across the RAN.

From 2019 onward, O-RAN working groups such as WG2 (Non-RT RIC & A1 interface) and WG3 (RT RIC & E2 interface) began publishing technical specifications that defined how AI/ML models could be trained, distributed, and executed across the RIC layers. By 2020–2021, proof-of-concepts and plugfests showcased concrete AI/ML use cases, such as energy savings, traffic steering, and anomaly detection, running as xApps (residing in RT-RIC) and rApps (residing in NRT-RIC). Following the first O-RAN specifications and proof-of-concepts, it becomes helpful to visualize how the different architectural layers relate to AI and ML. You will find a lot of the standardization documents in the reference list at the end of the document.

rAPPS AND xAPPS – AN ILLUSTRATION.

In the Open RAN architecture, the system’s intelligence is derived from the applications that run on top of the RIC platforms. The rApps exist in the Non-Real-Time RIC and xApps in the Real-Time RIC. While the RICs provide the structural framework and interfaces, it is the apps that carry the logic, algorithms, and decision-making capacity that ultimately shape network behavior. rApps operate at longer timescales, often drawing on large datasets and statistical analysis to identify trends, learn patterns, and refine policies. They are well-suited to classical machine learning processes such as regression, clustering, and reinforcement learning, where training cycles and retraining benefit from aggregated telemetry and contextual information. In practice, rApps are commonly developed in high-level languages such as Python or Java, leveraging established AI/ML libraries and data processing pipelines. In contrast, xApps must execute decisions in near-real time, directly influencing scheduling, beamforming, interference management, and resource allocation. Here, the role of AI and ML is to translate abstract policy into fast, context-sensitive actions, with an increasing reliance on intelligent control strategies, adaptive optimization, and eventually even agent-like autonomy (more on that later in this article). To meet these latency and efficiency requirements, xApps are typically implemented in performance-oriented languages like C++ or Go. However, Python is often used in prototyping stages before critical components are optimized. Together, rApps and xApps represent the realization of intelligence in Open RAN. One set grounded in long-horizon learning and policy shaping (i.e., Non-RT RIC and rApps), the other in short-horizon execution and reflexive adaptation (RT-RIC and xApps). Their interplay is not only central to energy efficiency, interference management, and spectral optimization but also points toward a future where classical ML techniques merge with more advanced AI-driven orchestration to deliver networks that are both adaptive and self-optimizing. Let us have a quick look at examples that illustrate how these applications work in the overall O-RAN architectural stack.

Figure: Energy efficiency loop in Open RAN, showing how long-horizon rApps set policies in the NRT-RIC, xApps in the RT-RIC execute them, and DU/RU translate these into scheduler and hardware actions with continuous telemetry feedback.

One way to understand the rApp–xApp interaction is to follow a simple energy efficiency use case, shown in the figure below. At the top, an energy rApp in the Non-RT RIC learns long-term traffic cycles and defines policies such as ‘allow cell muting below 10% load.’ These policies are then passed to the RT-RIC, where an xApp monitors traffic every second and decides when to shut down carriers or reduce power. The DU translates these decisions into scheduling and resource allocations, while the RU executes the physical actions such as switching off RF chains, entering sleep modes, or muting antenna elements. The figure above illustrates how policy flows downward while telemetry and KPIs flow back up, forming a continuous energy optimization loop. Another similarly layered logic applies to interference coordination, as shown in the figure below. Here, an interference rApp in the Non-RT RIC analyzes long-term patterns of inter-cell interference and sets coordination policies — for example, defining thresholds for ICIC, CoMP, or power capping at the cell edge. The RT-RIC executes these policies through xApps that estimate SINR in real time, apply muting patterns, adjust transmit power, and coordinate beam directions across neighboring cells. The DU handles PRB scheduling and resource allocation, while the RU enacts physical layer actions, such as adjusting beam weights or muting carriers. This second loop shows how rApps and xApps complement each other when interference is the dominant concern.

Figure: Interference coordination loop in Open RAN, where rApps define long-term coordination policies and xApps execute real-time actions on PRBs, power, and beams through DU/RU with continuous telemetry feedback.

Yet these loops do not always reinforce each other. If left uncoordinated, they can collide. An energy rApp may push the system toward contraction, reducing Tx power, muting carriers, and blanking PRBs. In contrast, an interference xApp simultaneously pushes for expansion, raising Tx power, activating carriers, and dynamically allocating PRBs. Both act on the same levers inside the CU/DU/RU, but in opposite directions. The result can be oscillatory behaviour, with power and scheduling thrashing back and forth, degrading QoS, and wasting energy. The figure below illustrates this risk and underscores why conflict management and intent arbitration are critical for a stable Open RAN.

Figure: Example of conflict between an energy-saving rApp and an interference-mitigation xApp, where opposing control intents on the same CU/DU/RU parameters can cause oscillatory behaviour.

Beyond the foundational description of how rApps and xApps operate, it is equally important to address the conflicts and issues that can arise when multiple applications are deployed simultaneously in the Non-RT and RT-RICs. Because each app is designed with a specific optimization objective in mind, it is almost inevitable that two or more apps will occasionally attempt to act on the same parameters in contradictory ways. While the energy efficiency versus interference management example is already well understood, there are broader categories of conflict that extend across both timescales.

Conflicts between rApps occur when long-term policy objectives are not aligned. For instance, a spectral efficiency rApp may continuously push the network toward maximizing bits per Hertz by advocating for higher transmit power, more active carriers, or denser pilot signaling. At the same time, an energy-saving rApp may be trying to mute those very carriers, reduce pilot density, and cap transmit power to conserve energy. Both policies can be valid in isolation, but when issued without coordination, they create conflicting intents that leave the RT-RIC and lower layers struggling to reconcile them. Even worse, the oscillatory behavior that results can propagate into the DU and RU, creating instability at the level of scheduling and RF execution. The xApps, too, can easily find themselves in conflict when they react to short-term KPI fluctuations with divergent strategies. An interference management xApp might impose aggressive PRB blanking patterns or reduce power at the cell edge. At the same time, a mobility optimization xApp might simultaneously widen cell range expansion parameters to offload traffic. The first action is designed to protect edge users, while the second may flood them with more load, undoing the intended benefit. Similarly, an xApp pushing for higher spectral efficiency may keep activating carriers and pushing toward higher modulation and coding schemes, while another xApp dedicated to energy conservation is attempting to put those carriers to sleep. The result is rapid toggling of resource states, which wastes signaling overhead and disrupts user experience.

The O-RAN Alliance has recognized these risks and proposed mechanisms to address them. Architecturally, conflict management is designed to reside in the RT-RIC, where a Conflict Mitigation and Arbitration framework evaluates competing intents from different xApps before they reach the CU/DU. Policies from the Non-RT RIC can also be tagged with priorities or guardrails, which the RT-RIC uses to arbitrate real-time conflicts. In practice, this means that when two xApps attempt to control the same parameter, the RT-RIC applies priority rules, resolves contradictions, or, in some cases, rejects conflicting commands entirely. On the rApp side, conflict resolution is handled at a higher abstraction level by the Non-RT RIC, which can consolidate or harmonize policies before they are passed down through the A1 interface.

The layered conflict mitigation approach in O-RAN provides mechanisms to arbitrate competing intents between apps. It can reduce the risk of oscillatory behavior, but it cannot guarantee stability completely. Since rApps and xApps may originate from different sources and vary in design quality, careful testing, certification, and continuous monitoring will remain essential to ensure that application diversity does not undermine network coherence. Equally important are policies that impose guardbands, buffers, and safety margins in how parameters can be tuned, which serve as a hedge against instabilities when apps are misaligned, whether the conflict arises between rApps, between xApps, or across the rApp–xApp boundary. These guardbands provide the architectural equivalent of shock absorbers, limiting the amplitude of conflicting actions and ensuring that, even if multiple apps pull in different directions, the network avoids catastrophic oscillations.

Last but not least, the risks may increase as rApps and xApps evolve beyond narrowly scoped optimizers into more agentic forms. An agentic app does not merely execute a set of policies or inference models. It can plan, explore alternatives, and adapt its strategies with a degree of autonomy (and agency). While this is likely to unlock powerful new capabilities, it also expands the possibility of emergent and unforeseen interactions. Two agentic apps, even if aligned at deployment, may drift toward conflicting behaviors as they continuously learn and adapt in real time. Without strict guardrails and robust conflict resolution, such autonomy could magnify instabilities rather than contain them, leading to system behavior that is difficult to predict or control. In this sense, the transition from classical rApps and xApps to agentic forms is not only an opportunity but also a new frontier of risk that must be carefully managed within the O-RAN architecture.

IS AI IN RAN ALL ABOUT “ChatGPT”?

I want to emphasize that when I address AI in the RAN, I generally do not refer to generative language models, such as ChatGPT, or other large-scale conversational systems built upon a human language context. Those technologies are based on Large Language Models (LLMs), which belong to the family of deep learning architectures built on transformer networks. A transformer network is a type of neural network architecture built around the attention mechanism, which allows the model to weigh the importance of different parts of an input sequence simultaneously rather than processing it step by step. They are typically trained on enormous human-based text datasets, utilizing billions of parameters, which requires immense computational resources and lengthy training cycles. Their most visible purpose today is to generate and interpret human language, operating effectively at the scale of seconds or longer in user interactions. In the context of network operations, I suspect that GPT-like LLMS will have a mission in the frontend where humans will need to interact with the communications network using human language. That said, the notion of “generative AI” is not inherently limited to natural language. The same underlying transformer-based methods can be adapted to other modalities (information sources), including machine-oriented languages or even telemetry sequences. For example, a generative model trained on RAN logs, KPIs, and signaling traces could be used to create synthetic telemetry or predict unusual event patterns. In this sense, generative AI could provide value to the RAN domain by augmenting datasets, compressing semantic information, or even assisting in anomaly detection. The caveat, however, is that these benefits still rely on heavy models with large memory footprints and significant inference latency. While they may serve well in the Non-RT RIC or SMO domain, where time scales are relaxed and compute resources are more abundant, they are unlikely to be terribly practical for the RT RIC or the DU/RU, where deterministic deadlines in the millisecond or microsecond range must be met.

By contrast, the application of AI/ML in the RAN is fundamentally about real-time signal processing, optimization, and control. RAN intelligence focuses on tasks such as load balancing, interference mitigation, mobility prediction, traffic steering, energy optimization, and resource scheduling. These are not problems of natural human language understanding but of strict scheduling and radio optimization. The time scales at which these functions operate are orders of magnitude shorter than those typical of generative AI. From long-horizon analytics in the Non-RT RIC (greater than one second) to near-real-time inference in the RT-RIC (i.e., 10 ms–1 s), and finally to deterministic microsecond loops in the DU/RU. This stark difference in time scales and problem domains explains why it appears unlikely that the RAN can be controlled end-to-end by “ChatGPT-like” AI. LLMs, whether trained on human language or telemetry sequences, are (today at least) too computationally heavy, too slow in inference, and are optimized for open-ended reasoning rather than deterministic control. Instead, the RAN requires a mix of lightweight supervised and reinforcement learning models, online inference engines, and, in some cases, ultra-compact TinyML implementations that can run directly in hardware-constrained environments.

In general, AI in the RAN is about embedding intelligence into control loops at the right time scale and with the right efficiency. Generative AI may have a role in enriching data and informing higher-level orchestration. It is difficult to see how it can efficiently replace the tailored, lightweight models that drive the RAN’s real-time and near-real-time control.

As O-RAN (and RAN in general) evolves from a vision of open interfaces and modular disaggregation into a true intelligence-driven network, one of the clearest frontiers is the use of Large Language Models (LLMs) at the top of the stack (i.e., frontend/human-facing). The SMO, with its embedded Non-RT RIC, already serves as the strategic brain of the architecture, responsible for lifecycle management, long-horizon policy, and the training of AI/ML models. This is also the one domain where time scales are relaxed, measured in seconds or longer, and where sufficient compute resources exist to host heavier models. In this environment, LLMs can be utilized in two key ways. First, they can serve as intent interpreters for intent-driven network operations, bridging the gap between operator directives and machine-executable policies. Instead of crafting detailed rules or static configuration scripts, operators could express high-level goals, such as prioritizing emergency service traffic in a given region or minimizing energy consumption during off-peak hours. An LLM, tuned with telecom-specific knowledge, can translate those intents into precise policy actions distributed through the A1 interface to the RT RIC. Second, LLMs can act as semantic compressors, consuming the vast streams of logs, KPIs, and alarms that flow upward through O1, and distilling them into structured insights or natural language summaries that humans can easily grasp. This reduces cognitive load for operators while ensuring (at least we should hope so!) that the decision logic remains transparent, possibly explainable, and auditable. In both roles, LLMs do not replace the specialized ML models running lower in the architecture. Instead, they enhance the orchestration layer by embedding reasoning and language understanding where time and resources permit.

WHAT AI & ML ARE LIKELY TO WORK IN RAN?

This piece assumes a working familiarity with core machine-learning concepts, models, training and evaluation processes, and the main families you will encounter in practice. If you want a compact, authoritative refresher, most of what I reference is covered, clearly and rigorously, in Goodfellow, Bengio, and Courville’s Deep Learning (Adaptive Computation and Machine Learning series, MIT Press). For hands-on practice, many excellent Coursera courses walk through these ideas with code, labs, and real datasets. They are a fast way to build the intuition you will need for the examples discussed in this section. Feel free to browse through my certification list, which includes over 60 certifications, with the earliest ML and AI courses dating back to 2015 (should have been updated by now), and possibly find some inspiration.

Throughout the article, I use “AI” and “ML” interchangeably for readability, but formally, they should be regarded as distinct. Artificial Intelligence (AI) is the broader field concerned with building systems that perceive their environment, reason about it, and act to achieve goals, encompassing planning, search, knowledge representation, learning, and decision-making. Machine Learning (ML) is a subset of AI that focuses specifically on data-driven methods that learn patterns or policies from examples, improving performance on a task through experience rather than explicit, hand-crafted rules, which is where the most interesting aspects occur.

Article content
Figure: Mapping of AI roles, data flows, and model families across the O-RAN stack — from SMO and NRT-RIC handling long-horizon policy, orchestration, and training, to RT-RIC managing fast-loop inference and optimization, down to CU and DU/RU executing near-real-time and hardware-domain actions with lightweight, embedded AI models.

Artificial intelligence in the O-RAN stack exhibits distinct characteristics depending on its deployment location. Still, it is helpful to see it as one continuous flow from intent at the very top to deterministic execution at the very bottom. So, let’s go with the flow.

At the level of the Service Management and Orchestration, AI acts as the control tower for the entire system. This is where business or human intent must be translated into structured goals, and where guardrails, audit mechanisms, and reversibility are established to ensure compliance with regulatory oversight. Statistical models and rules remain essential at this layer because they provide the necessary constraint checking and explainability for governance. Yet the role of large language models is increasing rapidly, as they provide a bridge from human language into structured policies, intent templates, and root-cause narratives. Generative approaches are also beginning to play a role by producing synthetic extreme events to stress-test policies before they are deployed. While synthetic data for rare events offers a powerful tool for training and stress-testing AI systems, it may carry significant statistical risks. Generative models can fail to represent the very distributions they aim to capture, bias inference, or even introduce entirely artificial patterns into the data. Their use therefore requires careful anchoring in extremes-aware statistical methods, rigorous validation against real-world holdout data, and safeguards against recursive contamination. When these conditions are met, synthetic data can meaningfully expand the space of scenarios available for training and testing. Without the appropriate control mechanisms, decisions or policies based on synthetic data risk becoming a source of misplaced confidence rather than resilience. With all that considered, the SMO should be the steward of safety and interpretability, ensuring that only validated and reversible actions flow down into the operational fabric. If agentic AI is introduced here, it could reshape how intent is operationalized. Instead of merely validating human inputs, agentic systems might proactively (autonomeously) propose actions, refine intents into strategies, or initiate self-healing workflows on their own. While this promises greater autonomy and resilience, it also raises new challenges for oversight, since the SMO would become not just a filter but a creative actor in its own right.

At the top level, rApps (which reside in the NRT-RIC) are indirectly shaped by SMO policies, as they inherit intent, guardrails, and reversibility constraints. For example, when the SMO utilizes LLMs to translate business goals into structured intents, it essentially sets the design space within which rApps can train or re-optimize their models. The SMO also provides observability hooks, allowing rApp outputs to be audited before being pushed downstream.

The Non-Real-Time RIC can be understood as the long-horizon brain of the RAN. Its function is to train, retrain, and refine models, conduct long-term analysis, and transform historical and simulated experience into reusable policies. Reinforcement learning in its many flavors is the cornerstone here, particularly offline or constrained forms that can safely explore large data archives or digital twin scenarios. Autoencoders, clustering, and other representation learning methods uncover hidden structures in traffic and mobility, while supervised deep networks and boosted trees provide accurate forecasting of demand and performance. Generative simulators extend the scope by fabricating rare but instructive scenarios, allowing policies to be trained for resilience against the unexpected. Increasingly, language-based systems are also being applied to policy generation, bridging between strategic descriptions and machine-enforceable templates. The NRT-RIC strengthens AI’s applicability by moving risk away from live networks, producing validated artifacts that can later be executed at speed. If an agentic paradigm is introduced here, it would mean that the NRT-RIC is not merely a training ground but an active planner, continuously setting objectives for the rest of the system and negotiating trade-offs between coverage, energy, and user experience. This shift would make the Non-RT RIC a more autonomous planning organ, but it would also demand stronger mechanisms for bounding and auditing its explorations.

Here, at the NTR-RIC, rApps that are native to this RIC level are the central vehicle for model training, policy generation, and scenario exploration. They consume SMO intent and turn it into reusable policies or models for the RT-RIC. For example, a mobility rApp could use clustering and reinforcement learning to generate policies for user handover optimization, which the RT-RIC then executes in near real time. Another rApp might simulate mMIMO pairing scenarios offline, distill them into simplified lookup tables or quantized policies, and hand these artifacts down for execution at the DU/RU. Thus, rApps act as the policy factories. Their outputs cascade into xApps, at the RT-RIC, CU parameter sets, and lightweight silicon-bound models deeper down.

The Real-Time RIC is where planning gives way to fast, local action. At timescales between ten milliseconds and one second, the RT-RIC is tasked with run-time inference, traffic steering, slicing enforcement, and short-term interference management. Because the latency budget is tight, the model families that thrive here are compact and efficient. Shallow neural networks, recurrent models, and temporal CNN-RNN hybrids are all appropriate for predicting near-future load and translating context into rapid actions. Decision trees and ensemble methods remain attractive because of their predictable execution and interpretability. Online reinforcement learning, in which an agent interacts with its environment in real-time and updates its policy based on rewards or penalties, together with contextual bandits, a simplified variant that optimizes single-step decisions from observed contexts, both enable adaptation in small, incremental steps while minimizing the risk of destabilization. In specific contexts, lightweight graph neural networks (GNNs), which are streamlined versions of GNNs designed to model relationships between entities while keeping computational costs low, can capture the topological relationships between neighboring cells. In the RT-RIC, models must balance accuracy with predictable execution under tight timescales. Shallow neural networks (simple feedforward models capturing non-linear patterns), recurrent models (RNNs that retain memory of past inputs), and hybrid convolutional neural network–recurrent neural network (CNN–RNN) models (combining spatial feature extraction with temporal sequencing) are well-suited for processing fast-evolving time series, such as traffic load or interference, delivering near-future predictions with low latency. Decision trees (rule-based classifiers that split data hierarchically) and ensemble methods (collections of weak learners, such as random forests or boosting) add value through their lightweight, deterministic behavior and interpretability, making them reliable for regulatory oversight and stable actuation. Online reinforcement learning (RL) and contextual bandits further allow the system to adapt incrementally to changing conditions without risking destabilization. In more complex contexts, lightweight GNNs capture the topological structure between neighboring cells, supporting coordination in handovers or interference management while remaining efficient enough for real-time use. The RT-RIC thus embodies the point where AI policies become immediate operational decisions, measurable in KPIs within seconds. When viewed through the lens of agency, this layer becomes even more dynamic. An agentic RT-RIC could weigh competing goals, prioritize among multiple applications, and negotiate real-time conflicts without waiting for external intervention. Such an agency might significantly improve efficiency and responsiveness but would also blur the boundary between optimization and autonomous control, requiring new arbitration frameworks and assurance layers.

At this level, xApps, native to the RT-RIC, execute policies derived from rApps and adapt them to live network telemetry. An xApp for traffic steering might combine a policy from the Non-RT RIC with local contextual bandits to adjust routing in the moment. Another xApp could, for example, use lightweight GNNs to coordinate interference management across adjacent cells, directly influencing DU scheduling and RU beamforming. This makes xApps the translators of long-term rApp insights into second-by-second action, bridging the predictive foresight of rApps with the deterministic constraints of the DU/RU.

The Centralized Unit occupies an intermediate position between near-real-time responsiveness and higher-layer mobility and bearer management. Here, the most useful models are those that can both predict and pre-position resources before bottlenecks occur. Long Short-Term Memory networks (LSTMs, recurrent models designed to capture long-range dependencies), Gated Recurrent Units (GRUs, simplified RNNs with fewer parameters), and temporal Convolutional Neural Networks (CNNs, convolution-based models adapted for sequential data) are natural fits for forecasting user trajectories, mobility patterns, and session demand, thereby enabling proactive preparation of handovers and early allocation of network slices. Constrained reinforcement learning (RL, trial-and-error learning optimized under explicit safety or policy limits) methods play an important role at the bearer level, where they must carefully balance Quality of Service (QoS) guarantees against overall resource utilization, ensuring efficiency without violating service-level requirements. At the same time, rule-based optimizers remain well-suited for more deterministic processes, such as configuring Packet Data Convergence Protocol (PDCP) and Radio Link Control (RLC) parameters, where fixed logic can deliver predictable and stable outcomes in real-time. The CU strengthens applicability by anticipating issues before they materialize and by converting intent into per-flow adjustments. If agency is introduced at this layer, it might manifest as CU-level agents negotiating mobility anchors or bearer priorities directly, without relying entirely on upstream instructions. This could increase resilience in scenarios where connectivity to higher layers is impaired. Still, it also adds complexity, as the CU would need a framework for coordinating its autonomous decisions with the broader policy environment.

Both xApps and rApps can influence CU functions as they relate to bearer management and PDCP/RLC configuration. For example, a QoS balancing rApp might propose long-term thresholds for bearer prioritization. At the same time, a short-horizon xApp enforces these by pre-positioning slice allocations or adjusting bearer anchors in anticipation of predicted mobility. The CU thus becomes a convergence point, where rApp strategies and xApp tactics jointly shape mobility management and session stability before decisions cascade into DU scheduling.

At the very bottom of the stack, the Distributed Unit and Radio Unit function under the most stringent timing constraints, often in the realm of microseconds. Their role is to execute deterministic PHY and MAC functions, including HARQ, link adaptation, beamforming, and channel state processing. Only models that can be compiled into silicon, quantized, or otherwise guaranteed to run within strict latency budgets are viable in this layer of the Radio Access Network. Tiny Machine Learning (TinyML), Quantized Neural Networks (QNN), and lookup-table distilled models enable inference speeds compatible with microsecond-level scheduling constraints. As RU and DU components typically operate under strict latency and computational constraints, TinyML and low-bit QNNs are ideal for deploying functions such as beam selection, RF monitoring, anomaly detection, or lightweight PHY inference tasks. Deep-unfolded networks and physics-informed neural models are particularly valuable because they can replace traditional iterative solvers in equalization and channel estimation, achieving high accuracy while ensuring fixed execution times. In advanced antenna systems, neural digital predistortion and amplifier linearization enhance power efficiency and spectral containment. At the same time, sequence-based predictors can cut down channel state information (CSI) overhead and help stabilize multi-user multiple-input multiple-output (MU-MIMO) pairing. At this level, the integration of agentic AI must, in my opinion, be approached with caution. The DU and RU domains are all about execution rather than deliberation. Introducing agency here could compromise determinism. However, carefully bounded micro-agents that autonomously tune beams or adjust precoders within strict envelopes might prove valuable. The broader challenge is to reconcile the demand for predictability with the appeal of adaptive intelligence baked into hardware.

At this layer, most intelligence is “baked in” and must respect microsecond determinism timescales. Yet, rApps and xApps may still indirectly shape the DU/RU environment. The DU/RU do not run complex agentic loops themselves, but they inherit distilled intelligence from the higher layers. Micro-agents, if used, must be tightly bound. For example, an RU micro-agent may autonomously choose among two or three safe precoding matrices supplied by an xApp, but never generate them on its own.

Taking all the above together, the O-RAN stack can be seen as a continuum of intelligence, moving from the policy-heavy, interpretative functions at the SMO to the deterministic, silicon-bound execution at the RU. Agentic AI has the potential to change this continuum by shifting layers from passive executors to active participants. An agentic SMO might not only validate intents but generate them. An agentic Non-RT RIC might become an autonomous planner. An agentic RT-RIC could arbitrate between conflicting goals independently. And even the CU or DU might gain micro-agents that adjust parameters locally without instruction. This greater autonomy promises efficiency and adaptability but raises profound questions about accountability, oversight, and control. If the agency is allowed to propagate too deeply into the stack, the risk is that millions of daily inferences are taken without transparent justification or the possibility of reversal. This situation is unlikely to be considered regulatory acceptable and would be in direct violation of the European Artificial Intelligence Act, violating core provisions of the EU AI Act. The main risks are a lack of adequate human oversight (Article 14), inadequate record-keeping and traceability (Article 12), failures of transparency (Article 13), and the inability to provide meaningful explanations to affected users (Article 86). Together, these gaps would undermine the broader lifecycle obligations on risk management and accountability set out in Articles 8–17. To mitigate that, openness becomes indispensable: open policies, open data schemas, model lineage, and transparent observability hooks allow agency to be exercised without undermining trust. In this way, the RAN of the future may become not only intelligent but agentic, provided that its newfound autonomy is balanced by openness, auditability, and human authority at the points that matter most. However, I suspect that reaching that point may be a much bigger challenge than developing the AI Agentic framework and autonomous processes.

While the promise of AI in O-RAN is compelling, it is equally important to recognize where existing functions already perform so effectively that AI has little to add. At higher layers, such as the SMO and the Non-RT RIC, the complexity of orchestration, policy translation, and long-horizon planning naturally creates a demand for AI. These are domains where deterministic rules quickly become brittle, and where the adaptive and generative capacities of modern models unlock new value. Similarly, the RT-RIC benefits from lightweight ML approaches because traffic dynamics and interference conditions shift on timescales that rule-based heuristics often struggle to capture. As one descends closer to execution, however, the incremental value of AI begins to diminish. In the CU domain, many bearer management and PDCP/RLC functions can be enhanced by predictive models. Still, much of the optimization is already well supported by deterministic algorithms that operate within known bounds. The same is even more pronounced at the DU and RU levels. Here, fundamental PHY and MAC procedures such as HARQ timing, CRC checks, coding and decoding, and link-layer retransmissions are highly optimized, deterministic, and hardware-accelerated. These functions have been refined over decades of wireless research, and their performance approaches the physical and information-theoretical limits. For example, beamforming and precoding illustrate this well. Linear algebraic methods such as zero-forcing and MMSE are deeply entrenched, efficient, and predictable. AI and ML can sometimes enhance them at the margins by improving CSI compression, reducing feedback overhead, or stabilizing non-stationary channels. Yet it is unlikely to displace the core mathematical solvers that already deliver excellent performance. Link adaptation is similar. While machine learning may offer marginal gains in dynamic or noisy conditions, conventional SINR-based thresholding remains highly effective and, crucially, deterministic. It is worth remembering that simply and arbitrarily applying AI or ML functionality to an architectural element does not necessarily mean it will make a difference or even turn out to be beneficial.

This distinction becomes especially relevant when considering the implications of agentic AI. In my opinion, agency is most useful at the top of the stack, where strategy, trade-offs, and ambiguity dominate. In the SMO or Non-RT RIC, agentic systems can propose strategies, negotiate policies, or adapt scenarios in ways that humans or static systems could never match. At the RT-RIC, a carefully bound agency may improve arbitration among competing applications. But deeper in the stack, particularly at the DU and RU, the agency adds little value and risks undermining determinism. At microsecond timescales, where physics rules and deadlines are absolute, autonomy may be less of an advantage and more of a liability. The most practical role of AI here is supplementary, enabling anomaly detection, parameter fine-tuning, or assisting advanced antenna systems in ways that respect strict timing constraints. This balance of promise and limitation underscores a central point. AI is not a panacea for O-RAN, nor should it be applied indiscriminately.

Article content
Figure: Comparative view of how AI transforms RAN operations — contrasting classical vendor-proprietary SON approaches, Opanga’s vendor-agnostic RAIN platform, and O-RAN implementations using xApps and rApps for energy efficiency, spectral optimization, congestion control, anomaly detection, QoE, interference management, coverage, and security.

The Table above highlights how RAN intelligence has evolved from classical vendor-specific SON functions toward open O-RAN frameworks and Opanga’s RAIN platform. While Classical RAN relied heavily on embedded algorithms and static rules, O-RAN introduces rApps and xApps to distribute intelligence across near-real-time and non-real-time control loops. Opanga’s RAIN, however, stands out as a truly AI-native and vendor-agnostic platform that is already commercially deployed at scale today. By tackling congestion, energy reduction, and intelligent spectrum on/off management without reliance on DPI (which is, anyway, a losing strategy as QUIC becomes increasingly used) or proprietary stacks, RAIN directly addresses some of the most pressing efficiency and sustainability challenges in today’s networks. It also appears straightforward for Opanga to adapt its AI engines into rApps or xApps should the Open RAN market scale substantially in the future, reinforcing its potential as one of the strongest and most practical AI platforms in the RAN domain today.

A NATIVE-AI RAN TEASER.

Native-AI in the RAN context means that artificial intelligence is not just an add-on to existing processes, but is embedded directly into the system’s architecture, protocols, and control loops. Instead of having xApps and rApps bolted on top of traditional deterministic scheduling and optimization functions, a native-AI design treats learning, inference, and adaptation as first-class primitives in the way the RAN is built and operated. This is fundamentally different from today’s RAN system designs, where AI is mostly externalized, invoked at slower timescales, and constrained by legacy interfaces. In a native-AI architecture, intent, prediction, and actuation are tightly coupled at millisecond or even microsecond resolution, creating new possibilities for spectral efficiency, user experience optimization, and autonomous orchestration. A native-AI RAN would likely require heavier hardware at the edge of the network than today’s Open (or “classical”) RAN deployments. In the current architecture, the DU and RU rely on highly optimized deterministic hardware such as FPGAs, SmartNICs, and custom ASICs to execute PHY/MAC functions at predictable latencies and with tight power budgets. AI workloads are typically concentrated higher up in the stack, in the NRT-RIC or RT-RIC, where they can run on centralized GPU or CPU clusters without overwhelming the radio units. However, by contrast, a native-AI design pushes inference directly into the DU and even the RU, where microsecond-scale decisions on beamforming, HARQ, and link adaptation must be made. This implies the integration of embedded accelerators, such as AI-optimized ASICs, NPUs, or small-form-factor GPUs, into radio hardware, along with larger memory footprints for real-time model execution and storage. The resulting compute demand and cooling requirements could increase power consumption substantially beyond today’s SmartNIC-based O-RAN nodes. An effect that would be multiplied by millions of cell sites worldwide should such a design be chosen. This may (should!) raise concerns regarding both CapEx and OpEx due to higher costs for silicon and more demanding site engineering for power and heat management.

Article content
Figure: A comparison of the possible differences between today’s Open RAN and the AI-Native RAN Architecture. I should point out that the AI-Native RAN architecture is my own depiction and may not be how it may eventually look.

A native-AI RAN promises several advantages over existing architectures. By embedding intelligence directly into the control loops, the system can achieve higher spectral efficiency through ultra-fast adaptation of beamforming, interference management, and resource allocation, going beyond the limits of deterministic algorithms. It also allows for far more fine-grained optimization of the user experience, with decisions made per device, per flow, and in real-time, enabling predictive buffering and even semantic compression without noticeable delay. Operations themselves become more autonomous, with the RAN continuously tuning and healing itself in ways that reduce the need for manual intervention. Importantly, intent expressed at the management layer can be mapped directly into execution at the radio layer, creating continuity from policy to action that is missing in today’s O-RAN framework. Native-AI designs are also better able to anticipate and respond to extreme conditions, making the system more resilient under stress. Finally, they open the door to 6G concepts such as cell-less architectures, distributed massive MIMO, and AI-native PHY functions that cannot be realized under today’s layered, deterministic designs.

At the same time, the drawbacks of the Native-AI RAN approach may also be quite substantial. Embedding AI at microsecond control loops makes it almost impossible to trace reasoning steps or provide post-hoc explainability, creating tension with regulatory requirements such as the EU AI Act and NIS2. Because AI becomes the core operating fabric, mistakes, adversarial inputs, or misaligned objectives can cascade across the system much faster than in current architectures, amplifying the scale of failures. Continuous inference close to the radio layer also risks driving up compute demand and energy consumption far beyond what today’s SmartNIC- or FPGA-based solutions can handle. There is a danger of re-introducing vendor lock-in, as AI-native stacks may not interoperate cleanly with legacy xApps and rApps, undermining the very rationale of open interfaces. Training and refining these models requires sensitive operational and user data, raising privacy and data sovereignty concerns. Finally, the speed at which native-AI RANs operate makes meaningful human oversight nearly impossible, challenging the principle of human-in-the-loop control that regulators increasingly require for critical infrastructure operation.

Perhaps not too surprising, NVIDIA, a founding member of the AI-RAN Alliance, is a leading advocate for AI-native RAN, with strong leadership across infrastructure innovation, collaborative development, tooling, standard-setting, and future network frameworks. Their AI-Aerial platform and broad ecosystem partnerships illustrate their pivotal role in transitioning network architectures toward deeply integrated intelligence, especially in the 6G era. The AI-Native RAN concept and the gap it opens compared to existing O-RAN and classical RAN approaches will be the subject of a follow-up article I am preparing based on my current research into this field.

WHY REGULATORY AGENCIES MAY END THE AI PARTY (BEFORE IT REALLY STARTS).

Article content
Figure: Regulatory challenges for applying AI in critical telecom infrastructure, highlighting transparency, explainability, and auditability as key oversight requirements under European Commission mandates, posing constraints on AI-driven RAN systems.

We are about to “let loose” advanced AI/ML applications and processes across all aspects of our telecommunication networks. From the core all the way through to access and out to consumers and businesses making use of what is today regarded as highly critical infrastructure. This reduces cognitive load for operators while aiming to keep decision logic transparent, explainable, and auditable. In both roles, LLMs do not replace the specialized ML models running lower in the architecture. Instead, they enhance the orchestration layer by embedding reasoning and language understanding where time and resources permit. Yet it is here that one of the sharpest challenges emerges. The regulatory and policy scrutiny that inevitably follows when AI is introduced into critical infrastructure.

In the EU, the legal baseline now treats many network-embedded AI systems as high-risk by default whenever they are used as safety or operational components in the management and operation of critical digital infrastructure. This category encompasses modern telecom networks squarely. Under the EU AI Act, such systems must satisfy stringent requirements for risk management, technical documentation, transparency, logging, human oversight, robustness, and cybersecurity, and they must be prepared for conformity assessment and market surveillance. If the AI used in RAN control or orchestration cannot meet these duties, deployment can be curtailed or prohibited until compliance is demonstrated. The same regulation now also imposes obligations on general-purpose AI (foundation/LLM) providers, including additional duties when models are deemed to pose systemic risk, to enhance transparency and safety across the supply chain that may support telecom use cases. This AI-specific layer builds upon the EU’s broader critical infrastructure and cybersecurity regime. The NIS2 Directive strengthens security and incident-reporting obligations for essential entities, explicitly including digital and communications infrastructure, while promoting supply-chain due diligence. This means that operators must demonstrate how they assess and manage risks from AI components and vendors embedded in their networks. The EU’s 5G Cybersecurity Toolbox adds a risk-based, vendor-agnostic lens to supplier decisions (applied to “high-risk” vendors). Still, the logic is general: provenance alone, whether from China, the US, Israel, or any “friendly” jurisdiction, does not exempt AI/ML components from rigorous technical and governance assurances. The Cyber Resilience Act extends horizontal cybersecurity duties to “products with digital elements,” which can capture network software and AI-enabled components, linking market access to secure-by-design engineering, vulnerability handling, and update practices.

Data-protection law also bites. GDPR Article 22 places boundaries on decisions based solely on automated processing that produce legal or similarly significant effects on individuals, a genuine concern as networks increasingly mediate critical services and safety-of-life communications. Recent case law from the Court of Justice of the EU underscores a right of access to meaningful information about automated decision-making “procedures and principles,” raising the bar for explainability and auditability in any network AI that profiles or affects individuals. In short, operators must be able to show their work, not just that an AI policy improved a KPI, but how it made the call. These European guardrails are mirrored (though not identically) elsewhere. The UK Telecoms Security Act and its Code of Practice impose enforceable security measures on providers. In the US, the voluntary NIST AI Risk Management Framework has become the de facto blueprint for AI governance, emphasizing transparency, accountability, and human oversight, principles that regulators can (and do) import into sectoral supervision. None of these frameworks cares only about “who made it”. They also care about how it performs, how it fails, how it is governed, and how it can be inspected.

The AI Act’s human-oversight requirement (i.e., Article 14 in the EU Artificial Intelligence Act) exists precisely to bound such risks, ensuring operators can intervene, override, or disable when behavior diverges from safety or fundamental rights expectations. Its technical documentation and transparency obligations require traceable design choices and lifecycle records. Where these assurances cannot be demonstrated, regulators may limit or ban such deployments in critical infrastructure.

Against this backdrop, proposals to deploy autonomous AI agents deeply embedded in the RAN stack face a (very) higher bar. Autonomy risks eroding the very properties that European law demands.

  • TransparencyReasoning steps are difficult to reconstruct: Traditional RAN algorithms are rule-based and auditable, making their logic transparent and reproducible. By contrast, modern AI models, especially deep learning and generative approaches, embed decision logic in complex weight matrices, where the precise reasoning steps cannot be reconstructed. Post-hoc explainability methods provide only approximations, not complete causal transparency. This creates tension with regulatory frameworks such as the EU AI Act, which requires technical documentation, traceability, and user-understandable logic for high-risk AI in critical infrastructure. The NIS2 Directive and GDPR Article 22 add further obligations for traceability and meaningful explanation of automated decisions. If operators cannot show why an AI system in the RAN made a given decision, compliance risks arise. The challenge is amplified with autonomous agents (i.e., Agentic AI), where decisions emerge from adaptive policies and interactions that are inherently non-deterministic. For critical infrastructure, such as telecom networks, transparency is therefore not optional but a regulatory necessity. Opaque models may face restrictions or outright bans.
  • Explainability – Decisions must be understandable: Explainability means that operators and regulators can not only observe what a model decided, but also understand why. In RAN AI, this is challenging because deep models may optimize across multiple features simultaneously, making their outputs hard to interpret. The EU AI Act requires high-risk systems to provide explanations that are “appropriate to the intended audience,” meaning engineers must be able to trace technical logic. In contrast, regulators and end-users require more accessible reasoning. Without explainability, trust in AI-driven traffic steering, slicing, or energy optimization cannot be established. A lack of clarity risks regulatory rejection and reduces operator confidence in deploying advanced AI at scale.
  • Auditability – Decisions must be verifiable: Auditability ensures that every AI-driven decision in the RAN can be logged, traced, and checked after the fact. Traditional rule-based schedulers are inherently auditable, but ML models, especially adaptive ones, require extensive logging frameworks to capture states, inputs, and outputs. The NIS2 Directive and the Cyber Resilience Act require such traceability for digital infrastructure, while the AI Act imposes additional obligations for record-keeping and post-market monitoring. Without audit trails, it becomes impossible to verify compliance or to investigate failures, outages, or discriminatory behaviors. In critical infrastructure, a lack of auditability is not just a technical gap but a regulatory showstopper, potentially leading to deployment bans.
  • Human Oversight – The challenge of real-time intervention: Both the EU AI Act and the NIS2 Directive require that high-risk AI systems remain under meaningful human oversight, with the possibility to override or disable AI-initiated actions. In the context of O-RAN, this creates a unique tension. Many RIC-driven optimizations and DU/RU control loops operate at millisecond or even microsecond timescales, where thousands or millions of inferences occur daily. Expecting a human operator to monitor, let alone intervene in real time, is technically infeasible. Instead, oversight must be implemented through policy guardrails, monitoring dashboards, fallback modes, and automated escalation procedures. The challenge is to satisfy the regulatory demand for human control without undermining the efficiency gains that AI brings. If this balance cannot be struck, regulators may judge certain autonomous functions non-compliant, slowing or blocking their deployment in critical telecom infrastructure.

The upshot for telecom is clear. Even as generative and agentic AI move into SMO/Non-RT orchestration for intent translation or semantic compression, the time-scale fundamentals do not change. RT and sub-ms loops must remain deterministic, inspectable, and controllable, with human-governed, well-documented interfaces mediating any AI influence. The regulatory risk is therefore not hypothetical. It is structural. As generative AI and LLMs move closer to the orchestration and policy layers of O-RAN, their opacity and non-deterministic reasoning raise questions about compliance. While such models may provide valuable tools for intent interpretation or telemetry summarization, their integration into live networks will only be viable if accompanied by robust frameworks for explainability, monitoring, and assurance. This places a dual burden on operators and vendors: to innovate in AI-driven automation, but also to invest in governance structures that can withstand regulatory scrutiny.

In a European context, no AI model will likely be permitted in the RAN unless it can pass the tests of explainability, auditability, and human oversight that regulators will and also should demand of functionality residing in critical infrastructures.

WRAPPING UP.

The article charts an evolution from SON-era automation to today’s AI-RAN vision, showing how O-RAN institutionalized “openness + intelligence” through a layered control stack, SMO/NRT-RIC for policy and learning, RT-RIC for fast decisions, and CU/DU/RU for deterministic execution at millisecond to microsecond timescales. It argues that LLMs belong at the top (SMO/NRT-RIC) for intent translation and semantic compression, while lightweight supervised/RL/TinyML models run the real-time loops below. “ChatGPT-like” systems (i.e., founded on human-generated context) are ill-suited to near-RT and sub-ms control. Synthetic data can stress-test rare events, but it demands statistics that are aware of extremes and validation against real holdouts to avoid misleading inference. Many low-level PHY/MAC primitives (HARQ, coding/decoding, CRC, MMSE precoding, and SINR-based link adaptation) are generally close to optimal, so AI/ML’s gains in these areas may be marginal and, at least initially, not the place to focus on.

Most importantly, pushing agentic autonomy too deep into the stack is likely to collide with both physics and law. Without reversibility, logging, and explainability, deployments risk breaching the EU AI Act’s requirements for human oversight, transparency, and lifecycle accountability. The practical stance is clear. Keep RT-RIC and DU/RU loops deterministic and inspectable, confine agency to SMO/NRT-RIC under strong policy guardrails and observability, and pair innovation with governance that can withstand regulatory scrutiny.

  • AI in RAN is evolutionary, not revolutionary, from SON and Elastic RAN-style coordination to GPU-accelerated AI-RAN and the 2024 AI-RAN Alliance.
  • O-RAN’s design incorporates AI via a hierarchical approach: SMO (governance/intent), NRT-RIC (training/policy), RT-RIC (near-real-time decisions), CU (shaping/QoS/UX, etc.), and DU/RU (deterministic PHY/MAC).
  • LLMs are well-suited for SMO/NRT-RIC for intent translation and semantic compression; however, they are ill-suited for RT-RIC or DU/RU, where millisecond–to–microsecond determinism is mandatory.
  • Lightweight supervised/RL/TinyML models, not “ChatGPT-like” systems, are the practical engines for near-real-time and real-time control loops.
  • Synthetic data for rare events, generated in the NRT-RIC and SMO, is valid but carries some risk. Approaches must be validated against real holdouts and statistics that account for extremes to avoid misleading inference.
  • Many low-level PHY/MAC primitives (HARQ, coding/decoding, CRC, classical precoding/MMSE, SINR-based link adaptation) are already near-optimal. AI may only add marginal gains at the edge.
  • Regulatory risk: Deep agentic autonomy without reversibility threatens EU AI Act Article 14 (human oversight). Operators must be able to intervene/override, which, to an extent, may defeat the more aggressive pursuits of autonomous network operations.
  • Regulatory risk: Opaque/unanalyzable models undermine transparency and record-keeping duties (Articles 12–13), especially if millions of inferences lack traceable logs and rationale.
  • Regulatory risk: For systems affecting individuals or critical services, explainability obligations (including GDPR Article 22 context) and AI Act lifecycle controls (Articles 8–17) require audit trails, documentation, and post-market monitoring, as well as curtailment of non-compliant agentic behavior risks.
  • Practical compliance stance: It may make sense to keep RT-RIC and DU/RU loops deterministic and inspectable, and constrain agency to SMO/NRT-RIC with strong policy guardrails, observability, and fallback modes.

ABBREVIATION LIST.

  • 3GPP – 3rd Generation Partnership Project.
  • A1 – O-RAN Interface between Non-RT RIC and RT-RIC.
  • AAS – Active Antenna Systems.
  • AISG – Antenna Interface Standards Group.
  • AI – Artificial Intelligence.
  • AI-RAN – Artificial Intelligence for Radio Access Networks.
  • AI-Native RAN – Radio Access Network with AI embedded into architecture, protocols, and control loops.
  • ASIC – Application-Specific Integrated Circuit.
  • CapEx – Capital Expenditure.
  • CPU – Central Processing Unit.
  • C-RAN – Cloud Radio Access Network.
  • CRC – Cyclic Redundancy Check.
  • CU – Centralized Unit.
  • DU – Distributed Unit.
  • E2 – O-RAN Interface between RT-RIC and CU/DU.
  • eCPRI – Enhanced Common Public Radio Interface.
  • EU – European Union.
  • FCAPS – Fault, Configuration, Accounting, Performance, Security.
  • FPGA – Field-Programmable Gate Array.
  • F1 – 3GPP-defined interface split between CU and DU.
  • GDPR – General Data Protection Regulation.
  • GPU – Graphics Processing Unit.
  • GRU – Gated Recurrent Unit.
  • HARQ – Hybrid Automatic Repeat Request.
  • KPI – Key Performance Indicator.
  • L1/L2 – Layer 1 / Layer 2 (in the OSI stack, PHY and MAC).
  • LLM – Large Language Model.
  • LSTM – Long Short-Term Memory.
  • MAC – Medium Access Control.
  • MANO – Management and Orchestration.
  • MIMO – Multiple Input, Multiple Output.
  • ML – Machine Learning.
  • MMSE – Minimum Mean Square Error.
  • NFVI – Network Functions Virtualization Infrastructure.
  • NIS2 – EU Directive on measures for a high standard level of cybersecurity across the Union.
  • NPU – Neural Processing Unit.
  • NRT-RIC – Non-Real-Time RAN Intelligent Controller.
  • O1 – O-RAN Operations and Management Interface to network elements.
  • O2 – O-RAN Interface to cloud infrastructure (NFVI and MANO).
  • O-RAN – Open Radio Access Network.
  • OpEx – Operating Expenditure.
  • PDCP – Packet Data Convergence Protocol.
  • PHY – Physical Layer.
  • QoS – Quality of Service.
  • RAN – Radio Access Network.
  • rApp – Non-Real-Time RIC Application.
  • RET – Remote Electrical Tilt.
  • RIC – RAN Intelligent Controller.
  • RLC – Radio Link Control.
  • R-NIB – Radio Network Information Base.
  • RT-RIC – Real-Time RAN Intelligent Controller.
  • RU – Radio Unit.
  • SDAP – Service Data Adaptation Protocol.
  • SINR – Signal-to-Interference-plus-Noise Ratio.
  • SmartNIC – Smart Network Interface Card.
  • SMO – Service Management and Orchestration.
  • SON – Self-Organizing Network.
  • T-Labs – Deutsche Telekom Laboratories.
  • TTI – Transmission Time Interval.
  • UE – User Equipment.
  • US – United States.
  • WG2 – O-RAN Working Group 2 (Non-RT RIC & A1 interface).
  • WG3 – O-RAN Working Group 3 (RT-RIC & E2 Interface).
  • xApp – Real-Time RIC Application.

ACKNOWLEDGEMENT.

I want to acknowledge my wife, Eva Varadi, for her unwavering support, patience, and understanding throughout the creative process of writing this article.

FOLLOW-UP READING.

  1. Kim Kyllesbech Larsen (May 2023), “Conversing with the Future: An interview with an AI … Thoughts on our reliance on and trust in generative AI.” An introduction to generative models and large language models.
  2. Goodfellow, I., Bengio, Y., Courville, A. (2016), Deep Learning (Adaptive Computation and Machine Learning series). The MIT Press. Kindle Edition.
  3. Collins, S. T., & Callahan, C. W. (2009). Cultural differences in systems engineering: What they are, what they aren’t, and how to measure them. 19th Annual International Symposium of the International Council on Systems Engineering, INCOSE 2009, 2.
  4. Herzog, J. (2015). Software Architecture in Practice, Third Edition, Written by Len Bass, Paul Clements, and Rick Kazman. ACM SIGSOFT Software Engineering Notes, 40(1).
  5. O-RAN Alliance (October 2018). “O-RAN: Towards an Open and Smart RAN“.
  6. TS 103 982 – V8.0.0. (2024) – Publicly Available Specification (PAS); O-RAN Architecture Description (O-RAN.WG1.OAD-R003-v08.00).
  7. Lee, H., Cha, J., Kwon, D., Jeong, M., & Park, I. (2020, December 1). “Hosting AI/ML Workflows on O-RAN RIC Platform”. 2020 IEEE Globecom Workshops, GC Wkshps 2020 – Proceedings.
  8. TS 103 983 – V3.1.0. (2024)- Publicly Available Specification (PAS); A1 interface: General Aspects and Principles (O-RAN.WG2.A1GAP-R003-v03.01).
  9. TS 104 038 – V4.1.0. (2024) – Publicly Available Specification (PAS); E2 interface: General Aspects and Principles (O-RAN.WG3.E2GAP-R003-v04.01).
  10. TS 104 039 – V4.0.0. (2024) – Publicly Available Specification (PAS); E2 interface: Application Protocol (O-RAN.WG3.E2AP-R003-v04.00).
  11. TS 104 040 – V4.0.0. (2024) – Publicly Available Specification (PAS); E2 interface: Service Model (O-RAN.WG3.E2SM-R003-v04.00).
  12. O-RAN Work Group 3. (2025). Near-Real-time RAN Intelligent Controller E2 Service Model (E2SM) KPM Technical Specification.
  13. Bao, L., Yun, S., Lee, J., & Quek, T. Q. S. (2025). LLM-hRIC: LLM-empowered Hierarchical RAN Intelligent Control for O-RAN.
  14. Tang, Y., Srinivasan, U. C., Scott, B. J., Umealor, O., Kevogo, D., & Guo, W. (2025). End-to-End Edge AI Service Provisioning Framework in 6G ORAN.
  15. Gajjar, P., & Shah, V. K. (n.d.). ORANSight-2.0: Foundational LLMs for O-RAN.
  16. Elkael, M., D’Oro, S., Bonati, L., Polese, M., Lee, Y., Furueda, K., & Melodia, T. (2025). AgentRAN: An Agentic AI Architecture for Autonomous Control of Open 6G Networks.
  17. Gu, J., Zhang, X., & Wang, G. (2025). Beyond the Norm: A Survey of Synthetic Data Generation for Rare Events.
  18. Michael Peel (July 2024), The problem of ‘model collapse’: how a lack of human data limits AI progress, Financial Times.
  19. Decruyenaere, A., Dehaene, H., Rabaey, P., Polet, C., Decruyenaere, J., Demeester, T., & Vansteelandt, S. (2025). Debiasing Synthetic Data Generated by Deep Generative Models.
  20. Decruyenaere, A., Dehaene, H., Rabaey, P., Polet, C., Decruyenaere, J., Vansteelandt, S., & Demeester, T. (2024). The Real Deal Behind the Artificial Appeal: Inferential Utility of Tabular Synthetic Data.
  21. Vishwakarma, R., Modi, S. D., & Seshagiri, V. (2025). Statistical Guarantees in Synthetic Data through Conformal Adversarial Generation.
  22. Banbury, C. R., Reddi, V. J., Lam, M., Fu, W., Fazel, A., Holleman, J., Huang, X., Hurtado, R., Kanter, D., Lokhmotov, A., Patterson, D., Pau, D., Seo, J., Sieracki, J., Thakker, U., Verhelst, M., & Yadav, P. (2021). Benchmarking TinyML Systems: Challenges and Direction.
  23. Capogrosso, L., Cunico, F., Cheng, D. S., Fummi, F., & Cristani, M. (2023). A Machine Learning-oriented Survey on Tiny Machine Learning.
  24. Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., & Bengio, Y. (2016). Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations.
  25. AI Act. The AI Act is the first-ever comprehensive legal framework on AI, addressing the risks associated with AI, and is alleged to position Europe to play a leading role globally (as claimed by the European Commission).
  26. The EU Artificial Intelligence Act. For matters related explicitly to Critical Infrastructure, see in particular Annex III: High-Risk AI Systems Referred to in Article 6(2), Recital 55 and Article 6: Classification Rules for High-Risk AI Systems. I also recommend taking a look at “Article 14: Human Oversight”.
  27. European Commission (January 2020), “Cybersecurity of 5G networks – EU Toolbox of risk mitigating measures”.
  28. European Commission (June 2023), “Commission announces next steps on cybersecurity of 5G networks in complement to latest progress report by Member States”.
  29. European Commission, “NIS2 Directive: securing network and information systems”.
  30. Council of Europe (October 2024), “Cyber resilience act: Council adopts new law on security requirements for digital products.”.
  31. GDPR Article 22, “Automated individual decision-making, including profiling”. See also the following article from Crowell & Moring LLP: “Europe’s Highest Court Compels Disclosure of Automated Decision-Making “Procedures and Principles” In Data Access Request Case”.

The Telco Ascension to the Sky.

It’s 2045. Earth is green again. Free from cellular towers and the terrestrial radiation of yet another G, no longer needed to justify endless telecom upgrades. Humanity has finally transcended its communication needs to the sky, fully served by swarms of Low Earth Orbit (LEO) satellites.

Millions of mobile towers have vanished. No more steel skeletons cluttering skylines and nature in general. In their place: millions of beams from tireless LEO satellites, now whispering directly into our pockets from orbit.

More than 1,200 MHz of once terrestrially-bound cellular spectrum below the C-band had been uplifted to LEO satellites. Nearly 1,500 MHz between 3 and 6 GHz had likewise been liberated from its earthly confines, now aggressively pursued by the buzzing broadband constellations above.

It all works without a single modification to people’s beloved mobile devices. Everyone enjoyed the same, or better, cellular service than in those wretched days of clinging to terrestrial-based infrastructure.

So, how did this remarkable transformation come about?

THE COVERAGE.

First, let’s talk about coverage. The chart below tells the story of orbital ambition through three very grounded curves. On the x-axis, we have the inclination angle, which is the degree to which your satellites are encouraged to tilt away from the equator to perform their job. On the y-axis: how much of the planet (and its people) they’re actually covering. The orange line gives us land area coverage. It starts low, as expected, tropical satellites don’t care much for Greenland. But as the inclination rises, so does their sense of duty to the extremes (the poles that is). The yellow line represents population coverage, which grows faster than land, maybe because humans prefer to live near each other (or they like the scenery). By the time you reach ~53° inclination, you’re covering about 94% of humanity and 84% of land areas. The dashed white line represents mobile cell coverage, the real estate of telecom towers. A constellation at a 53° inclination would cover nearly 98% of all mobile site infrastructure. It serves as a proxy for economic interest. It closely follows the population curve, but adds just a bit of spice, reflecting urban density and tower sprawl.

This chart illustrates the cumulative global coverage achieved at varying orbital inclination angles for three key metrics: land area (orange), population (yellow), and estimated terrestrial mobile cell sites (dashed white). As inclination increases from equatorial (0°) to polar (90°), the percentage of global land and population coverage rises accordingly. Notably, population coverage reaches approximately 94% at ~53° inclination, a critical threshold for satellite constellations aiming to maximize global user reach without the complexity of polar orbits. The mobile cell coverage curve reflects infrastructure density and aligns closely with population distribution.

The satellite constellation’s beams have replaced traditional terrestrial cells, providing a one-to-one coverage substitution. They not only replicate coverage in former legacy cellular areas but also extend service to regions that previously lacked connectivity due to low commercial priority from telecom operators. Today, over 3 million beams substitute obsolete mobile cells, delivering comparable service across densely populated areas. An additional 1 million beams have been deployed to cover previously unserved land areas, primarily rural and remote regions, using broader, lower-capacity beams with radii up to 10 kilometers. While these rural beams do not match the density or indoor penetration of urban cellular coverage, they represent a cost-effective means of achieving global service continuity, especially for basic connectivity and outdoor access in sparsely populated zones.

Conclusion? If you want to build a global satellite mobile network, you don’t need to orbit the whole planet. Just tilt your constellation enough to touch the crowded parts, and leave the tundra to the poets. However, this was the “original sin” of LEO Direct-2-Cellular satellites.

THE DEMAND.

Although global mobile traffic growth slowed notably after the early 2020s, and the terrestrial telecom industry drifted toward its “end of history” moment, the orbital network above inherited a double burden. Not only did satellite constellations need to deliver continuous, planet-wide coverage, a milestone legacy telecoms had never reached, despite millions of ground sites, but they also had to absorb globally converging traffic demands as billions of users crept steadily toward the throughput mean.

This chart shows the projected DL traffic across a full day (UTC), based on regions where local time falls within the evening Busy Hour window (17:00–22:00) and are within satellite coverage (minimum elevation ≥ 25°). The BH population is calculated hourly, taking into account time zone alignment and visibility, with a 20% concurrency rate applied. Each active user is assumed to consume 500 Mbps downlink in 2045. The peak, reaching over
This chart shows the uplink traffic demand experienced across a full day (UTC), based on regions under Busy Hour conditions (17:00–22:00 local time) and visible to the satellite constellation (with a minimum elevation angle of 25°). For each UTC hour, the BH population within coverage is calculated using global time zone mapping. Assuming a 20% concurrency rate and an average uplink throughput of 50 Mbps per active user, the total UL traffic is derived. The resulting curve reflects how demand shifts in response to the Earth’s rotation beneath the orbital band. The peak, reaching over

The radio access uplink architecture relies on low round-trip times for proper scheduling, timing alignment, and HARQ (Hybrid Automatic Repeat Request) feedback cycles. The propagation delay at 350 km yields a round-trip time of about 2.5 to 3 milliseconds, which falls within the bounds of what current specifications can accommodate. This is particularly important for latency-sensitive applications such as voice, video, and interactive services that require low jitter and reliable feedback mechanisms. In contrast, orbits at 550 km or above push latency closer to the edge of what NR protocols can tolerate, which could hinder performance or require non-standard adaptations. The beam geometry also plays a central role. At lower altitudes, satellite beams projected to the ground are inherently smaller. This smaller footprint translates into tighter beam patterns with narrower 3 dB cut-offs, which significantly improves frequency reuse and spatial isolation. These attributes are important for deploying high-capacity networks in densely populated urban environments, where interference and spectrum efficiency are paramount. Narrower beams allow D2C operators to steer coverage toward demand centers while minimizing adjacent-beam interference dynamically. Operating at 350 km is not without drawbacks. The satellite’s ground footprint at this altitude is smaller, meaning that more satellites are required to achieve full Earth coverage. Additionally, satellites at this altitude are exposed to greater atmospheric drag, resulting in shorter orbital lifespans unless they are equipped with more powerful or efficient propulsion systems to maintain altitude. The current design aims for a 5-year orbital lifespan. Despite this, the shorter lifespan has an upside, as it reduces the long-term risks of space debris. Deorbiting occurs naturally and quickly at lower altitudes, making the constellation more sustainable in the long term.

THE CONSTELLATION.

The satellite-to-cellular infrastructure has now fully matured into a global-scale system capable of delivering mobile broadband services that are not only on par with, but in many regions surpass, the performance of terrestrial cellular networks. At its core lies a constellation of low Earth orbit satellites operating at an altitude of 350 kilometers, engineered to provide seamless, high-quality indoor coverage for both uplink and downlink, even in densely urban environments.

To meet the evolving expectations of mobile users, each satellite beam delivers a minimum of 50 Mbps of uplink capacity and 500 Mbps of downlink capacity per user, ensuring full indoor quality even in highly cluttered environments. Uplink transmissions utilize the 600 MHz to 1800 MHz band, providing 1200 MHz of aggregated bandwidth. Downlink channels span 1500 MHz of spectrum, ranging from 2100 MHz to the upper edge of the C-band. At the network’s busiest hour (e.g., around 20:00 local time) across the most densely populated regions south of 53° latitude, the system supports a peak throughput of 60,000 Tbps for downlink and 6,000 Tbps for uplink. To guarantee reliability under real-world utilization, the system is engineered with a 25% capacity overhead, raising the design thresholds to 75,000 Tbps for DL and 7,500 Tbps for UL during peak demand.

Each satellite beam is optimized for high spectral efficiency, leveraging advanced beamforming, adaptive coding, and cutting-edge modulation. Under these conditions, downlink beams deliver 4.5 Gbps, while uplink beams, facing more challenging reception constraints, achieve 1.8 Gbps. Meeting the adjusted peak-hour demand requires approximately 16.7 million active DL beams and 4.2 million UL beams, amounting to over 20.8 million simultaneous beams concentrated over the peak demand region.

Thanks to significant advances in onboard processing and power systems, each satellite now supports up to 5,000 independent beams simultaneously. This capability reduces the number of satellites required to meet regional peak demand to approximately 4,200. These satellites are positioned over a region spanning an estimated 45 million square kilometers, covering the evening-side urban and suburban areas of the Americas, Europe, Africa, and Asia. This configuration yields a beam density of nearly 0.46 beams per square kilometer, equivalent to one active beam for every 2 square kilometers, densely overlaid to provide continuous, per-user, indoor-grade connectivity. In urban cores, beam radii are typically below 1 km, whereas in lower-density suburban and rural areas, the system adjusts by using larger beams without compromising throughput.

Because peak demand rotates longitudinally with the Earth’s rotation, only a portion of the entire constellation is positioned over this high-demand region at any given time. To ensure 4,200 satellites are always present over the region during peak usage, the total constellation comprises approximately 20,800 satellites, distributed across several hundred orbital planes. These planes are inclined and phased to optimize temporal availability, revisit frequency, and coverage uniformity while minimizing latency and handover complexity.

The resulting Direct-to-Cellular satellite constellation and system of today is among the most ambitious communications infrastructures ever created. With more than 20 million simultaneous beams dynamically allocated across the globe, it has effectively supplanted traditional mobile towers in many regions, delivering reliable, high-speed, indoor-capable broadband connectivity precisely where and when people need it.

When Telcos Said ‘Not Worth It,’ Satellites Said ‘Hold My Beam. In the world of 2045, even the last village at the end of the dirt road streams at 500 Mbps. No tower in sight, just orbiting compassion and economic logic finally aligned.

THE SATELLITE.

The Cellular Device to Satellite Path.

The uplink antennas aboard the Direct-to-Cellular satellites have been specifically engineered to reliably receive indoor-quality transmissions from standard (unmodified) mobile devices operating within the 600 MHz to 1800 MHz band. Each device is expected to deliver a minimum of 50 Mbps uplink throughput, even when used indoors in heavily cluttered urban environments. This performance is made possible through a combination of wideband spectrum utilization, precise beamforming, and extremely sensitive receiving systems in orbit. The satellite uplink system operates across 1200 MHz of aggregated bandwidth (e.g., 60 channels of 20 MHz), spanning the entire upper UHF and lower S-band. Because uplink signals originate from indoor environments, where wall and structural penetration losses can exceed 20 dB, the satellite link budget must compensate for the combined effects of indoor attenuation and free-space propagation at a 350 km orbital altitude. At 600 MHz, which represents the lowest frequency in the UL band, the free space path loss alone is approximately 133 dB. When this is compounded with indoor clutter and penetration losses, the total attenuation the satellite must overcome reaches approximately 153 dB or more.

Rather than specifying the antenna system at a mid-band average frequency, such as 900 MHz (i.e., the mid-band of the 600 MHz to 1800 MHz range), the system has been conservatively engineered for worst-case performance at 600 MHz. This design philosophy ensures that the antenna will meet or exceed performance requirements across the entire uplink band, with higher frequencies benefiting from naturally improved gain and narrower beamwidths. This choice guarantees that even the least favorable channels, those near 600 MHz, support reliable indoor-grade uplink service at 50 Mbps, with a minimum required SNR of 10 dB to sustain up to 16-QAM modulation. Achieving this level of performance at 600 MHz necessitated a large physical aperture. The uplink receive arrays on these satellites have grown to approximately 700 to 750 m² in area, and are constructed using modular, lightweight phased-array tiles that unfold in orbit. This aperture size enables the satellite to achieve a receive gain of approximately 45 dBi at 600 MHz, which is essential for detecting low-power uplink transmissions with high spectral efficiency, even from users deep indoors and under cluttered conditions.

Unlike earlier systems, such as AST SpaceMobile’s BlueBird 1, launched in the mid-2020s with an aperture of around 900 m² and challenged by the need to acquire indoor uplink signals, today’s Direct-to-Cellular (D2C) satellites optimize the uplink and downlink arrays separately. This separation allows each aperture to be custom-designed for its frequency and link budget requirements. The uplink arrays incorporate wideband, dual-polarized elements, such as log-periodic or Vivaldi structures, backed by high-dynamic-range low-noise amplifiers and a distributed digital beamforming backend. Assisted by real-time AI beam management, each satellite can simultaneously support and track up to 2,500 uplink beams, dynamically allocating them across the active coverage region.

Despite their size, these receive arrays are designed for compact launch configurations and efficient in-orbit deployment. Technologies such as inflatable booms, rigidizable mesh structures, and ultralight composite materials allow the arrays to unfold into large apertures while maintaining structural stability and minimizing mass. Because these arrays are passive receivers, thermal loads are significantly lower than those of transmit systems. Heat generation is primarily limited to the digital backend and front-end amplification chains, which are distributed across the array surface to facilitate efficient thermal dissipation.

The Satellite to Cellular Device Path.

The downlink communication path aboard Direct-to-Cellular satellites is engineered as a fully independent system, physically and functionally separated from the uplink antenna. This separation reflects a mature architectural philosophy that has been developed over decades of iteration. The downlink and uplink systems serve fundamentally different roles and operate across vastly different frequency bands, with their power, thermal, and antenna constraints. The downlink system operates in the frequency range from 2100 MHz up to the upper end of the C-band, typically around 4200 MHz. This is significantly higher than the uplink range, which extends from 600 to 1800 MHz. Due to this disparity in wavelength, a factor of nearly six between the lowest uplink and highest downlink frequencies, a shared aperture is neither practical nor efficient. It is widely accepted today that integrating transmit and receive functions into a single broadband aperture would compromise performance on both ends. Instead, today’s satellites utilize a dual-aperture approach, with the downlink antenna system optimized exclusively for high-frequency transmission and the uplink array designed independently for low-frequency reception.

In order to deliver 500 Mbps per user with full indoor coverage, each downlink beam must sustain approximately 4.5 Gbps, accounting for spectral reuse and beam overlap. At an orbital altitude of 350 kilometers, downlink beams must remain narrow, typically covering no more than a 1-kilometer radius in urban zones, to match uplink geometry and maintain beam-level concurrency. The antenna gain required to meet these demands is in the range of 50 to 55 dBi, which the satellites achieve using high-frequency phased arrays with a physical aperture of approximately 100 to 200 m². Because the downlink system is responsible for high-power transmission, the antenna tiles incorporate GaN-based solid-state power amplifiers (SSPAs), which deliver hundreds of watts per panel. This results in an overall effective isotropic radiated power (EIRP) of 50 to 60 dBW per beam, sufficient to reach deep indoor devices even at the upper end of the C-band. The power-intensive nature of the downlink system introduces thermal management challenges (describe below in the next section), which are addressed by physically isolating the transmit arrays from the receiver surfaces. The downlink and uplink arrays are positioned on opposite sides of the spacecraft bus or thermally decoupled through deployable booms and shielding layers.

The downlink beamforming is fully digital, allowing real-time adaptation of beam patterns, power levels, and modulation schemes. Each satellite can form and manage up to 2,500 independent downlink beams, which are coordinated with their uplink counterparts to ensure tight spatial and temporal alignment. Advanced AI algorithms help shape beams based on environmental context, usage density, and user motion, thereby further improving indoor delivery performance. The modulation schemes used on the downlink frequently reach 256-QAM and beyond, with spectral efficiencies of six to eight bits per second per Hz in favorable conditions.

The physical deployment of the downlink antenna varies by platform, but most commonly consists of front-facing phased array panels or cylindrical surfaces fitted with azimuthally distributed tiles. These panels can be either fixed or mounted on articulated platforms that allow active directional steering during orbit, depending on the beam coverage strategy, an arrangement also called gumballed.

No Bars? Not on This Planet. In 2045, even the Icebears will have broadband. When satellites replaced cell towers, the Arctic became just another neighborhood in the global gigabit grid.

Satellite System Architecture.

The Direct-to-Cellular satellites have evolved into high-performance, orbital base stations that far surpass the capabilities of early systems, such as AST SpaceMobile’s Bluebird 1 or SpaceX’s Starlink V2 Mini. These satellites are engineered not merely to relay signals, but to deliver full-featured indoor mobile broadband connectivity directly to standard handheld devices, anywhere on Earth, including deep urban cores and rural regions that have been historically underserved by terrestrial infrastructure.

As described earlier, today’s D2C satellite supports up to 5,000 simultaneous beams, enabling real-time uplink and downlink with mobile users across a broad frequency range. The uplink phased array, designed to capture low-power, deep-indoor signals at 600 MHz, occupies approximately 750 m². The DL array, optimized for high-frequency, high-power transmission, spans 150 to 200 m². Unlike early designs, such as Bluebird 1, which used a single, large combined antenna, today’s satellites separate the uplink and downlink arrays to optimize each for performance, thermal behavior, and mechanical deployment. These two systems are typically mounted on opposite sides of the satellite and thermally isolated from one another.

Thermal management is one of the defining challenges of this architecture. While AST’s Bluebird 1 (i.e., from mid-2020s) boasted a large antenna aperture approaching 900 m², its internal systems generated significantly less heat. Bluebird 1 operated with a total power budget of approximately 10 to 12 kilowatts, primarily dedicated to a handful of downlink beams and limited onboard processing. In contrast, today’s D2C satellite requires a continuous power supply of 25 to 35 kilowatts, much of which must be dissipated as heat in orbit. This includes over 10 kilowatts of sustained RF power dissipation from the DL system alone, in addition to thermal loads from the digital beamforming hardware, AI-assisted compute stack, and onboard routing logic. The key difference lies in beam concurrency and onboard intelligence. The satellite manages thousands of simultaneous, high-throughput beams, each dynamically scheduled and modulated using advanced schemes such as 256-QAM and beyond. It must also process real-time uplink signals from cluttered environments, allocate spectral and spatial resources, and make AI-driven decisions about beam shape, handovers, and interference mitigation. All of this requires a compute infrastructure capable of delivering 100 to 500 TOPS (tera-operations per second), distributed across radiation-hardened processors, neural accelerators, and programmable FPGAs. Unlike AST’s Bluebird 1, which offloaded most of its protocol stack to the ground, today’s satellites run much of the 5G core network onboard. This includes RAN scheduling, UE mobility management, and segment-level routing for backhaul and gateway links.

This computational load compounds the satellite’s already intense thermal environment. Passive cooling alone is insufficient. To manage thermal flows, the spacecraft employs large radiator panels located on its outer shell, advanced phase-change materials embedded behind the DL tiles, and liquid loop systems that transfer heat from the RF and compute zones to the radiative surfaces. These thermal systems are intricately zoned and actively managed, preventing the heat from interfering with the sensitive UL receive chains, which require low-noise operation under tightly controlled thermal conditions. The DL and UL arrays are thermally decoupled not just to prevent crosstalk, but to maintain stable performance in opposite thermal regimes: one dominated by high-power transmission, the other by low-noise reception.

To meet its power demands, the satellite utilizes a deployable solar sail array that spans 60 to 80 m². These sails are fitted with ultra-high-efficiency solar cells capable of exceeding 30–35% efficiency. They are mounted on articulated booms that track the sun independently from the satellite’s Earth-facing orientation. They provide enough current to sustain continuous operation during daylight periods, while high-capacity batteries, likely based on lithium-sulfur or solid-state chemistry, handle nighttime and eclipse coverage. Compared to the Starlink V2 Mini, which generates around 2.5 to 3.0 kilowatts, and the Bluebird 1, which operates at roughly 10–12 kilowatts. Today’s system requires nearly three times the generation and five times the thermal rejection capability compared to the initial satellites of the mid-2020s.

Structurally, the satellite is designed to support this massive infrastructure. It uses a rigid truss core (i.e., lattice structure) with deployable wings for the DL system and a segmented, mesh-based backing for the UL aperture. Propulsion is provided by Hall-effect or ion thrusters, with 50 to 100 kilograms of inert propellant onboard to support three to five years of orbital station-keeping at an altitude of 350 kilometers. This height is chosen for its latency and spatial reuse advantages, but it also imposes continuous drag, requiring persistent thrust.

The AST Bluebird 1 may have appeared physically imposing in its time due to its large antenna, thermal, computational, and architectural complexity. Today’s D2C satellite, 20 years later, far exceeds anything imagined two decades earlier. The heat generated by its massive beam concurrency, onboard processing, and integrated network core makes its thermal management system not only more severe than Bluebird 1’s but also one of the primary limiting factors in the satellite’s physical and functional design. This thermal constraint, in turn, shapes the layout of its antennas, compute stack, power system, and propulsion.

Mass and Volume Scaling.

The AST’s Bluebird 1, launched in the mid-2020s, had a launch mass of approximately 1,500 kilograms. Its headline feature was a 900 m² unfoldable antenna surface, designed to support direct cellular connectivity from space. However, despite its impressive aperture, the system was constrained by limited beam concurrency, modest onboard computing power, and a reliance on terrestrial cores for most network functions. The bulk of its mass was dominated by structural elements supporting its large antenna surface and the power and thermal subsystems required to drive a relatively small number of simultaneous links. Bluebird’s propulsion was chemical, optimized for initial orbit raising and limited station-keeping, and its stowed volume fit comfortably within standard medium-lift payload fairings. Starlink’s V2 Mini, although smaller in physical aperture, featured a more balanced and compact architecture. Weighing roughly 800 kilograms at launch, it was designed around high-throughput broadband rather than direct-to-cellular use. Its phased array antenna surface was closer to 20–25 m², and it was optimized for efficient manufacturing and high-density orbital deployment. The V2 Mini’s volume was tightly packed, with solar panels, phased arrays, and propulsion modules folded into a relatively low-profile bus optimized for rapid deployment and low-cost launch stacking. Its onboard compute and thermal systems were scaled to match its more modest power budget, which typically hovered around 2.5 to 3.0 kilowatts.

In contrast, today’s satellites occupy an entirely new performance regime. The dry mass of the satellite ranges between 2,500 and 3,500 kilograms, depending on specific configuration, thermal shielding, and structural deployment method. This accounts for its large deployable arrays, high-density digital payload, radiator surfaces, power regulation units, and internal trusses. The wet mass, including onboard fuel reserves for at least 5 years of station-keeping at 350 km altitude, increases by up to 800 kilograms, depending on the propulsion type (e.g., Hall-effect or gridded ion thrusters) and orbital inclination. This brings the total launch mass to approximately 3,000 to 4,500 kilograms, or more than double ATS’s old Bluebird 1 and roughly five times that of SpaceX’s Starlink V2 Mini.

Volume-wise, the satellites require a significantly larger stowed configuration than either AST’s Bluebird 1 or SpaceX’s Starlink V2 Mini. While both of those earlier systems were designed to fit within traditional launch fairings, Bluebird 1 utilizes a folded hinge-based boom structure, and Starlink V2 Mini is optimized for ultra-compact stacking. Today’s satellite demands next-generation fairing geometries, such as 5-meter-class launchers or dual-stack configurations. This is driven by the dual-antenna architecture and radiator arrays, which, although cleverly folded during launch, expand dramatically once deployed in orbit. In its operational configuration, the satellite spans tens of meters across its antenna booms and solar sails. The uplink array, built as a lightweight, mesh-backed surface supported by rigidizing frames or telescoping booms, unfolds to a diameter of approximately 30 to 35 meters, substantially larger than Bluebird 1’s ~20–25 meter maximum span and far beyond the roughly 10-meter unfolded span of Starlink V2 Mini. The downlink panels, although smaller, are arranged for precise gimballed orientation (i.e., a pivoting mechanism allowing rotation or tilt along one or more axes) and integrated thermal control, which further expands the total deployed volume envelope. The volumetric footprint of today’s D2C satellite is not only larger in surface area but also more spatially complex, as its segregated UL and DL arrays, thermal zones, and solar wings must avoid interference while maintaining structural and thermal equilibrium. Compared to the simplified flat-pack layout of Starlink V2 Mini and the monolithic boom-deployed design of Bluebird 1.

The increase in dry mass, wet mass, and deployed volume is not a byproduct of inefficiency, but a direct result of very substantial performance improvements that were required to replace terrestrial mobile towers with orbital systems. Today’s D2C satellites deliver an order of magnitude more beam concurrency, spectral efficiency, and per-user performance than its 2020s predecessors. This is reflected in every subsystem, from power generation and antenna design to propulsion, thermal control, and computing. As such, it represents the emergence of a new class of satellite altogether: not merely a space-based relay or broadband node, but a full-featured, cloud-integrated orbital RAN platform capable of supporting the global cellular fabric from space.

CAN THE FICTION BECOME A REALITY?

From the perspective of 2025, the vision of a global satellite-based mobile network providing seamless, unmodified indoor connectivity at terrestrial-grade uplink and downlink rates, 50 Mbps up, 500 Mbps down, appears extraordinarily ambitious. The technical description from 2045 outlines a constellation of 20,800 LEO satellites, each capable of supporting 5,000 independent full-duplex beams across massive bandwidths, while integrating onboard processing, AI-driven beam control, and a full 5G core stack. To reach such a mature architecture within two decades demands breakthrough progress across multiple fronts.

The most daunting challenge lies in achieving indoor-grade cellular uplink at frequencies as low as 600 MHz from devices never intended to communicate with satellites. Today, even powerful ground-based towers struggle to achieve sub-1 GHz uplink coverage inside urban buildings. For satellites at an altitude of 350 km, the free-space path loss alone at 600 MHz is approximately 133 dB. When combined with clutter, penetration, and polarization mismatches, the system must close a link budget approaching 153–160 dB, from a smartphone transmitting just 23 dBm (200 mW) or less. No satellite today, including AST SpaceMobile’s BlueBird 1, has demonstrated indoor uplink reception at this scale or consistency. To overcome this, the proposed system assumes deployable uplink arrays of 750 m² with gain levels exceeding 45 dBi, supported by hundreds of simultaneously steerable receive beams and ultra-low-noise front-end receivers. From a 2025 lens, the mechanical deployment of such arrays, their thermal stability, calibration, and mass management pose nontrivial risks. Today’s large phased arrays are still in their infancy in space, and adaptive beam tracking from fast-moving LEO platforms remains unproven at the required scale and beam density.

Thermal constraints are also vastly more complex than anything currently deployed. Supporting 5,000 simultaneous beams and radiating tens of kilowatts from compact platforms in LEO requires heat rejection systems that go beyond current radiator technology. Passive radiators must be supplemented with phase-change materials, active fluid loops, and zoned thermal isolation to prevent transmit arrays from degrading the performance of sensitive uplink receivers. This represents a significant leap from today’s satellites, such as Starlink V2 Mini (~3 kW) or BlueBird 1 (~10–12 kW), neither of which operates with a comparable beam count, throughput, or antenna scale.

The required onboard compute is another monumental leap. Running thousands of simultaneous digital beams, performing real-time adaptive beamforming, spectrum assignment, HARQ scheduling, and AI-driven interference mitigation, all on-orbit and without ground-side offloading, demands 100–500 TOPS of radiation-hardened compute. This is far beyond anything that will be flying in 2025. Even state-of-the-art military systems rely heavily on ground computing and centralized control. The 2045 vision implies on-orbit autonomy, local decision-making, and embedded 5G/6G core functionality within each spacecraft, a full software-defined network node in orbit. Realizing such a capability requires not only next-gen processors but also significant progress in space-grade AI inference, thermal packaging, and fault tolerance.

On the power front, generating 25–35 kW per satellite in LEO using 60–80 m² solar sails pushes the boundary of photovoltaic technology and array mechanics. High-efficiency solar cells must achieve conversion rates exceeding 30–35%, while battery systems must maintain high discharge capacity even in complete darkness. Space-based power architectures today are not yet built for this level of sustained output and thermal dissipation.

Even if the individual satellite challenges are solved, the constellation architecture presents another towering hurdle. Achieving seamless beam handover, full spatial reuse, and maintaining beam density over demand centers as the Earth rotates demands near-perfect coordination of tens of thousands of satellites across hundreds of planes. No current LEO operator (including SpaceX) manages a constellation of that complexity, beam concurrency, or spatial density. Furthermore, scaling the manufacturing, testing, launch, and in-orbit commissioning of over 20,000 high-performance satellites will require significant cost reductions, increased factory throughput, and new levels of autonomous deployment.

Regulatory and spectrum allocation are equally formidable barriers. The vision entails the massively complex undertaking of a global reallocation of terrestrial mobile spectrum, particularly in the sub-3 GHz bands, to LEO operators. As of 2025, such a reallocation is politically and commercially fraught, with entrenched mobile operators and national regulators unlikely to cede prime bands without extensive negotiation, incentives, and global coordination. The use of 600–1800 MHz from orbit for direct-to-device is not yet globally harmonized (and may never be), and existing terrestrial rights would need to be either vacated or managed via complex sharing schemes.

From a market perspective, widespread device compatibility without modification implies that standard mobile chipsets, RF chains, and antennas evolve to handle Doppler compensation, extended RTT timing budgets, and tighter synchronization tolerances. While this is not insurmountable, it requires updates to 3GPP standards, baseband silicon, and potentially network registration logic, all of which must be implemented without degrading terrestrial service. Although NTN (non-terrestrial networks) support has begun to emerge in 5G standards, the level of transparency and ubiquity envisioned in 2045 is not yet backed by practical deployments.

While the 2045 architecture described so far assumes a single unified constellation delivering seamless global cellular service from orbit, the political and commercial realities of space infrastructure in 2025 strongly suggest a fragmented outcome. It is unlikely that a single actor, public or private, will be permitted, let alone able, to monopolize the global D2C landscape. Instead, the most plausible trajectory is a competitive and geopolitically segmented orbital environment, with at least one major constellation originating from China (note: I think it is quit likely we may see two major ones), another from the United States, a possible second US-based entrant, and potentially a European-led system aimed at securing sovereign connectivity across the continent. This fracturing of the orbital mobile landscape imposes a profound constraint on the economic and technical scalability of the system. The assumption that a single constellation could achieve massive economies of scale, producing, launching, and managing tens of thousands of high-performance satellites with uniform coverage obligations, begins to collapse under the weight of geopolitical segmentation. Each competitor must now shoulder its own development, manufacturing, and deployment costs, with limited ability to amortize those investments over a unified global user base. Moreover, such duplication of infrastructure risks saturating orbital slots and spectrum allocations, while reducing the density advantage that a unified system would otherwise enjoy. Instead of concentrating thousands of active beams over a demand zone with a single coordinated fleet, separate constellations must compete for orbital visibility and spectral access over the same urban centers. The result is likely to be a decline in per-satellite utilization efficiency, particularly in regions of geopolitical overlap or contested regulatory coordination.

2045: One Vision, Many Launch Pads. The dream of global satellite-to-cellular service may shine bright, but it won’t rise from a single constellation. With China, the U.S., and others racing skyward, the economics of universal LEO coverage could fracture into geopolitical silos, making scale, spectrum, and sustainability more contested than ever.

Finally, the commercial viability of any one constellation diminishes when the global scale is eroded. While a monopoly or globally dominant operator could achieve lower per-unit satellite costs, higher average utilization, and broader roaming revenues, a fractured environment reduces ARPU (average revenue per user). It increases the breakeven threshold for each deployment. Satellite throughput that could have been centrally optimized now risks duplication and redundancy, increasing operational overhead and potentially slowing innovation as vendors attempt to differentiate on proprietary terms. In this light, the architecture described earlier must be seen as an idealized vision. This convergence point may never be achieved in pure form unless global policy, spectrum governance, and commercial alliances move toward more integrated outcomes. While the technological challenges of the 2045 D2C system are significant, the fragmentation of market structure and geopolitical alignment may prove an equally formidable barrier to realizing the full systemic potential. While a monopoly or globally dominant operator could achieve lower per-unit satellite costs, higher average utilization, and broader roaming revenues, a fractured environment reduces ARPU (average revenue per user). It increases the breakeven threshold for each deployment. Satellite throughput that could have been centrally optimized now risks duplication and redundancy, increasing operational overhead and potentially slowing innovation as vendors attempt to differentiate on proprietary terms. In this light, the architecture described earlier must be seen as an idealized vision. This convergence point may never be achieved in pure form unless global policy, spectrum governance, and commercial alliances move toward more integrated outcomes. While the technological challenges of the 2045 D2C system are significant, the fragmentation of market structure and geopolitical alignment may prove an equally formidable barrier to realizing the full systemic potential.

Heavenly Coverage, Hellish Congestion. Even a single mega-constellation turns the sky into premium orbital real estate … and that’s before the neighbors show up with their own fleets. Welcome to the era of broadband traffic … in space.

Despite these barriers, incremental paths forward exist. Demonstration satellites in the late 2020s, followed by regional commercial deployments in the early 2030s, could provide real-world validation. The phased evolution of spectrum use, dual-use handsets, and AI-assisted beam management may mitigate some of the scaling concerns. Regulatory alignment may emerge as rural and unserved regions increasingly depend on space-based access. Ultimately, the achievement of the 2045 architecture relies not only on engineering but also on sustained cross-industry coordination, geopolitical alignment, and commercial viability on a planetary scale. As of 2025, the probability of realizing the complete vision by 2045, in terms of indoor-grade, direct-to-device service via a fully orbital mobile core, is perhaps 40–50%, with a higher probability (~70%) for achieving outdoor-grade or partially integrated hybrid services. The coming decade will reveal whether the industry can fully solve the unique combination of thermal, RF, computational, regulatory, and manufacturing challenges required to replace the terrestrial mobile network with orbital infrastructure.

POSTSCRIPT – THE ECONOMICS.

The Direct-to-Cellular satellite architecture described in this article would reshape not only the technical landscape of mobile communications but also its economic foundation. The very premise of delivering mobile broadband directly from space, bypassing terrestrial towers, fiber backhaul, and urban permitting, undermines one of the most entrenched capital systems of the 20th and early 21st centuries: the mobile infrastructure economy. Once considered irreplaceable, the sprawling ecosystem of rooftop leases, steel towers, field operations, base stations, and fiber rings has been gradually rendered obsolete by a network that floats above geography.

The financial implications of such a shift are enormous. Before such an orbital transition described in this article, the global mobile industry invested well over 300 billion USD annually in network CapEx and Opex, with a large share dedicated to the site infrastructure layer, construction, leasing, energy, security, and upkeep of millions of base stations and their associated land or rooftop assets. Tower companies alone have become multi-billion-dollar REITs (i.e., Real Estate Investment Trusts), profiting from site tenancy and long-term operating contracts. As of the mid-2020s, the global value tied up in the telecom industry’s physical infrastructure is estimated to exceed 2.5 to 3 trillion USD, with tower companies like Cellnex and American Tower collectively managing hundreds of billions of dollars in infrastructure assets. An estimated $300–500 billion USD invested in mobile infrastructure represents approximately 0.75% to 1.5% of total global pension assets and accounts for 15% to 30% of pension fund infrastructure investments. This real estate-based infrastructure model defined mobile economics for decades and has generally been regarded as a reasonably safe haven for investors. In contrast, the 2045 D2C model front-loads its capital burden into satellite manufacturing, launch, and orbital operations. Rather than being geographically bound, capital is concentrated into a fleet of orbital base stations, each capable of dynamically serving users across vast and shifting geographies. This not only eliminates the need for millions of distributed cell sites, but it also breaks the historical tie between infrastructure deployment and national geography. Coverage no longer scales with trenching crews or urban permitting delays but with orbital plane density and beamforming algorithms.

Yet, such a shift does not necessarily mean lower cost, only different economics. Launching and operating tens of thousands of advanced satellites, each capable of supporting thousands of beams and running onboard compute environments, still requires massive capital outlay and ongoing expenditures in space traffic management, spectrum coordination, ground gateways, and constellation replenishment. The difference lies in utilization and marginal reach. Where terrestrial infrastructure often struggles to achieve ROI in rural or low-income markets, orbital systems serve these zones as part of the same beam budget, with no new towers or trenches required.

Importantly, the 2045 model would likely collapse the mobile value chain. Instead of a multi-layered system of operators, tower owners, fiber wholesalers, and regional contractors, a vertically integrated satellite operator can now deliver the full stack of mobile service from orbit, owning the user relationship end-to-end. This disintermediation has significant implications for revenue distribution and regulatory control, and challenges legacy operators to either adapt or exit.

The scale of economic disruption mirrors the scale of technical ambition. This transformation could rewrite the very economics of connectivity. While the promise of seamless global coverage, zero tower density, and instant-on mobility is compelling, it may also signal the end of mobile telecom as a land-based utility.

If this little science fiction story comes true, and there are many good and bad reasons to doubt it, Telcos may not Ascend to the Sky, but take the Stairway to Heaven.

Graveyard of the Tower Titans. This symbolic illustration captures the end of an era, depicting headstones for legacy telecom giants such as American Tower, Crown Castle, and SBA Communications, as well as the broader REIT (Real Estate Investment Trust) infrastructure model that once underpinned the terrestrial mobile network economy. It serves as a metaphor for the systemic shift brought on by Direct-to-Cellular (D2C) satellite networks. What’s fading is not only the mobile tower itself, but also the vast ancillary industry that has grown around it, including power systems, access rights, fiber-infrastructure, maintenance firms, and leasing intermediaries, as well as the telecom business model that relied on physical, ground-based infrastructure. As the skies take over the signal path, the economic pillars of the old telecom world may no longer stand.

FURTHER READING.

Kim K. Larsen, “Will LEO Satellite Direct-to-Cellular Networks Make Traditional Mobile Networks Obsolete?”, A John Strand Consult Report, (January 2025). This has also been published in full on my own Techneconomyblog.

Kim K. Larsen, “Can LEO Satellites close the Gigabit Gap of Europe’s Unconnectables?“ Techneconomyblog (April 2025).

Kim K. Larsen, “The Next Frontier: LEO Satellites for Internet Services.” Techneconomyblog (March 2024).

Kim K. Larsen, “Stratospheric Drones & Low Earth Satellites: Revolutionizing Terrestrial Rural Broadband from the Skies?” Techneconomyblog (January 2024).

Kim K. Larsen, “A Single Network Future“, Techneconomyblog (March 2024).

ACKNOWLEDGEMENT.

I would like to acknowledge my wife, Eva Varadi, for her unwavering support, patience, and understanding throughout the creative process of writing this article.