Generative AI Is Expensive, and Often Solving the Wrong Layer of the Problem


Generative AI has rapidly become the default interface for asking questions of data. Its fluency and flexibility make it attractive, particularly in environments where questions are ill-formed or evolving. However, beneath this convenience lies a material cost: generative models are computationally intensive, energy-hungry, and operationally expensive. At scale, they represent a fundamentally inefficient way to answer many of the questions organizations are actually asking.

In a significant number of practical engineering and analytics scenarios, the desired outcome of a generative interaction is not novel reasoning or creative synthesis. It is a concrete artifact: a SQL query, a filter, a classification, a correlation, a threshold, or a decision rule applied to data the system already possesses. The generative model is often acting as an intermediary while translating human language into a deterministic operation that could have been executed directly, repeatedly, and at negligible marginal cost.

This pattern reveals a mismatch between tool and task. Generative AI excels at exploration, abstraction, and synthesis under uncertainty. It is far less appropriate as a perpetual runtime dependency for answering stable, repetitive, or well-bounded questions. Using a large language model to continuously re-derive the same logic is analogous to running a compiler every time a program executes, rather than compiling once and deploying an optimized binary.

The Enduring Relevance of Classical Machine Learning and Systems

Long before generative models, engineers built systems designed to answer questions efficiently and reliably. Expert systems encoded domain knowledge as rules and decision trees. Bayesian classifiers modeled uncertainty explicitly and updated beliefs as new evidence arrived. Finite state automata captured protocol behavior, workflows, and lifecycle transitions with mathematical precision. Neural networks, long predating today’s foundation models, were trained for narrowly scoped inference tasks with predictable performance and cost profiles.

These techniques are not obsolete. On the contrary, they are often better aligned with the operational realities of production systems. They are interpretable, testable, bounded in resource consumption, and amenable to formal verification. Most importantly, they are designed to answer the same class of question repeatedly without re-computation of the underlying logic.

When an organization asks, “Which assets are drifting out of compliance?”, “Which users exhibit anomalous behavior?”, or “Which records meet this evolving set of criteria?”, the answer rarely requires open-ended language generation. It requires a model of the domain, a representation of state, and a deterministic or probabilistic method of inference applied to known data.

Generative AI as a Design Tool, Not a Runtime Crutch

This suggests a more sustainable role for generative AI in engineering contexts: not as the system that answers questions indefinitely, but as the system that helps design the machinery that does.

Generative models are exceptionally well suited to assisting engineers in discovering patterns, proposing feature transformations, sketching decision logic, and even generating initial implementations of queries, classifiers, or state machines. Used this way, generative AI functions as a high-leverage design assistant, accelerating human understanding and system construction, rather than as an always-on inference engine.

In this model, a generative interaction produces an artifact: a SQL query, a scoring function, a rule set, or a lightweight model that is then implemented, reviewed, tested, and deployed as a standalone system. Once deployed, that system answers the question at machine speed, with predictable cost and behavior. The generative model is invoked again only when the question itself changes.

This approach mirrors established engineering discipline. We do not continuously simulate architectures when serving production traffic. Instead, we design them, validate them, and then operate them efficiently. Treating generative AI as a design-time capability rather than a perpetual runtime dependency restores this discipline.

Cost, Sustainability, and Architectural Responsibility

The economic and environmental implications of indiscriminate generative AI usage cannot be ignored. Large models consume vast amounts of compute, memory, and energy, often to answer questions whose structure is already well understood. At scale, this becomes not just expensive but irresponsible, particularly when simpler, more targeted solutions exist.

Architecturally, systems should strive to move intelligence as close as possible to the data and execution layer. If the final action is a query, the system should learn the query. If the final action is a classification, the system should learn the classifier. Persisting in a loop where a generative model is repeatedly asked to “think” its way to the same answer is a failure to operationalize knowledge.

Toward Intentional Use of Generative AI

None of this diminishes the transformative potential of generative AI. Rather, it reframes its role. Generative models are most valuable when they help us reason about problems, not when they substitute for systems we already know how to build.

For engineering organizations, the challenge is not to eliminate generative AI from their stacks, but to use it intentionally: as a catalyst for better system design, not as an expensive oracle answering the same mundane questions over and over. In many cases, the most powerful outcome of a generative interaction is not the answer itself, but the realization that what we needed all along was a smarter query, a better model, or a clearer representation of the problem, and the discipline to implement it.

Popular posts from this blog

The Fallacy of Cybersecurity by Backlog: Why Counting Patches Will Never Make You Secure

Quasiparticles as Functional Resources in Quantum Networks

IPv6 White Paper I: Primer to Passive Discovery and Topology Inference in IPv6 Networks Using Neighbor Discovery Protocol