The building that doesn't exist yet

The cheapest moment to change a building is before it exists. The most expensive is after it's built. Every architectural process for the last 2,000 years has been an argument about how to compress that gap — drawing, models, rendering, BIM, walkthroughs, mock-ups.

Each compression costs something. Hand drawings were fast to produce but required imagination to read. Physical models were communicative but expensive to change. Photo-realistic renders produced beautiful stills that were useless for spatial decision-making. BIM walkthroughs required specialist hardware and a tolerance for early-2000s graphics.

The compression has run out of room with conventional tools. The next move is spatial AI: the moment when an unbuilt building can be seen, walked, and dimensionally tested before a single piece of it is procured.

This essay is about which parts of that promise are real today, which parts are still demoware, and what the gap means for how buildings get designed and sold in the next five years.

What's actually shipping

Photogrammetry

Mature, commodity, and mostly used in surveying rather than design. You photograph an existing site from many angles, the software reconstructs a dense point cloud, and you have a dimensionally accurate spatial record of what exists. Matterport industrialised this for real estate. Construction firms use it for progress documentation. The technology is solid; the design applications are underexplored.

The limitation is directionality: photogrammetry is good at capturing what is, not at enabling you to see what will be.

NeRFs

Neural Radiance Fields represented a genuine leap in reconstruction quality. Rather than producing a point cloud, a NeRF learns an implicit 3D representation of a scene from a set of 2D photographs. The rendered output is striking — continuous, with view-dependent lighting, substantially more realistic than a mesh reconstruction.

The practical constraint is compute cost. Training a NeRF on a meaningful scene still requires hours of GPU time. Rendering is interactive at low resolution but slow at production quality. This is a technology in the "works in the lab, expensive in production" phase. The compute curve is heading in the right direction.

Gaussian splatting

The breakout of 2024–2025. Where NeRFs represent a scene as a continuous volumetric function, Gaussian splatting represents it as a set of small, overlapping 3D Gaussians — essentially coloured ellipsoidal blobs — trained to reproduce the appearance of the scene from any viewpoint.

The result is near-real-time training and real-time rendering at quality that beats NeRF for most scene types. You can now train a Gaussian splat of a room on a laptop in minutes, ship it into a game engine or a web viewer, and have someone walk through it at 60fps on consumer hardware.

This is the technology that makes photo-realistic spatial capture practical outside a research lab. It will be in the standard AEC workflow within three years in the same way that LiDAR scanning is now.

AI-driven 3D model generation

Text-to-3D is mostly still toy-grade. The geometry quality of current text-to-mesh models is not suitable for architectural use. Objects tend toward blobby approximations of their textual description — recognisable but dimensionally unreliable.

Image-to-3D is the working frontier. Given a photograph of an object from one or more angles, current models can reconstruct a mesh of useful quality, particularly for furniture-scale objects. This is the technology Imersian uses to convert a furniture retailer's 2D product catalogue into a deployable spatial asset library. For furniture, it works well enough to ship. For buildings, it is not there yet — the geometry is too complex, the dimension accuracy requirements too high.

The trajectory is clear. Image-to-3D will reach architectural quality. The question is timeline, not direction.

What's still demo

One-prompt building generation. "Design me a three-storey mixed-use building for this site" remains a demo. The multimodal models can produce images that look like architectural proposals. They cannot produce geometry that a structural engineer can assess, a contractor can price, or a building control officer can approve. The gap between appearance and data is the entire gap.

High-fidelity AR pre-walkthroughs on consumer hardware. Seeing a designed building at full scale in situ before it is built is the most valuable single thing spatial AI could deliver to architectural practice. A client standing on a cleared site, looking through their phone or glasses at the building that will be there — not a 2D render, not a flythrough, but an in-context, dimensionally accurate augmented view — would compress the decision cycle more than anything else in the design process.

This is not shipping. ARKit and ARCore can place objects in space. They cannot anchor a 40-metre-tall building to a site with the dimensional accuracy that makes spatial decision-making meaningful. Spatial anchoring at that scale, stable across sessions and users, is an unsolved engineering problem.

Real-time generative variation in a live client meeting. The dream is: client says "can we see that roof form but softer?" and you change it while they watch. Parametric tools allow this within a pre-built model. Generative AI cannot yet do it with the fidelity and speed that a live meeting requires. The bottleneck is latency and geometric coherence under edit — you need something that changes consistently with the prompt, not something that generates a new building from scratch.

Multi-modal building models. True simulation of a building integrates visual, thermal, acoustic, structural, and code-compliance data into a single navigable model. This does not exist as a unified product. It exists as a suite of disconnected specialist tools that talk to each other badly, if at all. Spatial AI has not yet attacked this integration problem in a commercially meaningful way.

What this changes about pre-construction

The parts that are shipping now are already changing the economics of the pre-construction phase — slowly, and mostly in the expensive residential and high-end commercial sectors, but the direction is visible.

Client decisions shift from approve/reject to iterate. When a client can walk through a photorealistic spatial representation of a design and understand it intuitively, the approval dynamic changes. Instead of reviewing a render and saying yes or no, they begin to direct. "Move the kitchen island back half a metre" becomes a live design conversation rather than a change order three weeks later. This is better for the project and worse for firms that monetise every iteration as a billable revision.

Contractor pricing gets earlier because the visual is dimensionally honest sooner. A contractor priced off a spatially accurate representation makes fewer assumptions than one pricing from 2D drawings. Contingency drops. The schedule shortens. The risk allocation between design team and builder changes. This is not a design benefit; it is a project delivery benefit, and it is the lever most likely to drive procurement pressure toward practices that can produce it.

The fee structure inside firms re-prices toward the up-front phase. If the value of spatial AI is front-loaded into pre-design and design development — the phases where the compressed decision cycle matters — then the fee curve should shift to match. Practices that start charging significant pre-design retainers for spatial visualisation capability will capture the efficiency; practices that treat visualisation as a loss-leader to win the downstream work will continue to subsidise the client's decision-making at their own expense.

Junior staff shift from drafting to editing. The model is not produced manually; it is produced by AI and edited by humans. The skill set this requires is different from the one architectural education has been producing: less manual drafting fluency, more visual judgment, more ability to evaluate AI output against design intent, more comfort directing tools rather than operating them.

Why furniture is the canary

Imersian is one data point on this, so I should be precise about what I think it proves and what I do not think it proves.

The furniture case has a specific set of properties that made it tractable before the building case:

The object is smaller. A sofa is bounded. Its reconstruction problem is finite. A building is not.

The purchase cycle is shorter. A furniture buying decision can be made and executed in a single session. The visualisation needs to close a sale in minutes, not inform a planning application over years. The feedback loop is fast enough to iterate on.

The asset pipeline is the problem, not the rendering. What we learned building Imersian is that the hard work is not producing a beautiful render. It is getting 30,000 SKUs from a retailer's legacy catalogue into a state where they can be spatially rendered with dimensional accuracy. The model is not the moat. The ingestion pipeline is the moat. This will be equally true for the building case: whoever solves the "get this building into spatial AI" pipeline problem owns the workflow.

Willingness to pay is high per visualisation. A retailer will pay to convert their catalogue because each visualisation is proximate to a transaction. The value is measurable and immediate.

The building equivalent is bigger and slower: larger objects, longer decisions, more stakeholders, more regulatory friction, less direct transaction proximity. But the direction is the same. Spatial AI will compress the pre-construction gap for buildings. It will take longer than the furniture case and require solving harder problems.

The practices and technology companies that start building toward that compression now — on the asset pipeline, the spatial anchoring, the multi-modal integration — are investing in something that will matter significantly in the next five to ten years.

The practices that wait for the technology to be finished before they engage with it will find it already embedded in their competitors' workflows by the time it arrives.

Ven Iyer co-founded Imersian, a spatial commerce platform, and led computational product development at Bollinger Grohmann. He writes about AI in spatial practice at vencolab.com. Get in touch if you are building in this space.