This is a seriously impressive demonstration of Intel Labs' Enhancing Photorealism Enhancement project, which looks to me like a major advance in realistic 3D graphics. Very roughly summarized, it uses machine learning algorithms to merge 3D graphics with related footage taken from real life -- in this case, Grand Theft Auto Online, and video footage of a German city. (Notably not Los Angeles, even though GTA V is almost a direct mirror image of the LA cityscape.)
Or to put it more academic terms:
We present an approach to enhancing the realism of synthetic images. The images are enhanced by a convolutional network that leverages intermediate representations produced by conventional rendering pipelines. The network is trained via a novel adversarial objective, which provides strong supervision at multiple perceptual levels. We analyze scene layout distributions in commonly used datasets and find that they differ in important ways. We hypothesize that this is one of the causes of strong artifacts that can be observed in the results of many prior methods. To address this we propose a new strategy for sampling image patches during training. We also introduce multiple architectural improvements in the deep network modules used for photorealism enhancement. We confirm the benefits of our contributions in controlled experiments and report substantial gains in stability and realism in comparison to recent image-to-image translation methods and a variety of other baselines.
Much more here. I wonder how dependent this process is on the more reliable and predictable patterns of modern city streets, versus, say, 3D graphics depicting an alien planet. Anyway I've reached out to project lead Stephan Richter at Intel Labs and hope to hear back. I'm also curious if the graphics have been made to be too real, i.e. too detail rich for the human brain to comfortably process -- or stripped of subtle artistic choices that make a virtual world like GTA Online so memorable.
Hat tip: /SecondLife, where Redditors wonder if such a process would work to enhance graphics in a virtual world like Second Life. I'd (very recklessly?) speculate not unless the re-rendering process is all done before publishing it to the servers, or possibly if it's streamed live via cloud streaming.
Darn, I'd been hoping they were using it to generate the 3d content. Given the regular pattern of cities, I wonder how hard it would be to semiautomatically generate a 3d city from streetview data? I know Google has 3d models of buildings already, but my goal isn't to get an accurate representation, just a plausible city without too much repetition, particularly in textures.
Posted by: Sean R. Lynch | Thursday, May 27, 2021 at 12:29 PM