AI helps convert photos into 3D models

Article By : Purdue University

Developed by Purdue University Professor Karthik Ramani's team, the SurfNet method can create 3D content for virtual or augmented reality by simply using standard 2D images.

Researchers at Purdue University have developed a new technique that uses machine learning and deep learning methods to create 3D shapes from 2D images.

When fully developed, this method, called SurfNet, could have significant applications in the fields of 3D searches on the Internet, as well as helping robotics and autonomous vehicles better understand their surroundings, according to the team.

Karthik Ramani, Purdue's Donald W. Feddersen Professor of Mechanical Engineering, said the "magical" capability of AI deep learning is that it is able to learn abstractly.

"If you show it hundreds of thousands of shapes of something such as a car, if you then show it a 2D image of a car, it can reconstruct that model in 3D," said Ramani. "It can even take two 2D images and create a 3D shape between the two, which we call 'hallucination.'"

The researcher said SurfNet could be used to create 3D content for virtual reality and augmented reality by simply using standard 2D photos.

"You can imagine a movie camera that is taking pictures in 2D, but in the virtual reality world everything is appearing magically in 3D," he said. "Pretty soon we will be at a stage where humans will not be able to differentiate between reality and virtual reality."

20170731_EETI_Purdue-3D-printing-01 (cr)
Figure 1: *Computers using a new artificial intelligence technique developed at Purdue University can create 3D shapes from 2D images, such as these photographs of airplanes. The technique could help technologies such as virtual reality, augmented reality and robotics. (Source: Purdue University)
The computer system is able to learn both the 3D image and the 2D image in pairs, and then is able to predict other, similar 3D shapes from just a 2D image, according to Ramani.

"This is very similar to how a camera or scanner uses just three colours, red, green and blue—known as RGB—to create a colour image, except we use the XYZ coordinates," he explained.

Ramani said the technique also allows for greater accuracy and precision than current 3D deep learning methods that operate more using volumetric pixels (or voxels).

"We use the surfaces instead since it fully defines the shape. It's kind of an interesting offshoot of this method. Because we are working in the 2D domain to reconstruct the 3D structure, instead of doing 1,000 data points like you would otherwise with other emerging methods, we can do 10,000 points. We are more efficient and compact."

One significant outcome of the research would be for robotics, object recognition and even self-driving cars in the future; they would only need to be fitted with standard 2D cameras, yet still have the ability to understand the 3D environment around them.

Leave a comment