OpenAI has just released Shap-E, a generative model for creating 3D assets based on text prompts and images.

It’s capable of generating both textured meshes¹ and neural radiance fields², allowing for a diverse range of 3D outputs.

In this tutorial, we’ll walk you through setting up Shap-E on Google Colab (free)³, running the code to generate 3D objects from a text prompt and from an image. Thanks to Google Colab you don’t need a powerful GPU because we’ll use the one provided by Google.

The code we’re running can be found here (taken from the openai/shap-e Github):

sample_text_to_3d.ipynb – code to generate a 3D model from text
sample_image_to_3d.ipynb – code to generate a 3D model from an image

Quick Demo
Setting up Shap-E on Google Colab
1. Enable GPU on Google Colab
2. Install Shap-E
Generate 3D Objects from Text with Shap-E
1. Saving the Generated 3D Objects as Meshes
Generate 3D Objects from Images with Shap-E
Conclusion
Resources

Quick Demo

In this short demo we’re installing and running Shap-E on Google Colab.

Watch this video on YouTube

Setting up Shap-E on Google Colab

Open Google Colab by visiting https://colab.research.google.com/.

Click on File > New notebook to create a new Colab notebook.

Enable GPU on Google Colab

We’ll then need to enable Graphics Processing Unit (GPU) on our notebook. It’s often required for resource-intensive tasks like deep learning.

To enable GPU in Google Colab, follow these steps:

You have opened your new Colab notebook.
Click on the “Runtime” menu in the top toolbar.
Select “Change runtime type” from the dropdown menu.
In the “Runtime type” dialog, choose “GPU” from the “Hardware accelerator” dropdown.
Click “Save” to apply the changes.

Install Shap-E

In Google Colab, we need to first clone the Shap-E repository from GitHub and then install the required packages. To do this, follow these steps:

Step 1. In the first cell of your Colab notebook, paste the following code:

!git clone https://github.com/openai/shap-e.git

This command clones the Shap-E repository from GitHub to your Colab environment. It downloads the code, examples, and required files for you to use Shap-E.

Run the cell by clicking on the play button or pressing Shift + Enter.

Step 2. In a new cell, paste the following code:

%cd shap-e

This command changes the current working directory to the shap-e folder, which is where we cloned the Shap-E repository in the previous step. We need to be inside this folder to install the required packages.

Run the cell by clicking on the play button or pressing Shift + Enter.

Step 3. In another new cell, paste the following code:

!pip install -e .

This command installs the necessary packages for Shap-E in your Colab environment. The -e flag installs the package in “editable” mode, which means that any changes made to the package files will be reflected in the installed package without needing to reinstall it.

Run the cell to complete the installation.

Now that the Shap-E repository is cloned and the required packages are installed, you can proceed with generating 3D objects using the code provided earlier in the tutorial.

Generate 3D Objects from Text with Shap-E

To generate a 3D object based on a text prompt, follow these steps:

Step 1. In a new cell in your Colab notebook, paste the following code:

import torch
from shap_e.diffusion.sample import sample_latents
from shap_e.diffusion.gaussian_diffusion import diffusion_from_config
from shap_e.models.download import load_model, load_config
from shap_e.util.notebooks import create_pan_cameras, decode_latent_images, gif_widget

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
xm = load_model('transmitter', device=device)
model = load_model('text300M', device=device)
diffusion = diffusion_from_config(load_config('diffusion'))

batch_size = 4
guidance_scale = 15.0
prompt = "a shark"

latents = sample_latents(
    batch_size=batch_size,
    model=model,
    diffusion=diffusion,
    guidance_scale=guidance_scale,
    model_kwargs=dict(texts=[prompt] * batch_size),
    progress=True,
    clip_denoised=True,
    use_fp16=True,
    use_karras=True,
    karras_steps=64,
    sigma_min=1e-3,
    sigma_max=160,
    s_churn=0,
)

render_mode = 'nerf'  # you can change this to 'stf'
size = 64  # this is the size of the renders; higher values take longer to render.

cameras = create_pan_cameras(size, device)
for i, latent in enumerate(latents):
    images = decode_latent_images(xm, latent, cameras, rendering_mode=render_mode)
    display(gif_widget(images))

This code sets up the necessary imports, loads the Shap-E models, and configures the generation parameters such as the text prompt and rendering options. The text prompt in this example is a shark, but you can change it to any object you’d like to generate.

Step 2. Run the cell to generate 3D objects based on the text prompt. The output will be displayed as animated GIFs, showing the generated 3D objects from different angles.

You can experiment with different text prompts and rendering options by changing the prompt, render_mode, and size variables in the code.

Saving the Generated 3D Objects as Meshes

To save the generated 3D objects as mesh files (in PLY format), follow these steps:

Step 1. In a new cell, paste the following code:

from shap_e.util.notebooks import decode_latent_mesh

for i, latent in enumerate(latents):
    with open(f'example_mesh_{i}.ply', 'wb') as f:
        decode_latent_mesh(xm, latent).tri_mesh().write_ply(f)

Step 2. Run the cell to save the generated 3D objects as PLY files. These files will be saved in the shap-e folder in your Colab environment.

They’ll be saved as files named example_mesh_0.ply.

Step 3. To download the generated PLY files to your local machine, click on the folder icon on the left sidebar in Colab, navigate to the shap-e folder, and right-click on the PLY files you want to download. Select ‘Download’ to save them to your local machine.

Now you can use these generated 3D objects in any 3D modeling software that supports PLY files.

Generate 3D Objects from Images with Shap-E

You can also generate 3D objects from images with Shap-E.

To do this, first we’ll use the sample image provided in the examples.

First download that image and upload it in the shap-e directory in your Google Colab.

Just hover over the directory in the file browser on the left and you’ll see a 3 dot menu. Click it and then click upload, and upload corgi.png.

Next, assuming you enabled GPU and installed Shap-E, run the following code:

import torch

from shap_e.diffusion.sample import sample_latents
from shap_e.diffusion.gaussian_diffusion import diffusion_from_config
from shap_e.models.download import load_model, load_config
from shap_e.util.notebooks import create_pan_cameras, decode_latent_images, gif_widget
from shap_e.util.image_util import load_image

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

xm = load_model('transmitter', device=device)
model = load_model('image300M', device=device)
diffusion = diffusion_from_config(load_config('diffusion'))

batch_size = 4
guidance_scale = 3.0

image = load_image("example_data/corgi.png")

latents = sample_latents(
    batch_size=batch_size,
    model=model,
    diffusion=diffusion,
    guidance_scale=guidance_scale,
    model_kwargs=dict(images=[image] * batch_size),
    progress=True,
    clip_denoised=True,
    use_fp16=True,
    use_karras=True,
    karras_steps=64,
    sigma_min=1e-3,
    sigma_max=160,
    s_churn=0,
)

render_mode = 'nerf' # you can change this to 'stf' for mesh rendering
size = 64 # this is the size of the renders; higher values take longer to render.

cameras = create_pan_cameras(size, device)
for i, latent in enumerate(latents):
    images = decode_latent_images(xm, latent, cameras, rendering_mode=render_mode)
    display(gif_widget(images))

The result doesn’t seem all that great. But hopefully with some tweaking or using other images, you’ll get better results.

Conclusion

OpenAI’s Shap-E is a powerful tool that enables users to generate 3D objects from text and images.

By leveraging Google Colab, you can easily set up and run Shap-E without the need for any complicated installations or powerful hardware.

Resources

https://github.com/openai/shap-e – The official OpenAI Shap-E repository. You can find the code and useful examples and inspiration on what you can generate with Shap-E.
https://github.com/openai/shap-e/blob/main/samples.md – many GIF samples of what you can generate.

1
Meshes are a way to represent 3D objects using a collection of vertices, edges, and faces. These small shapes, usually triangles or quadrilaterals, are combined to form the object’s surface. Meshes are commonly used in 3D modeling and animation.
2
Neural Radiance Fields (NeRF) represent 3D objects as continuous functions mapping 3D coordinates to color and density values. A neural network learns this representation from 2D images of the object. NeRF can generate high-quality images of the object from different viewpoints and lighting conditions.
3
Google Colab is an online platform for writing, running, and sharing code, similar to how Google Docs is an online platform for writing and sharing documents. Colab provides an interactive coding environment, where you can write, run, and share code in Python notebooks. It offers built-in access to powerful computing resources, including GPUs and TPUs, which makes it a popular choice for machine learning and data science projects. Just like Google Docs, you can collaborate with others in real-time and easily share your work.

8 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Alan

2 years ago

This was really helpful, thanks

Author

EdXD

Reply to Alan

Hi, Alan. Thanks for commenting! Glad to help! If you have any suggestions, or would’ve prefer to cover some other aspects, please let me know. It would be of great help!

Gary

nice introduction to the topic !
After giving it a shot, Collab shows this error :

—————————————————————————
RuntimeError Traceback (most recent call last)
in ()
6
7 device = torch.device(‘cuda’ if torch.cuda.is_available() else ‘cpu’)
—-> 8 xm = load_model(‘transmitter’, device=device)
9 model = load_model(‘text300M’, device=device)
10 diffusion = diffusion_from_config(load_config(‘diffusion’))

3 frames
/content/shap-e/shap_e/models/download.py in check_hash(path, expected_hash)
86 actual_hash = hash_file(path)
87 if actual_hash != expected_hash:
—> 88 raise RuntimeError(
89 f”The file {path} should have hash {expected_hash} but has {actual_hash}. ”
90 “Try deleting it and running this call again.”

RuntimeError: The file /content/shap-e/shap_e_model_cache/transmitter.pt should have hash af02a0b85a8abdfb3919584b63c540ba175f6ad4790f574a7fef4617e5acdc3b but has 03622d595efd68c5cedf8f1633937fd40ed9a7db900a4c2a3d70493542182878. Try deleting it and running this call again.

april

Locally start python to run the file, install https://openaipublic.azureedge.net/main/shap-e/transmitter.pt has been interrupted, how to solve it

Mark

Great article! Is Shap-e still your favorite text to 3D? Thank you!

Sher

1 year ago

Hi!
Thank you very much for this!! As someone who has no background in coding at all, and I have visited other articles, I’ve been facing so many errors, almost gave up! Trying your steps, made it work for the very first time! thank you!! I’m very thankful! Maybe also add that this is only doable in Linux environment for dummies like me. Thanks!

Reply to Sher

Oh, wow! Thank you for the kind words. I’m happy you found it useful!

I’ll look into how to do these things on Linux. I’ve been a bit hesitant because often performance of AI tools depends on each user’s GPU, and it may be frustrating for some who don’t have a powerful GPU.