Recent developments in AI have highlighted the importance of scale in driving advancements across various domains. Large AI models, with their increased number of learnable parameters, have shown significant improvements in language comprehension, generation, representation learning, multimodal tasks, and image generation. For example, models like Chinchilla and LLaMA have surpassed previous models like GPT-2 by consuming trillions of web-crawled tokens, leading to better performance on benchmarks and greater capabilities.
In the field of computer vision, the scaling of datasets from millions to billions of images through web crawling has resulted in powerful visual representations. Datasets like LAION5B, with billions of images, have allowed models like CLIP to outperform previous models like ImageNet. However, in areas like 3D computer vision, scaling has been more challenging. Current tasks like 3D object generation and reconstruction still rely on small handcrafted datasets, making it difficult to crowdsource and scale.
To address this limitation, researchers from various institutions have introduced Objaverse-XL, a large-scale web-crawled dataset of 3D assets. This dataset, with over 10 million 3D objects, offers a wider variety and higher quality of data than previous efforts. Models trained on Objaverse-XL, such as Zero123-XL and PixelNeRF, have demonstrated remarkable zero-shot generalization capabilities and improved performance in tasks like novel view synthesis.
Objaverse-XL has implications beyond just 3D models. It opens up opportunities in computer vision, graphics, augmented reality, and generative AI. For instance, researchers can now explore text-to-3D generation using the vast and diverse dataset. The release of Objaverse-XL marks a significant milestone in the field of 3D datasets and provides a foundation for large-scale training and groundbreaking research.
While Objaverse-XL is currently smaller than billion-scale image-text datasets, it sets the stage for further exploration and scaling of 3D datasets. Future work can focus on optimizing data points for training and extending Objaverse-XL to benefit other tasks like 3D segmentation and detection. Overall, the introduction of Objaverse-XL presents exciting new possibilities in computer vision, graphics, augmented reality, and generative AI.
To stay updated with the latest AI research news and projects, join our ML subreddit, Discord channel, and email newsletter.