San Fransico-based artificial intelligence startup OpenAI unveiled the latest version of its AI image generator tool DALL-E. According to the company, the machine learning model named DALL-E 3 allows users to generate images with significantly more “nuance and detail” than previous iterations.
OpenAI researcher Aditya Ramesh says the system is now far better at understanding and representing what a user asks for, adding that the technology was built to have a more precise grasp of the English language.
ChatGPT Can Now Generate Images And Also Provide Picture Descriptions
The Microsoft-backed company is implementing DALL-E on ChatGPT, making the large language model-based (LLM) chatbot a generative AI hub that can produce text, images, sounds, software, and digital media on its own.
Despite being developed and owned by OpenAI, both DALL-E and ChatGPT were operating as separate applications. But with the latest update, ChatGPT users can simply produce an image by describing what they want to see or can generate images using descriptions provided by the chatbot.
OpenAI CEO Sam Altman shared a video on X (formerly Twitter) explaining how the dual functionality could be made use of. He said the language model could now write and illustrate a children’s bedtime story from simple prompts.
Meanwhile, Gabriel Goh, another researcher at the company, showed how ChatGPT could generate detailed descriptions that can be later used to create related images.
The bot was able to produce logo descriptions for a restaurant called Mountain Ramen and then proceeded to generate several images by referring to the descriptions, all in a matter of seconds.
Goh added that DALL-E 3 is able to produce multi-paragraph descriptions and follow instructions laid out by users in minute detail. He also warned the image generator, like all other text-to-image AI systems currently in the market such as Midjounrey or Open AI’s own DALL-E 2, is prone to mistakes where it can misinterpret commands.
This has given rise to prompt engineers, who train generative artificial intelligence tools like ChatGPT and DALL-E to better optimize their results.
AI Image Generators Are Widely Used To Spread Disinformation Online, Says Experts
Experts have criticized image-generating technologies for allowing the spread of large amounts of online disinformation.
To combat misinformation, DALL-E 3 will incorporate tools that prevent the generation of images related to problematic subjects, such as sexually explicit images and portrayals of public figures. OpenAI is also working to limit DALL-E’s ability to mimic specific artistic styles.
Open AI’s safety and policy lead Sandhini Agarwal said DALL-E has the tendency to generate images that are more stylized than photorealistic. She acknowledged that the image-processing model could still be prompted to produce convincing images like a moment captured by security cameras.
According to Agarwal, the company does not plan to block potentially problematic content produced by DALL-E 3, saying that such an approach was “just too broad” as the images could only be considered dangerous depending on the context in which they appear or where it is being used.
Speaking on an episode of “The Diary of a CEO” podcast earlier this month, British artificial intelligence researcher and co-founder of AI company Deepmind (now owned by Alphabet), Mustafa Suleyman MBE, warned of a dark scenario where bad actors could misuse AI technology to experiment with dangerous pathogens to intentionally create synthetic viruses that could be more transmissible or lethal and trigger a potential pandemic.
He is demanding to limit AI’s access to tools and systems that could carry out that kind of experimentation.
OpenAI will be releasing DALL-E 3 in October. The text-to-image processing ML algorithm will be exclusive to ChatGPT Plus and ChatGPT Enterprise users.
Read More: Microsoft AI and Surface Event: Everything You Need To Know Event