Carnegie Mellon's AI-Powered FRIDA Robot Collaborates With Humans To Create Art

Aaron AupperleeTuesday, February 7, 2023

FRIDA, a robotic arm with a paintbrush taped to it, uses artificial intelligence to collaborate with humans on works of art. Here, it works on a portrait of Ruth Bader Ginsburg.

Carnegie Mellon University's Robotics Institute has a new artist-in-residence.

FRIDA, a robotic arm with a paintbrush taped to it, uses artificial intelligence to collaborate with humans on works of art. Ask FRIDA to paint a picture, and it gets to work putting brush to canvas.

"There's this one painting of a frog ballerina that I think turned out really nicely," said Peter Schaldenbrand, a School of Computer Science Ph.D. student in the Robotics Institute working with FRIDA and exploring AI and creativity. "It is really silly and fun, and I think the surprise of what FRIDA generated based on my input was really fun to see."

FRIDA, named after Frida Kahlo, stands for Framework and Robotics Initiative for Developing Arts. The project is led by Jean Oh, an associate research professor in the RI and the head of the Bot Intelligence Group (BIG), and Jim McCann, an assistant RI professor who runs the Textiles LabSchaldenbrand has been the project's driving force, and students and researchers across CMU have collaborated on research involving FRIDA.

Users can direct FRIDA by inputting a text description, submitting other works of art to inspire its style, or uploading a photograph and asking it to paint a representation of it. The team is experimenting with other inputs as well, including audio. They played ABBA's "Dancing Queen" and asked FRIDA to paint it.

"FRIDA is a robotic painting system, but FRIDA is not an artist," Schaldenbrand said. "FRIDA is not generating the ideas to communicate. FRIDA is a system that an artist could collaborate with. The artist can specify high-level goals for FRIDA and then FRIDA can execute them."

The robot uses AI models similar to those powering tools like OpenAI's ChatGPT and DALL-E 2, which generate text or an image, respectively, in response to a prompt. FRIDA simulates how it would paint an image with brush strokes and uses machine learning to evaluate its progress as it works.

FRIDA's final products are impressionistic and whimsical. The brushstrokes are bold. They lack the precision sought so often in robotic endeavors. If FRIDA makes a mistake, it riffs on it, incorporating the errant splotch of paint into the end result.

"FRIDA is a project exploring the intersection of human and robotic creativity," McCann said. "FRIDA is using the kind of AI models that have been developed to do things like caption images and understand scene content and applying it to this artistic generative problem."

FRIDA taps into AI and machine learning several times during its artistic process. First, it spends an hour or more learning how to use its paintbrush. Then, it uses large vision-language models trained on massive data sets that pair text and images scraped from the internet, such as OpenAI's Contrastive Language-Image Pre-Training (CLIP), to understand the input. AI systems use these models to generate new text or images based on a prompt.

Other image-generating tools such as OpenAI's DALL-E 2, use large vision-language models to produce digital images. FRIDA takes that a step further and uses its embodied robotic system to produce physical paintings. One of the biggest technical challenges in producing a physical image is reducing the simulation-to-real gap, the difference between what FRIDA composes in simulation and what it paints on the canvas. FRIDA uses an idea known as real2sim2real. The robot's actual brush strokes are used to train the simulator to reflect and mimic the physical capabilities of the robot and painting materials.

FRIDA's team also seeks to address some of the limitations in current large vision-language models by continually refining the ones they use. The team fed the models the headlines from news articles to give it a sense of what was happening in the world and further trained them on images and text more representative of diverse cultures to avoid an American or Western bias. This multicultural collaboration effort is led by Zhixuan Liu and Beverley-Claire Okogwu, first-year RI master's students, and Youeun Shin and Youngsik Yun, visiting master's students from Dongguk University in Korea. Their efforts include training data contributions from China, Japan, Korea, Mexico, Nigeria, Norway, Vietnam and other countries.

Once FRIDA's human user has specified a high-level concept of the painting they want to create, the robot uses machine learning to create its simulation and develop a plan to make a painting to achieve the user's goals. FRIDA displays a color pallet on a computer screen for a human to mix and provide to the robot. Automatic paint mixing is currently being developed, led by Jiaying Wei, a master's student in the School of Architecture, with Eunsu Kang, faculty in the Machine Learning Department.

Armed with a brush and paint, FRIDA will make its first strokes. Every so often, the robot uses an overhead camera to capture an image of the painting. The image helps FRIDA evaluate its progress and refine its plan, if needed. The whole process takes hours.

"People wonder if FRIDA is going to take artists' jobs, but the main goal of the FRIDA project is quite the opposite. We want to really promote human creativity through FRIDA," Oh said. "For instance, I personally wanted to be an artist. Now, I can actually collaborate with FRIDA to express my ideas in painting."

More information about FRIDA is available on its website. The team will present its latest research from the project, "FRIDA: A Collaborative Robot Painter With a Differentiable, Real2Sim2Real Planning Environment" at the 2023 IEEE International Conference on Robotics and Automation this May in London. FRIDA resides in the RI's Bot Intelligence Group (BIG) lab in the Squirrel Hill neighborhood of Pittsburgh.

For More Information

Aaron Aupperlee | 412-268-9068 | aaupperlee@cmu.edu