Fireside Chat
Roope Rainisto speaks with Elman Mansimov in a fireside chat with Alejandro Cartagena, co-founder of Fellowship.
Alejandro Cartagena: So, Roope, in your artistic journey, how do you balance artistic intuition with the algorithmic logic provided by AI?
Roope Rainisto: It's an interesting question. As an artist, I've been through multiple ideas and, when thinking about AI tools, they work great for some ideas but not for all. It's one path; you can use them to create something, and it works great for some ideas but won't work for all ideas, at least today. But who knows what you can do in five years. I think as an artist, it's a new tool for a new art form. It's slightly a black box or a box that nobody completely knows what it can do.
It's really important, as an artist, to keep an open mind, to explore, and not have a 100% preset, preconceived notion of the thing that you want to create. You should have some idea and be open to it. It's a hybrid interplay between me and the AI tool that I'm using. So, the best results are certainly not the first attempts, but it's when you start to go deeper and deeper, and I think that's the way to create something.
Cartagena: In this process of going deeper, where do you draw the line between these creative inputs and the AI's contribution in your art?
Rainisto: Drawing the line, as I've sometimes said previously, personally, I think the ultimate judgment call I use with AI is whether you are able to create all sorts of fantastical art. You are able to create fantastical visions. But at the end of the day, it comes back to what the artist wants the viewer or the consumer of the art to feel. What do I want to try to communicate, or what do I want to make somebody feel? I think the question is whether it's easy to create fantastical art that doesn't invoke a particular feeling. I often use that as the ultimate discriminator.
So if I want to invoke a particular feeling related to the goal, if the artwork invokes that, that feeds into me, and I'm pleased. That's irrespective of how aesthetically beautiful the art is or how complex or simple it was to create, but it's all about this feeling. Sometimes it takes a long time to create something that actually makes me feel something, something that I want to make the artwork make somebody else feel. I'm now using that as my personal guideline. Irrespective of how beautiful or not something is, it's about if it's able to come and evoke a feeling.
Elman Mansimov: It's very interesting that you mentioned that because from the computer science perspective, I'm developing those AI models most of the time. The metric when you develop these models was to make something photorealistic or something that humans cannot distinguish from a real photo or an AI-generated photo. But I guess, partly, what I've felt is a failure of the models back in the day. Of course, now they're much more realistic than before, but I interpreted it as a failure that there are certain artifacts. If you take deep faces, the faces generated with GANs, they would have something off about the eyes, something off about the skin or lips. I'm pretty sure if you go to other things, like actual objects or styles, I felt it was in some way a failure that we couldn't get it photorealistic. But you mentioned that it provokes feelings, right? And then for you, and for other people, when they look at it, it seems like it's a failure from a scientific perspective. Not a failure, but it's not there, right? We haven't reached that 100% level of realism.
But for me, it was still a pleasant surprise that it invokes feelings in people. When they look at it, it's something very interesting. Do you remember Deep Dream back in 2015, those swirls of the images? I felt like it was the first time where people were looking at those representations of hidden units in the neural network, and they were impressed by how artistic it is. It was a global phenomenon back then. I don't think people talk about it as much now, but it's pleasantly impressive to me that those things catch up in a way that you don't expect.
Rainisto: Yeah, and I think I'd like to continue because I think that and it also relates to texts and prompts. So I mean, there are different works that have different meanings. So if I say a cat or a dog, we have an idea in the real world as to what a cat looks like, even though cats don’t look alike. But if the word is "sadness" or, you know, there are so many words that don't have a physical meaning. So why wouldn't it be right if sadness would create an abstract picture? Reality is overall, some words describe reality, and some works are about feelings or concepts that have non-specific light, or right or wrong.
And so, if tools are actually, if you prompt for sadness and then create something that is not realistic, is that a meaning that the tool fails, or is it that it's actually successful because it understands that it shouldn't look realistic? I think that's an extremely interesting question.
Cartagena: Okay, so, Elman, also a similar question from your perspective, how does AI influence the creative decision-making process in art?
Mansimov: It's very interesting from my perspective because I'm not an artist, right? I'm a computer scientist, and that's my background, and that's what I do. And I think what is impressive is, as Roope mentioned at the end, is how you have the prompts. Like, the AI model is not just the actual model, like the output it generates, but also what you're feeding into it and how you're using it to generate those images.
So I think the fact that creatives and artists think about those prompts in a way that people like me, who are more into training these models, think about algorithms like backpropagation, architectures, loss functions, making sure that things are running properly. We don't really think about it, but people do. And people create fantastic things with that to the point that you mentioned, like sadness. You put what you feel, you generate something, you iterate, use techniques like Stable Diffusion over that latent space, and then you get to something that inspires you. It's pretty incredible. I think what's also impressive to me is that some artists not only use prompts but also train the models themselves.
I know things are not very clean, and there are colab notebooks, hardware like GPUs, coding in Python, PyTorch, and all these packages. But it's impressive to me that artists, who may not have the exact background as we do, figure out how to tweak and run it and then run it frugally. I'm very impressed with how people use these tools, and I encourage them to keep doing it because I want to discover something that I don't see and how AI influences artists in ways that I would never even imagine before.
Cartagena: Yes, I see. And what are the most significant technological challenges you face when engineering AI to meet artistic needs?
Mansimov: I wouldn't necessarily start with artistic needs first. I would just generally say that engineering is the challenge. Many years ago, seven or eight years ago, at the beginning of text-to-image synthesis, we didn't have enough hardware at the university; we had limited memory. The software wasn't as good, and we had to write backpropagation ourselves. Now, with open source, many things are handled for you, and there are better packages and access to powerful machines. I feel like hardware and software got better simultaneously. Commercially, this led to rapid progress in the field. Access to better hardware, more datasets, and improved software made it all come together in an incredible way.
Also, the way artists commercialize their work, like with NFTs, is significant. It's not just images in a notebook; people turn them into pieces they sell and distribute. Engineering challenges include software, hardware, access, etc., but it all worked together to make things better and more accessible.
Rainisto: From my perspective as the end user/artist, I started in 2021 using Colab, which is an extremely easy way to deploy AI. The tools make it accessible to people who may not be technically sophisticated. While traditional artists spent years thinking about these questions, new artists may have a different perspective. Copying or replicating someone else's style is a common starting point for artists, whether they draw, write music, or create AI artwork.
However, as they gain experience, they start to develop their own style and voice, which is a natural progression. AI tools are increasingly customizable, making it easier for artists to personalize their work and find their own distinctive style. It's a shame to judge new artists if they start by replicating someone else's style, as it takes time and experience to develop their own unique approach.
Mansimov: I agree with Roope that artists often begin by replicating others' styles and get inspired by each other. They start with something familiar and gradually add their own voice to it. It's not just in art but also in science, where we reference previous work and build upon it. Copyright and compensation issues arise when artists use AI to quickly generate works that they profit from without significant personal input. Commercialization and distribution become key concerns in this context. Compensating artists for their work and data input into the model is a challenge that requires better solutions.
Cartagena: As this realm is growing and becoming easier to enter, how are you navigating the ethical considerations that come with using AI in your art? Roope, regarding originality and copyright issues, how do you handle them?
Rainisto: Copyright laws and ethics haven't changed, in my opinion. Whether you create an artwork using AI or traditional methods, the end result is what matters. Artists have always faced questions about originality and copying, and these questions are older than AI itself. AI tools have made it easier to replicate styles, but over time, artists naturally develop their own voice and style. Most artists want to be recognized for their distinct style. AI tools are increasingly customizable, making it easier for artists to personalize their work and find their own distinctive style. While it's not a good idea to shame new artists who start by replicating others' styles, it's a natural progression to find their own voice.
Mansimov: I agree that artists often begin by getting inspired by each other and gradually adding their own voice to their work. AI copyright and compensation issues arise when artists quickly generate works and profit from them without significant personal input. The challenge is how to properly compensate and credit artists for their work and the data they contribute to the model. It's not an easy answer, and we need better ways to address these concerns.
Cartagena: Do you think that this question of intellectual property is shifting now because of the volume of content we are producing, and because it's very easy to copy and reproduce? Do you think this is influencing the future of AI in arts, in the art fields?
Rainisto: I'm certain that these questions are very much at the forefront. And I think we are still far from reaching mass use. Probably only a few percent of the world's population is currently using it. So, if you try to imagine when it's ten times more, twenty times more, or fifty times more, it will become more prominent. I do think that if someone were to copy a successful artist's style or art and put it online, they won't automatically become successful because of that.
Even though there are tens of millions of Midjourney users, not all of them are successful, even though they create art that looks fantastic. I think the style and the technical aspect are important, but by themselves, they are insufficient. It's like a cookbook from a famous chef. Their cookbook sells, and if I make a cookbook, I need to have great authority because my cookbook won't sell by itself. It's a broader question of what we are creating and why someone would prefer our version of it. So, why would somebody want my recipe for spaghetti instead of a famous chef's? Even if I don't copy the famous chef's recipe into my cookbook, why would someone buy my book? This question applies to photos as well. Nowadays, we all have beautiful cameras, and we can all take fantastic photos. Yet, professional photographers still exist, and most people can't sell their photos. The ease of creation alone doesn't eliminate the need for professional artists.
Mansimov: I think branding matters a lot. Branding comes from having a unique voice, unique perspective, and being unique yourself. I agree that it's increasingly important now, as it has been throughout our existence. People should have something unique of their own. Additionally, in terms of ethical usage, people should be transparent about when and how they use AI. This applies even in the context of academic writing, where students might use Chat GPT for assistance. I was talking to a professor who encourages students to use Chat GPT but wants them to explain how they used it. If they used it to edit sentences or bRainistorm, that's fine. But if they simply copy-paste without adding anything of their own, it becomes a problem. It's not just about grades; it's about showing the world something unique that reflects your thoughts, even if you used AI to assist in some parts.
Rainisto: That's an interesting point. As these AI-generated images become more realistic, there's a desire to add a watermark to indicate that it's fake. Realism is intriguing. Even today, people use AI to enhance their photos, making it increasingly hard to distinguish real from fake. Personally, as an artist, I find it more engaging to explore the contradictions rather than striving for 100% realism. Sometimes, it's harder to do that, as creating something truly unique requires more effort. It's a bit like the joke that once a computer learns to draw a hand with five fingers, it's harder to make it draw a hand with six fingers. As an artist, I choose to embrace transparency in my art, making it clear how it's created, so I don't need to put a stamp in the corner to say, "This was created by AI." But the majority of AI art seems to lean towards a realistic style, either due to how the tools are designed or what people desire.
Rainisto: If you think back to 2015 and compare it to today, did you have an estimate in 2015 of how many years it would take to reach this stage we are at now? Did you think it would take more or less time? Do you have a feeling of how long it would take to become completely photorealistic after seeing the first results?
Mansimov: Back in 2015, when I was working on it, I knew it would get better and more photorealistic over time. The timeline initially felt like it would take at least ten years, so optimistically, 2025 or even into the 2030s for more realistic results. I knew it would happen. What surprised me was how quickly it happened. It's you go about your daily life with all your ups and downs, and suddenly something incredible happens, even if you weren't thinking about it. For me, that moment was when OpenAI released DALL·E-2 and showcased an avocado chair, which I believe was in early 2022 or late 2021. I didn't expect to open my Twitter feed and see such an interesting, funky creation from the model. Another surprise was how mainstream it became, with Midjourney gaining over ten million subscribers. It's impressive how fast and far we've come, and there's still room for further growth. It's also amazing to see how people use and discover it, even those who aren't experts in machine learning, using prompts and Colab notebooks in ways I didn't expect five to seven years ago. It's becoming more and more mainstream.
And then, I guess, in a similar vein, I'm very curious. You started doing it relatively recently, in 2021. I saw that you did do art before, or I do know you wrote some music, played instruments, but it's not a totally different world for you. I'm very curious, what attracted you to this place? Was it something you saw on the internet, or maybe you always thought about it, but it was, okay, now it's time in my life that I want to try something different and go for it. So I'm always very curious to ask people, how did they get attracted to it? How do they get magnetized into it?
Rainisto: Yeah, I think it's a combination of multiple aspects that drew me here. I've been doing photography for about 30 years. I got my first camera when I was 13, thanks to my dad. The first ten years were with film, and I spent my teenage years in the darkroom, working on film processing, which was a fantastic experience. Then, of course, the digital era came along. I've also been in some bands, written music, made short films, and so on, but these were always side projects. In my professional life, I started as a web designer in 1998 and went on to UI design, concept design, product design, worked for a virtual reality company, and even at Microsoft on smartphones. I had a wide range of professional activities. About two and a half years ago, I had what some might call a midlife crisis at the age of 40.
I realized I didn't want to grow old without trying to create something. I've always had aspirations to create something, and the year 2021 happened to be the COVID year, so we were stuck inside our houses. I heard about this amazing AI, which I thought of as a virtual camera. I love taking photos, but I couldn't go outside. So I thought, why not create art indoors? I found it fascinating to explore large-scale models, as they seem a distilled version of the human experience. Take, for example, the Stable Diffusion model with 5 billion inputs, which is a sort of distilled version of humanity. I find it extremely fascinating to explore a virtual camera that is the opposite of a typical camera. It offers a shared viewpoint, multiple viewpoints, multiple moments in time, and multiple subjects.
I enjoy identifying what remains the same, what's recognizable, and what these models can reveal about us as humans when we create art. It's about discovering the hidden aspects of ourselves, things that we may not think about when photographing or portraying. AI doesn't have feelings, and it's not trying to hide certain aspects or portray things in a certain light, which makes it fascinating. It's an artist from a different planet. I can ask it to draw a picture of New York, and it creates a warped version of reality. I believe there's a lot of conflict and contrast that can be created, and conflict and contrast make for interesting art, at least for me.
Mansimov: It's very interesting that you mentioned your camera. Can you tell me more about your camera tech stack? What are the things you fine-tune in the models? Do you use prompts, or is it a mix of these approaches? I'd be very interested to hear some technical details about how you work with your camera.
Rainisto: I've been quite involved in fine-tuning the models, especially with Stable Diffusion and Dreamboat fine-tuning, along with various embeddings in different versions. This approach made it easier for me to take the base version and adapt it to create the style I wanted. Initially, most AI artists, I emphasized the importance of the prompt. Some people write incredibly long prompts that feel like elaborate spells.
However, I've shifted away from relying on prompts heavily in my recent work. I've been experimenting with using fewer prompts, and I even created three video artworks using prompts from a 2015 paper. The challenge for me was to see if I could create art without using prompts and still find it personally satisfying. Even without prompts, there's still much more to explore and discover. While prompts remain a critical part, there's so much more you can do, and I find it intriguing to see how much art can be created even without using prompts. It's a fun challenge.
Mansimov: It's still very impressive to me. I'm impressed that you mentioned techniques like fine-tuning and embeddings. I'm curious, do you feel that your success is partly due to going deeper into the details rather than just using prompts on the surface level of the models? By fine-tuning and customizing the models, do you think you create new images that others can't and capture people's imagination and feelings with them?
Rainisto: I'd probably say yes. From an artistic perspective, collectors often look for art that is distinct, recognizable, and rare. If millions of people use the same AI tool and create art using only prompts without any alterations, their art may appear beautiful but indistinguishable. When you see a piece of art, you can't attribute it to a specific creator because many others can replicate it. Going deeper into the technical aspects allows me to customize and personalize my art. While prompts remain an essential part, my approach is distinct, and it would be challenging for someone else to replicate my style. I'm sure this holds true for many artists. Once you develop your unique techniques or themes, your art becomes more distinct, which likely contributes to your success.
However, artists are more than just their technical tricks. With AI art tools, there's an opportunity to be more versatile in terms of style. Traditionally, learning specific styles takes a long time, but with AI, artists can switch styles relatively easily. This versatility is fascinating to me. Yet, just because something can be done doesn't mean it should be done. As a commercial artist, you need to consider the artistic integrity and what serves your vision best. It's about striking a balance between the capabilities of AI and your creative expression.
As for the future, given what we know now, do you have any educated guesses? It might be entirely wrong, but what do you think might be the next big thing that hasn't been discovered yet? In terms of AI development, what do you think is on the horizon that we're not aware of at the moment?
Mansimov: That's a very interesting question. The way I personally think about it is that current AI models are like statistical pattern recognition machines on steroids. They process vast amounts of data, such as billions of images, in ways our human minds cannot comprehend. They interpolate between these data points in fascinating ways. However, what they do is interpolation, not true creation. AI is still limited by the data it processes, and it's all about interpolating between existing data. The true breakthrough will come when AI can make decisions on its own reliably.
We've seen glimpses of this with AlphaGo and a few other examples. Still, AI's ability to make decisions independently is far from being a reality. In many areas like self-driving cars, we're not quite there yet. Elon Musk has been promising fully self-driving Teslas for years, but it remains elusive. The real game-changer will come when AI has true agency and can execute, try out ideas, and discover entirely new concepts and things we haven't even thought of yet. This will be a significant turning point when AI moves beyond being a mere assistive technology to being a true innovator and discoverer.
Rainisto: I agree, and that's a fascinating perspective. For AI to achieve real discovery, it needs not only to guess but also to test and prove its hypotheses. Currently, AI can't make guesses based on hypotheses, and it's a significant limitation. When AI can generate insights that we didn't even provide as raw data, it will indeed be a game-changer. It has to be capable of that next level of creativity and discovery, similar to how humans make groundbreaking scientific discoveries. It's an exciting yet potentially scary prospect. AI discovering things that we couldn't have predicted is a paradigm shift waiting to happen.
Mansimov: Indeed, AI making true discoveries outside of known patterns is an exciting yet challenging concept. While AI can interpolate and generate novel things based on existing data, making discoveries beyond what is currently known is a completely different level of achievement. It's akin to the groundbreaking discoveries of the past, like understanding gravity or that the Earth is round. The ultimate goal is for AI to become an agent of discovery, finding new concepts, ideas, and solutions that are currently inconceivable to us. Once we reach that point, it will indeed be a "Wow" moment, and a new era in AI will begin.