fault

AI loves naughty drawings and it’s all our fault

Added 2023-04-19 05:02:30 +0000 UTC

As a lot of you probably already know, AI-generated illustrations have become much more than a "passing fad." I had the opportunity to experiment with them a while ago and wanted to share my impressions. Please note that this article may be somewhat outdated since I wrote it last month. Nevertheless, I decided to leave it as a memorandum; a reminder of how rapidly the technology behind AI is advancing.

The Setup

There are various AI illustration services but I decided to try Stable Diffusion, which seemed to be the most open source and unrestricted (if you know what you’re doing). It's still experimental, so I wanted to see how it feels.

The AI loves naughty drawings.

I tried to generate a picture like Selphine + Rughzenhaide Castle interior.

▼This generated image captures the ambiance of the castle's interior, although many of the background choices feature distorted perspective lines, making it challenging to find a suitable option. Additionally, the girl in the picture appears to be either under the staircase or buried in the ground due to the floor's height...

Most of the compositions were landscape with a background or portrait with a character pin shot.

This model is a pretty girl generation model, so it always squeezes in a pretty girl, even if you just want the background.

It never listened to my request for "No Character."

So, considering the later character removal work, I specified "bald" to minimize the character area...

The mobs in the background became bald....That's not what I meant...

As a temporary measure, I decided to specify short hair.

Also, if you don't specify, it will keep creating erotic images close to nude with large breasts. Even if I specified to exclude NSFW content, it ignored it and produced erotic images, making me feel a persistent obsession with eroticism.

Eroticism often accompanies groundbreaking technological advancements, so I understand, but I couldn't help but laugh when it generated a picture with breasts growing from the feet.

AI prioritizes "feel"

Image generation produces various variations based on the prompt (spell), but "structure and such are secondary as long as it looks good in the picture" seems to be the basic stance.

It might be a matter of usage, but sometimes it ignores detailed color specifications, which I think is because the AI is deciding how to make the image look good.

▼It doesn't matter if the door on the right is too narrow. If the atmosphere is good, everything is OK

It doesn't care about things like legs being buried, strange height, etc. It generates images like "Wouldn't it be beautiful if there was light here!!!" "Let's cut this pillar in half because it's in the way (it's hidden by the character anyway)" "I don't know if it's a window or a door, but isn't this frame line great!?"

▼This is an example with significant distortions. The eye level and perspective are completely mixed up

Although there are various points to criticize, I think it's fair to say that the accuracy of color coordination is quite high.

Since it seems to understand "how to make it look good," it's good at adjusting complementary colors, contrast, and balance as part of the composition, which are the basics of illustration techniques.

In fact, if it can output something close to what I imagine, I can use it as a reference as it is, so it seems to be useful enough for adjustments and hearing at the rough stage.

Alternatively, I felt that it was very useful for simulating the finished form, such as whether it's okay to finish it like this at the concept stage before the rough, or to let it color temporarily and check if the image is not different.

▼

Summoning Ritona

After playing around with it for a while, we decided to have the AI generate Ritona herself. To generate a specific character like Ritona, the AI needs to learn Ritona’s character design.

One problem with this learning process is that the illustrations of Ritona prepared for learning are not consistent. Usually, it is best to prepare several images of the same character design with "different angles," "with backgrounds," etc. for learning, but unfortunately there are fewer illustrations of Ritona with a unified design than I thought.

There are illustrations of Ritona in Guardian's outfit, travel outfit, StP design, those with added arrangements in Patreon illustrations, etc. Moreover, since my drawing style has changed from before, even my bad old habits are included in the learning targets.

Reproducing "Ritona drawn by Konatsu Hare" is quite challenging, and I couldn't get Ritona as I wanted in the first try, so I had to repeatedly pick up the closest Ritona from the gacha and teach the AI, "This is Ritona."

I needed to strongly say "Please make her flat-chested" because the AI always wants to make the chest bigger, but after repeating it many times, it started to generate images with an average score of 50-60.

Unfortunately, the facial expressions of everyone are not suitable for the character, so there is only one usable image out of a 100 generated images. Generating it repeatedly takes time quickly, so in the end, I wonder if it would be faster to draw it myself... but if it's properly taught, it can be a powerful weapon.

▼ A Ritona that was finally created from hundreds of generated images

Retouching

Ritona's hair features seem to have been learned beautifully in general, so there is not much to fix about the hair drawing, which is convenient.

However, the facial features are not stable due to the long years of changes. The expressions are also not as I imagined, so it should be assumed that the face needs to be rewritten.

▼ The red part (mainly the entire face) needs to be redrawn because it is a mess. The green part is not as messed up, but still needs to be polished up as an illustration.

There is still room for improvement because it looks like AI's art rather than mine, so I will fill in that part.

▼ Various modifications and additions

So, I added some touches as if I was doing my usual finishing work.

Through the retouch work, I realized that even if I just want to change a part, I can't change it much due to considering the balance of the whole. Sometimes, I have to adjust it to match AI's art style. I would have preferred if the breast was a little smaller for Ritona.

Regarding color schemes, AI aims to produce illustrations that have an appealing look, so when the training data is insufficient or incomplete, it may be necessary to generate images with low contrast and less intense colors to allow for further adjustment. However, the best approach depends on your individual preference and the style of the artwork. Therefore, there is no single correct answer.Lastly

Lastly

Although AI-generated illustrations have raised some concerns regarding copyright permission, I cannot imagine a future where this technology entirely disappears due to opposition. While these legal and ethical issues are crucial, they must be resolved as part of the technological advancement process. In the meantime, artists and designers can and probably should leverage this technology and its vast potential to create something unique and inspiring.

Putting aside legal issues, I find this technology very interesting and feel the potential of it assisting in my creations.

When I can't express myself well in English, or when I cannot make a drawing due to time constraints or when I lack a certain idea...the power of AI can help me fill those gaps, and I feel relieved knowing that I can depend on it.

Although copying someone's art style can be demoralizing for the artist, technological innovation continues to push boundaries and create new opportunities. While ethical and legal issues need attention, we should appreciate the benefits of AI-generated art and other technological advancements.

That's why, in the not-too-distant future, it may become a very challenging world for illustrators who are only commissioned to draw requested illustrations.

On the other hand for small teams like AiD, AI technology can speed up the production process and help us reach the desired quality level faster.

Having more options, such as the use of AI, is truly a blessing, particularly for small teams struggling to find helpers. When working with a small team and striving to achieve something significant, the workload can become overwhelming and unmanageable, leading to delays or even deemed impossible workloads. In game development, where multiple tasks, such as implementation and animation, need completing, AI-generated materials can be a valuable asset, making it an appealing option to consider.

I feel both excitement and fear for what the future holds in this world. I hope that there is a future where we can coexist, but we cannot predict what will happen. I will continue to think carefully and observe this trend.

*The article above has been translated to English by ChatGPT 3.5 and has been lightly edited by Munisix.

Addendum; about AI generated art

Munisix here, writing this part of the article much closer to the publish date. It's been a while since we started experimenting with AI technology and I have a few random thoughts I wanted to post here as well:

1) It looks easy... but it’s kinda not.

Above: Broken Ritonas where proper prompts arent being used or missing.

While generating random, aesthetically pleasing images is pretty straightforward, creating specific art works/materials that meet a particular criteria (such as those used in video games) can be pretty challenging. It's possible that I just still suck and every experimental step seems like a chore. However, given that that job titles like "prompt engineers” has started to creep up, it's clear that making AI do precisely what we want can present a significant challenge.

You can get ridiculously high quality results in an instant... but they just aren’t Ritona.

2) It all of a sudden forces you to know exactly what you want, right this moment (and this is more of a problem than it sounds)

This is kind of the existential crisis aspect that I’ve experienced about AI Art (and it's not about stealing other peoples work or whatever). Because it’s now easier than ever to produce above-average art peices with AI, knowing exactly what you want is more important than ever. This means that you need the tenacity and ability to say “no” to the hundreds if not thousands of 7~9 quality results ... and a lot of them might even be better than your initial idea. At first you can dismiss it with a whimsical “oh I never thought of that, that actually sounds cool”... but when this happens over and over what are you even making anymore? Traditionally speaking, you hone your craft over thousands of hours of practicing and training. When you’re drawing, writing, signing, whatever, you’re not just leveling up your craft, you’re honing the ability to know what’s you and more specifically what isn’t you.

There's a certain sense of irony and romance in how AI-generated art is born of noise, honed from an infinite range of possibilities.

No matter how advanced technology becomes, rising above the sea of mediocrity will always remain a challenge, and AI cannot do that for you — at least, not yet. If you're able to succeed, it's because you were meant to and your efforts have paid off.

I recall a time when artists did what they loved because they simply wanted to. They could be likened to a tree falling in the forest when no one is around, but they continue to create noise because damn it, that's what they want to do. However, this way of thinking never mixed well with capitalism. Society will always place a monetary value on individuality and artistic expression in relation to supply and demand which is understandable for now. Unfortunately, as social media has completely taken over the internet, people's perception of art has cemented; numerical representation of value has become the measure of success. But we probably shouldn't forget that artists, at their core, should do what they want to because they fucking want to. Not because AI can be faster or better. And if you have this mentality, AI will never be an enemy, only a tool to get what you want out of yourself in whatever manner you please.

I dont know... this became more of a ramble than a coherent thought. I just wanted to reiterate that this is an amazing tool that can potentially change the way small teams create things. And with that, I leave a few more Ritonas made in Stable Diffusion that line up with the actual Ritona. Enjoy!

And thats kind of about it... There are dozens of other kinda cool ones but it’s just not Ritona and I think it’s our responsibility as the first party to not show things that are too far off from the character. One of the biggest problems Im having is that she keeps fucking smiling. She smiles too god damn much when Ritona is a character that rarely should smile. It also keeps spitting crazy amounts of borderline porn even when we tell it not to. The model should be able differentiate NSFW as a negative prompt but it just doesnt want to and its’a all our (humanity's) fault. The power of porn is too strong, another thing I find hilarious and ironic.

But in time.. even these issues should be taken care of.

Next is creating animations with them...! Hopefully with 0 flickering that we can actually use in game. More detailed articles to come.

Jesus this became one long ass post. If you make it this far thank you for reading.

-Munisix