Week 7


In this study, the focus is not on the gender of the avatar but rather on the style of its representation. The avatars explored fall into four distinct categories based on their visual and behavioural characteristics:

  • hyper-realistic avatar: near-exact representation of a human, with details so precise that it is challenging to differentiate it from a real person.
  • digital human: an animated, human-looking avatar that mimics the appearance and behaviour of a real person.
  • robot avatar: a mechanical, robot-like representation that retains anthropomorphic features, especially in its facial expressions such as eyes and mouth.
  • abstract entity: presents anthropomorphism primarily through its voice, rather than visual cues.

Avatar Design and User Personas:

  • For Olivia, Halo’s non-human design helps prevent false assumptions of empathy or emotional understanding. Its neutral appearance and functional tone align with her expectations of a reliable assistant, focused on clarity and usefulness rather than emotional performance.
  • Lea’s human-like design reinforces Mark’s trust and belief in AI’s emotional capacity and intelligence, creating an interaction that feels intuitive and engaging to a user unaware of the system’s underlying limitations.
  • Sam’s cartoonish, friendly, and emotionally expressive design aligns closely with Emily’s desire for a nurturing and empathetic interaction. By resembling a trusted friend, Sam supports Emily’s expectations of emotional connection, making the AI feel approachable, supportive, and personally engaging.

To create videos of the talking avatars from static images, I used three different online tools, each selected for a specific reason:

  • Usage limitations: Most tools were only available for one-time or limited free use, which required me to combine multiple platforms to complete all avatar videos.
  • Variety of visual representations: Each tool offered different avatar styles, such as hyper-realistic, cartoon, or robotic.
  • Voice matching: I needed to find voices that aligned naturally with each avatar’s appearance and behaviour. Different tools provided different voice options, helping me achieve a better visual–auditory match.
  • Sam (cartoon-style) was created using HeyGen
  • Squidji (robot) was made using Vidnoz
  • Lea (human-like avatar) was developed using VisionStory

I went through several rounds of editing to get each avatar’s look and behaviour just right, especially to make sure the lip-syncing was accurate and the voice felt natural for the face.

Initially, I generated the image of Sam using Dalle-E, however, I couldn’t animate it with the online tools. So I just used an image available in HeyGen.

Similarly, I generated an image of a friendly robot using ChatGPT and tried to animate it. However, the tool detected the wrong area as the mouth of the robot. This is an example of the limitations of these online tools.

I then generated another image of a robot using the Image playground on Mac, and animated it for the final avatar.

I built the fourth avatar, Halo, representing an abstract and fictional AI entity using p5.js. I used a text-to-speech model on the groq playground to generate the AI’s voice.

Rather than relying on geometric rigidity, I designed an organic flow by layering multiple distorted rings. The shapes respond in real-time to the voice’s volume, making the form pulse and shift in a fluid, lifelike way. This gives the impression of a live, breathing entity rather than something predictable or robotic.

The first iteration was too vague, so I decided to add a specified shape instead of rings to make the avatar unique and memorable. https://editor.p5js.org/lamita/sketches/Sxi-1vVql

For the final design, I created more of a sine wave shape with the help of this online sketch.

https://editor.p5js.org/lamita/sketches/U2q220Wlp

Key visual elements:

  • Amplitude-driven movement: The radius of the rings scales with the smoothed audio volume, giving the avatar a voice-driven presence.
  • Perlin noise distortion: I applied noise-based displacement to each ring’s points, making them appear soft and organic rather than mathematically perfect.
  • Gradient colouring: A soft gradient shifts across the rings, reinforcing the non-linear, evolving feel of the avatar.
  • Multiple rings: Using concentric shapes adds visual depth and reinforces the idea that this AI has complexity and emotional nuance.

This combination of real-time responsiveness and organic motion was intended to evoke a sense of intelligence and emotion without relying on facial features.


Leave a Reply

Your email address will not be published. Required fields are marked *