« From Sketch to SL: Here's a Glimpse of the Process Behind Designing a Second Life Dress | Main | Iris Wants to Know: Have The Skills You've Learned in Second Life Helped You Out in Real Life? »

Tuesday, August 05, 2014

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Metacam Oh

Color me impressed.

GoSpeed Racer

It looks pretty rough right now. Her head is tilted back a bit too far so we don't see her face full on. The eye movement was disconcerting too. But, overall it was impressive compared to SL. Given time this will look awesome.

melponeme_k

Still looks awful. Doesn't matter how advanced it is. Awful avatar. Awful expression capture with goggle eyes and a slack jaw.

Ciaran Laval

Looks a tad funny, especially the eyes.

However, bloody marvellous voice.

Stroker Serpentine

I can make that body move!

Jo Yardley

I was more impressed with the avatar singing Queen.

Tracy RedAngel

Was the avatar singing Bohemian Rhapsody?
The avatar is in the rough stages (although Stroker seems quite attracted to it *cough*) but it's pretty impressive nonetheless.
I think Linden Labs has their work cut out for them.

ColeMarie Soleil

As a vocalist I think this technology would be amazing if the real time would be true-to-real-time. Even the slightest slightest animation delay makes it feel more like watching an avatar sing karaoke than it is an avatar singing in sync to the music. If you want to be immersive you are going to have to eliminate that delay all together for a real worthwhile believable experience but we are a long way from that at the moment, but give it two years. Yay oculus teams!

A.J.

Impressive technology.

But is it what people really want from their virtual world experience?

High Fidelity and Oculus Rift both seem very demanding of your attention and very invasive into your real life existence.

Right now, we're still at the stage of wearing handicapping contraptions and animating ugly cartoons.

Missing the "wow" factor.

DrFran

From crank the car, remove the crank, get in, and hope for the best to press a button on your key, the car starts, get in and drive away.

Technology is rough; then smooth. The folks at HiFi are visionaries with an attainable goal: Create an immersive and low latency world that raises the bar from what's available now. Better yet, from my standpoint, it will still allow users to create.

Carlos Loff

If HF Avis will all look like this I will surely not join in, where is the FIDELITY ???

joe

chuckle

Arcadia Codesmith

I really don't mind the cartoony, stylized approach... as proof of concept. Get that same latency with a realistic, detailed contemporary avatar, and my interest will be more than academic.

On a side note, EverQuest II did facial capture a few years back with the SOEmote feature. I never played with it much, so I don't know how well they did with the lip-sync. Put a voice morpher in the loop, and I would imagine that the timing gets a little dicey. I would trade a little more latency for a rock-solid sync.

2014

great voice, the avi is about two steps below IMVU

Metacam Oh

Here's a thought, why couldn't HiFi delay the audio stream by a fraction of a second or two to compensate for the minor delay in lip sync?

Debs Regent

I hear all the naysaying here. But what you're realy looking at is advanced tech. The only reason the avatar looks this way is because this is what the creator chose for it. Think liquid mesh, full materials, great lighting, true gravity. For a platforom still in 'Alpha platform I've seen far worse.
It's not what you see - because that's restricted to the people involved. It's what's possible (for both good and ill unfortunately) - like Second Life, where the good go to create & build, the bad follow to steal & destroy.

Arcadia Codesmith

Metacam, I can't say for certain, but I think that'd only work if you could precisely measure the gap between audio and video processing... which is probably not consistent from second to second (hence Philip throwing a lot of qualifiers like "roughly" and "about").

I can think of approaches to analyze and match the mouth shape with the phonemes from the audio stream, but I don't know how much processing overhead that would add when you're measuring in milliseconds. "Good enough" might be as good as it gets.

Roblem Hogarth

Clearly you can get some good expression translation with good lighting conditions and some input smoothing from the raw camera data. Previous live demos suffered from lighting issues and unbuffered output making everyone want to barf into the uncanny valley.

As far as the comparison with SL lip sync, there isn't any. The Vivox software SL uses for voice only hands off a very basic "energy" variable via the ParticipantPropertiesEvent. The best you are ever going to get with that is a basic indication of how much you should morf the mouth open.

I know people go on about phoneme detection and puppeting, and when you can't see the lips, it's pretty good. But for a true performance you want as accurate a translation of the face as possible. For Alpha this is looking really good.

Compcat

Something vaguely creepy. I think it does skirt on the edges of the uncanny valley a bit. I'd like to see a side-by-side comparison of her actual face, to see how well it's being tracked. We can see a bit of this in Philip Rosedale's demonstration at SVVR9: http://youtu.be/gaWacrQuEcI?t=42m40s It seems like the avatar has a tendency to smile too much, a lot more than the source, which I think might also going on here.

Also, I don't really understand how this is supposed to work with the Rift? Will it primarily just track the mouth?

Partytime

Real Time Facial Animation presented for Unity game engine a year ago. The exact same tech but only more advanced. Presented in this video at minute 18.30 https://www.youtube.com/watch?v=vjgSbX28Qz0

What is not there is the streaming part, the streaming part could be added by two or three student coders from a tech university in about a month or two. Stream voice or stream the caputured animation data is really not that hard. You just send the packets with the data from the server to the client. It is worth to watch the unity presentation as the tech presented there provides you with a more deep insight.

Looking a bit deeper this did show up: https://www.youtube.com/watch?v=NFBv_ypyhiA
live motion capture data streaming in Second Life in the year 2010 and nobody blinked when it was possible, nobody used it or saw potential in it.

There is really a lot of impressive gadgets and tech around these days, in particular in Unity because Unity is free so all students experiment with it.

Uccie Poultry

Very nice. What about Sign Language support? Text-to-speech? The deaf/HOH/mute community could be well-served by this technology.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Your Information

(Name is required. Email address will not be displayed with the comment.)

Wagner James Au New World Notes Social VR blog
SinespaceAd
Ample Avi  SL avatars
SL fashion blog Cajsa Gidge
6a00d8341bf74053ef01b7c8d83a87970b
my site ... ... ...