Check if a speech file is below a certain threshold in db/volume

Started by AndreasBlack, Mon 25/09/2023 12:35:07

Previous topic - Next topic

AndreasBlack

What i mean is when the speech file is "silent" or almost silent since most people probably aren't obsessed audiofreaks like me. Is there such a thing? I want a 'psuedo lipsync' module, because it would look much better for pixelart. The goal is to have it check for when the speech file has these slight pauses between the sentences. That it immediately then goes to a closed mouth frame via repeately execute, and then after that the speech restarts itself once there's "hearable speech" in the audio file again.

I know AGS recognise's that a audio file is playing and you can stop, resume, etc, but i can't seem to find anything other then that in the manual. Like "IsAudioSilent" sounds stupid, but you get the point! (laugh)


eri0o

I think there isn't any way to read the volume level of any sort like enabling making an equalizer in AGS Script. But for lipsync that you want, have you tried using Rhubarb paired with the TotalLipSync module from Snarky?

AndreasBlack

Quote from: eri0o on Mon 25/09/2023 13:09:03I think there isn't any way to read the volume level of any sort like enabling making an equalizer in AGS Script. But for lipsync that you want, have you tried using Rhubarb paired with the TotalLipSync module from Snarky?

Unfortunaly yes! It looks ok and i'd probably use in the future if my feature request (i guess) is not easy to fix in the near future. I really don't think i need the lipsync feature, well it "kinda would be a lipsync feature", but get me right. A continued looped mouth moment but only to be closed aka "synced" when there's no sound = closed mouth frame. What i've done so far is manually "lipsync" parts by creating various speech views and trying to line it up with the sound in a dark room with the character. It takes ages, alltho it looks much better to my eyes!  (nod)

Do you think it's possible to fix it inside of AGS in the future? Audio detection using perhaps someform of freeware noisegate behind the scenes. This would improve AGS own speech function.

Then the user could perhaps set their own AudioIsBelowSpeechlevel(-20); would be equal to -20db. Since forexample voices in games seems to live around (taking numbers out of my ass might not be 100% correct) -12db to say -6db. Perhaps i should have posted this as a feature request, but anyways, here's the dream in a pseudo code

Code: ags


function repeatedly_execute_always()
{

  if (player.Speaking) && AudioIsBelowSpeechlevel==-20))
  
  {
   player.Frame=0 (speech player frame closed mouth) 
    

}

else player.KeepSpeaking(); //Something like that..


  }

}
  






Crimson Wizard

Adding a property that returns current level of the played audio file - that may be feasible, as a low-level API. In fact even returning a soundwave as array may make sense (similar to the suggestion of returning a pixel array of a sprite).
(EDIT: if i were making API, i'd probably add this to AudioClip. Although since audio playback calculates position in Ms, and levels are per sample, then there likely has to be a distinct property/function that returns an average of all samples for the given millisecond, or something like that.)

But I'm generally against adding complex built-in behavior based on this.

However, there may be another solution: use some tool to generate a sound timeline with these levels, put this as a custom data in your game, and use that in script instead. With this no additions to the engine will be necessary.

Snarky

Quote from: Crimson Wizard on Mon 25/09/2023 18:03:24However, there may be another solution: use some tool to generate a sound timeline with these levels, put this as a custom data in your game, and use that in script instead. With this no additions to the engine will be necessary.

That is essentially what eri0o suggested, using Rhubarb as the tool to identify when speech is happening in the voice clips and save this in a data file, and TotalLipSync as the script to read the data and control the animation.

Both are perhaps "overkill" in the sense that they do a lot more than is necessary for this particular case, but they should work. (One could make some fairly simple changes to TLS to make it play a looping animation instead of tracking individual frames whenever the mouth isn't closed. Or of course write the code from scratch.)

AndreasBlack

Wizard, i agree about AudioClips too! Perhaps you could call something like "getAudioLevel" and check the objects at the same time, for easier syncing. Cause it does take a bit of time to try and sync stuff up. I sat for hours trying to figure a traineffect and *swosh* sound when a figure is dragging a rope for example.

I have a big gate that's opening and now i just move them behind a Wall opposite sides. I mean it's a bit of guessing game, and judging how long should the gate drag on floor before it's fully opened and the sound ends.

It could save a couple of minutes. (nod)

Snarky, i'm gonna try your tool again later tonight! Cause i'm so feed up with my stupid manually syncing stuff, it must be easier to create what i want with your tool. I have to try what i tried before but failed. Edit a dat file, and just figure out how long the various Speechanimation times that AGS offers would be similated in the dat file. In what order they all should come for the full animation to play thru like it normally does without lipsyncing, and ofc their specific timings.

Then i could use the tool first and look for all the silences that are synced and take away the rest and justcopy/paste in the stored "full animation running" settings. Thoughts? Edit: The lipsync tool looks great for highres games, i'm only talking about lowres pixelart games here and yes having full on lipsync IS overkill.





AndreasBlack

Highly important Update Actually maybe it's cased closed now! I'm trying Papagoya and it works WAY BETTER from what i remembered it, it was aweful and i got so pissed when i tried some years ago to line stuff up with the spectrum wave. I must have tried a buggy version, works fine now! Now it's actually possible to do syncinc alright (nod). Still that "GetAudioLevel" and being able to set a value for the engine to look for would be very useful indeed!

Snarky

Quote from: AndreasBlack on Tue 26/09/2023 10:34:24Then i could use the tool first and look for all the silences that are synced and take away the rest and justcopy/paste in the stored "full animation running" settings. Thoughts? Edit: The lipsync tool looks great for highres games, i'm only talking about lowres pixelart games here and yes having full on lipsync IS overkill.

What I'd probably do to minimize the work would be to keep the Rhubarb files as-is without any manual editing, but map all the phoneme codes to the same frame number in AGS, namely 1 (so, whenever the character is not speaking, the frame is 0, and when it is speaking the frame is 1, regardless of which phoneme/mouth shape it has). Then I'd make a slight change to the TLS code, so that instead of setting the displayed frame directly, the frame number controls whether the speech animation is playing or not.

This would also work with Papagayo files, if you find that Rhubarb's automatic analysis is not sufficiently accurate.

SMF spam blocked by CleanTalk