Optimal preset selection and Transpose use for voice transformation

How to work with SoundID VoiceAI models and use the optimal preset for your project for voice transformation.


In this article:


Preset preview

SoundID VoiceAI features 23 voice and 21 creative (instruments) models at launch, each model offers a quick description of the voice or instrument with the best input pitch details.


VAI preset change.gif


Selecting the voice preset

  1. Click on Voices or Creative to select the target voice or instrument preset
  2. Click on '▶' ("play") to preview how the preset sounds at its best vocal range
  3. If your source pitch is similar to the preset preview, proceed to Start processing
  4. If the results sound too high or low, use Transpose to adjust the output pitch by seminotes, and process again
  5. Use the AI voice button to Enable/Disable the transformation on the track


Before committing to process the entire track, it's a good idea to highlight and process a smaller section of the track first and ensure the results sound good. Processing takes approximately 2.5x the time of the captured audio duration. It is possible to Reprocess the results for free (limited to 10 times per hour) to minimize excessive artifacts.


1 minute of audio processing costs 600 tokens. The token amount needed for processing will always be displayed on the Start processing button. You can check your balance in the plugin, or in your Sonarworks Account. Learn more about tokens below.


Note: Learn more about optimal preset selection and Transpose use below.





Input pitch

Each preset will provide you with the best input pitch information. If the pitch matches with the input source, you can leave the transpose setting at 0 as you might not require any corrections. Otherwise, the pitch correction can be applied before the audio processing for the best matching.


As an example, if the original input is F#3 and we're looking to use Rebecca's voice preset with the suggested best pitch of F#4, if our input voice is in F#3 (one octave below F#4), you would need to transpose it up 12 semitones (or one octave) to reach F#4.






The primary use case for SoundID VoiceAI is transforming a singing voice into a realistic singing voice of another human being. Ideally, the original input should match the best input pitch - see the preset descriptions for what recorded audio pitch will generate the best results. If the natural vocal range difference is significant between the input audio and the applied preset, pitch adjustments can be made with the Transpose feature. 


Transpose allows pitch adjustments by semitones (half steps) for the generated audio. 12 steps of the Transpose parameter value corresponds to an octave. Transpose can be adjusted to +/- 4 octaves (48 steps up or down). If the Transpose value is unaltered, the pitch will remain the same.



Achieving optimal results becomes more straightforward and efficient when certain parameters are considered, particularly when a project is fixed to a specific key. Before processing a vocal track, we recommend taking the following steps:

  • Preview the preset by clicking on "" (play button).
  • Evaluate the best input pitch to find a suitable preset without Transposing the output pitch.
  • Use Transpose according to the preset model's vocal range:
    • If the target preset sings in a higher pitch than your input voice track, increase the value of the Transpose parameter.
    • If the target preset sings in a lower pitch than your input voice track, decrease the value of the Transpose parameter.
  • Process a small section and evaluate the results before committing to process the entire track.


Note: Transpose values below or above 12 might produce unexpected results. Using Transpose with Drums will have a small impact on the overall sound and is not advised.



By default, an additional Auto-transpose feature is enabled. When it is active, the Transpose knob is unavailable for adjustments, and the plugin automatically detects and applies the optimal Transpose value for the combination of the captured audio and the applied preset.

  • For Voice presets, the auto-transpose values can be -12, 0, or +12
  • For Creative (instrument) presets, the auto-transpose values range from -24, -12, 0, +12, or +24


To switch back to manual Transpose adjustments, disable the 'Auto' checkbox - manual adjustments will become available again (by default, the last set value of Auto-transpose will be retained).






Creative presets

With the Creative presets you can transform humming and beatboxing into tracks that sound like instruments, discover new ways of generating sounds and melodies, and create demo songs quickly. Here are some ideas to consider:

  • Mimic instruments with your voice and transform vocal inputs into realistic instruments for quick transfer of melodic ideas into DAW or creative sound generation.
  • Turn beatboxing into drums. Record a few bars of beatboxing to create a drum track.
  • Transform existing instrument tracks. Convert your guitar solo into a saxophone solo, use your guitar to create a realistic bass guitar track, or use a trumpet track to harmonize, and create an entire brass section of various instruments, and much more.
  • Use virtual instruments for creative AI processing.



Can I create my own voice and instrument presets?

No, it is not possible to create your own presets with the SoundID VoiceAI plugin. 

Was this article helpful?

1 out of 1 found this helpful

Have more questions? Submit a request


Please sign in to leave a comment.