I have now run the main.py file in Microsoft VS Code and got the following error:
_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, do those steps only if you trust the source of the checkpoint. (1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. (2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message. WeightsUnpickler error: Unsupported global: GLOBAL TTS.tts.configs.xtts_config.XttsConfig was not an allowed global by default. Please use `torch.serialization.add_safe_globals([XttsConfig])` or the `torch.serialization.safe_globals([XttsConfig])` context manager to allowlist this global if you trust this class/function.Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html.
FileNotFoundError: [Errno 2] No such file or directory: 'Morrowind.json' What i do wrong? I can't extract dialogue
I do: 1. Run Dialogue Extractor 2. Choose morrowind.esm 3. File "C:\Users\Lev\Desktop\MorrowindSpeechGenerator-main\tes3convsettings.py", line 24, in load_file with open(f'{output_name}.json', 'r', encoding='utf-8') as json_file: FileNotFoundError: [Errno 2] No such file or directory: 'Morrowind.json'
Do you have any lines appearing before in the cmd? This could tell us more about why morrowind.json was not generated.If not, make sure you have the tes3conv.exe file in your folder.
It generates all dialogs as audio files, including those not specified for a particular npc. They will then be generated for each race and sex.Particular npc such as vivec with a unique voice, for example, are supported. Otherwise, the voice chosen by default will be used for audio files generated, for example, for merchant crab dialogs.
Yes, I can't ignore it, it happens sometimes since the last version on long sentences, I have to try and find out where the problem comes from, but it has little impact on the quality of the generation.
ValueError: [!] Looks like you are using a multi-speaker model. You need to define either a `speaker_idx` or a `speaker_wav` to use a multi-speaker model. How do I fix this? When I try to launch Speech Generation I get this, even after having done Audio and Dialogue extraction.
Check if you have .wav files in audio in Speakers Path folders. This error is due to the fact that no .wav files have been loaded, so the script retrieves them according to this tree structure.("Speakers Path/Race/Sex_Initial/*.wav) Check the contents of the folder specified for speakers path in the settings .
Do you know if there's a way to further improve audio quality? I've tried several popular TTS solutions, and your tool already produces some of the best audio quality among free TTS software. However, I recently came across this mod: https://www.nexusmods.com/morrowind/mods/54454?tab=description
I reached out to the mod author to understand how they achieved the audio quality in their updated version from August. It sounds almost as good as ElevenLabs, but they clarified that it's not, which makes sense—it would be prohibitively expensive to generate hundreds of hours of audio using ElevenLabs. Somehow, they managed to produce exceptional audio quality quickly and efficiently.If it were possible to integrate the same approach with your tool, it would be incredible! Do you or anyone else have any ideas on how the mod author might have done this?
Oh and btw the description of their approach on the mod page hasn't changed since the original version, which had significantly lower audio quality. So simply following those directions (applio + edge-tts) doesn't give the desired results unfortunately
They used appolio and edge-tts.This method requires you to create a specific model for each voice and train it before generation (about 10h per voice on my computer).(Because of my automatic tools, samples containing onomatopoeia may be used and could cause minor voice problems in sentences).This could explain the better results.This would be possible with coqui-ai's TTS, but I didn't implement it because I found it a very time-consuming process.
Got it. So it seems they likely didn’t implement a completely new approach in their updated version but rather made a few adjustments. Thanks for sharing your perspective! As for creating the models, it’s a one-time process for each voice, right?. If we could share the models afterward, it would save other users from having to create them themselves. However, since we don’t own the copyright for the original voice files, we might not be allowed to share a model trained on them publicly, hmm.
Edit: the author gave an outline of the process used in their new version here It was AllTalk TTS + Applio and like you said a model (actually 2) per voice.
Thank you for this information,I am currently looking into implementing Deepspeed (which is used in AllTTS and which also uses the xtts model) which could greatly reduce the generation time (2 to 5 times from what I have seen).
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 100 928k 100 928k 0 0 842k 0 0:00:01 0:00:01 --:--:-- 3210k \Desktop\Franciskus Standerius\Games\TES III Morrowind\!Mods\!@new order folders\7Gameplay\MorrowindSpeechGenerator\tes3conv\tes3conv.exe 1 File(s) copied Download Python 3.10... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 27.0M 100 27.0M 0 0 3078k 0 0:00:08 0:00:08 --:--:-- 3663k Install Python 3.10... Python was not found; run without arguments to install from the Microsoft Store, or disable this shortcut from Settings > Manage App Execution Aliases. Install Dependencies Installation Succesful ! Python was not found; run without arguments to install from the Microsoft Store, or disable this shortcut from Settings > Manage App Execution Aliases. Press any key to continue . . . when I press a key it closes with no effect. Any idea whats wrong?
everything works except when I launch speech generation ValueError: [!] Looks like you are using a multi-speaker model. You need to define either a `speaker_idx` or a `speaker_wav` to use a multi-speaker model.
To use the audio files, either use the audio extractor tool and redirect to the folder created, or use your own .wav audio samples and follow the form in folder use for speakers_path "Race\sex_initial(m or f)".
This error is due to the fact that there is no .wav file found in the “Race\sex_initial” tree, check also the presence of wav file for example in your case: “Data Files/Sound/Dark Elf/m” for Dark Elf male files
already did that back when it gave me that error, all the folders have .wav in their respective race and gender, then i moved the folders from /data files/ to /data files/sound/ which is what i thought you meant in your previous comment but it gives the same error. Thank you for answering all the errors though i appreciate it
Did you use the "Extract Audio" function and directed it to your Morrowind Data Files path? Did the program extract the audio files correctly? Eg you should see folders like /Dark Elf/m in the folder where you created the extracted audio, and these folders should contain .wav files
Hmm, weird. I got the same error message when the audio files weren't there. Somewhere above the error there should be a log message listing the audio files it was trying to use. Starts with "Load Wav Files:" Does it actually list files there?
72 comments
Is there another application that needs to be launched?
Thanks.
_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, do those steps only if you trust the source of the checkpoint. (1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. (2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message. WeightsUnpickler error: Unsupported global: GLOBAL TTS.tts.configs.xtts_config.XttsConfig was not an allowed global by default. Please use `torch.serialization.add_safe_globals([XttsConfig])` or the `torch.serialization.safe_globals([XttsConfig])` context manager to allowlist this global if you trust this class/function.Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html.
What i do wrong? I can't extract dialogue
I do:
1. Run Dialogue Extractor
2. Choose morrowind.esm
3. File "C:\Users\Lev\Desktop\MorrowindSpeechGenerator-main\tes3convsettings.py", line 24, in load_file
with open(f'{output_name}.json', 'r', encoding='utf-8') as json_file:
FileNotFoundError: [Errno 2] No such file or directory: 'Morrowind.json'
The text length exceeds the character limit of 250 for language 'en', this might cause truncated audio.
Anyone else gotten this text? Is it ignorable?ValueError: [!] Looks like you are using a multi-speaker model. You need to define either a `speaker_idx` or a `speaker_wav` to use a multi-speaker model.
How do I fix this? When I try to launch Speech Generation I get this, even after having done Audio and Dialogue extraction.Check the contents of the folder specified for speakers path in the settings .
Happy New Year.
https://www.nexusmods.com/morrowind/mods/54454?tab=description
I reached out to the mod author to understand how they achieved the audio quality in their updated version from August. It sounds almost as good as ElevenLabs, but they clarified that it's not, which makes sense—it would be prohibitively expensive to generate hundreds of hours of audio using ElevenLabs. Somehow, they managed to produce exceptional audio quality quickly and efficiently.If it were possible to integrate the same approach with your tool, it would be incredible! Do you or anyone else have any ideas on how the mod author might have done this?
Oh and btw the description of their approach on the mod page hasn't changed since the original version, which had significantly lower audio quality. So simply following those directions (applio + edge-tts) doesn't give the desired results unfortunately
As for creating the models, it’s a one-time process for each voice, right?. If we could share the models afterward, it would save other users from having to create them themselves. However, since we don’t own the copyright for the original voice files, we might not be allowed to share a model trained on them publicly, hmm.
Edit: the author gave an outline of the process used in their new version here
It was AllTalk TTS + Applio and like you said a model (actually 2) per voice.
% Total % Received % Xferd Average Speed Time Time Time Current
when I press a key it closes with no effect. Any idea whats wrong?Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 100 928k 100 928k 0 0 842k 0 0:00:01 0:00:01 --:--:-- 3210k \Desktop\Franciskus Standerius\Games\TES III Morrowind\!Mods\!@new order folders\7Gameplay\MorrowindSpeechGenerator\tes3conv\tes3conv.exe
1 File(s) copied
Download Python 3.10...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 27.0M 100 27.0M 0 0 3078k 0 0:00:08 0:00:08 --:--:-- 3663k
Install Python 3.10...
Python was not found; run without arguments to install from the Microsoft Store, or disable this shortcut from Settings > Manage App Execution Aliases.
Install Dependencies
Installation Succesful !
Python was not found; run without arguments to install from the Microsoft Store, or disable this shortcut from Settings > Manage App Execution Aliases.
Press any key to continue . . .
Adding Python to the path during installation and then rebooting your PC should solve the problem.
This should help: https://images.app.goo.gl/G9fJEevXikvB2Mqm6
ValueError: [!] Looks like you are using a multi-speaker model. You need to define either a `speaker_idx` or a `speaker_wav` to use a multi-speaker model.
To use the audio files, either use the audio extractor tool and redirect to the folder created, or use your own .wav audio samples and follow the form in folder use for speakers_path "Race\sex_initial(m or f)".
This error is due to the fact that there is no .wav file found in the “Race\sex_initial” tree, check also the presence of wav file for example in your case: “Data Files/Sound/Dark Elf/m” for Dark Elf male files
[Path]
csv_path = D:/
speaker_path = C:/Users/Administrator/Desktop/Franciskus Standerius/Games/TES III Morrowind/Morrowind/Data Files/MorrowindSpeechGenerator/data files/sound/Vo
output = C:/Users/Administrator/Desktop/Franciskus Standerius/Games/TES III Morrowind/Morrowind/Data Files/MorrowindSpeechGenerator/out
speaker_default = Dark Elf
[Language]
speaker_language = en
Did the program extract the audio files correctly? Eg you should see folders like /Dark Elf/m in the folder where you created the extracted audio, and these folders should contain .wav files
Somewhere above the error there should be a log message listing the audio files it was trying to use. Starts with "Load Wav Files:"
Does it actually list files there?