0 of 0

File information

Last updated

Original upload

Created by

Fiestag

Uploaded by

Fiestag

Virus scan

Some manually verified files

80 comments

  1. KolincaRN
    KolincaRN
    • member
    • 7 kudos
    Hello, Fiestag.

    You mentioned that it will be faster in Nvidia GPUs due to CUDA, but can it be executed in AMD gpus? If so, can you explain in the description how to?
    1. Fiestag
      Fiestag
      • member
      • 1 kudos
      It will work on a computer with an AMD gpu, but performance will be significantly reduced because only the processor will be used. The program automatically detects your GPU. Follow the instructions given in the description. You'll just get a message warning you that CUDA is not being used. 
  2. kickbuttkristen123
    kickbuttkristen123
    • supporter
    • 0 kudos
    Awesome Idea for a mod brother! Keen to get it working!

    I am having some issues though - After I "Extract Audio" all of the wav files are generated successfully without issue but I have no CSV that is generated to point to. Can you tell me where this should be located? 

    Everything seems to have processed fine so I'm not sure what I'm missing

    Thank you heaps!
    1. Fiestag
      Fiestag
      • member
      • 1 kudos
      Do you have run Dialogue extractor?The genrated files are located in csv folder of morrowind speech generator.
  3. Durv329
    Durv329
    • supporter
    • 1 kudos
    Great idea for a mod, but your vague install instructions have created a barrier to entry for your mod that doesn't need to be there.
    1. Fiestag
      Fiestag
      • member
      • 1 kudos
      Thank you.
      Can you tell me which installation instructions are vague?
    2. Durv329
      Durv329
      • supporter
      • 1 kudos
      A program like this could use a thorough, step-by-step guide on how to use it. I feel like it would be easier to show what I mean. I hope you don't mind, but I wrote up a guide using your install instructions as a jumping off point. I tried to be very specific, as well as include examples of proper directories to minimize user error.

      Installation
      ************* This app needs approximately 10 GB to install all dependencies (Python 3.10, XTTS model and CUDA toolkit)    *************

      1.  Extract Morrowind Speech Generator and launch 'SetupPython.bat' (Don't close window!). Once it's complete, it will prompt you to restart your computer. Type in 'o' for yes. 
      2.  Once you've restarted, go back into the same folder and launch 'setup.bat'. Once that is complete, it will once again prompt you to restart your computer. Type in 'o' for yes.
      3.  Once installation is complete, launch 'MorrowindSpeechGenerator.bat' to open the app.

      Usage
      ************* It is highly recommended to use NVIDIA GPU for generating speech; it will speed up the process considerably. *************

      1.  Open 'Settings'.
      1a.  The top directory shown is the Speaker Path. This is where the program will draw audio from. Set it to your audio directory of choice (Example:\Morrowind\Data Files\Sound\Vo).
      1b.  The second directory is the Output Path, and as it sounds, this is where the program will save the created audio files. Set this to wherever you like.
      2.  Exit settings and run 'Dialogue Extractor' from the main menu. This will extract the dialogue script. Point it to the ESM file you want to export a dialogue script from. (Example:\Morrowind\Data Files\Morrowind.esm)
      3.  Run 'Extract Audio' from the main menu. This will convert the mp3 audio files in your Vo folder into wav files needed for speech generation.
      4.  Run 'Launch Speech Generation' from the main menu. In the new window, point it to where the extracted script is located (Default
      location:\Morrowind\csv). This process will take time.
      5.  Once that is complete, take the contents of your output folder, and copy/paste them into the proper audio folder (Morrowind\Data Files\Sound\Vo\AIV)
    3. Fiestag
      Fiestag
      • member
      • 1 kudos
      Thanks for your help I'll add it.
  4. CoolioManMan
    CoolioManMan
    • premium
    • 0 kudos
    When I press "Launch Speech Generation" the terminal just says "Load TTS..." and nothing else happens.

    Is there another application that needs to be launched?

    Thanks. 
    1. CoolioManMan
      CoolioManMan
      • premium
      • 0 kudos
      I have now run the main.py file in Microsoft VS Code and got the following error: 

      _pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, do those steps only if you trust the source of the checkpoint.        (1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.        (2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message.        WeightsUnpickler error: Unsupported global: GLOBAL TTS.tts.configs.xtts_config.XttsConfig was not an allowed global by default. Please use `torch.serialization.add_safe_globals([XttsConfig])` or the `torch.serialization.safe_globals([XttsConfig])` context manager to allowlist this global if you trust this class/function.Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html.
  5. mybrogleb23
    mybrogleb23
    • member
    • 0 kudos
    FileNotFoundError: [Errno 2] No such file or directory: 'Morrowind.json'
    What i do wrong? I can't extract dialogue

    I do:
    1. Run Dialogue Extractor
    2. Choose morrowind.esm
    3. File "C:\Users\Lev\Desktop\MorrowindSpeechGenerator-main\tes3convsettings.py", line 24, in load_file
        with open(f'{output_name}.json', 'r', encoding='utf-8') as json_file:
    FileNotFoundError: [Errno 2] No such file or directory: 'Morrowind.json'
    1. Fiestag
      Fiestag
      • member
      • 1 kudos
      Do you have any lines appearing before in the cmd? This could tell us more about why morrowind.json was not generated.If not, make sure you have the tes3conv.exe file in your folder.
  6. Sannh
    Sannh
    • premium
    • 8 kudos
    So if i understand correctly. the mod does not generate voiceline on the fly, rather it has to be selectively generated per npc pre-emptively?
    1. Fiestag
      Fiestag
      • member
      • 1 kudos
      It generates all dialogs as audio files, including those not specified for a particular npc. They will then be generated for each race and sex.Particular npc such as vivec with a unique voice, for example, are supported. Otherwise, the voice chosen by default will be used for audio files generated, for example, for merchant crab dialogs.
  7. Spessen5221
    Spessen5221
    • member
    • 0 kudos
    The text length exceeds the character limit of 250 for language 'en', this might cause truncated audio.
    Anyone else gotten this text? Is it ignorable?
    1. Fiestag
      Fiestag
      • member
      • 1 kudos
      Yes, I can't ignore it, it happens sometimes since the last version on long sentences, I have to try and find out where the problem comes from, but it has little impact on the quality of the generation.
  8. Spessen5221
    Spessen5221
    • member
    • 0 kudos
    I know we need Voices of Vvardenfell downloaded but does it have to be turned on in the modlist or will that cause overlap?
    1. Fiestag
      Fiestag
      • member
      • 1 kudos
      Voice of Vvardenfell must be activated to play audio files.
  9. Spessen5221
    Spessen5221
    • member
    • 0 kudos
    ValueError:  [!] Looks like you are using a multi-speaker model. You need to define either a `speaker_idx` or a `speaker_wav` to use a multi-speaker model.
    How do I fix this? When I try to launch Speech Generation I get this, even after having done Audio and Dialogue extraction.
    1. Fiestag
      Fiestag
      • member
      • 1 kudos
      Check if you have .wav files in audio in Speakers Path folders. This error is due to the fact that no .wav files have been loaded, so the script retrieves them according to this tree structure.("Speakers Path/Race/Sex_Initial/*.wav)
      Check the contents of the folder specified for speakers path in the settings .
    2. Spessen5221
      Spessen5221
      • member
      • 0 kudos
      Is that under the Morrowind folder or the Morrowind Speech Generator folder? I don't see it in either
    3. Fiestag
      Fiestag
      • member
      • 1 kudos
      Launch morrowind speech generator and click on settings you should see the speakers path .
    4. Spessen5221
      Spessen5221
      • member
      • 0 kudos
      You are a saint. How does it feel to be the best?
      Happy New Year.
  10. S9T9K
    S9T9K
    • supporter
    • 7 kudos
    Do you know if there's a way to further improve audio quality? I've tried several popular TTS solutions, and your tool already produces some of the best audio quality among free TTS software. However, I recently came across this mod:
    https://www.nexusmods.com/morrowind/mods/54454?tab=description

    I reached out to the mod author to understand how they achieved the audio quality in their updated version from August. It sounds almost as good as ElevenLabs, but they clarified that it's not, which makes sense—it would be prohibitively expensive to generate hundreds of hours of audio using ElevenLabs. Somehow, they managed to produce exceptional audio quality quickly and efficiently.If it were possible to integrate the same approach with your tool, it would be incredible! Do you or anyone else have any ideas on how the mod author might have done this?

    Oh and btw the description of their approach on the mod page hasn't changed since the original version, which had significantly lower audio quality. So simply following those directions (applio + edge-tts) doesn't give the desired results unfortunately 
    1. Fiestag
      Fiestag
      • member
      • 1 kudos
      They used appolio and edge-tts.This method requires you to create a specific model for each voice and train it before generation (about 10h per voice on my computer).(Because of my automatic tools, samples containing onomatopoeia may be used and could cause minor voice problems in sentences).This could explain the better results.This would be possible with coqui-ai's TTS, but I didn't implement it because I found it a very time-consuming process.
    2. S9T9K
      S9T9K
      • supporter
      • 7 kudos
      Got it. So it seems they likely didn’t implement a completely new approach in their updated version but rather made a few adjustments. Thanks for sharing your perspective!
      As for creating the models, it’s a one-time process for each voice, right?. If we could share the models afterward, it would save other users from having to create them themselves. However, since we don’t own the copyright for the original voice files, we might not be allowed to share a model trained on them publicly, hmm.

      Edit: the author gave an outline of the process used in their new version here
      It was AllTalk TTS + Applio and like you said a model (actually 2) per voice.
    3. Fiestag
      Fiestag
      • member
      • 1 kudos
      Thank you for this information,I am currently looking into implementing Deepspeed (which is used in AllTTS and which also uses the xtts model) which could greatly reduce the generation time (2 to 5 times from what I have seen).