Speech Note

Rating: 
5
Your rating: None Average: 5 (13 votes)

App for note taking with speech to text.

Speech Note converts speech to text using Coqui STT engine (a fork of Mozilla's DeepSpeech) and various acoustic and language models.

    All voice processing is entirely done locally on the device. Internet connection is only required for model download during app initial configuration. Speech Note respects your privacy and provides truly offline speech-to-text capability.

    DeepSpeech models for particular language can be downloaded directly from the app. Following models are currently configured for download:

    • Catalan / ca
    • Czech / cs
    • English / en
    • German / de
    • Spanish / es
    • Finnish / fi
    • French / fr
    • Italian / it
    • Polish / pl
    • Russian / ru
    • Ukrainian / uk
    • Chinese / zh-CN

    + many experimental models (Estonian, Mongolian, Dutch, Yoruba, Amharic, Basque, Turkish, Thai, Slovenian, Romanian, Portuguese, Latvian, Indonesian, Greek, Hungarian)

    The exact sources are listed here.

    The quality of speech recognition strongly depends on acoustic model. In general it is not perfect but for some languages is surprisingly fine. I would be grateful for any feedback how good speech transcription is for individual models.

    Known issues:

    • Jolla Tablet: does not work at all because I don't know how to build STT library for i486 architecture.
    • Jolla 1: speech transcription is slow and sometimes app crashes due to low memory error
    • PinePhone: very unstable and sometimes causes crash of PulseAudio server

    Any comments, ideas, translations, issue reports are highly appreciated.

    Tip: To play and test any model which is not yet configured for download, add model description to `$HOME/.local/share/harbour-dsnote/harbour-dsnote/models.json` file on the device.

    Translations (both Speech Note and Speech Keyboard):
    All translations are very welcome. There are three ways to contribute:
    - [preferred] Transifex project
    - Direct github pull request
    - Translation file sent to me via e-mail: dsnote@mkiol.net

    Source code: https://github.com/mkiol/dsnote
    Bugs, Feature requests: https://github.com/mkiol/dsnote/issues or just email: dsnote@mkiol.net

    Application versions: 
    AttachmentSizeDate
    File harbour-dsnote-1.5.1-1.armv7hl.rpm1.27 MB17/11/2021 - 10:00
    File harbour-dsnote-1.5.1-1.aarch64.rpm1.34 MB17/11/2021 - 19:28
    File harbour-dsnote-1.6.0-1.aarch64.rpm1.39 MB09/12/2021 - 21:32
    File harbour-dsnote-1.6.0-1.armv7hl.rpm1.31 MB09/12/2021 - 21:32
    File harbour-dsnote-1.6.1-1.armv7hl.rpm1.31 MB10/12/2021 - 20:52
    File harbour-dsnote-1.6.1-1.aarch64.rpm1.39 MB10/12/2021 - 20:52
    File harbour-dsnote-1.8.0-1.aarch64.rpm1.44 MB02/04/2022 - 19:40
    File harbour-dsnote-1.8.0-1.armv7hl.rpm1.36 MB02/04/2022 - 19:40
    Changelog: 

    1.8.0

    • New languages: Finnish, Mongolian (experimental), Estonian (experimental)
    • Improved model for Polish language: Polski (mkiol)
    • Experimental German medical model: Deutsch (med)
    • New models for English: English (Coqui Huge Vocabulary), English (Coqui Large Vocabulary)
    • Improved languages browser
    • Support for SFOS 4.4 (sandboxing disabled)

    => I would be very grateful for any feedback how good speech transcription is for individual models.

    1.6.1

    • New German language model "Deutsch (Aashish Agarwal)" (experimental). This model might be even better than the currently configured default. I would be greateful for the feedback.

    1.6.0

    • New and default listening mode: One sentence (Clicking on the bottom panel starts listening, which ends when the first sentence is recognized)
    • Cover action (When 'One sentence' mode is set, cover displays action to enable/cancel listening.)
    • Improved language viewer
    • Coqui STT lib update (v1.1.0)
    • Bug fixes and performance improvements (e.g. App starts much quicker with multiple languages enabled)

    1.5.1

    • Fix: Languages configuration wasn't loaded when app was installed for the first time

    1.5.0

    • Fix for ARM64 - now app should work
    • Model for Catalan language
    • Many "experimental" models for various languages: Dutch, Yoruba, Amharic, Basque, Turkish, Thai, Slovenian, Romanian, Portuguese, Latvian, Indonesian, Greek, Hungarian. Most of these models provide very bad accuracy :(

    1.4.0

    • Russian and Ukrainian models
    • D-Bus API and service for 3rd-party app integration (e.g. Speech Keyboard)

    1.3.0

    • Czech language model and translation (many thanks to Lukáš Karas for the contribution)
    • New additional models: French (Common Voice), Italian (Mozilla Italia)

    1.2.0

    • Option to transcribe audio file
    • Minor UI fixes and improvements

    1.0.1

    • support for Jolla 1, Jolla C and PinePhone (alpha)
    • speech recognition accuracy is much improved thanks to DeepSpeech library update to version '0.10.0-alpha.3'
    • UI minor fixes