Speech Technology

Channel address:

Categories: Technologies

Language: English

Subscribers: 652

▲ Vote (1)

Ratings & Reviews

2.67

3 reviews

Reviews can be left only by registered users. All reviews are moderated by admins.

5 stars

4 stars

3 stars

2 stars

1 stars

The latest Messages 7

2023-04-18 00:44:44 Not sure about claimed accuracy but numbers are interesting

https://blog.deepgram.com/nova-speech-to-text-whisper-api/

A remarkable 22% reduction in word error rate (WER)

A blazing-fast 23-78x quicker inference time

A budget-friendly 3-7x lower cost starting at only $0.0043/min

450 viewsedited 21:44

Open / Comment

2023-04-18 00:30:35 Laugh is nice, Russian stress is traditionally bad

https://github.com/suno-ai/bark

394 viewsedited 21:30

Open / Comment

2023-04-12 16:20:21

Space is closer than you think. Happy Cosmonautics day my friends.

279 views13:20

Open / Comment

2023-04-10 10:54:51 GPU beam search in pytorch

https://github.com/pytorch/audio/pull/3096

312 views07:54

Open / Comment

2023-04-08 15:59:16 NeMo 1.17 is now released and and includes a lot of improvements that users have long requested.

This includes a high level Diarization API, PyCTCDecode support for beam search, InterCTC Loss support, AWS Sagemaker tutorial and more !

https://twitter.com/alphacep/status/1644685634404073472

404 views12:59

Open / Comment

2023-04-04 04:44:26

41 views01:44

Open / Comment

2023-04-04 01:12:27 Learning model from Whisper

https://github.com/speechcatcher-asr

131 viewsedited 22:12

Open / Comment

2023-04-03 04:52:29 https://groups.inf.ed.ac.uk/edacc/

The Edinburgh International Accents of English Corpus: Towards the Democratization of English ASR. Ramon Sanabria, Bogoychev, Markl, Carmantini, Klejch, and Bell. ICASSP 2023. Presentation of the EdAcc.

276 views01:52

Open / Comment

2023-04-02 16:45:37 https://www.openslr.org/136/

EMNS
Identifier: SLR136

Summary: An emotive single-speaker dataset for narrative storytelling. EMNS is dataset containing transcriptions, emotion, emotion intensity, and description of acted speech.

Category: Speech, text-to-speech, automatic speech recognition

License: Apache 2.0
About this resource:

Emotive Narrative Storytelling (EMNS) corpus introduces a dataset consisting of a single speaker, British English speech with high-quality labelled utterances tailored to drive interactive experiences with dynamic and expressive language. Each audio-text pairs are reviewed for artefacts and quality. Furthermore, we extract critical features using natural language descriptions, including word emphasis, level of expressiveness and emotion.

EMNS data collection tool: https://github.com/knoriy/EMNS-DCT

EMNS cleaner: https://github.com/knoriy/EMNS-cleaner

331 viewsedited 13:45

Open / Comment

2023-04-02 16:36:54 The largest 2,000 hours multi-layer annotated corpus QASR is available @ https://arabicspeech.org/qasr/ QASR is suitable for ASR, dialect ID, punctuation, speaker ID-linking, and potentially other NLP modules for spoken data.
#nlproc #speechproc #Arabic #AI
@QatarComputing

@qcrialt

https://twitter.com/ArabicSpeech/status/1641402805951815681

312 views13:36

Open / Comment

Speech Technology

Ratings & Reviews

The latest Messages 7

Popular Channels

Related Chats

Popular Channels

Login