Welcome to Knowledge Base!

KB at your finger tips

Book a Meeting to Avail the Services of Azure Custom Speech Service overtime

This is one stop global knowledge base where you can learn about all the products, solutions and support features.

Categories
All

Azure Custom Speech Service

(Go to Product)

Ensuring Data Privacy and Security with Azure Custom Speech Service

Data Processed by Speech to Text

Azure Custom Speech Service processes various types of data, including audio input or voice audio, input transcription text, and transcriptions for speech translation. It accepts voice audio as input and uses it for transcription services. The service also assesses pronunciations based on transcribed text in pronunciation assessment tasks. Additionally, speech translation involves transcribing text and translating it into a specified language through the Translator service.

Data Processing by Speech to Text

In real-time speech to text scenarios, audio input is processed by the speech recognition engine on Azure's server memory without storing data at rest. All data in transit are encrypted for protection. For batch transcription, customers specify storage locations for audio input and output transcription text files. Customers control data storage and retention, including setting retention times for generated transcription files.

Speaker Diarization/Separation

Azure Custom Speech Service offers speaker separation (diarization) for both real-time and batch APIs. When enabled, the engine analyzes audio input to differentiate between speakers. Unique voice characteristics signals are used temporarily to annotate the transcription output with speaker markers. Signal data for speaker separation is discarded post-process and supports multiple speakers within a single audio file.

Language Detection and Translation

Language detection in Azure Custom Speech Service calculates probabilities of mapping between phonemes and languages to identify spoken languages in audio input. Speech translation involves machine transcription followed by text translation services for language conversion. Translated text can also be converted into audio format if needed.


Stay Ahead in Today’s Competitive Market!
Unlock your company’s full potential with a Virtual Delivery Center (VDC). Gain specialized expertise, drive seamless operations, and scale effortlessly for long-term success.

Book a Meeting to Avail the Services of Azure Custom Speech Serviceovertime

Empower Your Team with Azure Custom Speech Service Training

Enhance Team Skills with Microsoft Learn for Organizations

Microsoft Learn for Organizations offers tailored training to enhance technical skills within teams. By utilizing curated offerings, organizations can address skills gaps, provide tailored learning experiences, and earn Microsoft-verified credentials essential for keeping up with rapidly evolving technology trends. This platform ensures that teams are well-equipped to handle new roles and responsibilities, thereby boosting overall productivity and success.

Read article

Evaluating Pronunciation with Azure Custom Speech Service

Using Pronunciation Assessment in Streaming Mode

Pronunciation assessment in Azure's Custom Speech Service supports uninterrupted streaming mode, allowing for unlimited recording time through the Speech SDK. This feature enables users to receive real-time evaluation on pronunciation accuracy and fluency without interruptions. By continuously recording audio, users can conveniently pause and resume the evaluation process as needed. Additionally, for detailed information on available languages and regions supported by the pronunciation assessment feature, refer to the documentation.

Read article

Azure Custom Speech Service: Empowering Batch Transcription for Audio Data Handling

Understanding Batch Transcription

Batch transcription is a process that allows for the transcription of a large volume of audio data stored in Azure. This functionality is supported by both the Speech to text REST API and Speech CLI, enabling efficient handling of numerous audio files for transcription purposes.

Read article

Enhancing Text Clarity with Azure Custom Speech Service's Display Text Formatting

Inverse Text Normalization (ITN)

Inverse Text Normalization (ITN) in Azure Custom Speech Service converts spoken words into their written form, ensuring clear and accurate transcriptions. Supported formats include dates, times, decimals, currencies, addresses, emails, and phone numbers. By automatically applying ITN rules, the service enhances readability and ensures the expected text formatting.

Read article

Empowering Communication with Azure Custom Speech Service

Core Features of Azure Custom Speech Service

Azure Custom Speech Service offers advanced speech to text capabilities, supporting real-time and batch transcription for converting audio streams into text. The core features include real-time transcription, fast transcription for quick results, batch transcription for processing large volumes of audio, and custom speech models for enhanced accuracy in specific domains.

Read article