Difference Between Base Model and Custom Speech to Text Model
A base speech to text model is pre-trained with Microsoft-owned data and deployed in the cloud. Custom models are tailored to specific environments with unique ambient noise or language requirements. Custom models are ideal for settings like factory floors, cars, or noisy streets needing adapted acoustic models, and for domains like biology, physics, or custom acronyms requiring specific language models. Training a custom model involves enhancing recognition by incorporating domain-specific terms and phrases.
Starting with Base Models
Begin by acquiring an API key and selecting a region in the Azure portal. For REST calls to a predeployed base model, refer to the REST APIs documentation. To leverage WebSockets, download the Speech SDK for seamless integration.
Necessity of Building Custom Models
Custom speech models are essential when applications operate in specialized environments or require enhanced accuracy. For generic, everyday language applications or noise-free environments, using base models suffices. Comparing baseline and custom models through accuracy tests can aid in determining the optimal model for specific use cases.
Checking Completion Status
To determine when processing for datasets or models concludes, monitor the status within the table. A 'Succeeded' status signifies the processing is complete. The only way to ascertain completion currently is through this status display.
Creating Multiple Models
Azure allows the creation of multiple models without restrictions within your collection. This flexibility enables users to tailor models to various scenarios and iterate on different adaptations to optimize performance.
Utilizing Detailed Output Results
While multiple results are generated for each phrase, opt for the first result for the best accuracy, even if other results have higher confidence values. Additional results may be utilized for specific situations like offering correction choices or handling misrecognized commands.
Importance of Latest Base Model Selection
Selecting the most recent base model during custom model training ensures enhanced accuracy. Although older base models are accessible for a period after new additions, transitioning to the latest model is recommended for optimal performance.
Model Update Through Combination
Models cannot be updated directly. To incorporate new data, merge the old and new datasets and readapt for improved performance. Upon completion, redeploy the updated model to access the new endpoint.
Automatic Model Deployment Updates
Updates for deployments are not automatic. Users must decommission existing models, readapt with newer base model versions, and redeploy for better accuracy. Both base and custom models are retired after a certain period (refer to the Model and endpoint lifecycle).
Local Model Execution
Custom models can be executed locally within a Docker container, providing flexibility in utilizing models offline or in isolated environments.
Copying and Moving Models and Datasets
Copy custom models to other regions or subscriptions using the Models_Copy REST API. Datasets and deployments cannot be directly copied; however, datasets can be imported into a new subscription to create endpoints.
Request Logging and Throttling
Requests are not logged by default. Users can enable logging options when creating custom endpoints for secure storage of audio and transcription data. Requests can be throttled based on Speech service quotas and limits.
Charging for Dual Channel Audio
For dual channel audio submissions, charges are incurred based on file durations. Submitting each channel separately results in individual file duration charges, while multiplexing channels in a single file incurs charges for the overall file duration.
Stay Ahead in Today’s Competitive Market!
Unlock your company’s full potential with a Virtual Delivery Center (VDC). Gain specialized expertise, drive
seamless operations, and scale effortlessly for long-term success.
Book a Meeting to Avail the Services of Azure Custom Speech Service