Arthur Wang's Blog
Follow me on
  • My General Blog
  • Software Development
    • Latest Articles on Software Development
    • Complete Resources for Developers >
      • Tutorial Information for Developers
      • .NET Developer Blogs
      • Developer Journals and Magazines
      • Developer's Tools
      • Database Development
      • ​Developer Conference and Events
  • Tech
    • Latest Articles on Technology Development
  • Health
  • Money
  • Services
    • Modern Website Design
    • Web Maintenance of Existing Websites Service
    • Corporate Business Consulting Service
  • About
  • Contact
  • Art
  • 中文部落格

Are You Ready to Tap into the Power of Bing Cognitive Service API?

1/22/2017

0 Comments

 
As we are moving toward to the Microservice architecture era, Microsoft has been developing a suite of APIs that let developers consume and use its cognitive technology.  It allows you to build powerful apps using just a few lines of code regardless of what device and operating system your app hosts.  In the latest suite of Bing’s Cognitive Services APIS, it includes five areas: Vision, Speech, Language, Knowledge, and Search.  Despite that the Search service is similar to old Microsoft Search API that released 10 years ago, other services have been just developed within one to two years which are now available on Microsoft’s Azure portal.
Picture
Recently, I have the opportunity to develop a Speech-to-Text web application using the Bing Speech API.  My web application allows the user to upload an audio file in .wav format and by selecting the locale of the audio, the Speech To Text API is capable of recognizing the audio and returns the result of transcribed text and displays it on the web page.  You can find the official documentation for Speech Recognition service from here.
​
Before you start researching and reading about the documents, please note there are many documents and sample projects used the api.projectoxford.api endpoint, and this endpoint is retiring on January 17, 2017, and you should use speech.platform.bing.com endpoint instead.  
There are 3 types of Speech Recognition library: REST API, Client Library, and Service Library.  You should only choose one of them. Since I was developing a web application, I used the REST API and this means that I only get one result per request. For Client Library, it allows real-time streaming and returns partial recognition. In this way, the code must be written on the client side and request directly to the service directly. This is great for writing for the app for mobile devices. Lastly, the Service library also allows you to request the partial result and is great for Windows apps. 
 
In the REST API library, I wish the documentation can be written by someone other than the developers who involved with the cognitive service because many things are not that clear unless you become experience with it.  For example, “Your application must indicate the end of the audio to determine start and end of speech….” So how do you indicate the end of the audio?  How about say “The End” in the voice audio?  No, I think I’ll assume a short moment of “no voice” or quite which indicates the end of audio.  In this API, you should break your audio file into segments, and send each segment to the REST API to process it.  Each segment is a complete .wav file, not bytes chunk, and it has a limit of 10 seconds of audio in one request. This means that each length of speech needs to be less than 10 seconds. I used 9 seconds for my app, or else you will see the unserviceable error.  The duration of this request cannot exceed 14 second of processing time, which means that the API will abort itself once it exceeds that limit.

The sample code provided by Microsoft can be found here.  In my web application, I used the latest System.Net.Http. HttpClient class, instead of HttpWebRequest class shown on the sample to connect the API service. But before you made the switch, you should be familiar with asynchronous programming especially if you are building your own Web API pipes for your web applications. 
var client = new HttpClient();
var token = MicrosoftAuthentication.GetNewToken();
client.DefaultRequestHeaders.Add("Authorization", "Bearer " + token);
client.DefaultRequestHeaders.TransferEncodingChunked = true;
client.BaseAddress = new Uri(requestUri);
var response = await client.PostAsync(requestUri, fileContent).ConfigureAwait(false);
​At this time of writing, the Bing Speech API is still in beta, and from my experience, it is still rough, but I feel the time for prime use is near.  Now is a good time to get your feet wet with the Cognitive technology from Bing.  

Below is a snapshot of my web application that I developed:
Picture
Read Latest Software Development Articles  Read Latest Tech Articles

Picture
Let's Connect!

Arthur Wang

​Arthur is a software developer at his core and has been developing web applications and leading development projects since 2000.  He enjoys learning new technologies and writing about them.


If you would like to read more articles written by Arthur, please subscribe the weekly newsletter from Uniting Digital since he is also a content contributor for the site.
Subscribe

0 Comments



Leave a Reply.

    Arthur Wang

    @ArthurWangLA
    MCSD App Builder
    MCSD Web Applications
    ​Member of Windows Insider Program & HoloLens Developer Community & Dev Center Insider Program

    Over 17+ years of  experience in web-based software development & management.  Specialized in Microsoft technology with c# language and its architecture design.  MCSD, MCSE, Microsoft Specialist, MCP + Internet, and B.S. from UCLA

    Archives

    August 2018
    March 2018
    January 2017
    December 2016
    May 2016
    April 2016
    March 2016
    February 2016
    April 2014

    Categories

    All
    API
    Arduino
    ASP.NET
    Cognitive
    CSS
    Database
    Deep Learning
    DevOps
    Electronics
    Flexbox
    HTML5
    IoT
    Katana
    Machine Learning
    Management
    .NET
    .NET Core
    Neural Network
    OWIN
    Programming
    Programming Tools
    Recognition
    Security
    SQL Server
    UWP
    Visual Studio
    Web API
    Web Developer

    RSS Feed

    Latest Articles

© 2014-2020 ArthurWiz.com All Rights reserved. | Home | About |
Protected by Copyscape