Ever since Satya Nadella took over as CEO we’ve been watching developments at Microsoft closely. We’re interested for two reasons here at TEKenable, firstly because we’re a software company building applications on Azure, .NET, xRM, SharePoint, Office365, SQL, Xamarin etc every day, but also because we’re intrigued to see how Microsoft will remain competitive in the face of formidable new rivals. Fortunately for Microsoft, the signs are encouraging. This year, I’m seeing lots of evidence that Microsoft is remaining relevant as it refocuses towards the ‘Cloud First, Mobile First’ world that Satya Nadella envisioned when taking over in 2014. Over the summer I’ve been catching up on new Microsoft ‘stuff’ for want of a better word. Some of this technology is truly exciting, take Cortana Intelligence Services for example, a new artificial intelligence (AI) toolkit which offers great potential for software developers. It’s one more initiative, just like the launch of the Hololens and the Xamarin Acquisition that tells me Microsoft means business. Satya Nadella has probably been most quoted for his ‘Cloud First, Mobile First’ vision. Even more important however was another remark by Nadella that captured the very essence of the Tech Sector, ‘Our industry does not respect tradition — it only respects innovation’. For some reason Microsoft seems compelled to call everything Cortana even though this name is recognised by most people as the speech recognition engine and while this is undoubtedly based on/part of the AI platform speech recognition is only a small part of what this thing can do. So here’s a quick look at Cortana Intelligence Services, why it’s innovative, why we are going to be working with it (we are already looking at the Bot framework for a client) and why I expect it will get lots of attention from the developer community.
Cortana Intelligence ServicesAt a high level, Cortana Intelligence Services has two components: Microsoft Cognitive Services, a set of intelligent APIs that let computers understand humans in more natural terms, and the Microsoft Bot Framework, which lets developers and companies create and manage bots more easily. Both of these software services are available through Azure, Microsoft's cloud platform, and they will also work alongside Office, Windows, and other Microsoft applications. This means existing Microsoft customers can quickly harness Cortana and Bots, increasing their productivity in the process. From a Microsoft perspective this is very positive as developers will create a whole new eco-system of services built on Microsoft technology. And as you will see later on, some of the applications already emerging are very creative. Let’s look at Cognitive Services first in a little more detail.
Microsoft Cognitive ServicesMicrosoft Cognitive Services is a collection of more than 20 APIs available on the Azure Cloud Platform. Grouped under five categories of Vision, Speech, Language, Knowledge and Search these APIs can be used to See, Hear, Speak, Understand, Search and Interpret using natural language communication. Take the Computer vision API for instance that analyses the visual content of images. This can be used by developers to classify photographs and it’s very impressive. Look at the photograph of cows in a field below, which I ran through the API and what it ‘saw’ in the photograph. The Computer Vision API reported it was 99.99% confident the picture contained cows, 99.98% confident there was grass in the image, 99.75% confident that the image was taken outdoors and so on. It also made an overall assessment, (with 84.33% confidence) that the photograph was of ‘a herd of cattle standing on top of a wire fence’. Not perfectly accurate as the cattle were standing behind and a bit above the fence, but still very impressive. If you’d like to test the computer vision API with a photograph of your own, you can find a demo here: https://www.microsoft.com/cognitive-services/en-us/computer-vision-api Face is an API that identifies human faces in an image and collects attributes, including age, gender, pose, smile, and facial Hair. The information from Face can be further analysed using the Emotion API. This API detects emotions ranging from contempt, disgust and fear, to happiness sadness, and surprise. The screen shot below shows and example. As with the computer vision API, you can upload your own images to the Microsoft website to test the service. https://www.microsoft.com/cognitive-services/en-us/face-api Uploading my profile picture to this gave an estimated age of 58 years so clearly this bit is very broken but all other attributes were accurate :-) From the same link you can try out Face Verification - how alike are two faces, Face tagging - find a face or group of faces in images that look like the person or group in your reference image, Group a collection of images by similar looking faces, even detect and categorise faces in video in near real time (see image below): Earlier this year, Satya Nadellas announced his vision of ‘Conversation as a Platform’. The Cortana ‘Speech’ and ‘Language’ APIs bring this vision closer. Bing Speech API for example allows you convert spoken audio to text. The Speech to Text API enables you to build voice triggered smart apps. I am feeling a little smug as I forecast this convergence in a previous LinkedIn post “Golden Handcuffs & The Evolution of the Mobile App”. https://www.linkedin.com/pulse/golden-handcuffs-evolution-mobile-app-peter-rose?trk=pulse_spock-articles The possibilities for developers are limited only by their imagination and exciting applications are already beginning to emerge. Take for example Seeing AI, built by a blind Microsoft employee in the UK. Seeing AI uses ‘Vision’ and ‘Speech’ APIs to see objects and people and then translate what the user sees via an audio message. It works on both smartphones as well as Pivothead SMART glasses. In a meeting for example, it sees other people and can assess their gender, age and even their emotional state. And the use cases extend well outside the office. The smartphone version for example, can also take a picture of a menu, and then offer an audio version of that menu. To understand what Seeing AI means from a user perspective take a look at this video:
Bot FrameworkThe Bot Framework is the second component of Cortana Intelligence Services. It ties into the Microsoft ‘Conversation as a platform’ vision and it also reflects an expectation that bots, (pieces of smart software that can talk with their user) will gradually replace the ‘apps’ we use today. As Nadella put it "Human language is the new user interface layer,” Many of us know about Tay, the artificial intelligence chat-bot that Microsoft launched in March and withdrew two days later after it was corrupted by other Twitter users. Tay was a well publicised failure however Microsoft has also had a big success in the bot arena. Not many people here will have heard of Xiaoice, the Microsoft chat-bot launched on Chinese instant messaging service, We Chat. Since its release last year, Xiaoice has proved hugely popular, attracting 20 million registered users and it’s now a celebrity in China, even presenting the weather report on TV. The clip below shows Xiaoice interacting with a newsreader on Shanghai’s Dragon TV. Success with Xiaoice would appear to have given Microsoft confidence that bots will play an important role in the future. Combine this with the fact that between 2 and 3 billion people globally use messaging apps today - meaning ‘conversational UI’ is already the most common way of experiencing technology - and the case for bots gets even stronger. The Bot Framework is Microsoft’s attempt to anticipate this trend and make it easier to build bots. It has three components. ‘Bot Builder’ is a range of tools and services including APIs, command dialogs and support for Skype calling and rich attachments. The Developer Portal allows developers connect their bots to text, Skype, Office 365 and other channels. Portal users can also publish and manage a bot through the bot’s dashboard. Finally, the Bot Directory is a register of all bots published using the Framework. Users can try out a bot from the directory via web chat control.
It’s clear however that Microsoft won’t have the market for bots all to itself. Competitors see the opportunity too. Apple for example, already has Siri while Facebook recently launched a bot platform of its own, running on the Messenger chat app. (The Guardian recently reported that more than 30,000 bots were developed for Messenger alone in the last six months). At the same time, Google is rumoured to be launching a new intelligent assistant running inside Allo, a new messaging app and it already offers Home, a portable speaker that takes voice commands. Finally, Amazon Echo is reportedly installed in more than 1.5 million homes and has added 1,200 "skills" through its API. But Microsoft has some big advantages. It has a massive installed base of enterprise users - Windows 10 platform is now installed on in excess of 400 million devices and growing – it has a strong developer base and it owns the very popular Azure Cloud service. Of course, it’s possible that in the best traditions of the tech sector bots may yet prove to be hype. Already a backlash is starting. Last week the Financial Times suggested bots as ‘odds-on favourite for the title of Most over-hyped new technology of 2016’. We will need to wait another year or two before we can separate the high expectations from reality. Right now however, Microsoft is placing big bets on the future of bots as the next UI and looks well positioned to have a strong role in the new bot landscape.