Augmented reality and machine learning have already started to change the way that we see and interact with the world around us in a quiet way and not for all of us, even amongst technology users. For those of you that use Snapchat and Instagram, filters are commonplace with video messages to friends using the various overlays and skin enhancing filters that use machine learning to jazz up the experience. Quite honestly, if you haven’t tried it, you need to. It’s a fun way to get a little silly.
And for those of you who use Google’s apps, their new maps and translation apps use both augmented reality and machine learning to boost the experience. For Google Translation, machine learning assists with translation, character/object recognition, and now even runs offline. For Google Maps, the app uses machine learning to recognize buildings, knows exactly which direction you are looking (or where your phone is pointed) and is able to guide you in the right direction.
At its core, machine learning and augmented reality enables a computer to see, hear, think make decisions and move, and these capabilities are what is exciting because these abilities are the groundwork for much more complicated interactions and applications between humans and computers.
Coming down the pipeline are advancements that take what is possible today and vault the experiences into what really seems like black magic. There is software that automatically edits and syncs video without a guiding human hand and the editing is done perfectly. Looking towards AR glasses, the glasses will blur the line between reality and the digital world by adding or removing objects. Pause and think about the ethical ramifications of a computer being able to alter your “reality” by removing objects that it does not want you to see.
Advancements in hardware components such as Neural Processing Units (NPU) are also to thank for these advanced capabilities. Have you noticed that your phones now automatically place a square around faces and are even able to automatically blur out the background? The NPU is what enables facial recognition and also things like language translation on your phone.
On the software side, software advancement is enabling machine learning to run even in toasters and microwaves. Your toaster can now know what “medium” toast means to you and your fridge can hear you cough too much and recommend chicken soup without having to be plugged into a massive GPU or computer. Check out TinyML and Edge Impulse for more info.
So what are other applications, use cases, and who are some of the companies on the leading edge of this revolution? For those of you who want to check out the companies fast, here is the short list:
So what Does The Future Hold?
This WSJ video (from a series called Upstream) does a fantastic job of showing what is possible. While the video doesn’t go into technical details, it does cover a wide array of applications and also discusses some of the challenges. Which brings us to the point that augmented reality and machine learning is still continuing to mature. We know the machine can now see, hear, think make a decision and move and this can be seen through companies attempting full factory automation (check out for more) as well as Apple’s AR Glasses. If the youtube videos are anything to be believed, the Apple AR glasses could dramatically change the mobile user interface as we know it. Technically, you wouldn't need a phone anymore as the data that you want to see can simply be rendered with the AR glasses. But for now, we sit at the edge of our seats in anticipation of the future.
Augmented Reality + Machine Learning Gaming w/ ForwardGame SCOAAR.
"SCOAAR" is an augmented reality version of air hockey on steroids and got its big debut during Apple’s recent product launch back in 2019. You use your phone to control the air hockey paddle and essentially turn your living room into a massive air hockey table that is viewable from any angle with any number of participants. A fun in-game easter egg is that you can throw fireballs with a flick of the wrist. Check out the video here:https://www.youtube.com/watch?v=wr0Otffr16o
Augmented Reality Platform w/ 1000 Realities.
1000 Realities is a platform that enables developers to create augmented reality applications like Google Maps AR by providing the underling software that enables the creation of a digital twin through photos only. These applications can be an indoor navigation app, heavy machinery training app, or even a massive multiplayer game. The base application enables users to create a digital replica of their current environment so that digital assets can then be dropped in.
AI Assisted Video w/ GoPro QuikStories / Descript / rawshorts.
GoPro did a great job with their app that helps users quickly create a summary video from a collection of other videos. If you're looking for a quick way to pump out content, this app works well. Another solution with much more control is Descript. It's key feature is that it transcribes what you say into text and then you can cut out entire portions of the video by simply deleting the text. Finally, rawshorts is kind of the reverse of the first two. This app takes your text and adds video to the text. This solution would enable you to create videos much faster by eliminating barriers like people, lighting, and audio.
Realtime Audio Noise Filtering w/ Krisp.
The Krisp app uses machine learning to remove annoying background noise in real time during your voice calls. This includes paper rustles, chair screeches, and even wind noise. This is perhaps one of the most functional applications of machine learning. And for those that are curious, the technical problem they solved is called the “cocktail party problem.”
Amazon Dash Cart.
These carts from Amazon have built in cameras with machine learning that identify the items you toss into your cart in real time. There is no need to interact with a cashier or worry about double checking the receipt anymore because the cart already knows what you grabbed and your credit card is charged as soon as you walk out of the store.
Autonomous Grocery Store w/ Nuro.
The next level beyond grocery carts is to jam a grocery store into an autonomous vehicle and have it self drive around neighborhoods like the ice cream man use to do. The best part is that these autonomous vehicles can be summoned via an app. Food deserts might literally be a thing of the past.
Autonomous Construction Surveillance & Vehicles.
Two companies that are on the bleeding edge regarding work site surveillance and construction robots are SkyCatch and Built Robotics. SkyCatch recently teamed with NVIDIA and Komatsu for a project to predict worksite patterns and analysis to enable work estimations and situational awareness (think large construction vehicles not running over humans). Built Robotics has automated construction machinery, starting with a bulldozer. If the job of clearing a job site and building the frame of a building could be automated, humanity could really benefit from this.