Let's look ahead to what will impact consumer electronics in the year ahead, and beyond.
As we begin another year, it’s time for some forecasts on the coming year(s). Like the other year-end pieces I’ve written (see this holiday shopping guide and my retrospective on the past year), the following thoughts are in no particular order.
When you scroll through the categories below, you may sense a pattern. Developments in deep learning will impact a number of applications going forward, including autonomous vehicles, network security, and even elections around the globe.
Sound off in the comments with your thoughts on any or all of them, as well as anything you think I may have overlooked.
Deep learning for images
The ability to pattern-match and extrapolate from already-identified data (“training”) to not-yet-identified data (“inference”) has transformed the means by which many algorithms are developed nowadays, with impact on numerous applications (thereby rationalizing why the words “deep learning” appear in multiple entries on this list!). Computer vision was one of the first disciplines to enthusiastically embrace deep learning, and for good reason: traditional algorithm development was tedious and narrow in applicability, not able to accurately handle “corner cases” such as off-axis object views, poor lighting conditions, and atmospheric and other distortions and obscurations. Plus, algorithms developed to identify one class of objects often required re-coding from scratch in order to identify a different object class.
Conversely, with deep learning, after you enter a sufficient-sized and robustly labeled data set into a training routine, the resultant deep learning model is able to robustly identify similar objects within that same class. Need to broaden the object suite to be identified? Incrementally train the model with more data. And even if you need to swap out models in order to deliver sufficient identification breadth, the underlying framework can remain the same. One of an increasingly lengthy list of compelling implementation examples, which I came across just the other day in reading a recent issue of Time Magazine, is OrCam’s MyEye 2, which the publication awarded as one of its Best Inventions of 2019. Quoting from the writeup, “The artificial-intelligence device attaches to the frame of any glasses and can identify faces and currency or read text and information from bar codes aloud … OrCam MyEye 2 can also be useful for those with reading difficulties like dyslexia.”
Deep learning for audio
Clusters of image pixels aren’t the only thing that deep learning is good at pattern-matching, of course. What about phonemes and other units of sound? Take, for example, Google’s Live Caption, now available on recent models in the company’s Pixel smartphone family. It runs in real time, translating both spoken audio and the audio tracks of videos into immediately displayed captions. And equally impressively, it runs entirely on the “edge” smartphone (leveraging deep learning models that can be periodically updated from the “cloud” to improve accuracy and language coverage, of course), meaning that it still works if you’re completely network-disconnected.
Deep learning for general-purpose data
Expand your thinking beyond multimedia data, and other compelling deep learning opportunities emerge. My nephew, for example, recently started working for a company called Darktrace, which purports to leverage artificial intelligence in identifying unsanctioned network intrusions, viruses operating on a network, and the like. Admittedly, I don’t understand the company’s products in great detail, but apparently once you install the software, it “learns” what usual network characteristics “look” like and is therefore able to identify and alert IT personnel to deviations from this norm.
Think back to what I said before about conventional versus deep learning-based computer vision algorithms and their respective abilities (or not) to adapt to “corner cases” and identifying new objects. Combine this with your recollection of conventional virus monitoring software you may have used in the past, and the appeal of a deep learning approach to network monitoring becomes obvious. Instead of a “hard-coded” virus scanning approach that doesn’t adapt to new threats until you install a new data patch (and then only retroactively), a deep learning-based approach evolves in real time in response to any variance from the average. Inevitably, I suppose, some degree of “false positives” will come out of this, at least initially, but a robust iterative training scheme will learn from them, too.
For another specific example of the general concept, see a post I recently came across on Slashdot about algorithms capable of identifying death notices in old newspaper pages, then pulling names and other key details into a searchable database useful for folks developing family trees and the like. But lest you get too excited by the possibilities, I also recommend the sanity-check perusal of a recent presentation (PDF), “How to Recognize AI Snake Oil,” from Arvind Narayanan, Associate Professor of Computer Science at Princeton University.
Recent (and inevitable future) setbacks aside, numerous established-and-upstart players in the automotive space are continuing to aggressively pursue deployments of increasingly autonomous vehicles. Just recently, for example, Waymo announced significantly expanded public trials of its completely driverless vehicle program in Chandler, Arizona, a suburb of Phoenix. You can now even request a Waymo One robotaxi ride by using an Android or iOS app.
But while the broadly applicable, broadly usable, fully autonomous vehicle remains the “bright shiny object” that continues to capture everyone’s attention, limited implementations of the concept will likely achieve most of if not all of the nearer-term success. Waymo, for example, is also partnering with AutoNation to deliver parts to nearby auto repair shops, an example of a focused-route application. Already-robustly mapped urban and other similar areas are also likely to see earlier adoptions of autonomy. Yes, eliminating human drivers is fundamentally behind Uber’s ongoing robotaxi investments, and in considering potential customers for such services, I can’t help but think of my father, who died of ALS almost a decade-and-a-half ago. Dad would have welcomed the independence extension that a self-driving vehicle would have afforded, after progressive loss of limb control eventually ended his own driving abilities. The blind, the elderly, the intellectually disabled … the potential candidates are countless.
Starship Technologies launched robot food delivery services at Purdue in September 2019.
Expand your thinking beyond cars, and autonomy’s potential (and near-term reality) expand as well. Mid-last year, for example, we learned that UPS and technology development partner TuSimple had for many months already been conducting self-driving freight-transport truck trials between Phoenix and Tucson. But simplistically speaking, a freight-transport truck is just a really big kind of vehicle. Consider, for example, Garmin’s Autoland system (yes, those guys), which can land a small plane if its human pilot becomes incapacitated: I daresay I don’t need to further explain the value of such a system to the passengers! And then there’s Starship’s delivery robot fleet, which my niece (a sophomore at my alma mater, Purdue University) tells me is in active operation all over campus. She recently gave me an update; they reportedly sometimes get confused, and apparently it’s also fairly easy to snag the food out of one of them en route to its delivery destination (not that she would ever do such a thing, mind you). But the elimination of costly, unreliable human labor for the delivery task is obvious.
The first wave of 5G-supportive smartphones was launched at and around Mobile World Congress this past spring, and additional (and more optimized) models will inevitably follow in 2020, including, potentially, Apple’s first 5G handsets. But I’m equally excited about the other 5G-enabled devices that may be coming.
In-between the development of my earlier 2019 retrospective and this piece, for example, Qualcomm held its annual Snapdragon Tech Summit. In addition to the announcement of the mobile computing and communications device-targeted Snapdragon 765 and 865, the former with an integrated 5G modem, Qualcomm also unveiled its next-generation augmented-plus-virtual reality (“XR”, in Qualcomm terminology) chipset, the XR2. It (like the Snapdragon 865) can be mated with the standalone X55 5G modem for completely untethered AR+VR headset applications. For completeness, I should also note that Qualcomm’s primary competitor, MediaTek, also recently released more complete details on its first 5G chipset, Dimensity 1000, which the company had first “teased” earlier this year.
The tug-of-war between general-purpose processors (either running software or programmable hardware) to implement a given function, versus special-purpose hard-wired accelerators, is always interesting to watch no matter how predictable the outcome often is. Take the MPEG series of video compression algorithms as a legacy example; in each generational case, the decode and even more computationally intensive encode algorithms initially executed on a CPU and/or a programmable logic fabric. However, once the standards firmed up and the market grew to a sufficient size to justify additional development investment, dedicated hardware cores emerged to offload the system processor, at the same time resulting in a much more silicon- and power-efficient ASIC or standard cell approach versus the FPGA-based predecessor.
Much the same thing is now happening with deep learning processing, for example, as a short list of frameworks is beginning to rise to the top of the usage-popularity list and as a common set of features to support them begins to gel. More generally, could you imagine a time when graphics processors didn’t exist and the CPU (in combination with a pixel display engine) instead handled the relevant rendering functions? Conversely, nowadays the GPGPU, as its “general-purpose” moniker suggests, is striving to expand its usefulness beyond graphics, also aspiring to stave off dedicated deep learning accelerators in the process. Look at the image shown above, taken at the Apple A13 SoC unveiling during the launch of the iPhone 11 family, and then consider that many of the functions listed run predominantly-to-completely on special-purpose coprocessor cores versus solely in software on the general-purpose CPU core cluster.
Although autonomy and an electric powertrain don’t always go hand-in-hand, it’s a safe bet that an increasing percentage of autonomous platforms going forward will be rechargeable battery-powered. Battery technology development and design implementation involve a tricky balancing act between a number of seemingly contradictory variables:
And there’s at least one other notable opportunity to consider, as battery technology continues to decrease in price and otherwise enhances its value proposition. Many, although not all (witness geothermal, for example), renewable power generation sources are periodic versus constant in their output characteristics; the wind doesn’t always blow, the sun doesn’t shine at night (or even as strong in cloudy conditions), tides are impermanent in their strength and direction, etc. Intermediary batteries can buffer and therefore smooth out this otherwise inconsistent pattern.
I showcase Apple’s products (the iPhone 11 family, in this particular case) in the image for this section because the company has been a particularly strong longstanding advocate of consumer privacy in the face of pressures involving law enforcement, and profit (i.e., figure out who the users are and what they’re doing and interested in, and in response serve them up tailored ads, share and exchange data with other companies, etc.) Unfortunately, Apple’s actions are sometimes seemingly inconsistent with its mission statement, especially when they have the fiscal potential to benefit the company, versus a partner such as Facebook. And yes, the subject of encrypted data and communications channels is once again heating up, with federal, state, and other law enforcement agencies demanding “backdoors” and tech companies pushing back. Expect this tension to increase, not lessen, in the year(s) to come.
“Deepfakes,” i.e. altering still image, video, audio, and other data to distort the reality originally captured within, is nothing new, as this frame from the 1994 film Forrest Gump shows. But the phenomenon has exploded in recent years, along with becoming dramatically more realistic as deep learning algorithms are brought to bear to, for example, map one person’s face onto another person’s body, or to generate synthetic speech phrases that may have never been uttered in real life (and then potentially even map them to a person’s mouth movements in a video sequence). The various shenanigans that went on around the 2016 election cycle in the United States are by this point already well documented (and my opinions on them are unnecessary to document), and will inevitably further escalate as the 2020 election cycle further ramps up. And this isn’t a US-only phenomenon, of course; as I was writing this, evidence emerged of Russian meddling in the UK elections. The bottom line, sadly, is that when many folks see something in their social media feed that supports their pre-determined conclusions on a particular person or topic, they won’t even bother sorting out whether it’s truthful or not. They’ll internalize it as “fact,” and pass it along to their online communities.
As I said in the introduction, please sound off in the comments with your thoughts on my thoughts and any additions you think should be on this list. And thanks as always for reading!
—Brian Dipert is Editor-in-Chief of the Embedded Vision Alliance, and a Senior Analyst at BDTI and Editor-in-Chief of InsideDSP, the company’s online newsletter.