The AI Whispered Your Name …

Categories: , , ,

You have probably been the unwitting protagonist in the following social horror movie.  A stranger approaches you and begins chatting animatedly, like an old acquaintance or confidant. Meanwhile, you nod vaguely, half-listening while seized by internal panic. You frantically attempt to triangulate a face, name and a context. It is a study in cognitive dissonance, intimacy without recognition.

One on one discussion in a larger group

Do I know this person? If so, from where? Is this a neighbor? A former academic colleague I haven’t seen in a decade? Or is it the contractor who repaired my roof in 2012?

As the putative stranger continues speaking, you pray they will offer some clue – any clue – that would help you place them in your social milieu.  You rapidly apply context filters to narrow the sample space. This is a university event, making it unlikely either my roofing contractor or my neighbor is here.

Fair warning: such filters are fallible. I was once approached by a neighbor at a university function; the out-of-context encounter left me chagrined when I failed to recognize them, forcing them to self-identify.

Even worse is the “fading memory” scenario. You vaguely recall the introduction, but the name has evaporated. Too much time has passed to interject, however politely, “Who are you?” All the while, embarrassment and social mores dictate that you keep a smile on your face, pretending you are delighted to see them.

The Coping Heuristics

Over the years, I have developed several survival strategies that can extract lost information without violating social norms.  If I am with my wife or a colleague, I use the “introduction gambit.” I introduce my companion, hoping the unknown individual will respond with a name and social cue. 

Cosmo Spaceley from the Jetsons

You know the drill, “I’d like you to meet my wife, Jane Jetson.” 

If lucky, they respond, “Hi, I’m Cosmo, from Spacely Space Sprockets.” 

If less fortunate, the reply may be a terse, “Nice to meet you, Jane.” 

A savvy companion, one who has played this game before, will then rescue you with a follow-up, “My pleasure. You are?”

Now, you may not care about Cosmo or his company, being a major stockholder in the competing company, Cogswell Cogs, but you have now satisfied the social contract and you have the basis for more than mumbled platitudes.

My second coping strategy relies on the comfort and certitude of badges and name tags. At conferences, they typically include a lanyard and a badge with the person’s name and organizational affiliation, perhaps augmented with which aspects of the event their registration fee covered.  At events where the attendees are not registered, they may be some version of the ubiquitous, “Hello, My Name Is …” peel-off nametag, where the wearer has written their name.

Hello, my name is Dan

For the record and on behalf of those with fading vision and/or prosopagnosia, please print the names on the badges and nametags in LARGE, HIGH CONTRAST FONTS.  That way, a furtive glance is sufficient to opine, “Hello, Cogswell.  It’s nice to see you here at the Cogs Conference.”  Otherwise, the all-too-obvious lean and squint give your ineptitude away.

However, the most important strategy is a simple one – nothing replaces basic respect. Whether you know the individual or not, they have sought your ear, and they deserve your attention in return.  Meet their gaze; listen thoughtfully and attentively; and respond politely and respectfully. The fact that your feet hurt and your voice is hoarse is not their concern.

Social Networks and Scaling Laws

The inability to recognize individuals and remember names is not just age and declining vision; it is an evolutionary byproduct. Not surprisingly, evolutionary biologists have long studied the phenomenon of social networks in primates. They theorize humans are only cognitively capable of maintaining about 150 stable relationships (i.e., where an individual knows the identity of each person in the group and how that person relates to every other person in the group). 

Dunbar’s Number—roughly 150—marks the cognitive limit for stable social relationships. Based on primate studies, this cognitive limit is likely rooted deeply in our evolutionary history, suggesting a hard biological constraint on social scalability. The constraint was functional in a human hunter-gatherer culture or even in a rural agrarian society. Conversely, it highlights the scalability challenges of a technologically connected culture, where social networks are vast and loosely connected at best.

This context game becomes impossible if you are a public figure, no matter how minor. In such cases, hundreds, perhaps thousands of people know who you are, by face, by name, by position, or by actions.  If you are a national figure – politician, actor, musician, or athlete – millions or even billions of people may recognize you when in public. In such cases, personalized reciprocity is impossible; you must rely on generic greetings.

Been There, Did That

As a senior university leader, I have personally experienced the asymmetry of knowing and being known. Random faculty members regularly approached me at university events, eager to opine on some academic issue. Listening and responding was part of the job, no matter how arcane the topic.

Part of my university role has also been to offer reassuring words to concerned and hopeful parents who were leaving their first born at the big university. I have also posed for photographs and made small talk with thousands of happy college graduates on commencement day. Both the parents and the students knew who I was – or at least the office I held – but I knew none of them, and it was unlikely that I ever would.

One of my other academic duties was to “work” athletic events, usually football.  Before each game, the university development office (i.e., the fundraising arm) would send me a dossier of one page “cheat sheets” – a biographical sketch on each donor I would be hosting in a luxury box overlooking the field. The sketch included an image, family relationships, business successes, hobbies, university connections, a net wealth estimate, and possible philanthropic interests. It sounds a bit mercenary, but it is how industrial philanthropy functions. (If you are wondering, I rarely saw much of the game.  That was not why I was there.)

Sometimes I knew the individuals by reputation, but more often not, and occasionally I was surprised.  At one football game, I met a former computer company chief technology officer (CTO), one I had interacted with while I was a vice president at Microsoft. We saw one another and each did a double take.  He walked up to me and said, just as I was about to utter the same words, “What are you doing here?” The short answer: he had retired to the area, and I had returned to academia.  Neither expected to see the other in this context.

Protocol Officers and Performative Intimacy

Highly visible public figures are acutely aware of the social network asymmetry: everyone knows them, but they, by comparison, know very few. Hence, they have “handlers” – people who protect them from unwanted intrusions – and “protocol officers” – people who whisper in their ear and introduce vetted individuals – reminding them of critical facts so every greeting feels like the warm embrace of an old friend.

In formal settings, the latter sometimes takes the form of a receiving line, where the public figure greets visitors one by one, while an aide leans in to whisper, “This is Jane Jetson.  She has a daughter, Judy, who just graduated and a son, Elroy.” The figure then beams and says, “Jane! It is so good to see you! You must be so proud of Judy and your boy, Elroy!”

It is the classic “grip and grin,” a performance of intimacy fueled by data. For master politicians, it is an art form of the highest order, the reality distortion field in action.

The Intelligent Memory Assistant: An AI Precursor

Back in 2008, I asked my friend and colleague, Dennis Gannon, to lead a project I called the Intelligent Memory Assistant, as part of the Microsoft eXtreme Computing Group (XCG).  As the XCG moniker suggests, our role was to explore the unusual, whether it be low-power cloud servers, hardware search accelerators, new programming models, or wireless spectrum options.  We were explorers of the future.

The Intelligent Memory Assistant aimed to democratize the protocol officer: an AI that whispers names, events, and context into your ear as an unobtrusive extension of memory. Whether names and faces, or grocery list items,  the “deer in the headlights” panic of forgetfulness would be banished forever.

Dan Reed wearing a Microsoft SenseCam

Years earlier, before joining Microsoft, I had worn a Microsoft SenseCam, which recorded a stop-action video of daily life.  It was a project launched by the late Gordon Bell. It raised a plethora of privacy issues and gave me a sense of what might be technically possible. (See Holiday Explanations: So, What Do You Do?)

Our design partitioned the Intelligent Memory Assistant code between a Windows Phone and the cloud, Microsoft Azure. This partition leveraged the smartphone camera for local context and processing, used Bing for search and summary generation, and early AI research from Microsoft Asia for face identification. We envisioned the supporting hardware as an unobtrusive buttonhole camera for image capture, a bone-conduction microphone for subvocal queries, and a small earpiece for audio summaries. All three devices would be connected to the smartphone via Bluetooth.  (We were geeks, but we were trying hard not to build a Borg on a budget.)

Microsoft XCG Intelligent Memory Assistant
Microsoft XCG Intelligent Memory Assistant

The camera would see a face, the cloud would identify it via image classification, and the AI system would whisper in your earpiece, “This is Cosmo. You met him in 2004. Ask him about his sprocket business.” We avoided pursuit of augmented reality and heads-up displays, because another group at Microsoft was working on HoloLens.  Had we done so, this would have allowed an overlaid, heads-up display of recognized individuals and their attributes.

The Technical Reality Check

In retrospect, we were simply too early – technically and sociologically. Dennis and the XCG team did phenomenal work, but interaction latency was too high; the face databases were too small; the text-to-speech generation was poor; the AI was not yet ready; and the digital protections were non-existent.  Recognizing these challenges, we ultimately canceled the project, but not before we learned a great deal.

We were still years away from Microsoft’s Rick Rashid 2012 speech in China, when Microsoft Research (MSR) software, using an early deep neural network, simultaneously translated his closing remarks from English to Mandarin. At the time, it was a feat worthy of international news. (You can watch a video snippet here; note how carefully Rick enunciates and how slowly he speaks.)

As an aside, I have also spoken to international audiences with a human translator.  It is a delicate dance, where you must take care to limit the length of your sound bites, allowing the translator to keep pace and using only technical jargon known to both. Try explaining “out of order execution with scoreboarding” and watch the light die in their eyes.

Jargon is the bane of any translator, which is why they much prefer to have a written transcript of your remarks as a reference.  If you doubt me, turn on the speech-to-text option on your television during a live sporting event and watch the AI struggle with the more arcane terms of art in the sporting world.

The XCG team was also hampered by wireless bandwidth.  4G cellular service was just appearing, and image uploads and downloads were bandwidth constrained.  Recognizing this, we were also pursuing unlicensed white spaces broadband, which used portions of the spectrum being vacated during the analog to digital television transition.  Alas, that white spaces approach never gained serious traction, despite multiple trials and its obvious appeal for rural, underserved areas.  (See White Spaces and Adaptive Communication and White Spaces: Celebrating the Cambridge Trial.)

Then, as now, Bing was a distant second to Google search, with a smaller base of supporting infrastructure. Although search relevance was high, for which the Bing team, then led by  Satya Nadella, was justifiably proud, image search was still difficult and the database of faces was small. As a result, image hit rates, particularly for all but the famous, was low. (Two other XCG teams, led by Doug Burger and Jim Larus, built FPGA accelerators and a domain-specific language to accelerate Bing, but that is a story for another day.)

However, the killer constraint was the then nascent state of artificial intelligence. The almost instant, highly accurate image recognition and conversational query responses of today’s AIs was then but an elusive dream, and the deep neural network revolution was still years away. 

Remember the context.  In the early 2000s, highly accurate image recognition was still an unsolved problem. In 2012, AlexNet stunned the AI world in the ImageNet Large Scale Visual Recognition Challenge, beating the runner-up by over 10 percentage points. It was a vision breakthrough based on convolutional neural networks that launched the modern era of deep learning—a movement that later culminated in today’s  generative pre-trained transformer (GPT) models.

There Are No More Strangers

Beyond the technical hurdles, the Intelligent Memory Assistant project provided a concrete framework for discussing our parallel Microsoft work streams in digital privacy and security and the need for ubiquitous broadband.

Suddenly, that small graph of roughly 150 personal acquaintances – your personal Dunbar number – is potentially a globally connected graph of billions of individuals. You would know everyone, even if you had never met them before – equally tellingly, they would all know you. Everyone is a potential panopticon officer, the fear of that other prognosticator, George Orwell, who would never have predicted we would happily pay for the cameras to watch one another.

This begs the important questions. What do those billions know about you? What do you know about them?  What should you know about one another? Do you have any say in the matter? Or, is your digital persona an entity created by others, defined only in part by your actions? Are you just an accretion of digital detritus, bread crumbs scattered across the infosphere?

If this sounds a bit like a Philip K. Dick science fiction novel, one that leaves you fearful while reading, even in the bright light of day, you would be right.

In the local village, you knew everyone and you (and the village elders) could directly shape and reshape their perception of you. For a Wikipedia page, reality is, for better or worse, an evolving community perspective, what science fiction writer  Bruce Sterling called “the major consensus narrative.”  For AI synthesized summaries, the data provenance is unknown, as are the model’s trained, implicit biases.  Even more worrisome, correcting errors of either commission or omission is difficult, if even possible.

Quantitative change in macroscale data scale and accessibility brings qualitative change in microscale human interactions. When anonymity becomes impossible, the nature of public interaction changes fundamentally. It raises still unresolved questions about what constitutes a public figure and the nature of digital privacy.  If a stranger on the street can pull up your resume, your net worth, and your family tree simply by looking at you, does privacy even exist?

We learned many things from the Intelligent Memory Assistant Project; one of them was the need for a more nuanced perspective on privacy and individual control.  At the time, I argued that data access should no longer be binary, but context sensitive. In particular, I suggested the need for three foundational rules:

  • Claims-based access:  You may use this piece of my data for this purpose only
  • Bounded lifetime: You may use this piece of my data only during this period
  • Limited transitivity:  You may not transfer the rights I grant you to another entity

See Information Privacy: Changing Norms and Expectations for longer thoughts on these ideas.

Do not despair; there are some protections, one of which I recently encountered inadvertently.  I asked Google Gemini to create a caricature  of me from public sources.  To my surprise, Gemini demurred, even when I insisted that as Dan Reed – verified via two factor authentication – I gave it permission. When I asked why, it told me I was a public figure.  Surprised, I asked why it considered me a public figure; it cited my public roles in science policy.

A few weeks later, Gemini reversed course and said that it now had a more nuanced understanding of public figure, distinguishing politicians from others, and it could now generate a caricature.  The refusal and then reversal —triggered by the model’s safety or policy heuristics—illustrates how opaque classification rules can shape what AIs will or will not do with our identities, and how little control individuals may have over those decisions.

Telling the Future the Past

In the fall of 2023, the New York Times ran a story about similar, internal efforts at Google and Meta (Facebook), that leveraged the same, albeit more mature technologies, we had pursued at Microsoft. The article, entitled, “The Technology Facebook and Google Didn’t Dare Release,” highlighted some of the same social, ethical, and legal issues we had struggled with years before, and both companies declined to productize the technology for fear of public backlash.

Google had, of course, faced its own public relations fiasco with Google Glass, its augmented reality glasses. Consumer reaction, outside the technorati, was brutal. For a brief time, the portmanteau – glasshole – entered the public vernacular.   (An inveterate early adopter, I had a pair; see Through A Google Glass, Darkly for my perspective.)

Lest this seem only a tale of what might have been, anyone can build a version of the Intelligent Memory Assistant we once conceived, using publicly accessible databases of human faces (e.g., PimEyes), any one of several large language models (e.g., ChatGPT, Gemini, Claude, or Copilot), and a smartphone or an augmented reality device such as Meta’s Ray-Ban glasses.

As described in another New York Times article from 2024, a group of Harvard students did just that, using Meta’s Ray-Ban glasses for the camera and heads-up display.  Their goal was to raise awareness of privacy risks, and it only took them four days to write the code.  Today, one of the AIs might generate that code in response to a single query. You can watch a brief video here.

With a few important exceptions, the right to take photographs of individuals in public spaces, without their consent, is a matter of settled law in the United States, and, outside of commercial advertising, there are relatively few legal prohibitions on the non-commercial use of such images. In the European Union, the General Data Protection Regulation includes, under certain circumstances, the “right to be forgotten” (i.e., removal of personal data from certain public sources). Legality, however is distinct from public acceptance, particularly for the vast majority of people who are not public figures.

Meta Ray-Ban AR glasses

As of early 2026, both trade stories and the mainstream press say Meta is integrating native facial recognition into its Ray-Ban glasses, ironically under a project supposedly called “Name Tag.”  The primary privacy safeguard would likely be a small LED that pulses while recording, though which groups of people it might be allowed to recognize is still a topic of debate. Meanwhile, Apple is reported to be introducing both smart glasses with an AI companion, and an AI pendant with cameras, a speaker, and multiple microphones, as an extension to the iPhone.

It is a brave new world when your face itself is now a tracking device. When coupled with augmented reality, image capture and face identification can be as simple as gazing at you.

Nearly twenty years after the Microsoft Intelligent Memory Assistant project, those thorny issues of privacy and social norms remain disconcertingly unresolved. The question is no longer whether we can build the panopticon, but how we will constrain it. Civil liberties advocates are rightly concerned, and both the U.S. Congress and multiple states are considering legislation.

Excuse me, what was your name again?


Discover more from Reed's Ruminations: The Past, Present, and Future

Subscribe to get the latest posts sent to your email.

Please leave a comment …

Discover more from Reed's Ruminations: The Past, Present, and Future

Subscribe now to keep reading and get access to the full archive.

Continue reading