Phil Houston knows how to spot a lie. A former career CIA officer who’s sometimes been called the human lie detector, he literally wrote the book on detecting deception. His techniques are used by U.S. intelligence agencies, businesses and billionaires.
Now he’s seeking to harness the power of artificial intelligence to supercharge his system—and market it to hedge funds listening to earnings calls, investigators interviewing suspects, employers considering potential new hires, or anyone else trying to discover duplicity.
There’s been a long history of hype around technologies that promised to turn lie detection from an art to a science, only to see the techniques fall far short. And there’s plenty of doubts about whether AI will be as transformational as some believe.
But Houston is among those at a handful of startups who say the technology is on the verge of providing a major breakthrough in their corner of forensics. His new company, CyberQ, uses generative AI to parse transcripts and analyze statements for tell-tale signs of deceit. In just seconds, he says, it can detect misleading statements with 92% accuracy.
“Any organization that deals with human beings, which means their employees and others, and faces situations where deception has a negative impact on their business model, on their business operations and so forth, can use this,” he says.
CyberQ is just one of a crop of new companies looking to build AI-powered lie detectors. Other including Arche AI, Coyote AI and Deceptio.AI are attempting something similar with claims to comparable results.
What makes CyberQ stand out is its executive pedigree. Houston spent 25 years at the CIA as both an investigator and polygrapher and developed the lie-detection methodology that’s still used by government agencies today. His co-founder and COO, Susan Carnicero, taught his system during a more than three-decade stint at the at the CIA. Simon Frechette, CyberQ’s CEO, previously worked with the two at QVerity—Houston’s earlier startup—which provides deception-detection services and consulting.
Together, they’ve trained “Q”—as the chatbot is known—on hundreds of validated examples of truths and lies. The training material included police interviews with OJ Simpson, testimony from Enron Corp. CEO Jeffrey Skilling’s infamous congressional appearance, and statements from the management of Wirecard, a German company that imploded in scandal.
One of the biggest challenges, Frechette says, was eliminating false positives, or truthful statements that Q labeled as lies.
That’s in part because finding examples of seemingly deceitful behavior that proved to be truthful was tough. Aaron Quinn, whose then-girlfriend was abducted from their California home in 2015, provided a rare example of speech that appeared suspect, but ultimately proved to be true.
On the surface, the software’s operation appears pretty simple: CyberQ’s engineers enter a prompt—Houston won’t reveal what it is exactly (“That’s the secret sauce,” he says)—and upload a text for it to review. Clients get back a marked-up version of the transcript with three potential results: no deceptive behavior indicated, deceptive behavior indicated, or follow-up needed.
To understand what is driving these determinations behind-the-scenes, it helps to know a bit about the lie-detection system that forms the backbone of Q’s training. While many think the craft involves zeroing in on a set of physical behaviors—like an averted gaze or a nervous fidget—that’s not the typical approach taken by experts.
“Eye contact for us is not a deceptive behavior,” says Carnicero. “We all know to look somebody in the eye to make you look truthful. And the bad guys know that as well. So they’ll use it against you.”
Instead, Q looks for specific linguistic patterns that are more solid tells. They include things like referral statements (“As I told the last guy …”), qualifiers (“To be perfectly honest …”) and lies of influence (“Don’t you know who I am?”) Other signposts include a failure to answer a question, providing overly specific answers, a lack of specific denials, and so on.
Human behavior is, of course, not entirely predictable and has foiled previous efforts to create a technological truth serum.
Polygraphs, which measure physiological responses like heart rate, blood pressure and sweating, are a case in point. They became ubiquitous in pop culture and were widely deployed. But their use in the private sector was restricted by federal law in 1988 and a decade later the U.S. Supreme Court upheld keeping them out of military proceedings, citing studies concluding the results were “little better” than a coin toss.
Houston, who taught polygraphy at the CIA, acknowledges the technology’s shortcomings. And while there aren’t yet rules restricting the use of AI lie detection in either the courtroom or in the employment process, he looks at Q as a tool that could help investigators focus in the right place—not necessarily a way of arriving at the results.
With polygraphs, “you have to answer every question, either yes or no. It’s a one-word response,” he says. “Now it has great value beyond that in terms of identifying if someone is telling the truth or not to that one-word response. And it also can be used as a little bit of a wedge, a psychological wedge, because [then] you tell someone ‘OK, now I know you’re lying.’”
In theory, Q can go further than the polygraph—not only by providing pressure points for interviewers but also by identifying specific statements that are worth following up on.
“That opens up a whole new world for the interrogator,” Houston says.
“For instance, if someone says ‘Oh, I’m an honest person, I would never do that.’ Well, that’s a deceptive behavior, but it also signals to us that that person may be worried about their reputation as much as anything. So we can incorporate that into our interview model to help give additional information,” Houston adds.
Importantly, Houston claims that Q can evaluate truthfulness based purely on the spoken words of its subject, making it potentially less prone to subconscious prejudice than a human interrogator.
While Q showed some initial signs of bias in the early stages of its development, Houston says, it has become fairer and more accurate over the course of its nine months of training. For proponents of such technology, using a faceless robot to analyze spoken words without any real idea of who is saying of them, is a feature, not a bug.
“The original model at the agency was designed to have global application,” Houston says. “It was designed to transcend culture and age and anything to make it as objective as we possibly can. Having said that, it’s not anywhere near the objectivity that Q has, because we’re human. You know, we’re going to go in with some preconceived notion when we ask the question. As humans, we all have biases about the person we are talking to.”
This article was provided by Bloomberg News.