Separate Fact from Fiction; Hype from the Help about AI
Go Beyond the Hyped-up Headlines and Over-promised Opportunities About AI
What can we really expect with AI? Will it replace humans? Will it take over the world?
With the recent hype and excitement about ChatGPT and AI, Matt talks with an AI expert, Cal Al-Dhubaib, about the realities (and limitations) of AI technology. You’ll be surprised when you learn the limitations of AI, rather than the breathless reporting and promises being made by people that don’t understand AI.
Cal Al-Dhubaib is an award-winning data scientist, who build trustworthy AI. He is
[00:00:00] Cal: Here’s a fact about AI, and it hasn’t changed much in the past few years. Somewhere between 80 to 90% of AI projects in the real world fail to deliver on business results, failed to deliver an ROI.
[00:00:17] Matt: Wow.
[00:00:18] Cal: there’s maybe one other profession that’s that bad and that’s, the weather people weather predict , you can afford it. Especially, I dunno, especially here in the Midwest.
[00:00:26] Matt: Yes.
[00:00:27] Cal: You can still be wrong and have a job . yes. So, you know, data scientists are right there with them.
[00:00:32] Voice over Intro: Welcome to Endless Coffee Cup, a regular discussion of marketing, news, culture and media for our complex digital lifestyle. Join Matt Bailey as he engages in conversation to find insights beyond the latest headlines and deeper understanding for those involved in marketing. Grab a cup of coffee. Have a sit. And thanks for joining Well, hello and welcome to another edition of the Endless Coffee Cup podcast. As always, I’m your host, Matt Bailey, and looking forward to another great conversation. So I hope you have a nice hot cup of coffee or a beverage of your choice and ready to settle in and learn today about AI.
[00:01:13] Matt: You know, unless you have been under a rock, AI has the headlines everywhere. There’s so much going on and, and just this past week, I mean, a ChatGPT has exploded. And if you’re like me, I had to spend about an hour just kind of drilling into what is going on, what’s going on behind the headlines, and it just so happens we had this podcast scheduled, I think a couple months in advance Cal. So what great timing, to have you on board here at the Endless Coffee cup.
My guest here today is Cal Al-Dhubaib. And Cal has done a lot of work in AI. And, and Cal I’m just gonna let you give us an intro about yourself and, and some of what you’ve been doing in the industry.
[00:01:56] Cal: Sure, and I’ll, I’ll just start with a, a remark that, oh my gosh, for the first time in my life, my job is now finally mainstream, then trying to explain what you do at thanksgiving to the family. This was the year where everybody’s like, oh, I get it.
[00:02:09] Matt: That’s awesome.
[00:02:10] Cal: I’m the founder and CEO of a company called Pandata. And what we do is we help organizations design and develop machine learning models and these AI systems in high risk enviroments So healthcare, financial services, education, energy, um, environments where data is sensitive, that application risk might be, uh, a little bit higher. And so we’ve had to develop an expertise in like building these models.
And here’s the thing, I started the company in 2016 at a time when I had to try to explain, what the data science profession. Which is almost now ubiquitous. You hear data scientists, you’re like, ooh, that’s a good job. You know?
Then it’s like, wait, math, and programming with business, what are you doing? so it’s been a really fulfilling journey and, and I’m really excited to dive into you some of the evolutions I’ve seen in the world of machine learning going from niche to mainstream.
[00:03:07] Matt: Absolutely. I love the thanksgiving dinner, story. I joked that my mother after 25 years still does not know what I do, and still cannot pride. You know, at one point she’s like, he does something with Google . I’m like, oh mom, come on.
[00:03:21] Cal: Yeah.
[00:03:21] Matt: How would you summarize, how do you explain machine learning and AI over the dinner table?
[00:03:28] Cal: So, I’ve spent much of the last seven years in education mode, like educating others on what this thing.
And so I’ve gotten really good at the grandmother test is nice, what they call it. You can you explain it at the dinner table and kind of get people to say, ah, I get it. So the best explanation for me, artificial intelligence, it’s nothing more than software.
And it does two things. It recognizes patterns and it reacts to patterns. So these patterns can be things in boring old spreadsheet. Um, they can be things like images or texts. They can be a little bit more sophisticated like audio and video, but you’re looking for patterns that are consistent.
Maybe trying to spot things that are a little bit harder to describe than just simple rules, right? If I see this than I do this. And then you’re trying to react to them in some sort of way. Maybe you’re trying to, to categorize, hey, this a widget, this is a cat. I’ve got a cat in my background right now. Maybe you’re trying to say something little bit more sophisticated, like you’re predicting, risk, you’re predicting cost.
And some of the latest stuff, the highest complexity of, how we react is pattern generation, and this is what you’ve seen over the past few weeks from ChatGPT and finally, everybody knows about GPT, but then some of the other cool stuff, like Dolly, um, and mid Journey that are actually creating novel new images and on the frontiers of this research is creating a whole host of new things, video, audio, PowerPoint, scott, I can’t wait till that comes out cause I don’t wanna ever have to design another deck again.
[00:05:00] Matt: And that’s a great point. So let’s dive into that. It’s interesting because the headlines are telling us ChatGPT’s gonna unseat Google. You know, the artwork’s gonna replace this.
You’re in this industry, how long do you think it will be until, it can make your PowerPoint for you because, you know, I think the headlines are pushing the whole thing.
[00:05:20] Cal: Yeah.
[00:05:20] Matt: But there’s this gap, In between what we’re seeing right now.
[00:05:24] Cal: Yeah.
[00:05:24] Matt: And what the promise is and I think that gets blurred a lot.
[00:05:28] Cal: So let me start with just saying like, I roll my eyes every time I see one of these types of headlines. For a long time, machine learning, at least for the much of the last 10 years, has been evolving as a discipline and in the advent of, hey, we have bigger computers, we can compute things at scale, we have better data.
It’s allowed us to do things with this math that frankly has existed for decades.
[00:05:51] Matt: Mm-hmm.
[00:05:52] Cal: Right? So none of this is technically new. The compute power’s new and there’s an article I like to quote by Jennifer Strong in the Wall Street Journal from four, years ago now, and she goes, for, AI to become useful, it needs to become boring, right?
And there’s a lot of these boring problems that are not in the headlines today. Things like predicting which patients need the best follow up care, trying to, you know, identify fraud in very complex financial transactions. You know, even things like, you know, finding defects in a paper mill, for example, right?
There’s a lot of these like really, really niche boring AI problems and that’s really driving economic gains. And when I see all the headlines talking about this stuff and people finally getting excited, cause yeah, pretty pictures and a fun chat bot. I’m like, there’s much more to this industry than that. I’ll zoom in there and give you my commentary on this wave of what’s called generative AI. So creating new and novel content.
I was just at the Brainstorm AI Summit, by Fortune, um, last week in San Francisco, and one of my favorite speakers at the event was Reid Hoffman from Linkedin.
And the way he described this next wave, it made me get it and conceptualize it. And he described it as a co-pilot for every profession.
[00:07:14] Matt: Hmm.
[00:07:15] Cal: Right? AI is certainly right. Don’t wanna not address the elephant in the room. It’s gonna shift jobs, but it’s not going to remove jobs
[00:07:25] Matt: Right.
[00:07:25] Cal: I’m unconvinced that we’re going to produce AI that can generate content in a single shot that’s going to remove the need for human intervention.
[00:07:36] Matt: Wow. So coming from you, I love that. I’m gonna interrupt here.
[00:07:41] Cal: No, please.
[00:07:41] Matt: That excites me, because you constantly hear the other side of this. It’s going to eliminate, it’s gonna replace, and so far my experiences with it, and I think someone, posted a really great article about how they used ChatGPT and they said basically it did intern level work.
And I think your example of you know, being boring, that maybe the image should be of that accountant who’s going line by line through a ledger with a ruler…
[00:08:10] Cal: Yeah.
[00:08:10] Matt: To look specifically that, that’s a more accurate depiction of something that follows rules, something that’s looking for those patterns rather than the creative side…
[00:08:23] Cal: Sure.
[00:08:23] Matt: Of coming up with something, because I sat in on a demo, couple weeks ago of, content development through AI and I was like, oh, you know, it brought up things that I hadn’t thought about.
It helped me to see what I do from a, fresh perspective. If anything, it helped me with keywords and, perspectives that I needed to introduce into my writing. The rest of it was obviously assembled together, from bits and pieces and needed addressed.
And I think that excited me we’re, coming from you where yes, it’s more the boring, it’s not the creative, it’s not the flashy that, there will need to be humans for emotionally, connective content, I think.
[00:09:05] Cal: Oh, sure. Yeah. Like, I’ll give you my own example. I mean, like, I was recently, kind of just hitting like a writer’s block, trying to put an abstract together for one of my upcoming, conference, talks. And so I took three or four articles I had written, dumped them into, uh, the predecessor to GPT Chat, GPT-3, there’s creative writing assisted in tools.
So I was kind of nerdy for going straight to the source, but let’s just say I dumped the last four articles I wrote in there, and I said, summarize this into, um, conference talk abstract with three or four bullets.
[00:09:40] Matt: Wow.
[00:09:40] Cal: And it, assembled it and I was like, close. I don’t like it, but all right. You know, here’s what’s wrong with this and I’ll go fix it. And these are the three points I really want to talk about. So it put together an abstract that kind of alluded to the topic. It structured it in a way that made it very easy to jumpstart my own creative process. And I said, okay, thanks. I’m gonna take it from there.
And writing and this content creation, you know, that we’re seeing in terms of article creation, article writing, et cetera, it’s probably the furthest along, and so it gives us the best glimpse into what this looks like when it scales into other fields, other professions, and it’s probably gonna look a lot like that.
Get it as far as you can to jumpstart your own creative process, consider five or six different versions, which is something you’d never make an intern do. You’re not gonna say, hey, write me 10 versions of this article, and then, you know, oh, I like two and three. Let’s blend those together. You’d never do that to a human.
[00:10:40] Matt: Right, right. That makes a lot of sense. And like you said, it’s that fresh perspective. It’s, you know, who wants to go back and reread everything you’ve written before. You’ve got that writer’s block going on.
[00:10:51] Cal: Yeah.
[00:10:51] Matt: I’m too close to the material.
[00:10:53] Cal: Yeah.
[00:10:54] Matt: And let’s pull back and do that. One of the best, I think descriptions, I don’t know where I got it from, was that machine learning AI, it’s not like hiring an Einstein, it’s like hiring a million interns,
[00:11:10] Cal: If that….
[00:11:12] Matt: And so it’s great to hear. But yeah, so that was an interesting way to use that, is to, yeah, even look at your own content and come up with a couple ideas. But yet the revisioning aspect to, especially on the technical side, and that’s where I thought GPT someone did, a test for coding.
They put in a code problem.
[00:11:31] Cal: Yep.
[00:11:31] Matt: And showed here’s what it looks like, here’s the code you would use, here’s how it would, and so they walked through that and said, you know, Google Killer. Then someone else said, no, GPTs making up data. It’s adding the nonsensical things and sourcing, citations that don’t exist. There’s like both sides of this, of someone saying it worked perfectly for what I wanna do, and then someone else saying, but it’s wrong.
[00:11:55] Cal: It’s both. It’s both.
[00:11:57] Matt: Yeah. How is that happen?
[00:11:58] Cal: Here’s the thing, right? All GPT-3 and all of its incarnations is, is a, predict the next word, model. That’s it. Nothing more, nothing less. Um, it’s not a search engine. It doesn’t actually retrieve facts. You can’t certainly connect it and build other aspects on top of it and build systems around it.
But it’s nothing more than trying to predict the next, given the context of what it’s been exposed to and when it’s been exposed to billions of documents, it has a lot of context and it’s this idea of, it looks at the sequence of words given to it in a prompt, and sometimes it’s a short, hey, write me a code example to do X, or it’s something much longer like my four pages of your, you know, 10 pages of article content from the four article and then it says, okay, what’s the next most likely word? And the one after that, and the one after that. And that’s how it builds its responses.
And so it’s really good at appearing polished.
[00:13:01] Matt: Mm-hmm.
[00:13:01] Cal: It’s trained, it’s maximizing the function of predicting the next most likely word that would make sense in that position. In many cases it’s useful, but in some cases, it actually gives you wrong facts, but it just sounds good in that position. Right? So like that’s, what the model is doing.
[00:13:19] Matt: Wow. That is amazing. So, you know, there’s the, generative AI where, I think last night even I watched a music video to, uh, I think it was, Black Sabbath’s War Pigs and with the, AI generated images. And it was, yeah, a great experience.
Yeah, it was like the early days of MTV, you know, we’re putting a video to this, and even then the videos made no sense, but the images have a style to ’em, and so it really worked. But I like also how, when you gave your, background, it’s, here’s high risk environments.
And then going into the boring aspect, what has been the result of, you know, I think you said that the, generative is getting all the headlines.
[00:14:00] Cal: Yeah.
[00:14:01] Matt: What is happening in these industries that absolutely need to have the boring side of the technology?
[00:14:08] Cal: Here’s a fact about AI, and it hasn’t changed much in the past few years. Somewhere between 80 to 90% of AI projects in the real world fail to deliver on business results, failed to deliver an ROI.
[00:14:25] Matt: Wow.
[00:14:25] Cal: Uh, there’s maybe one other profession that’s that bad and that’s, the weather people weather predict , you can afford it. Especially, I dunno, especially here in the Midwest.
[00:14:35] Matt: Yes.
[00:14:35] Cal: You can still be wrong and have a job . Um, yes. So, you know, data scientists were right there with them. Here’s the thing, whenever I talk to clients about machine learning, this is part of why it’s so hard, one of the many reasons, but, with machine learning, you’re guaranteed one thing, the model will be wrong.
And so the question is, how much wrongness are you willing to tolerate? How do you put guardrails around it? What’s the cost of being wrong? Is it worth it for the results that you get? How do you build process around it? How do you get humans to rely on this thing? Right? So you’re, guaranteed one thing and that they’ll be wrong.
[00:15:11] Matt: Wow.
[00:15:11] Cal: So how do you build around a technology with that truth?
[00:15:16] Matt: That is amazing. You’re blowing my mind with this. This is so cool. That first conversation has got to be earth shattering.
[00:15:22] Cal: Yeah.
[00:15:23] Matt: When you’re talking to a potential client or, someone about that, and you’re bringing this level of, it’s gonna be wrong, it’s most likely gonna fail, how do you sell something like that? How do we deal with it? What are the benefits of even attempting this?
[00:15:37] Cal: Sure. So, I mean, all fair questions. And it’s, looking at what’s the next best alternative, right? What’s it costing them today?
[00:15:44] Matt: Mm-hmm.
[00:15:44] Cal: And so the sweet spot is identifying these problems that I I like to bucket them in three categories. one, it’s something that humans are doing today and it’s a barrier to. scaling Uh, so maybe it’s screening patients or maybe it’s, you know, I’ve gotta actually type out all these emails to every different contact or whatever. It’s something a human owns and does.
[00:16:06] Matt: Mm.
[00:16:06] Cal: And without that human, it’s not getting done, and it’s keeping your business from doing more.
The second bucket is things that humans have the knowledge to do. Uh, doing regular, but they can’t afford to do it all the time. Think of quality control. Um, you’re not gonna look at every widget on the manufacturing floor. You’re not gonna listen to every customer conversation at the contact center, you know, at best sampling two 3% of the time and saying, hey, this is good enough. We’d like to do more, but this is good enough.
And then the last bucket of things, is information problems that humans might have the expertise to solve for, but the information is so overwhelming, it’s not being done today. It can’t be done today by humans.
Things like fraud detection. I’ve got a billion transactions to review and 99.99% of them are good and I’m looking for the examples that I can’t even imagine are fraud that are actually fraudulent. Right?
[00:17:00] Matt: Wow.
[00:17:00] Cal: And that’s that last category.
[00:17:01] Matt: Wow.
[00:17:01] Cal: So you’ve got these three different areas where one humans are the bottle neck and, you know, especially in the third case, you can’t even begin to solve it without using some of these algorithms to prioritize, sift through information.
So the art is, yeah, the models will be wrong, but what’s the cost of not doing it? And these are the types of problems in boring AI, that actually become really valuable in spite of the constraints when it comes to machine learning.
[00:17:31] Matt: I love that. You know, and Cal, this is what I love is you’re really doing a great job of just simplifying these down into very relatable, understandable situations, especially, you know, like you said, you started in 2016. I think the hype was just starting. How hard is it to build a business, a boring business, amidst all the hype?
[00:17:51] Cal: Um, you know, it’s funny actually. Our clients love it, and that’s part of the reason why they stay with us but I’ll tell you, I was probably two years too early with Pandata, you know, it’s almost like by the time we figure it out, it’s like, oh gosh, I wish I had started now.
[00:18:05] Matt: Mm.
[00:18:05] Cal: Um, so it’s been really interesting. I’ll tell you, it’s not easy being in the business of building AI systems when the building blocks, the tools at your disposal have been changing nearly every year.
There’s things we can do now we couldn’t do last year. That was true of every year for the last three years and we’ve gotta stay ahead of it all and also find ways to make it practical.
[00:18:29] Matt: Wow, that is amazing. And, yeah, with computing power, I think it’s, Moore’s law we’re the computing and the costs and, you know, constantly changing this is such a crazy, crazy industry.
[00:18:41] Cal: Yeah.
[00:18:41] Matt: So, gimme an example of, you know, you come up with a model.
[00:18:45] Cal: Yeah.
[00:18:46] Matt: Um, and then you work the model, to try and find these variables, what are some examples of companies that have been able to develop something or come up with something that answers that problem?
[00:18:58] Cal: So, one of my favorite examples recently, we actually worked with the health to help them with, breast cancer screening, right? An area that, you know, you might not think about as being touched by AI and they’ve got over a hundred thousand patients that, would be eligible for their annual screenings.
And, um, as they approach the age of 50 to 55, this is when they need to start coming in every year. And we help them build a model to try to predict simply, hey, if we do nothing, based on the patterns that we’ve seen in their medical records and their charts, at what age are they most likely to come in on their own for the first time without any intervention?
[00:19:40] Matt: Mm-hmm.
[00:19:41] Cal: And so we’re trying to help them identify, hey, this person’s in their fifties, early fifties, and we don’t even think they’ll come in until their 70 or 60. And that’s not necessarily good, especially if the patient’s a higher risk patient.
[00:19:56] Matt: Right.
[00:19:57] Cal: And so what we did is we narrowed this down from a hundred thousand patients to about a thousand that fall in this category of they’re approaching the age where they need to be seen, and we don’t think that they’re gonna come in on their own, hey, care coordinators.
These are factors influencing one, why, and this is important, why the model thinks. Maybe it’s because they have three or four individuals in their household above the age of 80, their caregiver. maybe it’s because they’re a single individual or, any one of these social factors tied into it.
So we’re giving them a prediction and we’re giving them an explanation of that prediction, and it empowers the care coordinator to then say, ah, I know what I need to do to be able to reach this patient. I’m gonna go do some digging. I’m gonna find their phone number, I’m gonna call them. I’m gonna try to get on their radar, and I’m gonna use this to probe a little bit more intelligently.
Maybe they need Uber credits to get here. Maybe they need some other kind of social support. So this is an example of a model that was really simple, predict the age at which they’re gonna come in, and we built some extra, you know, explainability around it and it’s having a profound impact.
[00:21:07] Matt: Wow. I love that example because you’re crunching the data that’s there, but yet it’s a human, and through this model, I love the why, it gives them the conversation to have the conversation starter. The reasoning. It gives them the empathy points.
[00:21:24] Cal: Yeah.
[00:21:25] Matt: When they make that call. I love that example because it still relies on the human to make that final connection, but it informs them to have a better conversation and a more, empathetic appeal to that.
Oh, I love that example, Cal. That is amazing.
[00:21:41] Cal: There’s certainly use cases where you, have something that’s totally automated, but in my experience, most of these like process and business automation, improvements that involve AI, the most successful ones have it designed where humans are able to act on that and make better decision.
Explainability is huge. You know, and by the way, most of our work is in healthcare, so this is why I use a lot of examples. But, I had a conversation with a radiologist that was reacting in anger to a headline that was saying, AI is better than doctors.
And he goes, we do 50 cases an hour, and we have to flip through these things real quick. Yeah. Humans are tired, they’re exhausted, they make mistakes. However, a well-trained physician can notice nuances.
[00:22:29] Matt: Mm-hmm.
[00:22:30] Cal: But, uh, even better than that is a well-trained physician armed with AI that can say, hey, I see you have 50 cases today, I went ahead and I looked through this, and of course the AI doesn’t talk, but let’s just, you know, imagine this narrative.
[00:22:46] Matt: Right.
[00:22:46] Cal: And out of these 50 cases, I found these three had some anomalies. The anomalies were here, here and here. And then these five cases, I really couldn’t make heads or tails of it. You really need to go through this on your own. And then the rest of these, I’m fairly confident there’s nothing worth reviewing.
[00:23:04] Promo Break: Hey everyone, this is Matt. And thanks for listening. Just a quick break in the middle of the podcast here to let you know there’s a couple ways that you can connect with us. The first is learn.site logic.com. That’s the learning site where you can see courses on analytics, courses on digital marketing across paid search seo, multiple disciplines. And then also you can connect with us on Slack. Go to Slack if you’re there and look for us at endless coffee cup.slack.com. Connect with us. I’d love to hear from you, hear what ails you in the realm of digital marketing. Are there courses you need information that you’d like to hear, or maybe some past guests that you’d like to hear more from? Thanks again for being a listener of the Endless Coffee Cup, and I look forward to hearing from you
[00:24:04] Matt: Right.
[00:24:06] Cal: So then, okay, what this does is for that physician’s workflow, um, they can go through the three that had the issues and verify, confirm, spend a little more time looking to see if there’s anything else that was missed or not missed, come up with their notes.
They go through the five that they know the model couldn’t do anything with and say, okay, now I’m gonna spend a lot of time here. I’m gonna review these careful and then maybe process, check the rest, right?
[00:24:33] Matt: Mm-hmm.
[00:24:33] Cal: Just make sure that everything is fine and go through those quickly. What you’ve done is you’ve optimized their time and their performance and their expertise.
[00:24:41] Matt: Wow. It’s another level of triage that goes further than visual. This is why I love talking to people who are experts in their field. I’m like, this gives me so much more hope.
Cal, I’m a fan of sci-fi. Yeah. So I think sci-fi is the, conscience of technology because what Sci-Fi asks is, okay, this technology’s great, it’s that quote from the original Jurassic Park. But I think it’s my favorite quote of all time, Jeff Goldwin says, your scientists were so preoccupied to find out if they could, they didn’t stop to ask, should they do this?
Um, because we get so wrapped up in the technology, the headlines, the promises. But yet there’s this, wait a minute, what if component? How does this apply? Because there is the fear of AI, you know, this was the basis of Terminator. This is the basis of so many movies of AI taking over. How do you see, content like that being a sci-fi fan, as well.
[00:25:39] Cal: So what’s funny is, you know, it’s all the engineering nerds that watch sci-fi, and then we try to bring it in life, bring it to life in our professions. But that being said, my take on this is, there’s AI and I think it’s such a misnomer, artificial intelligence because it’s certainly not intelligent, at least not in the definition that, you know, we approach it as humans.
There are groups out there trying to create, artificial general intelligence, and that’s an AI that, is robust enough, that it can switch contexts and tasks on its own, and perform at a human level. And honestly, I don’t know if we’re anywhere near that.
It’s hard to forecast more than 10, 20 years into the future at the rate technology is accelerating, but I’m one who thinks maybe not in our lifetime. Uh, there’s a lot of really promising aspects that AI will certainly be able to address.
I really believe in that quote from Reid Hoffman, co-pilot for every profession that’ll happen in our lifetime. Here’s the thing with these models is they’re, trained on a specific. Your job is to, maximize the accuracy on detecting scans. Your job is to create the most plausible, next best word in this sentence, right? Think of the explanation of GPT.
[00:27:00] Matt: Mm-hmm.
[00:27:00] Cal: Today, most of these models, no matter how impressive they are, are trained to maximize a singular function or a set of, terms that they’re trying to optimize for, but it can’t on its own change, its target from, I’m trying to predict the next best word to now I’m going to mislead Madden into, you know, click bait on whatever. Today humans still have to specify these objectives, so that’s one of it.
The other side of it is, and I’d be remiss if I didn’t mention this, is there’s a very real risk, and this should be scarier to everyone, frankly, than Terminator, of unintended consequences.
[00:27:43] Matt: Mm-hmm.
[00:27:44] Cal: Um, let’s consider, for example, GPT since we’ve been talking about that 40 billion documents. Do you know where those documents came? The internet. Have you been on the internet recently?
[00:27:56] Matt: Oh, oh yes. Yes. It’s the, uh, Million Monkeys, a million years typing the works of Shakespeare and we didn’t see that.
[00:28:04] Cal: You know, at best you have, mind numbing brainless content that I’m learning from, and at worst it’s, pulling some stuff that it probably shouldn’t.
[00:28:11] Matt: Right, right.
[00:28:12] Cal: Um, so there’s, patterns that it’s learning that are not desirable or maybe were fine at one point in time and are no longer fine today and it’s hard to measure when those biases are reintroduced into outputs, creative and there was a study done a few years ago by the National Research Council of Canada that I love to say , they looked at 200 different sentiment scoring systems. Something real basic.
[00:28:41] Matt: Mm-hmm.
[00:28:41] Cal: Hey, is this positive or is this negative? They came up with a thousand or so, uh, template sentences, and all they did was they replaced the name of the individual, with a European heritage name or an African heritage name.
[00:28:55] Matt: Oh.
[00:28:55] Cal: And when the African names were used, same sentence.
[00:29:00] Matt: Mm-hmm.
[00:29:00] Cal: 75% of the time, the sentiment came back more negative, more angry.
[00:29:05] Matt: Wow.
[00:29:06] Cal: Right. And so, so you, you’re learning these patterns. They’re perpetuated forward in these machines. And now we’re using these machines to guide decisions. So especially when we talk about high risk environments, which is where I work, this is, really critical to my clients and my stakeholders. You can’t do that in healthcare and it’s already happened.
[00:29:26] Matt: Yeah. So this brings up a whole aspect of the input data.
[00:29:30] Cal: Yeah.
[00:29:30] Matt: That AI machine learning, it completely relies on the data that it’s given ahead. One of the more famous examples, this is a few years ago, I’m trying to remember which photo service and labeling weddings.
if it was eastern, it labeled it as costume.
[00:29:48] Cal: True.
[00:29:49] Matt: Uh, because I think 90% was US based photos. Yeah. And so all the interpretation of what was a wedding, what was a birthday, what were specific things were very unique to US Western culture. And yet we’re, a small percentage of the world population. And so this model did not work outside of the us.
I love how you explain that, we have to have a reliability of input data and I think, the one, uh, I think good example is the policing example of if we’re going to rely on this predictive policing, explain how that you know, it leads us in, unintended consequences.
[00:30:29] Cal: Oh my gosh. That’s a whole can of worms. But, uh, without commenting on the politics of it, no…
[00:30:35] Matt: We don’t need to do that.
[00:30:36] Cal: There’s, certain regulations coming, and I’m really excited about this, by the way. The EU, AI Act should be on anyone’s radar, who cares about the future of AI. It separates machine learning and AI applications into one restricted the stuff we’re not gonna do, and predictive policing falls into that.
[00:30:52] Matt: Oh, wow.
[00:30:52] Cal: Um, and then there’s other stuff that’s considered high risk. Anything that has consequence on the human condition. Financially, uh, physically, mentally. So self-driving cars, algorithms used to prioritize care, facilitate employment decisions, loan applications, financial consequences, et cetera. So that’s all high risk.
[00:31:13] Matt: Wow.
[00:31:13] Cal: Um, but just to answer your question, predictive policing, why is that not good? Um, what ends up happening is, let’s say you start with some data that says, Hey, you know what, you’re more likely to have crime in this neighbourhood. And so we’re gonna redistribute our forces to spend more time there. Um, you’re more likely to have, proximity interactions then with, minor offenses, stuff that happens all the time, maybe in a suburb, that most people would look the other way, but because you have proximity and you have density, you have a lot more opportunity for there to be criminal activity being caught.
And then what happens is that becomes a part of your data record. And so now you have more incidences in that location that say, oh, there’s crime here. We need to send even more policing to this environment. And so you have this like reinforcement, cycle that basically you’re creating the data and the pattern by your intervention.
[00:32:13] Matt: Hmm.
[00:32:13] Cal: If that makes sense.
[00:32:14] Matt: Oh, absolutely. Are there other areas where this predictive policing problem, like in business, do we have these kind of problems in business when it comes to learning?
[00:32:26] Cal: Oh yeah. I mean like employment decisions, uh, loan applications, all this stuff that falls into the high risk territory. Um, one of my favorite examples that I like to say, it’s say, maybe not favorite, but best example to cite on this is, a readmissions algorithm, was used by a healthcare system and readmissions is basically somebody went in for a procedure and they came back within 30 days. it’s not good.
The government penalizes you and insurance will not cover it.
[00:32:52] Matt: Wow.
[00:32:53] Cal: So the hospital’s now on the hook because presumably it’s their fault they didn’t do a good enough job. Right? There’s a complication and this patient’s now back.
[00:33:00] Matt: Mm-hmm.
[00:33:01] Cal: So it makes sense. Hospitals want to know when a patient’s likely to have complications, give them extra care before they’re discharged. Total sense. Perfect for AI. What ended up happening is, um, an equally sick, black and white patient. The white patient would have a higher risk score and so basically got put ahead in line, prioritized for care. Why did this happen?
[00:33:26] Matt: Hmm.
[00:33:26] Cal: The data that was used to train this model, there’s not a flag in, your health record that says sicker. Uh, it’s not really an easy thing to solve for and so the assumption that was made, hmm let’s look at cost. People that are costing us more likely sicker find assumption, except when you realize that people from minority communities, either had less access to care so they cost less anyways, or culturally used the health system less frequently.
And so the way cost and sickness measured up in that population was different from cost and sickness in predominantly white suburbia. Um, and so they built this model and this model started to perpetuate that bias in terms of its scoring.
So these issues are, it’s not necessarily just, prone to occur in entertainment, but it’s like, these models that are rolled down into production that are used to make decisions then become a part of our data and our models keep learning from it.
And so this is why I believe there always needs to be some amount of human oversight whenever one of these models is deployed.
[00:34:34] Matt: Mm-hmm. And it’s funny because you know, we can, we talk about bias and especially when we’re looking at data training a model.
[00:34:42] Cal: Yeah.
[00:34:42] Matt: What data are we introducing? What are we doing from here? And it’s interesting because like for me, this is the ideal point where I point out to a business are you measuring the right things? You know, if you don’t have your KPIs aligned and you can show that causality from one to the next, and you understand what’s happening.
If you don’t have that aligned, then the data you feed in this isn’t going to fix, your problem. It will make it worse.
[00:35:11] Cal: Yes.
[00:35:11] Matt: If you don’t have the data alignment to start with and this is what I love is, when we get into a business, it’s like you got bigger problems. You need to define these things much earlier in the process before you can even begin to think about this.
[00:35:27] Cal: Sure.
[00:35:27] Matt: Yeah. That data training is, a big part of that. And I could list like countless times, especially in, uh, automation where companies just assumed we signed up with a marketing automation company, now everything’s just gonna happen.
[00:35:40] Cal: It takes curation. And, you know, my experience is this, you know, I’ll say that AI is one of those fields that requires experimentation to move forward.
[00:35:49] Matt: Mm-hmm.
[00:35:50] Cal: Um, a phase in every AI project, that needs to happen. I call it the building resilience phase. This is where, if you think about getting AI into the hands of a user going through two cycles, the first cycle is can we build a model mathematically? Does it work? Step one, can we even predict this?
And then step two is, hey, let’s incorporate this into workflow. Let’s turn this into an actual solution. When you move into that second. Right? And of course you wanna be thinking about that as you’re building the model, but like actually getting it integrated into workflow requires experimentation. Um, you’re going to learn the very first time a user makes a decision off of this model that there were some assumptions you didn’t get, right?
[00:36:35] Matt: Mm-hmm.
[00:36:35] Cal: Maybe the data in the real world is different than what you trained on. Maybe the business assumptions have changed slightly, or the behavior you’re trying to detect has shifted slightly. Okay, great. That’s an opportunity to refine or maybe you learn, hey, you know what? The humans that are supposed to act on this, don’t trust it.
We’ve gotta put some extra training in place. We’ve gotta add some explainability, et cetera. So there’s this building resilience phase that I see every AI project go through.
[00:37:04] Matt: Hmm.
[00:37:04] Cal: And the reality is, projects are one, not killed soon enough, or they’re killed too early as a result of going through this very natural process and so what I try to coach everyone through when they’re thinking about AI and getting into AI is it’s gotta be a portfolio mindset.
[00:37:23] Matt: Hmm.
[00:37:24] Cal: If you put all your eggs into one AI project, you’re prone to fail and say, oh gosh, that was not worth it. If you put your eggs into three to five, and you take them all through the proof of concept prototyping, stage three of them are gonna be solid and recoup the value of the rest and your organization’s going to learn how to deal with, and I’ll go back to my very first definition, or one of the first things I said, these models will be wrong.
[00:37:52] Matt: Hmm.
[00:37:52] Cal: And it requires a different way of thinking with how you incorporate it into business decisions. But if you get through that, that kind of puts you into the leaders category, the organizations that are actually using AI to drive their businesses, and affect their bottom line.
[00:38:09] Matt: I love it. I love it because, uh, it just, in so many areas, it comes down to a company or an organization, I’m gonna use the term loosely, being self-aware of…
[00:38:20] Cal: Okay.
[00:38:21] Matt: What they do. Their processes, their rules, their structures for this is how we do it and this is how we will measure it. Those organizations, and I’m sure you’ve seen this, those are the ones that are prepared to go to the next level.
[00:38:36] Cal: Sure.
[00:38:37] Matt: Because they have those structures in place, they have those rules and this is how we’ll do it. And so it’s much easier than to test something. It’s easier to automate something when you’ve got those processes and that framework in place without a framework, without anything like that.
[00:38:54] Cal: Yeah.
[00:38:54] Matt: You’re doomed to start or you’re doomed from the very beginning of how that will happen. I love how this just gets right down to basic business practices.
[00:39:02] Cal: None of it is new. Yeah, the tech is shiny, but none of the rest of it is new. You’re absolutely right.
[00:39:08] Matt: Wow. All right. Fun question before, we start wrapping up here. Why do I have to pick out the taxis? Have not bots figured out, which are pictures of bridges and which are pictures of taxis. Um, oh my, why do I have to do this?
[00:39:23] Cal: That’s a great question.
[00:39:24] Matt: You’re the best person to ask here.
[00:39:27] Cal: Oh my gosh. I taught a few courses over the years and every time I showed what do you call it? The capachos.
[00:39:33] Matt: Yes.
[00:39:34] Cal: But did you know that this is training data? My students, their minds would be blown by this.
[00:39:39] Matt: Yeah.
[00:39:39] Cal: They’d be like, oh my gosh, we’ve been feeding the bees. So, you know, no I mean there is no such thing as too much training data.
[00:39:47] Matt: Hmm.
[00:39:48] Cal: And so if you’ve noticed it shifted from, very obvious images to I don’t know in the past year or so, if you’ve noticed, they’ve been harder to solve for.
[00:39:58] Matt: It’s worse quality. I feel like we’re, training surveillance data is what I feel like when I see it now.
[00:40:05] Cal: Let’s hope it’s used for good purposes. But, what’s happening is there’s a lot of these niche problems that still are unsolved. Ambiguous pictures where something could be perceived as something else.
And actually there’s, two ways in which this data’s used. One, what are the image sets that even humans struggle with consistently?
[00:40:24] Matt: Mm-hmm.
[00:40:25] Cal: If you notice, even if you’re wrong, sometimes it’ll let you through. They’re trying to figure out where humans struggling but then also in some of these edge cases where models haven’t yet been able to master the pattern. Try to get more examples from more people to handle these edge cases.
Uh, self-driving cars have no issues in sunny San Diego weather on, you know, the coastal highway, smooth sailing.
[00:40:52] Matt: Great.
[00:40:53] Cal: Bring it to Cleveland, Ohio in the middle of a snowstorm, not so great, right? There’s not a lot of images or data. And there’s artifacts, right? Like fog, smoke, whatever you name it, that kind of get in the way of an otherwise idealized image that’s easier to handle. So these edge cases, we still don’t have good databases on yet and we won’t for some time. So, capachos’s gonna be around for a few years.
[00:41:19] Matt: Interesting, interesting. So it makes me feel like Google they’re in the AI business.
[00:41:24] Cal: They’ve been in the AI business for 20 years.
[00:41:26] Matt: Yeah, since day one almost. And it was interesting about the self-driving car too. I read a great article that talked about a human knows that a bird, that’s on the road in front of them will leave.
[00:41:36] Cal: Yeah.
[00:41:36] Matt: Whereas a car still hasn’t. How do they process this? That an animal, a bird is going to leave and you know, or if that animal does not leave, if a squirrel runs out in front of you, and that is that processing problem where the human knows instantly what’s going to happen, whereas the machine is still trying to figure this out.
[00:41:59] Cal: I’d say it’s not even trying to figure it out. It’s taught to recognize threats. Is there a threat?
[00:42:04] Matt: Ah, okay.
[00:42:04] Cal: At any given point. It doesn’t matter. There’s no consequential reasoning. A big part of why we struggle with AI is we anthropomorphize it. We give it reasoning abilities, just like a human when it really doesn’t have those.
[00:42:18] Matt: So it’s more how to react to what that is. It’s a set of rules.
[00:42:24] Cal: Or an intervention, or it’s like, you know, think of it as a, collection of systems, right?
[00:42:28] Matt: Mm-hmm.
[00:42:28] Cal: One system is saying straight, straight, straight turn, right turn, left, right? There’s another system that’s saying, hey, we’re on, X destination. This is the next stop. This is the next stop, right? There’s the mechanical system controlling the turns. There’s the GPS system that’s like mapping out, maybe taking into consideration traffic patterns, and then there’s another system that’s scanning for threats.
Hey, is there an obstacle in the road? Is there a stop sign? Is there a traffic light? Is there a whatever that needs to actuate a response? But there’s no actual planning that says, ah, there’s an object in my screen that’s going to be out of frame by the time I get there.
[00:43:04] Matt: So no central brain is kind of what you’re saying there.
[00:43:08] Cal: It’s a lot of really clever systems that are working in harmony together. There’s no prefrontal cortex, there’s no entity there wondering, you know, what they’re gonna eat at night and saying, oh, squirrel, gotta stop. Don’t wanna kill it. Right? Like, there’s…
[00:43:21] Matt: Yeah. okay, so now I’m just, so curious now about this. If all these systems are all doing one thing, then isn’t there? I don’t know, something that gathers all that in order to make a decision or am I still compromising? I know I am.
[00:43:34] Cal: It comes down to there’s a system controlling the drive left, drive, right? There’s the mechanical part, right. And what’s being fed to that at any given point in time is either signals that say, hey, slow down, speed up.
[00:43:47] Matt: Mm.
[00:43:48] Cal: Go faster and what, makes those decisions is a function of what’s coming out of the planning algorithms. Like how do you map from point A to point B? Where are you right now? It’s a function and it’s really sophisticated stuff.
I don’t mean to minimize this. There’s hundreds of systems that feed into this. Um, but it’s just a function of all of these different, very narrowly defined systems.
[00:44:10] Matt: Hmm.
[00:44:10] Cal: Scanning for threats, scanning for objects, scanning for impediments, reconciling where you’re at versus where you’re trying to go, it all funnels down into speed up, slow down, turn right, turn left.
[00:44:23] Matt: Well, it goes back to what you said each system has a very defined function. And that makes so much sense that we as humans, we give it a brain we think of it like as we would think, and I guess that’s the, fear of the AI.
[00:44:36] Cal: Yeah. And that’s why people like, they, oh my God, there’s, there must be a brain driving all of this.
[00:44:41] Matt: Wow.
[00:44:41] Cal: If there’s one thing I’d leave you with is, all of these AI systems that we look at today that seem magical. Everything from like text generation ChatGPT. To the image generation stuff to self-driving cars today, it’s nothing more than a system of very, very specific models working together, orchestrated in harmony.
If you have the knowledge and wherewithal to unpack that and understand what are the building blocks, you quickly learn that there’s no intelligence. Impressive. They’re amazing, but there’s no intelligence, at least not as we define it in the human sense.
[00:45:25] Matt: Right, right. Cal, this has been an enlightening talk. I really appreciate your time today. This has been, uh, just a lot of fun just to dig in and, learn more about what you’re doing and the real business applications. I think, we miss it a lot because of the hype and learning how it can really be applied has just been so, so enlightening today. So thank you so much.
[00:45:51] Cal: Oh, it’s my pleasure. And you know what, I’m glad I finally have something to talk about at Thanksgiving.
[00:45:55] Matt: Absolutely, absolutely. Hey, Cal, if people have questions or if they want to get in touch with you, what would be the best way for them to do that?
[00:46:03] Cal: I am very active on LinkedIn. Please connect with me there. I’d love to hear from you.
[00:46:07] Matt: All right, we’ll put that link in the show notes and, uh, wow. Cal, thanks again and listener. I hope you had a, great cup of coffee listening to this conversation. I know it’s been enlightening for me. I hope it’s been enlightening for you as well. And I look forward to our next cup of coffee on the endless coffee cup.
[00:46:26] Cal: Thanks, Matt. Thanks everyone.
[00:46:28] Matt: Thanks Cal.
You’ve been listening to the Endless Coffee Cup. If you enjoyed this episode, share it with somebody else. And of course, please take just a moment and rate or review us at your favorite podcast service. If you need more information, contact me at Site Logic Marketing dot. Thanks again for being such a great listener.
CEO, Pandata, https://pandata.co/
LinkedIn: Cal Al-Dhubaib| LinkedIn