Dr Yossi Keshet on Decoding Speech, AI, Morality, and the Future

In this episode, we explore linguistic and cultural influences on language with Dr. Yossi Keshet—a renowned expert in automated speech recognition.

We cover the intricacies of jargon, code-switching, and the ethical dimensions of artificial intelligence.

Listen to discover how the convergence of linguistics and computer science is revolutionizing our interaction with technology.

Show Notes

05:26 YOLA targets foundational industries through AI.

07:34 Automatic speech recognition similar to KJGPT model.

11:17 American English research bias in speech intelligibility.

13:33 Studying foreign languages improved understanding of grammar.

18:35 Passionate about linguistics and cognitive sciences. No AI has this capability.

20:23 Phenomenal correlation between artificial and neural mechanisms.

26:24 Innovating transcription: improving on old industry practices.

27:35 GPT’s influence on various fundamental industries.

31:56 Using multiple languages can enhance comprehension.

35:07 Switching between languages in code-switching research.

40:47 Superego: Freud’s guilt and fear mechanism. Evolutionary.

42:11 Book writing claiming need for non-standard regulations.

46:46 AI movie plot illustrates ethics in robotics.

50:25 GPT discussion focuses on personalized and helpful interaction.

53:20 End of insightful data-driven episode, future technology.

Transcript

Speaker: 00:00:00

Welcome back to another riveting episode of Data Driven.

Speaker: 00:00:03

Joining us today, lakeside and positively glowing from his

Speaker: 00:00:07

Appalachian retreat, is Frank. Meanwhile, the

Speaker: 00:00:11

always astute and ever energetic Andy is here to keep us

Speaker: 00:00:14

grounded. But enough about us. Today, we have

Speaker: 00:00:18

a true luminary in the field of AI, someone who's blending the worlds

Speaker: 00:00:22

of academia and enterprise with seamless finesse. He's an

Speaker: 00:00:25

associate professor at the Technion, has published over 100

Speaker: 00:00:29

research papers on automated speech recognition, and is the chief

Speaker: 00:00:33

scientist at Iola. Please welcome doctor Yossi

Speaker: 00:00:36

Keshet or as he's known to his friends, Yossi.

Speaker: 00:00:47

Alright. Hello, and welcome to Data Driven, the podcast where we explore the

Speaker: 00:00:50

emergent fields of artificial intelligence, data science, and,

Speaker: 00:00:55

and, of course, data engineering, without which the whole world would probably stop turning.

Speaker: 00:00:59

And you know, data engineering is important. That's

Speaker: 00:01:03

basically it. Still working on that that that revamped

Speaker: 00:01:06

monologue, for, for season 8, Andy. Were

Speaker: 00:01:10

you on vacation? You're on vacation. I am on vacation. And

Speaker: 00:01:14

for those of you who can't see on camera who are not who are

Speaker: 00:01:17

listening, not watching, I am literally lakeside,

Speaker: 00:01:22

in the foothills. Well, not the foothills. We are actually in the Appalachian Mountains. Or

Speaker: 00:01:25

is it Appalachian? I I never I I've heard of those. I I never

Speaker: 00:01:29

got a clear read on it. Say either. So, you know When I say either.

Speaker: 00:01:32

Yeah. Yeah. Yeah. Yeah. Yeah. So I am in Deep Creek Lake,

Speaker: 00:01:36

Maryland, which is kind of like, Maryland doesn't really have a Panhandle

Speaker: 00:01:40

per se, but if it did, it would be this is what this would be.

Speaker: 00:01:44

I probably think I'm 5 miles from West Virginia and about

Speaker: 00:01:47

20 miles from Pennsylvania. So it's kind of like this quiet

Speaker: 00:01:51

little corner of the state.

Speaker: 00:01:54

And I've been, you know, reading and studying

Speaker: 00:01:58

today. I hit day 600 on Pluralsight Consecutive. Nice.

Speaker: 00:02:02

So recording this June 17th. And, how

Speaker: 00:02:06

things with you, Andy? Things are good. I'm gonna throw out a plug for

Speaker: 00:02:10

data driven media dot tv because Frank mentioned.

Speaker: 00:02:13

If you're listening, he while he was mentioning that, he was

Speaker: 00:02:17

actually panning the camera over to the lake. But if

Speaker: 00:02:20

you're, subscribing to data driven media dot tv, you get

Speaker: 00:02:24

to see us. You get to see the video, and you

Speaker: 00:02:28

can see, for instance, that I am wearing the, my data is the

Speaker: 00:02:32

new oil t shirt, which you can pick up. I'm just full of

Speaker: 00:02:35

sponsor stuff today. I'm just doing Well, it's self out. It's

Speaker: 00:02:39

self sponsored. And, honestly, we really need to get better at that. Right? We have

Speaker: 00:02:43

data channel. Tv. There is a for listeners to the show, I will give

Speaker: 00:02:47

a preview. There is gonna be data driven academy is is launching soon. You have

Speaker: 00:02:50

a course coming up the end of the month. Actually, yeah, it's fabric.

Speaker: 00:02:55

Today. We're recording this on 17th. It's 24th

Speaker: 00:02:59

of of June, but I'm also doing, 2 more, at

Speaker: 00:03:03

near the ends of July August. And in addition

Speaker: 00:03:07

to that, while we're shameless plugging away here,

Speaker: 00:03:10

before we get to our very interesting guest, now I'm also bringing

Speaker: 00:03:14

back my, day of Azure Data Factory as wildly

Speaker: 00:03:18

popular. I delivered it at a couple of, conferences,

Speaker: 00:03:22

international conferences, 22, 23. And,

Speaker: 00:03:27

yeah. Let's see see if people are interested. What do you do Friday this

Speaker: 00:03:31

afternoon Friday afternoons, Andy? Oh, there's this thing, Frank. Thanks for

Speaker: 00:03:34

mentioning that. Totally free. We we gotta we're trying to get better at this. That's

Speaker: 00:03:37

all. We do. Yeah. Data engineering Fridays. And if you go to data engineering

Speaker: 00:03:41

fridays.com, you can learn more about that. Frank, you're doing a lot

Speaker: 00:03:45

of stuff with I noticed with using the, encore

Speaker: 00:03:49

replay feature in Restream. And it's

Speaker: 00:03:52

right you you shared that with me. I started doing that with data engineering

Speaker: 00:03:56

Fridays as well. But great a great way to,

Speaker: 00:04:00

you know, to get your message out there. And, you

Speaker: 00:04:04

know, I I had no idea replays would help. But my gosh.

Speaker: 00:04:08

They really have. It's just a matter of just hitting the echo of I

Speaker: 00:04:11

can't even talk. Algorithm the right way. Yeah. And Yeah. You know,

Speaker: 00:04:15

maybe we can get the so I think it's a good segue, for our

Speaker: 00:04:19

guest. Doctor Yossi, Keshet. He's the chief

Speaker: 00:04:22

scientist at AIOLA, an AI powered tech

Speaker: 00:04:26

company that automates business workflows

Speaker: 00:04:30

by capturing spoken data. Yossi is also

Speaker: 00:04:33

an associate professor at the Faculty of Electrical and Computer

Speaker: 00:04:37

Engineering at the Technion in Israel.

Speaker: 00:04:41

Yossi is an award winning scholar and has published over a 100 research

Speaker: 00:04:44

papers about automated speech recognition and speech

Speaker: 00:04:48

synthesis. Welcome to the show, Yossi. Hi.

Speaker: 00:04:51

Nice for having me. Thank you for having me. Hey. No problem. No

Speaker: 00:04:55

problem. We are very excited to have you. And, you're not just an

Speaker: 00:04:59

academic, but you've also proven yourself in in actual enterprise. So

Speaker: 00:05:04

which sounds really bad as I say that out loud, but I think you knew

Speaker: 00:05:06

there was a compliment.

Speaker: 00:05:12

But, so what is AIOLA?

Speaker: 00:05:16

Can you tell me a little bit about that? Because I'm curious about that and

Speaker: 00:05:19

and and workflows

Speaker: 00:05:23

around spoken data. So

Speaker: 00:05:27

Iola is a company that is aimed to target

Speaker: 00:05:30

the, you know, the very basic and foundational

Speaker: 00:05:34

industries. Maybe if I

Speaker: 00:05:38

may, let's start with the a general scene of the

Speaker: 00:05:42

automatic speech recognition now, and then you will understand where are YOLA stands because we

Speaker: 00:05:45

have now open AI and everything is like we you

Speaker: 00:05:49

can say we solve the AI problem. So it's not like that.

Speaker: 00:05:53

So we are in a in a amazing shape in in

Speaker: 00:05:57

terms of automatic speech recognition. So we we have a paper that shows

Speaker: 00:06:01

that whisper, the model of OpenAI, is as good as humans in

Speaker: 00:06:04

detecting and transcribing language when we speak about

Speaker: 00:06:08

American English with noise, without noise, and

Speaker: 00:06:12

also, l 2 speakers. That is the

Speaker: 00:06:15

speakers of non non native American speakers of the

Speaker: 00:06:19

language. And the the results are so whisper. The

Speaker: 00:06:23

OpenAI model is the same as human listeners. And that is

Speaker: 00:06:26

the main thing. But the thing is that

Speaker: 00:06:30

when you come to industries, usually they have jargon, they have special words.

Speaker: 00:06:35

And and those words are either rare in

Speaker: 00:06:38

their language or they they they are not none

Speaker: 00:06:42

word. It's like I don't know. I when I'm a medical doctor and would like

Speaker: 00:06:46

to make a surgery surgery and I would like to transcribe what I'm saying during

Speaker: 00:06:49

the surgery. I'm there isn't words that which are not

Speaker: 00:06:53

often used or which are none, non English words. And

Speaker: 00:06:57

in that case, those, automatic speech recognizer doesn't

Speaker: 00:07:00

work at all. They don't detect those words. And in Ayala, this

Speaker: 00:07:04

is our target to take those words, which are actually the most important word. Those

Speaker: 00:07:08

are the jargon of the of the industry of the of the facility.

Speaker: 00:07:13

So the goal is to help those industries to come

Speaker: 00:07:17

up with the with the automatic speech recognition for

Speaker: 00:07:21

reporting for transcribing speech.

Speaker: 00:07:25

I have a question. When you say automatic, what what makes it automatic? Is

Speaker: 00:07:29

it just kinda, what exactly does that mean?

Speaker: 00:07:34

So automatic speech recognition today works very similar

Speaker: 00:07:38

very, very similar to the way KJGPT works.

Speaker: 00:07:41

KJGPT works on a model called transformer. It's an, deep

Speaker: 00:07:45

learning architecture, which has, a

Speaker: 00:07:49

history based on previous recurrent architectures.

Speaker: 00:07:53

And it can predict, as as we all know, it can

Speaker: 00:07:56

predict text amazingly. In speech recognition, automatic

Speaker: 00:08:00

speech recognition, it's almost the same thing, but there is another

Speaker: 00:08:04

component, to the to the to the

Speaker: 00:08:08

this transformer, which is which is called encoder.

Speaker: 00:08:12

This this part take the speech and actually transfer it to

Speaker: 00:08:15

a great representation that can be used

Speaker: 00:08:19

with this, with this, let's call it with this with the other side, with

Speaker: 00:08:23

this, GPT together. Together, they can,

Speaker: 00:08:27

transcribe speech in, as I described, in a very good

Speaker: 00:08:30

way, as good as humans in some

Speaker: 00:08:33

cases. I will say, like,

Speaker: 00:08:37

I've been messing around with the app that's on the phone,

Speaker: 00:08:41

for, chat g p chat gbt, and,

Speaker: 00:08:45

I use the the voice interaction feature. It is

Speaker: 00:08:49

amazingly good at getting rid of the umms, the ahs,

Speaker: 00:08:52

the scatterbrain thoughts that I sometimes have when I talk to it.

Speaker: 00:08:56

Like, it it could kinda really distill a lot of

Speaker: 00:09:00

things. Like, I'm impressed with it. It's it's really gotten last time I

Speaker: 00:09:03

did anything serious with speech recognition was probably, like, maybe 4 years

Speaker: 00:09:07

ago, and it's really improved. Like, I mean, orders of magnitude

Speaker: 00:09:11

than I thought. I mean, it's it's it's it's almost at Star Trek level. You

Speaker: 00:09:14

know? I'm not sure

Speaker: 00:09:18

in those it depends on the company if it's Apple or

Speaker: 00:09:21

Google. And I'm not sure which they don't declare

Speaker: 00:09:25

which models they use. I think, personally, they don't use this whisper or

Speaker: 00:09:29

the latest model that we have for automatic speech recognition that

Speaker: 00:09:32

is transcribing speech. And the goal is a little bit different

Speaker: 00:09:36

in the in the phone. You actually want to maybe Right. Make,

Speaker: 00:09:40

make notes, send an email, send a text message,

Speaker: 00:09:44

and maybe the vocabulary the vocabulary is less

Speaker: 00:09:48

less defined. There is another problem with

Speaker: 00:09:51

the phones. Oh, no. Go ahead. I want to call my

Speaker: 00:09:55

friend. His name is xi, and

Speaker: 00:09:59

the last name is CHUNG. How do you pronounce it?

Speaker: 00:10:03

What what do you do with that? I'm gonna say he or chi or

Speaker: 00:10:07

so there is a there is a problem of proper name and how do you

Speaker: 00:10:10

define them. And this is a completely different problem. It's still an open problem, and

Speaker: 00:10:14

the goal is a little bit different. So

Speaker: 00:10:18

it's when we assessing the quality of those models, it's

Speaker: 00:10:22

a little bit different than the assessment of just spoken language

Speaker: 00:10:26

like what we do now. No. I mean, that's a great point. I mean, my

Speaker: 00:10:30

last name has, you know, technically is Lavin.

Speaker: 00:10:34

But, you know, growing up for for reasons many,

Speaker: 00:10:38

big and small, it became Lavinia. And like, so, like,

Speaker: 00:10:42

the phone, depending on if it's Android or an Apple, it will, it

Speaker: 00:10:46

will he gets confused pretty easily.

Speaker: 00:10:50

And that is an interesting point. Some names, Andy is lucky to have an

Speaker: 00:10:54

easy name for the, the system.

Speaker: 00:10:58

But not everybody does. So I understand that. Sure.

Speaker: 00:11:02

I also wanna double click on American

Speaker: 00:11:06

English. You you you said that a bunch of times. Like, is there is there

Speaker: 00:11:09

an inherent bias in these model trainings because these are done by American

Speaker: 00:11:13

companies? Yes. There is. Okay. The

Speaker: 00:11:17

day the data is mostly of American English. The research institutes

Speaker: 00:11:21

are mostly American. So the reason maybe I don't know

Speaker: 00:11:24

if you'd call it you call it inherent or implicit bias, but there is a

Speaker: 00:11:28

bias, definitely.

Speaker: 00:11:33

We are investigating, by the way, the the intelligibility

Speaker: 00:11:37

of speech in some cases And what is the intelligibility of

Speaker: 00:11:40

of American listener versus the inter intelligibility of

Speaker: 00:11:44

myself, which I'm not American listener, but I I know English.

Speaker: 00:11:48

What is the best, what is the best, double quote speaker? What is the best

Speaker: 00:11:51

listener? How can we transform those

Speaker: 00:11:57

to speech recognizer? How can we transform those to assessing the

Speaker: 00:12:01

quality of speech? What does it mean? What does it mean about the pathologies in

Speaker: 00:12:04

speech? And this is ongoing research on

Speaker: 00:12:08

this on this field. Interesting.

Speaker: 00:12:12

I I often wonder, like, you know, what it's not just English.

Speaker: 00:12:16

Right? Like, you know, if you listen to Spanish, like, there's different dialects of

Speaker: 00:12:19

Spanish. Right? Even even German. You know, I'm sure

Speaker: 00:12:23

there's, you know, plenty of dialects of all these languages and,

Speaker: 00:12:26

like, how do you the training of a

Speaker: 00:12:30

model that where it can get to be as good at

Speaker: 00:12:33

understanding x and x versus x and y versus, you know,

Speaker: 00:12:37

the base language, the base standard. I don't know. That's

Speaker: 00:12:41

fascinating. It seems like it seems like it could be an endless loop of, like,

Speaker: 00:12:45

training. It it is. Indeed, it

Speaker: 00:12:48

is. And when we train, there is another so I'm I'm

Speaker: 00:12:52

working on deep learning and AI. And what we found out

Speaker: 00:12:55

that it it may it may be the case that if you train

Speaker: 00:12:59

on 1 language, huge amount of data from 1 language, let's say

Speaker: 00:13:03

American English, but then train on less data on Spanish,

Speaker: 00:13:07

you actually get you get some advantage of training from

Speaker: 00:13:11

from the American English. So, again, in this modern whisper of

Speaker: 00:13:14

OpenAI, most of the data is American English, but,

Speaker: 00:13:18

actually, other languages are really great.

Speaker: 00:13:22

Again, Spanish is amazing. So maybe like

Speaker: 00:13:26

humans maybe like humans as we learn more and more languages, it's easier

Speaker: 00:13:29

for us. This is very interesting, point.

Speaker: 00:13:33

No. That's an interesting idea because I know, like, I never

Speaker: 00:13:37

understood American English grammar, American or otherwise,

Speaker: 00:13:41

until I studied a foreign language. And then when I studied it, it was German.

Speaker: 00:13:45

And, you know, German kept a lot of the archaic things that

Speaker: 00:13:49

are in English and kept them and kept make kept them,

Speaker: 00:13:53

made continue to keep them important. Like in English, you know, who

Speaker: 00:13:57

and whom used to confuse the you know what out of me.

Speaker: 00:14:01

Right? But when I when I learned in German about different cases and things

Speaker: 00:14:04

like that, I was like, oh, that's why it is. Right? So,

Speaker: 00:14:08

like, all these things that just like you said, like, learning another

Speaker: 00:14:12

having more data or data from another point of view, I suppose,

Speaker: 00:14:16

or another way to look at the world help me look at my world

Speaker: 00:14:20

a little better. Maybe maybe that's how

Speaker: 00:14:24

AI will work too. I don't know.

Speaker: 00:14:28

Maybe. We don't know. We we actually have a guess about that

Speaker: 00:14:32

because it those networks actually solve an optimization problem,

Speaker: 00:14:35

mathematical optimization problem. It's a problem that

Speaker: 00:14:40

that is, we define it with equation, and we need to have

Speaker: 00:14:44

a computer running and solve it. The equation is

Speaker: 00:14:48

overtraining set of examples. So it's 1

Speaker: 00:14:51

1 person say that, another person said something else.

Speaker: 00:14:55

And what happened is that when, again, when we have

Speaker: 00:14:59

a large amount of data,

Speaker: 00:15:03

it seems that those those networks get to an amazing place.

Speaker: 00:15:07

So this this, algorithm, this whisper or other

Speaker: 00:15:10

algorithms, it's really from the recent years, like 2, 3 years.

Speaker: 00:15:14

That's it. We it's they they perform amazingly

Speaker: 00:15:18

amazingly, with the with the

Speaker: 00:15:22

same with the same mechanism, not with the same amount of

Speaker: 00:15:25

data. Yeah. That's that's that's the

Speaker: 00:15:29

fascinating aspect of all of this. It's just that some of these things just seem

Speaker: 00:15:33

some problems seem harder than they ought to be,

Speaker: 00:15:37

and then some solutions to problems seem way more effective than they

Speaker: 00:15:41

ought to be. It's an interesting also to say

Speaker: 00:15:45

it's always the case that we so Whisper, OpenAI Whisper, was trained

Speaker: 00:15:49on: 600000 Speaker: 00:15:53

way, way much more than just a kid learning a language.

Speaker: 00:15:56

Kid language learning a language exposed to way much less hours of

Speaker: 00:16:00

speech, less less accurate, less,

Speaker: 00:16:04

coherent. And this is something,

Speaker: 00:16:08

Nom Chomski raised years ago, like, 50 years ago.

Speaker: 00:16:12

And it's still an open question. Like, if we can make those

Speaker: 00:16:16

system works better, if we know the language,

Speaker: 00:16:22

I guess you learn German faster than any

Speaker: 00:16:25

machine that works today.

Speaker: 00:16:30

That's yeah. It's it's and I'm glad you mentioned Noam

Speaker: 00:16:34

Chomsky because that kinda was like so for those who don't know, Noam

Speaker: 00:16:37

Chomsky is, among other things, a noted linguist scholar.

Speaker: 00:16:42

I highly recommend you do a search on him because that's a that's a

Speaker: 00:16:46

good Wikipedia rabbit hole to fall into. But,

Speaker: 00:16:50

how much does linguistics come up in this? Right? Because I think

Speaker: 00:16:54

what's fascinating about this field for me is a lot

Speaker: 00:16:57

of, my grandfather, my great grandfather

Speaker: 00:17:01

was a a linguistic professor. And, you know, as the

Speaker: 00:17:05

family lore goes, I never met him. He died decade or 2 before I was

Speaker: 00:17:08

born. He spoke, like, 12 languages. He was a professor of, like, 5

Speaker: 00:17:12

or 6. And, you know, a lot of people in my family

Speaker: 00:17:16

seem to have on that side of the family seem to be gifted in language.

Speaker: 00:17:20

And 1 of the fields I was tempted to to study in

Speaker: 00:17:23

university was linguistics. And I just find

Speaker: 00:17:27

it interesting how there's

Speaker: 00:17:31

a now a Venn diagram now is much larger

Speaker: 00:17:35

than it used to be in terms of linguistics and computer science.

Speaker: 00:17:38

So what are your thoughts on? Like, how much does like,

Speaker: 00:17:42

if you're if you have a

Speaker: 00:17:46

company like AIO. Right? Like, how many people are, you know, honest to

Speaker: 00:17:50

goodness, linguists versus computer scientists and and AI engineers?

Speaker: 00:17:55

So there is there is no no linguists there. Oh,

Speaker: 00:17:59

really? Okay. There are no linguists. But I have to tell you, so there was

Speaker: 00:18:02

a professor called Freddie Frederick, Jelinek. He was the

Speaker: 00:18:06

head of language, research at the John Hopkins University

Speaker: 00:18:10

at Baltimore. He was amazing. He was 1 of the smartest,

Speaker: 00:18:14

people on earth. And he said he was

Speaker: 00:18:18

developed many of the speech recognition algorithms. He said,

Speaker: 00:18:22

every time I fire a linguist, the performance of speech recognizer goes

Speaker: 00:18:26

up.

Speaker: 00:18:32

And this is, this is embarrassing. But I've been I

Speaker: 00:18:36

made myself, 1st, really like

Speaker: 00:18:40

linguistics. I really like cognitive sciences, and I really

Speaker: 00:18:44

try to combine it with with my work. But it's really

Speaker: 00:18:47

amazing that we don't have all those AI system

Speaker: 00:18:51

don't have any of that. So you don't train CEGPT

Speaker: 00:18:55

to what is a noun, what is a verb, what is anything. You don't train

Speaker: 00:18:59

speech that this is the

Speaker: 00:19:02

this is the you don't you don't use linguist. You don't use this is

Speaker: 00:19:06

the prominent word. This is the end of the sentence. It just happened

Speaker: 00:19:10

by huge amount of data. And

Speaker: 00:19:14

this is interesting. This is somehow contradict Noam Chomsky who said that

Speaker: 00:19:17

there there is a universal grammar. There is a

Speaker: 00:19:21

we are born innate with language. There is a

Speaker: 00:19:24

maybe some black box in our brain which

Speaker: 00:19:28

is tuned to learn a language. And,

Speaker: 00:19:33

we are not sure about that. There is no direct proof if it's correct or

Speaker: 00:19:37

no. We are born with language. We are as humans, we're

Speaker: 00:19:40

born with language. We this is part of our, human being.

Speaker: 00:19:44

We are not born with written language. So written language was invented.

Speaker: 00:19:48

The spoken language is something like like a zebra

Speaker: 00:19:52

has stripes. This is this is our nature, and this is

Speaker: 00:19:56

interesting. This is not happening not happening in

Speaker: 00:19:59

AI. The best success that didn't have linguist, they don't have any

Speaker: 00:20:03

restriction of what should be say or not.

Speaker: 00:20:10

Maybe maybe AI will be a tool to somehow

Speaker: 00:20:15

make the linguist research more effective and

Speaker: 00:20:18

try to understand what happened in the brain, what happened in the cognition part.

Speaker: 00:20:23

But I would like to tell you about another research we are preparing here, which

Speaker: 00:20:27

is really amazing. 1 of the thing is that we have

Speaker: 00:20:31

so there is this JGPT. It's a language model.

Speaker: 00:20:35

We also have something in the brain. It's also neural network.

Speaker: 00:20:38

And we when we try to compare them, there is a huge

Speaker: 00:20:42

correlation between the the what happened in the artificial neural

Speaker: 00:20:46

network of GPT and the neural

Speaker: 00:20:50

biological neural network in the brain. And, it was

Speaker: 00:20:54

shown, several years ago, and here we

Speaker: 00:20:57

show it again with, with this, with the most modern,

Speaker: 00:21:01

automatic speech recognizers. So this is

Speaker: 00:21:05

a phenomenal post correlation between the artificial and the

Speaker: 00:21:09

neural mechanisms. I was gonna ask about that

Speaker: 00:21:13

because I'm I'm familiar with, you know, at least the abstracts of

Speaker: 00:21:17

the research, from a few years ago and now. And

Speaker: 00:21:20

I was curious if there had been any new correlations

Speaker: 00:21:24

or, you know, or new research, new connections that have been made

Speaker: 00:21:28

between machines learning languages

Speaker: 00:21:32

and the way our brains work. It sounds like

Speaker: 00:21:36

that's true.

Speaker: 00:21:39

So we try to we just initiate, man,

Speaker: 00:21:43

a research here in my lab about that. There was

Speaker: 00:21:48

some French guys from, mainly King

Speaker: 00:21:52

and his colleague at, Meta. And

Speaker: 00:21:57

and I forgot the university in France. So they

Speaker: 00:22:01

show that there is those correlation. They show simple correlation. What we

Speaker: 00:22:05

they show it with LLM, with language model. What we show is a little bit

Speaker: 00:22:09

different. We show correlation with automatic speech

Speaker: 00:22:12

recognition. So we ask people under fMRI, under MRI.

Speaker: 00:22:16

They're we scan their brain at some

Speaker: 00:22:19

resolution, and we try to find correlation with their brain activity

Speaker: 00:22:23

during reading and during speaking aloud,

Speaker: 00:22:27

and ask what is the correlation with the the best model we know for

Speaker: 00:22:31

speech recognition. And then there are correlation.

Speaker: 00:22:35

I have to say that there is a mechanism in the transforming this

Speaker: 00:22:39

architecture of neural network. There is a mechanism called attention. This

Speaker: 00:22:42

mechanism allow those model to to have the connection between

Speaker: 00:22:46

worlds and themselves. So, I'm eating an

Speaker: 00:22:50

apple. It was delicious. So it refers to the apple.

Speaker: 00:22:54

Okay? So there is attention mechanism. This what makes those

Speaker: 00:22:57

model amazing. So there is attention mechanism, I guess, in the

Speaker: 00:23:01

brain. So we try to correlate the this attention mechanism in

Speaker: 00:23:04

the models and compare it to what the activity in the brain. We don't have

Speaker: 00:23:08

results yet, but it seems promising. And we also ask

Speaker: 00:23:12

another question. What if you don't read aloud? What if you read

Speaker: 00:23:15

like silent reading? What if you have dyslexia? What if you have,

Speaker: 00:23:19

other type of, pathology? What

Speaker: 00:23:23

what are the correlation then? So this is fascinating. So and

Speaker: 00:23:27

there is correlation. I don't I don't know still what what's going to happen

Speaker: 00:23:31

with that. But I I know the pathologist, but it's unbelievable, the

Speaker: 00:23:34

correlation. That that is really exciting,

Speaker: 00:23:38

especially when you're examining things like dyslexia,

Speaker: 00:23:41

which is considered, you know, not normal,

Speaker: 00:23:45

or maybe that's not the right term for it, but a

Speaker: 00:23:48

challenge at a minimum. The cool the cool kids call that neurodivergent

Speaker: 00:23:52

now. I think Neurodivergent. Thank you, Frank. So when you're studying, you

Speaker: 00:23:56

know, when you're studying that sort, I'm wondering if there's a place for

Speaker: 00:24:00

that, in in the artificial.

Speaker: 00:24:04

I'm curious. What what do you mean? Can you

Speaker: 00:24:08

So, yeah, is there is is there any benefit

Speaker: 00:24:12

to, I say, transferring the thought processes

Speaker: 00:24:16

of people who are neurodivergent and and automating that

Speaker: 00:24:20

and making that part of the, you know,

Speaker: 00:24:23

the the language model or or speech recognition?

Speaker: 00:24:29

Yeah. I think so. I think so. 1st, it's a it's a tool

Speaker: 00:24:33

to to an to analyze what happened in the

Speaker: 00:24:36

brain. Yeah. What happened

Speaker: 00:24:40

but it's very difficult. So we don't, we don't have any debugger for the build

Speaker: 00:24:44

the brain. We don't see the code of the brain. We don't see that this

Speaker: 00:24:47

function doesn't work. And it's, most of the work

Speaker: 00:24:51

is to design the experiment and

Speaker: 00:24:55

and it's really amazing. In our design, we have the

Speaker: 00:24:58

same so as yet as I told you, I'm asking people to read aloud

Speaker: 00:25:02

and compare it to what automatic speech recognition,

Speaker: 00:25:06

is plan is, supposed to do. But I'm

Speaker: 00:25:09

also asking people to read silently, and then I follow

Speaker: 00:25:13

their eyes. I have a make a make a machine that follows their eyes, and

Speaker: 00:25:17

I know where where is the where like, III

Speaker: 00:25:20

track their eyes and I see which wall they are reading

Speaker: 00:25:24

now. And I can and I can use that to follow

Speaker: 00:25:28

what what they read. But in order to operate that on a speech

Speaker: 00:25:32

recognizer model, I need the speech. So it's during the design of

Speaker: 00:25:35

the experiment, I need artificial speech or I need them to to read aloud

Speaker: 00:25:39

afterwards. It's a it's a big, it's a big question

Speaker: 00:25:43

how to do that properly and how to

Speaker: 00:25:46

make things happen, but definitely walking with

Speaker: 00:25:50

people with, with problems first to help them.

Speaker: 00:25:55

And second, to understand them. And 3rd, to maybe make

Speaker: 00:26:00

understand the brain and make, AI better.

Speaker: 00:26:04

I also think, like, stroke victims, right, could benefit down the line

Speaker: 00:26:07

from a better understanding of lang language models. Right? Like, maybe there would be some

Speaker: 00:26:11

kind of therapy that could be directed to that. I think I think it's

Speaker: 00:26:15

fascinating. I always love those fields where they touch upon more than 1 thing.

Speaker: 00:26:19

Right? This isn't just math. This isn't just computer science. Like, it's linguistics. But,

Speaker: 00:26:23

you know, it's a little bit of everything. It's like a giant, like, pot of

Speaker: 00:26:26

stew that you just throw a bunch of stuff in, and it all kind of

Speaker: 00:26:28

mixes. And, like, it's kind of like, almost like intellectual gumbo,

Speaker: 00:26:32

I guess, would be the word. Right? But,

Speaker: 00:26:37

what what,

Speaker: 00:26:42

what drove you to make, your your your

Speaker: 00:26:45

your company? Like, what what was the driving force to

Speaker: 00:26:49

say, hey. You know, we have

Speaker: 00:26:54

I remember many, many years ago in an office, and you would always see

Speaker: 00:26:57

doctors talking into these little, like, miniature recorders.

Speaker: 00:27:01

Right? In the olden days, they would go off to

Speaker: 00:27:05

some data center somewhere and somebody would not data center, but, like,

Speaker: 00:27:08

some piping center, call center where people would

Speaker: 00:27:12

transcribe that. You know, obviously, that is now an artifact of

Speaker: 00:27:16

the past as these models have gotten better.

Speaker: 00:27:22

What what was the goal in in in, your

Speaker: 00:27:25

company to say we can do this better? What what was the the that breakthrough

Speaker: 00:27:29

moment of, like, here's here's what the industry already does. Here's how we can do

Speaker: 00:27:33

it better. So there is

Speaker: 00:27:36

so we all know Check GPT, and it influence our life. We search now

Speaker: 00:27:40

instead of Google, we search with GPT and it's amazing. It's unbelievable.

Speaker: 00:27:45

So I thought, what about the very fundamental industries? What

Speaker: 00:27:48

about,

Speaker: 00:27:52

like, when you check-in when you, check an airplane, you

Speaker: 00:27:56

use a special jargon. You cannot touch anything. You cannot

Speaker: 00:28:00

leave even a pen there because otherwise the the plane wouldn't be,

Speaker: 00:28:04

valid for flight. What about industries like the food

Speaker: 00:28:08

industries when you need to report, the process? You

Speaker: 00:28:12

have gloves, you cannot touch an iPad, you cannot barely

Speaker: 00:28:15

write. And what about, other industries

Speaker: 00:28:19

like, maybe the cheap technology when you make nanotechnologies and

Speaker: 00:28:23

when you make chips, you make, you know,

Speaker: 00:28:26

silicon chips and silicon

Speaker: 00:28:30

first. So you need you you are cover all.

Speaker: 00:28:34

You are with gloves. You need to report the process. It's a all

Speaker: 00:28:38

those industries has this have special jargons. They use special

Speaker: 00:28:41

terms to describe what they're doing. They don't have access to

Speaker: 00:28:46

to to write something,

Speaker: 00:28:51

and they are very limited in the way they they provide. And on the other

Speaker: 00:28:54

end, we had speech recognition, but speech recognition doesn't work on

Speaker: 00:28:58

those jargon world. Those jargon world are actually the

Speaker: 00:29:02

most important to those industries, and this was the goal for

Speaker: 00:29:05

Iola. So what we do is we operate,

Speaker: 00:29:08

automatic speech recognition, the best automatic speech recognition,

Speaker: 00:29:12

but we also operate something else. We also operate something called keyword spotting.

Speaker: 00:29:16

It's another deep network, which is focused

Speaker: 00:29:20

on detecting only the jargon words. So you can define those jargon

Speaker: 00:29:24

words in advance. You don't need to train them. You you can

Speaker: 00:29:28

define them, and it they all work together. They work like, as a

Speaker: 00:29:31

complimentary, couple to make a

Speaker: 00:29:36

very robust prediction, and we can detect those,

Speaker: 00:29:41

jargon words and make reporting on on on on the

Speaker: 00:29:44

process, without just by speaking. So it

Speaker: 00:29:48

can it can use in any industries,

Speaker: 00:29:51

any, industry that doesn't

Speaker: 00:29:55

have access to the most modern AI system, the speech

Speaker: 00:29:59

recognizer wouldn't walk there. They have problems, like,

Speaker: 00:30:03

writing and formulating their reports.

Speaker: 00:30:06

Yeah. So I'm curious how those work together. You mentioned

Speaker: 00:30:10

that you've got the speech recognizer. You've got the keyword,

Speaker: 00:30:15

engine. Are they 2 separate engines that are just always running

Speaker: 00:30:18

maybe agents, running at the same time or are

Speaker: 00:30:22

they encapsulated, say, is the speech

Speaker: 00:30:25

recognizer does the speech recognizer have a, you know, a

Speaker: 00:30:29

subset or a a function built into it to do the

Speaker: 00:30:33

keyword recognition? So just to

Speaker: 00:30:37

be sure, those keywords in some industries are not are

Speaker: 00:30:40

not are not English words. So it can be a word which nobody

Speaker: 00:30:44

knows about. It was not shown in the in

Speaker: 00:30:47

the, like, in the Internet, like, JGPT strain on the data over the

Speaker: 00:30:51

Internet. There are some walls that are not not there. This is

Speaker: 00:30:55

your, proprietary company. You have invented a wall to

Speaker: 00:30:58

describe what is the this, part of the engine. So

Speaker: 00:31:02

Yeah. So what we so we have this keyword spotting. It was it it

Speaker: 00:31:06

is trained to detect keyword in general. They are defined by,

Speaker: 00:31:10

by text and it operates. We have 2 model for preparation. 1 of them

Speaker: 00:31:13

works on the this encoder part of

Speaker: 00:31:17

the of the automatic speech recognition, and then it guides.

Speaker: 00:31:20

It's still the speech recognition towards the correct

Speaker: 00:31:25

transcription. And there is another mode, which is,

Speaker: 00:31:29

our self, encode our self representation of

Speaker: 00:31:32

speech, and then it also guides the automatic speech

Speaker: 00:31:36

recognition to a better, location and to detect those

Speaker: 00:31:39

words. And, actually, we can show that you can buy combine

Speaker: 00:31:43

any word can be from different languages, and we can

Speaker: 00:31:47

detect them, like, almost 100% correct, those jargon

Speaker: 00:31:50

words. That was that was going sorry. Go ahead.

Speaker: 00:31:55

No. No. No. Sorry. That no. That's okay. That that makes perfect

Speaker: 00:31:58

sense now, what you just said about the languages using

Speaker: 00:32:02

multiple languages, you know, English plus all of the

Speaker: 00:32:06

other languages because sometimes

Speaker: 00:32:09

people will struggle if their English as a second

Speaker: 00:32:13

language speaker. They'll struggle to find the right

Speaker: 00:32:16

English word, and they'll substitute a word from their native language.

Speaker: 00:32:20

And in other cases, they'll be perhaps teaching

Speaker: 00:32:25

on a topic, and they may revert back

Speaker: 00:32:28

to an older language, Greek, Latin, something

Speaker: 00:32:32

like that. That may be part of the, the

Speaker: 00:32:36

lecture or, you know, I could see that in

Speaker: 00:32:39

medicine. I could see it in, you know, all all sorts

Speaker: 00:32:43

of literature studies. I could see a lot of that. And that

Speaker: 00:32:47

that kinda clicked for me as you were saying that that makes sense that you

Speaker: 00:32:50

would have additional languages. Yeah. I also wonder, like, in in

Speaker: 00:32:54

also conversational context. Right? Like, you know, Spanglish is a

Speaker: 00:32:57

thing. Frankel is is the French and

Speaker: 00:33:01

English kinda mashed together, and I know that other language

Speaker: 00:33:05

whenever you have 2 groups of people kinda come together, like, you know, there's always

Speaker: 00:33:08

some kind of weird mix of language that that kinda

Speaker: 00:33:12

just evolves either naturally or forced. I mean, that's Right. That's another

Speaker: 00:33:16

debate. Are you thinking belt or creole? I know we're Belter, you know, I

Speaker: 00:33:20

wasn't going there, but that that's a that's an excellent example.

Speaker: 00:33:24

So, Yosie looks very confused. So so there's a series of

Speaker: 00:33:27

books, called The Expanse. It was an excellent TV show

Speaker: 00:33:31

for about 6 seasons, and it's basically set, 2,

Speaker: 00:33:35

300 years in the future.

Speaker: 00:33:38

And as humans colonize the asteroid belt,

Speaker: 00:33:42

their people from all over the world kinda all end up living

Speaker: 00:33:46

together. So, like, the the Belter Creole language is this is a

Speaker: 00:33:49

creole of, you know, literally dozens of languages. Right?

Speaker: 00:33:53

So, like, it'll switch from, you know, Hindi to Arabic to,

Speaker: 00:33:57

English to French to there's even some German in there. I've heard some of that.

Speaker: 00:34:01

Like, and there are these kind of these weird mixes of things. Right? So they'll

Speaker: 00:34:05

say the the word for the Belter people, like,

Speaker: 00:34:08

people live in the Belk, is Beltaloda. Belt obviously comes from, you

Speaker: 00:34:12

know, the asteroid belt English. Loda, I think is a Hindu term. I

Speaker: 00:34:16

think. Don't hate on me in the comments. Don't hate on me in the comments.

Speaker: 00:34:19

But, I know Walla is a is a is a Hindu term. Right? So

Speaker: 00:34:23

they'll they'll, you know, when they talk to people who live in the Earth or

Speaker: 00:34:26

Mars, they refer to them as well wallahs, gravity well

Speaker: 00:34:30

wallahs. Right? Like so it's like, and I only know wallah because

Speaker: 00:34:34

of dish wallahs, and Wired Magazine did a whole story about dish wallows in

Speaker: 00:34:38

the nineties. Anyway, but I mean, I think, like, you know, I

Speaker: 00:34:42

I suppose that approach could work for something like a creole. Right? Like, we have

Speaker: 00:34:46

multiple languages kinda mixed together. Or is that not really a

Speaker: 00:34:50

massive business case?

Speaker: 00:34:54

It's Creole is really complicated. It's a language. It's like real real a

Speaker: 00:34:57

real language, and it's complicated. This the the more

Speaker: 00:35:01

delicate cases of that, what we call in research, code switching when

Speaker: 00:35:05

I'm Right. When I speak Hebrew, for example, I don't have a

Speaker: 00:35:08

word for the, you know, the Internet router. So I say the router in

Speaker: 00:35:12

in English. Or I said email or I will say

Speaker: 00:35:17

I don't know. There are so many words in English that are used especially

Speaker: 00:35:21

in technology that you use worldwide in other languages, and this

Speaker: 00:35:24

is code switching. There is another case. I think Andy pointed it

Speaker: 00:35:28

out that sometimes when you are stressed

Speaker: 00:35:32

or let's say your l 1 is Spanish, but l 2 is American

Speaker: 00:35:36

English or you're bilingual. And sometimes when you are

Speaker: 00:35:39

stressed, you you just switch the the 1

Speaker: 00:35:43

word and it this is amazing phenomena. This is a research with Tamar Golang

Speaker: 00:35:47

from, University of San Diego and Matt Goldrick from Northwestern

Speaker: 00:35:51

University. And I provide, again, a mechanism to detect

Speaker: 00:35:55

that and to make research of that. And the the key question is,

Speaker: 00:35:58

like, why do you do that? Why do and when do you do that? Is

Speaker: 00:36:01

it stress? What what what is the what is the state of

Speaker: 00:36:05

describing those? Are you gonna describe it in the American

Speaker: 00:36:09

way, the Spanish word, or is it gonna be vice

Speaker: 00:36:13

versa? And this is really interesting.

Speaker: 00:36:18

It's not my field of research. I just know how to detect them

Speaker: 00:36:22

and, and Interesting. To detect them really well,

Speaker: 00:36:26

but I don't know why it happens and what is the mechanism

Speaker: 00:36:29

behind that. I could definitely see,

Speaker: 00:36:35

the opportunity with starting with being

Speaker: 00:36:38

able to detect, you know, these I

Speaker: 00:36:42

don't I don't know the right word for them. I'll I'll call them modes. You

Speaker: 00:36:46

know, a mode of speech where someone is mixing 2

Speaker: 00:36:49

languages. And I'm sure those vary.

Speaker: 00:36:53

So Like when I go Jersey on you. Right? That's we we

Speaker: 00:36:57

can't we can't say any more about that, Frank. We're trying to keep our

Speaker: 00:37:00

clean rating. But yes. Exactly. But,

Speaker: 00:37:05

that's sorry. Inside, Joe. But the,

Speaker: 00:37:08

but, yeah, I could see modes of speaking where someone who is

Speaker: 00:37:12

more familiar with English as a second language.

Speaker: 00:37:16

And and they've still you know, of course, they know their native language. They'll always

Speaker: 00:37:20

know that. But as they I don't I don't wanna use the wrong word

Speaker: 00:37:23

here, but I'm thinking experience is probably the best word is they get more

Speaker: 00:37:27

experience, gain more experience with their second language.

Speaker: 00:37:31

They may switch words less or switch languages

Speaker: 00:37:35

less. And detecting that, I think, is the

Speaker: 00:37:38

is key. I understand now more about what what you're doing, what

Speaker: 00:37:42

you're accomplishing. And that that's the

Speaker: 00:37:46

very first step to then being able to produce speech

Speaker: 00:37:50

in those different modes. And that would be a

Speaker: 00:37:53

fascinating, you know, a fascinating accomplishment.

Speaker: 00:37:58

If you do, the more we can have. Machines

Speaker: 00:38:01

speak to us in the language that we're most familiar with, that,

Speaker: 00:38:05

of course, you know, is is almost there now, mostly

Speaker: 00:38:09

there right now, but have it be able to to speak to us in these

Speaker: 00:38:13

different modes where we where the machine switches where it's

Speaker: 00:38:17

back to our first language, you know, based

Speaker: 00:38:20

on some algorithmic calculation. That sounds

Speaker: 00:38:24

fascinating. Yeah. It is.

Speaker: 00:38:27

I'm not sure we are there yet. It's we have a long way to go

Speaker: 00:38:31

there. But, Sure. Yeah. Makes

Speaker: 00:38:34

sense. Fascinating. Well, this is how it starts, though. Right?

Speaker: 00:38:41

This is fascinating. This is, yeah, this is,

Speaker: 00:38:45

somehow there is an elephant in the room. There we may have to say

Speaker: 00:38:48

something about AI and their regulation and what happens now.

Speaker: 00:38:53

And, if I may, I would like to say something about this because I have

Speaker: 00:38:56

a deep totally different point of view about that.

Speaker: 00:39:01

Please. So everybody is speaking about

Speaker: 00:39:05

regulation and it might be a catastrophic situation

Speaker: 00:39:10

if those, machine are connected

Speaker: 00:39:13

together and they start to train themselves. They try to

Speaker: 00:39:17

build a meta architecture and try to train themselves,

Speaker: 00:39:21

and then they come up with something which is better than human. Some some people

Speaker: 00:39:24

call it the singularity point. So this is frightening. They're smarter

Speaker: 00:39:28

than us. Maybe they they're gonna kill us all. And

Speaker: 00:39:33

people say now people speak about regulation now, and there are

Speaker: 00:39:36

several institutes in Europa, in Europe and in, the US

Speaker: 00:39:40

trying to tackle that. And that

Speaker: 00:39:44

is amazing. That is really important, but I think we missed something here.

Speaker: 00:39:49

And I'll tell you why. So the so there is a book. It's here.

Speaker: 00:39:53

You know, Isaac Asimov, I, Robot. You probably

Speaker: 00:39:56

know that. So he, like, the first page of this book is like the 3

Speaker: 00:40:00

laws of robotic. A robot may not in in injury a

Speaker: 00:40:04

human being or through an interaction, allow human being to come to harm.

Speaker: 00:40:08

A robot must obey others and so on. So we have let's say

Speaker: 00:40:12

we have the regulation. AI cannot hurt humans. Okay?

Speaker: 00:40:16

But that doesn't enough. It's not good enough because if the AI is smart

Speaker: 00:40:20

enough, it will not do the I mean, it will

Speaker: 00:40:23

show us humans that it really obey the law

Speaker: 00:40:27

the laws, but it wouldn't. And this is frightening.

Speaker: 00:40:31

And here I suggest to look a little bit about the human morality

Speaker: 00:40:35

and what why human are have do they have laws? So we need to

Speaker: 00:40:39

think about, if I may, think about the

Speaker: 00:40:43

human psychology. In human psychology, we have a mechanism to obey law.

Speaker: 00:40:47

It's called the superego. It was embedded or defined by

Speaker: 00:40:50

Freud. So we have a mechanism that if we

Speaker: 00:40:55

if we doesn't we if we don't obey a law, we feel either

Speaker: 00:40:58

guilt or fear. And this mechanism was evolutionary.

Speaker: 00:41:02

So do we have a group of monkey? They obey

Speaker: 00:41:07

the the alpha monkey because they're frightened from him. They have some kind of

Speaker: 00:41:10

primitive superego. We obey the law because either we fight them from the

Speaker: 00:41:15

police or either we feel the guilt, we

Speaker: 00:41:18

we it's like the

Speaker: 00:41:23

those experiments that show that, there is, somebody,

Speaker: 00:41:26

left something on the table, and we don't take it because we feel guilt or

Speaker: 00:41:30

we feel something. So this is this mechanism, what

Speaker: 00:41:33

I claim, should be transferred to the

Speaker: 00:41:37

AI machine. This should be the regulation. So what is it superego? Superego

Speaker: 00:41:41

is a infrastructure for to be moral,

Speaker: 00:41:45

and we need a digital version for that for the this is the regulation we

Speaker: 00:41:48

need. We need the infrastructure to be moral in machine. And what it what

Speaker: 00:41:52

does it mean? So superego means that it's a little bit like

Speaker: 00:41:56

self harm, if I may. It's like we feel guilt. We feel something bad if

Speaker: 00:42:00

we do something not okay, if you're not obey the law.

Speaker: 00:42:04

So it's like a self destruction for AI machine. So AI machine,

Speaker: 00:42:07

if it doesn't obey the law, should feel something. It

Speaker: 00:42:11

cannot feel so. Right. It will distract itself. So this is my

Speaker: 00:42:15

claim. This is a book I'm writing, and this is something very fun fundamental.

Speaker: 00:42:19

We we all speak about this regulation, but I think it

Speaker: 00:42:22

it doesn't help just to to do standard

Speaker: 00:42:26

regulation. And if you if I may say another thing, the last thing is that

Speaker: 00:42:30

if you read the I, Robert, carefully, so

Speaker: 00:42:34

he speak there are several short stories there, and he speak about robots that

Speaker: 00:42:37

obey the law. And if you look carefully about those robots that

Speaker: 00:42:41

obey the law, the those robots have super all

Speaker: 00:42:45

all of them have have super ego. They feel guilt.

Speaker: 00:42:48

The the first story is about a robot that play with a girl,

Speaker: 00:42:52

and he feel guilt about winning all the time. So he let her win.

Speaker: 00:42:56

So he feels guilt. It means that it has superhego.

Speaker: 00:43:00

And then he feels frightened from the mother of the girl. And it's

Speaker: 00:43:04

really amazing. So I think, so

Speaker: 00:43:08

this book I'm trying to describe the psychological concept of superego

Speaker: 00:43:11

and then describe why it need to be more and how we can,

Speaker: 00:43:16

find a way to put it in regulation, like the the infrastructure

Speaker: 00:43:19

itself and not just lows.

Speaker: 00:43:23

That is a very interesting problem you're trying to solve.

Speaker: 00:43:27

Very important problem at that. Agreed. And

Speaker: 00:43:31

culturally, we speak, in the US, we have a saying that you

Speaker: 00:43:35

cannot legislate morality, which

Speaker: 00:43:38

legislate, regulate would be, you know,

Speaker: 00:43:42

synonyms. Exactly. Right? So Right. Right. And and legal code

Speaker: 00:43:46

is code. I I

Speaker: 00:43:49

definitely get what you're what you're saying. And I think it's super

Speaker: 00:43:53

important. You mentioned you were writing a book about this. Now

Speaker: 00:43:57

now now you have to tell me more because I wanna read this book.

Speaker: 00:44:00

Same. I'm in the process of looking

Speaker: 00:44:04

for an agent and it's, it's complicated. It's supposed

Speaker: 00:44:08

to be a popular book trying to explain the psychology of fraud.

Speaker: 00:44:12

What is, superego, ego, and the id,

Speaker: 00:44:16

and then describe what is the pathology? So we all have a pathology. So

Speaker: 00:44:20

you have the pathology of, it's called,

Speaker: 00:44:29

the, personalities criminal personality disorder. This

Speaker: 00:44:33

person will not have a super ego, ego ego. It's like Richard the

Speaker: 00:44:37

third from Shakespeare. He didn't have superego. He killed

Speaker: 00:44:40

his family and didn't feel guilt. So this wouldn't what's

Speaker: 00:44:44

going to happen with the with the with those machine. And then I

Speaker: 00:44:48

give some literature examples of,

Speaker: 00:44:51

what is a superego like from the, criminal and

Speaker: 00:44:55

punishment that that the guy killed the the

Speaker: 00:44:59

old lady, but he didn't he nobody,

Speaker: 00:45:02

caught him killing the lady. He murdered her. Nobody caught him, but he

Speaker: 00:45:06

still feel guilt. So he has a very, big

Speaker: 00:45:10

superego. And then we describe I describe, what happened in

Speaker: 00:45:13

other moral theories of human being, all of them connected to the

Speaker: 00:45:17

superego. And then I tried to describe a little bit how machine

Speaker: 00:45:21

learning is trained. Again, solving an optimization problem. And then I try

Speaker: 00:45:24

to describe how can we do superego with, how can we have

Speaker: 00:45:28

a digital superego if we can? No.

Speaker: 00:45:32

It's like you're giving it a conscience of of sorts. Exactly.

Speaker: 00:45:36

Yeah. And I I just wanted to, to add, we

Speaker: 00:45:40

may be able to help you. Maybe not find an

Speaker: 00:45:44

agent, but find a publisher. Both Frank and I are

Speaker: 00:45:47

published. And we, you know, we know Andy has a lot of

Speaker: 00:45:51

Andy's got a lot of connections in the publishing. Well That would be

Speaker: 00:45:54

great. I am I am not, I just wrote a lot of books

Speaker: 00:45:58

for different, publishing houses, and I know some people that if

Speaker: 00:46:02

they can't help you directly, they can probably point you to someone who

Speaker: 00:46:05

can. And, again, I am wholly motivated by wanting to

Speaker: 00:46:09

read this book. Same. Like, I think it's important

Speaker: 00:46:13

because I live in the Washington DC area. Right?

Speaker: 00:46:16

So so, like, there's a lot of people there who they're policy

Speaker: 00:46:20

makers. Right? Like, and they just assume

Speaker: 00:46:24

and I think a lot of humans fall for this. Right? You you see this

Speaker: 00:46:27

when the European Union passed their AI regulation act.

Speaker: 00:46:31

They assume that regulation's gonna solve all their problems.

Speaker: 00:46:34

And I think regulations prove that 1 of the fundamental forces

Speaker: 00:46:38

in the universe is is unintended consequences.

Speaker: 00:46:42

And, you know, when you regulate something, you don't end

Speaker: 00:46:46

the problem. You change the way people will route around it. Right? Like,

Speaker: 00:46:50

and I think a good example of this in AI is the movie Megan, which

Speaker: 00:46:53

I don't know if you've seen, or m threagan. I'm not sure how to pronounce

Speaker: 00:46:56

it, where I think she was about to torture

Speaker: 00:47:00

she was I don't wanna give the plot away, but the the robot

Speaker: 00:47:04

child, Chucky, kinda goes evil, Like, this is the

Speaker: 00:47:07

basic kind of plot line, and the the the person who created her

Speaker: 00:47:11

was like, you can't kill me because it's against your programming. He goes, oh, I

Speaker: 00:47:14

said nothing about killing you. I was gonna put you in a coma, and you'll

Speaker: 00:47:16

live, you know, however many years. Like, it was just like I mean,

Speaker: 00:47:20

that's a great example of, like, she you know, don't kill. Right? Seems like a

Speaker: 00:47:23

pretty reasonable instruction to give a robot, particularly a child's toy.

Speaker: 00:47:28

They'll kill anyone. But, you know, she was realized, like, well, kill

Speaker: 00:47:32

equals death. So if I don't kill you, if I just hospitalize you or

Speaker: 00:47:35

incapacitate you, that doesn't conflict with rule number 1.

Speaker: 00:47:38

Right? Which I think is no. Obviously, as, you

Speaker: 00:47:42

know, humans, we're like, well, it's not really the spirit of the

Speaker: 00:47:46

law, or the rule. But clearly,

Speaker: 00:47:50

the robot or the AI in this case, kind of figured it

Speaker: 00:47:53

out. Like, I don't know. I think you're right. Like and any regulations like that

Speaker: 00:47:57

too. Right? How many loopholes do people discover, whether it's

Speaker: 00:48:01

tax laws or, you know, this. It's like, well, technically, it's

Speaker: 00:48:05

legal. Is it actually, you know,

Speaker: 00:48:09

what the law intended? No. Like, it's Yeah. You need a you need

Speaker: 00:48:13

almost an something like a Nuance engine,

Speaker: 00:48:16

you'll see to Yeah. To get the the

Speaker: 00:48:20

what the machine to interpret

Speaker: 00:48:24

to the laws. And that's I've read Asimov as well,

Speaker: 00:48:28

big fan. And that's what happens down stream of

Speaker: 00:48:31

the 3 laws as they begin to fail as because the

Speaker: 00:48:35

robots are doing exactly what they're programmed to

Speaker: 00:48:39

do. And they're not they're they're

Speaker: 00:48:43

finding ways that in our opinion, human opinion,

Speaker: 00:48:46

circumvents the 3 laws, but really doesn't

Speaker: 00:48:50

break the robot's programming. And it's all about, you know,

Speaker: 00:48:54

how do you define harm? Like, Frank's example is a great, you know,

Speaker: 00:48:58

great example of that. So, yeah,

Speaker: 00:49:01

fascinating stuff. Yeah. We gotta Awesome stuff. We gotta help you write this

Speaker: 00:49:05

book. I wanna read this book. Yeah. I want to raise

Speaker: 00:49:09

another point, but the opposite point that you raised. Like, what happened with

Speaker: 00:49:12

the autonomous car, for example, or people say,

Speaker: 00:49:18

let's let's let's focus on autonomous cars. So so there will be

Speaker: 00:49:21

autonomous car. Who is in charge of a of a car accident?

Speaker: 00:49:25

Accidentally, somebody was killed. You are the

Speaker: 00:49:29

owner you. Somebody is the owner of the car. He sits

Speaker: 00:49:33

there. He bought the car, but the car killed

Speaker: 00:49:36

somebody. So

Speaker: 00:49:40

who who this is an open problem. This is, again,

Speaker: 00:49:43

moral problem. So what I suggest here is

Speaker: 00:49:47

maybe it will take time,

Speaker: 00:49:51

I guess. Maybe the the car, if we can be the

Speaker: 00:49:54

superego and mechanism for morality, you know, the just

Speaker: 00:49:58

the infrastructure for morality can take the

Speaker: 00:50:02

morality of the human. And if somehow he

Speaker: 00:50:05

inherit the the the driver morality, you

Speaker: 00:50:09

can blame the driver. I'll give you another example, which will be much

Speaker: 00:50:13

more maybe concrete. So we say now that there will be change GPT for

Speaker: 00:50:17

every person, for every laptop and iPhone and whatever.

Speaker: 00:50:21

You will have your own GPT with your own life follows

Speaker: 00:50:24

your own history. And the discussion with this GPT will be, And the

Speaker: 00:50:28

discussion with this, GPT will be very personalized and

Speaker: 00:50:32

very helpful. What happened in that case? So in that

Speaker: 00:50:35

case, if this, GPT

Speaker: 00:50:39

will take your responsibilities and morality, somehow we

Speaker: 00:50:43

can copy your morality and be part of it. So if you're moral, it

Speaker: 00:50:47

will be moral. If you're not, you're not, but this is

Speaker: 00:50:50

your responsibility as a human. And I think this

Speaker: 00:50:54

is the way to to go with that. We need just the infrastructure and not

Speaker: 00:50:57

the the law. Anybody can define the low, and anybody

Speaker: 00:51:01

can break the low. We just need the infrastructure to know that

Speaker: 00:51:06

at least the machine to know that it break the broke the low.

Speaker: 00:51:11

And and this is really important. I I think

Speaker: 00:51:16

Oh, I totally agree. Totally agree. Well, we're

Speaker: 00:51:20

gosh. We're coming up on time, Frank. Yeah. This was

Speaker: 00:51:23

awesome. So we'll just any

Speaker: 00:51:27

book recommendations? Obviously, I, Robot, I think, would be good reading

Speaker: 00:51:30

in this space. You also mentioned Shakespeare too,

Speaker: 00:51:34

Richard the 3rd. So Eddie, you can book

Speaker: 00:51:38

which I'm which I'm reading now, which is the band,

Speaker: 00:51:41

Vernon Stuputeux. It's, it's

Speaker: 00:51:45

amazing. It's amazing. It's 3 books, and it's actually

Speaker: 00:51:49

discussed whatever which is not AI. Anything which cannot be solved with

Speaker: 00:51:52

AI. It's speak about a a person who has a vinyl shop,

Speaker: 00:51:57

shop to sell vinyl and then CD runs, and now we cannot sell

Speaker: 00:52:00

anything. So this shop is is closed, and then he

Speaker: 00:52:04

he he try to somehow manage, but he get up at the street. He's, like,

Speaker: 00:52:08

homeless, and he meets many people. And the way like,

Speaker: 00:52:12

every chapter is a different, person or

Speaker: 00:52:15

or a group of pair of people, and it's really

Speaker: 00:52:19

fascinating. It's all those things that you cannot solve with AI. It's all

Speaker: 00:52:22

the human interaction, the very, very basic human interaction. Amazing.

Speaker: 00:52:26won the Booker Prize in the,: 2018Speaker: 00:52:32

Nice. Where can folks find out more about

Speaker: 00:52:35

you? So I have a website

Speaker: 00:52:39

under Joseph Keshet, and, and they

Speaker: 00:52:43

can find me there. Excellent.

Speaker: 00:52:47

Any parting thoughts, Andy? No. Just great great

Speaker: 00:52:50

interview. I appreciate that. 1, I would ask if you repeat the name of

Speaker: 00:52:54

the book you just mentioned about the the different stories.

Speaker: 00:52:58

What's the name of that book? It's not it's a it's a single

Speaker: 00:53:01

story. It's called the the pants,

Speaker: 00:53:06

for non subtext. It's from French. Oh, okay.

Speaker: 00:53:11

Amazing. Amazing. Amazing. Awesome. Excellent. That's it. That's

Speaker: 00:53:15

it for me. But that's great talk. Thank you. Excellent talk. Thank you.

Speaker: 00:53:18

And we'll let Bailey finish the show. Well, folks, that brings us to the end

Speaker: 00:53:22

of another enlightening episode of data driven. We've

Speaker: 00:53:26

navigated the fascinating intricacies of automatic speech

Speaker: 00:53:29

recognition, explored the moral quandaries of AI, and

Speaker: 00:53:33

pondered the future of technology with none other than 1 of the best minds

Speaker: 00:53:37

in the field, doctor Yossi Keshet. Remember, if you

Speaker: 00:53:40

enjoyed today's conversation, don't forget to subscribe to data

Speaker: 00:53:44

driven media TV for exclusive video content.

Speaker: 00:53:48

You can also grab some fantastic merch like the my data is the

Speaker: 00:53:52

new oil t shirt Andy's sporting today. And while Frank is

Speaker: 00:53:56

basking in the Appalachian sunshine, you can bet we're already cooking up the

Speaker: 00:53:59

next episode to keep your data driven minds engaged and entertained.

Speaker: 00:54:04

Until next time, stay curious, stay informed, and

Speaker: 00:54:08

always keep questioning. Cheerio.