Skip to content
Exploring Machine Learning, AI, and Data Science

Advanced Fraud Prevention in the Age of Artificial Intelligence

In this episode, Andy and Frank sit down with Pavel Goldman-Kalaydin, head of Artificial Intelligence and Machine Learning at Sumsub, a global company specializing in KYC, AML, and anti-fraud technologies.

They explore the challenges in verifying identities remotely, the rise of deep fakes for fraud, and the use of AI and machine learning to combat these threats. From discussing the impact of technology on security measures to Pavel’s journey in the field of computer science and AI, this episode offers insights into the evolving landscape of fraud detection and the intersection of technology, AI, and security.

Join us as we delve into the complexities of anti-fraud measures and the fascinating world of AI and machine learning.

Show Notes

00:00 Securing customer journey from onboarding to verification.

04:44 2 years ago, typical attack to open account.

06:58 German video identification process prolongs account opening.

12:16 Analyze data patterns to make informed decisions.

13:34 Questioning deep fake implications for customer data.

17:42 Advancing technology makes image manipulation easier.

22:32 Financial fraud: creating defects for unexpected reasons.

25:53 Fascinating progress in beta software development.

29:23 Samsung creates its own products, understands customers’ needs.

29:58 Problem with defects, educate and ensure understanding.

34:01 Interest in drug development and AI technology.

38:57 Audible sponsors Data Driven with free audiobook.

41:05 Please rate and review our podcast.

Transcript
Speaker:

In this 349th episode of data driven, we are pleased

Speaker:

to interview Pavel Goldman Khaledin, where he's the head of artificial

Speaker:

intelligence and machine learning at Sumsub.

Speaker:

Sumsub isn't your average AI startup. They're

Speaker:

globally recognized for their work in k y c, AML,

Speaker:

and anti fraud technologies. Our guest is the

Speaker:

wizard behind the curtain, crafting tech to outsmart financial

Speaker:

fraud does and deep fake artists. Quite the

Speaker:

digital Sherlock Holmes, if you will. Now here are

Speaker:

Frank, Andy, and Pavel.

Speaker:

Hello, and welcome to Data Driven, the podcast where we explore the emergent

Speaker:

Fields of data science, artificial intelligence, and,

Speaker:

of course, data engineering, which is basically the underpinning of it

Speaker:

all. And with me on this, journey is my favorite data

Speaker:

engineer of them all, Andy Leonard. How's it going, Andy? Good, Frank.

Speaker:

How are you? I'm doing alright. We we were recording this, the day

Speaker:

after we did a 2 hour show,

Speaker:

Kinda by accident, don't I see our guest in, it look kinda had this

Speaker:

look of, uh-oh. No. It's not gonna turn. I can't do that today.

Speaker:

But we are very excited here to in spite of our issues with Microsoft

Speaker:

Bookings, in spite of our crazy hectic schedules, And in

Speaker:

spite of your allergies and, really tasty jelly jam and

Speaker:

and and biscuits Really sorry about that. No.

Speaker:

I I don't know what it is on the East Coast this week, man. It's

Speaker:

it's well below freezing, and I'm sneezing. Oh, that rhymed.

Speaker:

Allergy station should be over for me. I don't know what's going on. For real.

Speaker:

But our guest is actually, from Berlin,

Speaker:

and one of my favorite cities in the world. In fact, they were singing the

Speaker:

virtual green room. Had I lived in Berlin instead of Frankfurt, I probably

Speaker:

never would have come back to New York,

Speaker:

or the US, but he is

Speaker:

our guest today is Pavel Goldman Kaledin. Hopefully, I said that

Speaker:

right. He is the head of AI and ML

Speaker:

at Sumsub, a global know your customer anti

Speaker:

money laundering, anti fraud company, and,

Speaker:

we're we're welcome to we're happy to have him. Although, I don't think he's in

Speaker:

Berlin today. I think he's somewhere a bit warmer. Welcome to the show, Pavel.

Speaker:

Yeah. Hi, guys. Happy to be here. Good. Good. So I have

Speaker:

a lot of questions. You know,

Speaker:

first off,

Speaker:

I think I can kinda see the map, but What's the

Speaker:

connection between know your customer, KYC,

Speaker:

anti money laundering, and anti fraud? I think I think

Speaker:

I see it, but I wanna hear you you kinda walk me through it because

Speaker:

I haven't had enough coffee either today. So so what's the, like, what's

Speaker:

the common thread? Because, like, because I I've not seen those 3

Speaker:

kinda put together in kinda 1,

Speaker:

sentence, but I can kinda see why. But

Speaker:

I I I I can try to explain. But the thing is and we actually

Speaker:

this is what we focus on. So we try to secure as a company.

Speaker:

We try to secure the whole customer journey from

Speaker:

onboarding. So this is the first step of when, for instance, like, I'm in a

Speaker:

bank. So So I want to onboard some of my customers, and I want to

Speaker:

make sure that this has real persons, for instance, that are not fraudsters.

Speaker:

So I want to onboard them, make sure they are,

Speaker:

that person, they actually pretend to be. And then and

Speaker:

here's the thing. If I can, for instance, like, I'm a Journey

Speaker:

person. But a month later. There could

Speaker:

be some, you know, strange patterns of, you know,

Speaker:

financial transaction happening. So probably, there are some sort of a pattern of

Speaker:

money laundering. So this is where transaction monitoring comes.

Speaker:

So you can actually this is a person. So this is but knowing customers are

Speaker:

very simple. You can actually I mean, you

Speaker:

can So basic basic attack is to be just pretend to be,

Speaker:

a person. You you are not, basically. But then even if I'm

Speaker:

not, I'm just a real person, I can actually, yeah, come up with some sort

Speaker:

of, you know, few things to

Speaker:

do. And then where just we try to monitor it, and then from a permit,

Speaker:

make sure that, Okay. We can actually flag the transaction and

Speaker:

then make sure it's it's it's getting looped. And then, I mean, there is a

Speaker:

flag raised, and then, Probably, we can do

Speaker:

something about that. This is just, like

Speaker:

this. If we're talking about anti fraud, and here's the

Speaker:

thing. Sometimes it's very easy to see that something fish is

Speaker:

happening. So for instance, like, A very like, 2 years ago, it

Speaker:

was a very typical attack. So I tried to, you know, open a bank

Speaker:

account or, like, remotely, And I actually, I'll leave somewhere

Speaker:

else, or I don't I I use a stolen document. What

Speaker:

what I can do To do that, I can actually just print out the

Speaker:

image of a person and just try to make sure that actually the

Speaker:

KFC provider like us Tried to make us

Speaker:

believe that I'm a real person. That was a very, you know, typical attack 2

Speaker:

years ago. Now it's very easy to detect. Still peep some people use

Speaker:

it. And that's it. And that's that for us. It is very easy to do

Speaker:

that. But probably, I mean, this is not a real person. Some of you trying

Speaker:

to use the printed out images. This is

Speaker:

Fraud. We can actually or reject it or or ask a person. Can

Speaker:

you well, I mean, we need your real real pay real real image.

Speaker:

Or we can just tell our customers that, you have to take a look because

Speaker:

there was something fishy going. And then it goes and goes and goes. And the

Speaker:

whole customer journey, We try to make sure that the fraud is not happening. This

Speaker:

is basically it. So

Speaker:

fraud is kind of, I think, Cyber fraud or whatever the cool

Speaker:

kids call it, I think is has has infected

Speaker:

every industry. I mean, if I just I

Speaker:

mean, I I get 2 factor authentication logging in the

Speaker:

roadblocks, like, for my kids. Right. And I'm like,

Speaker:

they'll they'll they'll they'll get in front of their device, and they'll be like, can

Speaker:

you tell me what the passcode is that they texted you? Like, Sometimes

Speaker:

some days it's the only way I see 1 of my kids. But,

Speaker:

has the because I I wonder, like, has the pandemic kind of Accelerated

Speaker:

kind of virtual fraud, or is that just independent?

Speaker:

I think it I think it is. Because it, right now, it's but it's not

Speaker:

Related to fraud. Exactly. But the thing is is that now

Speaker:

people are used to actually work

Speaker:

remotely, Or it's so it's not that common for you

Speaker:

to go to bank in person. So you just call there. You just I

Speaker:

mean, use over the internet, basically. It's like easier

Speaker:

So and now you can actually, there is no way, you can actually verify

Speaker:

that this is the only person. Right. Yep. And this is a final thing because

Speaker:

for instance, in Germany, where I reside, most of the

Speaker:

time. There is a regulation called it's called

Speaker:

video ident. So for in Germany, in order For for

Speaker:

me, if you are going to open an account, anyway, I really have to call

Speaker:

a person, a live in person operator, And talk to him, and he makes

Speaker:

sure that or she makes sure that, a a million person. But everybody

Speaker:

do not like it, basically. Because, I mean, it it takes time. You have to

Speaker:

talk, talk to a person. I I just want to open an account. So it's

Speaker:

it's it's it's fast as I'm but but except Germany, all of the rest

Speaker:

of European Union, I think across the world as well. It's, I mean, you

Speaker:

just Send your image or video, some of your documents,

Speaker:

and then the the account is up. So it's very easy. And people get you,

Speaker:

getting used to it. And that's why it's easier to to to actually,

Speaker:

do fraud because it's, I mean, it's it's a soldier to trade off,

Speaker:

Make it easier, and then it's easier for fraudsters to actually do their business. So

Speaker:

that's that's the thing. Gotcha. Do you see,

Speaker:

you mentioned you see, Like, new scams, people

Speaker:

are running as well. And you also mentioned a lot

Speaker:

of what I I thought would be pretty effective ways to to

Speaker:

combat those scams, without really

Speaker:

giving anybody any ideas. Are there, like, brand new

Speaker:

scams that have happened maybe in in the very recent past

Speaker:

that, you're still working on ways to combat?

Speaker:

I must say that, there is there will always

Speaker:

be some sort of, you know, arms, right.

Speaker:

Competition? Yeah. So you have to say or. There will always be,

Speaker:

like, a new prod Of yours.

Speaker:

And then we have to actually deal with that. But I can tell you a

Speaker:

story. So for instance, like, so we asked him so not a big company. Yeah.

Speaker:

The technology team is not that So big, we have to move fast. But

Speaker:

in my team, the AI slash, ML, it's not

Speaker:

anti money laundering, but artificial intelligence slash machine learning. We have

Speaker:

a very small department aimed at creating defects.

Speaker:

So we do not detect defects. We have to actually learn how to create

Speaker:

them So you actually know how I mean, how people actually read Oh,

Speaker:

that makes sense. So synthetic data. Interesting. Yeah. Yeah.

Speaker:

And this is at and I can also tell you that I mean, and this

Speaker:

is for me, it was, like, so sorry if, you know, a surprise because,

Speaker:

Most of the like, let's talk about defects. So, yes, then what what is like

Speaker:

recent type of fraud? Deepest, for sure. We had a report. I

Speaker:

I think it, We published it 3 years 2 days ago or like

Speaker:

yesterday on friends. So what's actually happening right now?

Speaker:

And the thing is that deep fakes, They use usage of

Speaker:

defects for fraud. It maybe it rest

Speaker:

like 5 times. So like 2 years ago, like nobody actually

Speaker:

knew so About defects. But now it's it's very easy to craft. It's

Speaker:

very easy to craft. I mean, people like I mean, you are a fraudster. You

Speaker:

have to actually, it's very rare

Speaker:

prefer for you to just craft just 1 defect. It's usually something

Speaker:

we call the serial fraud. You create like hundreds of defects. So now it's easy,

Speaker:

very easy to create them. So now it's like a craft, like, hundreds

Speaker:

of identities. And then I tried to bypass our security checks. So that's why this

Speaker:

is like the recent trend. I mean, as so it's on the news,

Speaker:

basically. And then we have to actually try to make sure that our solution,

Speaker:

can detect it. And it's not sometimes, it's not that easy. Well,

Speaker:

it sounds like, you know, there's there's stuff that people used

Speaker:

years ago, and you've got that figured out. And it's probably not being used

Speaker:

as much, at least alone. But now you've got,

Speaker:

people coming up with, first, new ideas, and then second, they're

Speaker:

doing combinations new plus older ideas. Is that

Speaker:

accurate? But but, it is actually. And the thing is Okay. So,

Speaker:

these are also like, Okay. Just imagine. We have a very

Speaker:

sophisticated deep fake detector. So I I'm pretty sure that our, like,

Speaker:

models are more or less, good. So, like,

Speaker:

I mean, it's not 100% for sure. Mhmm. But what

Speaker:

happens next? So can I actually, I mean, combat defects, 5

Speaker:

years later? Maybe it's I'm so advanced. I so make like,

Speaker:

our customers, like, ask us about it, like, once in a

Speaker:

month. So what do you actually what is your plan, to talk about defects

Speaker:

in 2 years. Right. Because now, you know, AI is like, it's very hard problem

Speaker:

to solve. But here's also problem. There is a thing

Speaker:

called mules. Have you heard about mules or money

Speaker:

mules? This is, the the thing is

Speaker:

that you actually go, hire a person.

Speaker:

Usually, buy, pay some €50. And then

Speaker:

actually this person passes a KVST check for you.

Speaker:

And then Oh, wow. The person just sells sells here his or her

Speaker:

account to you. And then this is a real person. I mean, it's not a

Speaker:

defect. I found it that I could defect. Wow. It's not obvious and not defect.

Speaker:

Yeah. But that well, this is that looks suspicious. But

Speaker:

but I if I'm in a bank, I'm in a I'm a bank, for me,

Speaker:

it's like a real person just trying to open up in a bank account. Yeah.

Speaker:

And now we actually have to look around. So that's why so I

Speaker:

like working with Deepgrams. I mean, it's very, you know, cool technology. You have to,

Speaker:

like Yeah. It's technology. But Now you actually have

Speaker:

to look around. You have to make sure what is, I mean, the

Speaker:

pattern. What are the devices do you use? It's like lots

Speaker:

of small Features or, signals, you have to actually

Speaker:

combine or merge them altogether and then make a decision. Is it, like,

Speaker:

specia or suspicious sorta? And this is like, but this is fun. This is

Speaker:

like, you have to really look around, look collect lots of data, and then try

Speaker:

to find, you know, your way into making a decision.

Speaker:

Interesting. It's it's it's a fascinating the simple things are no

Speaker:

longer simple. Right? Just signing up for an account, You know,

Speaker:

it's just now it's become like this massive multinational worldwide

Speaker:

cyber Security kind of exercise. It's a

Speaker:

fascinating, Yes. For a

Speaker:

customer, it is it must remain easy. Yes. I don't know like

Speaker:

I mean, since, like even, you know, the really, really

Speaker:

typical KBC check is includes recording your

Speaker:

video. You usually have to do something like, you know, turn your head

Speaker:

or something. I mean, if you have this experience. People do not like it. For

Speaker:

them, it's like, why do you have to do this? That's it's it looks strange.

Speaker:

I mean, just can I just open an account? And then it's like so it's

Speaker:

also trade off unless you have to be simultaneously

Speaker:

secure and busy. And this is Yeah. Those those are

Speaker:

those are very much contradictory, forces. Yeah.

Speaker:

Well, the other thing too, like, if I'm if I'm If I'm an average

Speaker:

customer or paranoid me. Right? Like, if I go to a

Speaker:

thing and they want me to look this way, look that way, Am I training

Speaker:

their deep fake model of me? Do you know what I mean? Like, I mean,

Speaker:

I'm kinda like, you know, obviously, I've done a lot of live streams and stuff

Speaker:

like that, so I shudder Better to think what you know, where that could lead.

Speaker:

But, what are your thoughts on that? Like, I mean, are do do you have

Speaker:

people who are Do savvy customers

Speaker:

do they get a little suspicious? Like,

Speaker:

what are your thoughts on I'm not. I I

Speaker:

must said that I mean, the defects that we see, they they

Speaker:

can be crafted just for 1 1 image. Right. So like,

Speaker:

here's the problem. So so like, there are, none of that, I mean,

Speaker:

you can see them, but Usually, people send, you know,

Speaker:

low quality images. So it's even harder for us to see it. Even harder for

Speaker:

for human person for human to see that this is a problem.

Speaker:

But there is also, I think, if I find a story that I

Speaker:

know, that some of our models

Speaker:

actually detect defects better than humans. So

Speaker:

it's actually easier for a fraudsters to treat a leading

Speaker:

person than a model. This model, like, can look back from certain artifacts with

Speaker:

eyes or just, like, some sort of, you know, glitches.

Speaker:

It's easy. But for person, especially the quality of the image is It's bad.

Speaker:

It's like there is no way anybody can actually spot this is the

Speaker:

problem. And this is great. It it is a problem. I I I must I

Speaker:

must admit this is, I think, this is what we

Speaker:

actually have to be have to hear

Speaker:

about about creating deep fakes. I know that that

Speaker:

is a very interesting thing. So, you know, about I mean, there are lots of

Speaker:

things happening, around AR regulations, Especially in the

Speaker:

European Union. Sure. And then so we actually tried to follow and then to

Speaker:

make sure that everything is compliant. And actually, I wanted to say that we touched

Speaker:

upon k y c KYT, which is know your

Speaker:

transaction. There was also KYB and all your business, which is basically, you

Speaker:

know, how we make sure that the company you work with is is

Speaker:

I know fraudsters. And there is also a thing called k y

Speaker:

a I, know your AI. And it says about transparency.

Speaker:

So many people out there want to be to know actually how AI is used.

Speaker:

So the k l it's it's a very new trend, I think. You have never

Speaker:

heard about it because, I mean, it was going to be a week ago. Since

Speaker:

I like, I want to actually know what's happening with all of this model of

Speaker:

error, not just about touch prod, ground everywhere. But back

Speaker:

to the problem with defects. The thing is,

Speaker:

what to to say that,

Speaker:

Oh, sorry. I lost the my my train of thought. But this is the all

Speaker:

the time. Yeah. We I was just about to say that. But what you know,

Speaker:

one solution to this, I I think, Pavel, would be

Speaker:

if people did something, you know, like, I don't know, colored their

Speaker:

hair Or grew a cool beard. I'm just

Speaker:

throwing that out and with apologies to people listening and not

Speaker:

watching. No. You know? I'm just

Speaker:

saying. But but if you did but if you did

Speaker:

grow a beard, would would or or or change your hair color or

Speaker:

altered their face? Like, I know that, like, facial most facial recognitions

Speaker:

use landmarks on, like, the eye sockets. Right. The a lot harder to change I

Speaker:

was joking. Didn't mind. But, like, would it would it would that

Speaker:

I don't know. Like, does that have any impact on these kind of systems or

Speaker:

are they more like facial recognition systems? They are,

Speaker:

it's, so we operate on the if you're talking about defect detectors or

Speaker:

defect, models for defect detection. Yeah. There are

Speaker:

some, I can't say that I face recognition. The

Speaker:

models, they mostly focus on artifacts. So so for

Speaker:

instance, like, a defect of a year ago, usually,

Speaker:

had problems with eyes. Your eyes of a defect, they usually are

Speaker:

very, you know, not really human.

Speaker:

So it will be changed. It will be like as as as the technology,

Speaker:

is getting more advanced. But like a few years ago, you can actually just crop

Speaker:

Eyes of an image of a person, pretending to be a human person, then they'd

Speaker:

make sure that this is actually a defect. Also I must say that

Speaker:

Yeah. So a video is is is easier to detect because you can actually

Speaker:

so, there is a thing called, I don't like the term in blindness because

Speaker:

No, but nobody actually know what Linus is, but Linus is a detection.

Speaker:

Linus detection is detection. If this is a

Speaker:

leading person or not. And before, like, 5 years ago, it was

Speaker:

mostly a distinction between, a video of a person or

Speaker:

a printed out image. Now it's a detection of an image,

Speaker:

defect, and the linear person. And at that time,

Speaker:

you actually there are 2 types of fly misses. One tool that's passive,

Speaker:

and we actually use also sometimes our customers actually ask us for

Speaker:

pacifying. Let's adjust 1 image. But it's easier for

Speaker:

us and for everybody else to ask a person to actually do something.

Speaker:

And for defects, for instance, like, if I ask them to rotate, Sometimes some

Speaker:

artifacts can appear. Some artifact. And then you can actually see that probably. I

Speaker:

mean, this is not the only person. There are some sort of problems with visual

Speaker:

artifacts. So it is it is like this.

Speaker:

Also, I must say that there was also a challenge for us because there

Speaker:

are, certain cameras. They have some sort of a

Speaker:

beautifiers. So I'm pretty sure as I'm calling from my,

Speaker:

my computer, and then my camera actually

Speaker:

Advances my image. So my image is a little bit, better

Speaker:

than I'm in the real life. So my my skin is is is a little

Speaker:

bit better. So it's it is actually, Embedded into

Speaker:

hardware. And for us, it looks like, some sort of, you know so there is

Speaker:

a signal for us. It does some sort of, you know it's Oh, I see.

Speaker:

So It's hard. You know? And you have to make sure that make sure that,

Speaker:

okay, it's not defect. It's just the person using that, camera off my,

Speaker:

computer. It's like, you know, you have you have to be really, a

Speaker:

yellow error. Apple, I mean, installs

Speaker:

another camera, and then you have to be actually tune your models to make

Speaker:

sure that you actually do not penalize people from with

Speaker:

I think about that. Yeah. The cameras are gonna behave differently if you use different

Speaker:

cameras. So I'm here using my 4 k,

Speaker:

camera. Kind of an outdated one, but it's still it does the job. But what

Speaker:

if I pick up my droid Or, you know, my wife

Speaker:

my wife, you know, she's the the device. She's got an

Speaker:

iPhone. And if I'm trying to log in through her device, That would be different

Speaker:

images, and it may change. You know, it may tell me, nope. That's not

Speaker:

you. Those are gonna be different artifacts. That's fascinating. And I also

Speaker:

think it's funny that you have an old four k camera, which

Speaker:

is a pretty funny thing to say. Like For for podcasting, I won't

Speaker:

No. I know. I don't wanna throw back to, theme from yesterday's

Speaker:

2 hour show, but I'll just make this note. We we

Speaker:

learned that we're in the top 2 a half percent of podcasts.

Speaker:

So now I feel like I should have, I don't know, 16 k studio

Speaker:

and Yeah. I should have a lot of time like Joe Rogan has in a

Speaker:

brick wall. Exactly. Right. I don't I need something better than this

Speaker:

old four k camera. But

Speaker:

if all of a sudden You just want to open a bank account right

Speaker:

now. Yeah. It looks strange because, I mean, a typical person is like you

Speaker:

use your iPhone or you're like a regular computer. Like, with 4 k or 16

Speaker:

k camera, it's like very strange. It's some something, you know. It's it's a signal

Speaker:

for for every model and make sure that It's an outlier. Right? And

Speaker:

it sounds like a big this is still obviously, there's way

Speaker:

more complicated things than what you do, But outliers

Speaker:

detecting outliers is probably 1 1 big tool in your tool belt.

Speaker:

It is. Yeah. That's very hard if you have a Genuine person,

Speaker:

and you are an outlier somehow. I mean, everybody can be an

Speaker:

outlier in some sense. It's very hard because, yeah,

Speaker:

So this is hard. So, like, at some point, yeah, colored hairs

Speaker:

can be also an outlier. I don't No. It's just interesting. So I imagine, like,

Speaker:

Instagram filters and things like that probably also cause

Speaker:

chaos and things like that. Yeah. Of course. But, yeah, I

Speaker:

mean So usually use, yeah, filters,

Speaker:

a strong signal for us. I mean Right. And also I must I must have

Speaker:

this defects. So going back, thing with defects is that

Speaker:

it's not, like, specifically use the fraudsters. Here's the

Speaker:

problem. You know, there are lots of cool things for defects. You can press

Speaker:

advertising. Right. I don't know what what else. But, usually,

Speaker:

you can actually adopt a person to, like, Replaced an

Speaker:

actor in the movie. This is also a defect. It's a very cool defect, very

Speaker:

sophisticated defect, very high quality defect. Still a defect. So those

Speaker:

are our usage is actually for for that, I mean, not just for fraud.

Speaker:

And then going back to our problems, it's like, I mean, And the

Speaker:

even even that and even that from that, I like this example,

Speaker:

but, the guys from the,

Speaker:

I mean so we focus on financial fraud. Yeah. So it's more or less like

Speaker:

people trying to actually sue money on, like, take over your account, something like

Speaker:

that. But the thing is the defects, they are mostly created

Speaker:

not for that. And this is a very interesting thing, I think. They are created.

Speaker:

And, actually, I didn't know about that, but we actually knew that When they started

Speaker:

to try and to create our Deepak's. So we went, you know, to the Internet,

Speaker:

some strange forms to make sure what what people actually use

Speaker:

What they create deep eggs for. And they create

Speaker:

deep eggs for porn. It's like 98%, 89%

Speaker:

Deepex, I slide 4. And this is also a problem because in in there is

Speaker:

a thing called nonconsensual port. Deepex are used for that, And this

Speaker:

is also a problem. So it's not our business, but the thing is that the

Speaker:

same technologies is there. And you actually I mean, if you,

Speaker:

I mean, work in the area, you can actually so the same model can actually

Speaker:

be applied to detect, this type of defects. Right. So it's

Speaker:

different, but, I mean yeah. Yes. It's, That was expressed to

Speaker:

me maybe a year ago. It's fascinating how

Speaker:

quickly this space is just Evolving or

Speaker:

devolving, I guess, depending on your point of view. Yeah.

Speaker:

But, no, you're right. Like, most of it is

Speaker:

Those a lot of the deep fake kind of work is done

Speaker:

for adult content. And, you know, and it's there

Speaker:

the The legislation around this is gonna vary

Speaker:

widely from place to place. But, like, you know,

Speaker:

revenge porn laws don't apply. And there. I I think that was a big thing

Speaker:

in, and there was a controversy somewhere. I think it

Speaker:

was New Jersey, Where somebody had

Speaker:

created deep fake images of either high

Speaker:

school or middle school girls, which adds an extra level of

Speaker:

legal Concern I have a whole lots of extra

Speaker:

levels of concern. Let's be honest. But, like, you know and and and and

Speaker:

there was this, you know, the big debate. And my first reaction was, I'm

Speaker:

actually kinda surprised it took this long for that to happen,

Speaker:

which is a very cynical take, I'll admit. But I can tell I I can

Speaker:

tell you the reason. The thing is that Technology moves so fast. Yes. And

Speaker:

legislation actually is always, like

Speaker:

so even with with EAU, AI act,

Speaker:

those I mentioned defects just a little because they started working on

Speaker:

the regulations 2 years ago. And 2 years ago, it was not a problem.

Speaker:

And now it's, like, all over, you know, the Internet, and then you have to

Speaker:

actually tweak the, wording,

Speaker:

but it takes time. Well, even still, like, you know, like, there's,

Speaker:

a few months ago, they had these fake commercials that were created by with

Speaker:

combination of 11 Labs and A few other companies to name them, so I

Speaker:

forget. But, you know, they had a picture of Elon Musk, you

Speaker:

know, eating spaghetti, and it looked weird. But you can easily see,

Speaker:

like, You know, I was messing around with v q early versions of v

Speaker:

q grant d q GANs in early

Speaker:

2022, And that stuff looked

Speaker:

weird, and it it really evolved. And this morning, I saw

Speaker:

Pika AI, I guess, just went Yeah. Yeah. Yeah. Went to a wider beta.

Speaker:

And, yeah, released and and and, like, I'm seeing what's created with that,

Speaker:

and, you know, it still looks weird, it still looks cartoonish,

Speaker:

but it's not The fact that we've gone that far in the span

Speaker:

of, you know, less than 2 years, like, I think says something, like and to

Speaker:

your point, legislation Usually takes years, to

Speaker:

make. So, like, by the time these laws are written, they may not be valid.

Speaker:

In the case of New Jersey, I think there's some debate over,

Speaker:

does what sorts of laws that applies to? Because

Speaker:

the the original, The faces

Speaker:

were mapped on to something else, but that the

Speaker:

something else I'm trying to keep our clean rating here. The something else were

Speaker:

people over 18, but the bases were mapped onto it. So there's

Speaker:

some debate over, do existing laws cover that?

Speaker:

I'm not a lawyer. Don't look at me, and I'm not. But,

Speaker:

it's just fascinating to your point. Like, this is moving quickly.

Speaker:

Yep. It's definitely complicated. So we've

Speaker:

reached the point in our show, Pavel, where we, like to

Speaker:

ask a set of questions. They're in the chat. And

Speaker:

I'll start out, with the, the very first question.

Speaker:

How did you find your way into this field? Did this field find you,

Speaker:

or did you find it? Yeah. I must say I have a

Speaker:

story to tell. I just studied yeah. Studied computer

Speaker:

science at, university And I actually worked as a software engineer

Speaker:

at Motorola. You may remember this company, with

Speaker:

HQ in Chicago back then, for 5 years.

Speaker:And then it was,:Speaker:

ago, the very first, massive

Speaker:

online courses appeared. There was a one called AI class,

Speaker:

and it later turned out to be a Udacity. And there

Speaker:

was also a m l called ML class. It's a ML class. And

Speaker:

this now this Coursera. It's like 10 years ago. And I was like, okay.

Speaker:

Cool. I enrolled and actually, I pushed because it is like it was it was

Speaker:

hard. It was like, you have to really, be involved. And

Speaker:

then I felt like, okay, this is a cool thing. This is like a next

Speaker:

big thing for me and, like, for everybody else. It was like

Speaker:

12 years ago. So I quit my job, and I actually, so

Speaker:

at the same time, I started to try to run a small startup with my

Speaker:

friend, failed miserably. But I take, took my time, studied,

Speaker:

for maybe half a year, and then joined a small data

Speaker:

startup as a data scientist. And then it just

Speaker:

started there. So it's I think I I find, my way into

Speaker:

data. But Yeah. I don't know. So You want to

Speaker:

I'm sorry. Go ahead. I just I just say it sounds like you were very

Speaker:

intentional about finding your way into it. So that's cool. Yeah.

Speaker:

That's cool. And I see you were You were at VK for a while too,

Speaker:

which I've never seen VK, but I hear it's like a like

Speaker:

a Russian language version of Twitter slash Facebook. It used

Speaker:

to be. Yes. Yeah. Yeah. I don't I yeah. Obviously, now things are different, but

Speaker:

yeah. Yeah. Yeah. Yeah. I worked there for 5 years, a long time ago. Oh,

Speaker:

interesting. And, you know, if you're talking about the data, I mean,

Speaker:

the, where it's like the the place where you can

Speaker:

actually play with data. You can actually cool do many cool things.

Speaker:

Oh, yeah. Nice. Nice. And he's being modest. According to LinkedIn, he was director

Speaker:

of AI research, so he's super smart.

Speaker:

But, what's your favorite part of

Speaker:

your current job? Oh, I can't say it

Speaker:

could create some defects, but, it's not

Speaker:

it. I think

Speaker:

no. I mean, I would say that what I like is, they,

Speaker:

the the Samsung, Samsung is is now it's it's a product or any company. So

Speaker:

have our own own products, whether, like, a technology company, yet we have our

Speaker:

own product. And having that,

Speaker:

actually, our own product, Actually helps us, you know, I know what our

Speaker:

customer wants. Wonderful. I know the

Speaker:

data. So it's like, you know, I mean, you have to actually so you have

Speaker:

to look around. Okay. There is a problem with defects. I have to,

Speaker:

like, make sure that I mean, I had, I actually have to understand this. This

Speaker:

is a problem. And for many of our customers, I mean, I

Speaker:

don't I would not like to say that we have to educate them or actually

Speaker:

make make sure that they understand this is a problem with defects. And now we

Speaker:

have when they understand, we can actually help them with their their,

Speaker:

safety and security. One thing that this is, like, a little bit, I

Speaker:

mean, Clumsy answer, but I'm sorry if you know.

Speaker:

Yeah. Being closer to the product is is is is fun.

Speaker:

Oh, sorry. Cool. So we have 3 complete

Speaker:

sentence. And the first one is when I'm not

Speaker:

working, I enjoy blank.

Speaker:

Okay. Okay. Let me think for a while. There are many things I can

Speaker:

say. No. I can say no. This is I think of this as I can,

Speaker:

I can share? No. I I I I run or I can see job.

Speaker:

Mhmm. Oh, cool. Cool. I run-in the the ring marathon.

Speaker:

This is my Nice. There are Major Martins, like, 5,

Speaker:

6 Martins across the world. So that's New York, Paris,

Speaker:

London, Tokyo, Berlin, and,

Speaker:

London. Nice. Like, 6 so that Very So Berlin was my 1st major

Speaker:

marathon. So I ran it, this this September, and it was great. No. That's

Speaker:

awesome. That's awesome.

Speaker:

When you said Berlin, the first thing that popped in my mind was, Berliner

Speaker:

Kendall wrote, which is like this local kinda drink.

Speaker:

Yeah. Yeah. Yeah. Yeah. I know. That's like Yeah.

Speaker:

Yeah. But I prefer there is a it's a vehicle. It's

Speaker:

like a craft. Oh, yeah. From Berlin.

Speaker:

Right. But I talking about Berlin, so I run. It was

Speaker:

super fun, but, on my finishing picture, so

Speaker:

it's my me, Ryan. So close to Bernsberg. It's a

Speaker:

very central grid. Mhmm. And there is also a guy in the

Speaker:

bottle question. And and I

Speaker:

wasn't it was not slow. I wasn't slow. Yeah. There was a guy in a

Speaker:

huge ball, like, I still running, like, finishing with me. Like, so it was, Oh,

Speaker:

that's funny. That's fun. It's that's fun. That's funny. Very cool.

Speaker:

Next, complete the sentence. I think the coolest thing in

Speaker:

technology today is

Speaker:

blank. Oh, it's it's it's hard to say. Let me I'll just

Speaker:

think for a while. But, I mean,

Speaker:

I think that so my my area

Speaker:

seems like I expert a personally specified natural

Speaker:

language processing. So I know about language models. And,

Speaker:

actually, we had papers on language models, like, before they they

Speaker:

were super big. So, like, on tuning language models. Yes. I

Speaker:

found it really, really exciting that it in a

Speaker:

year, it went from, you know, research

Speaker:

Prototypes to, like, everyday product. This is Yeah. This was

Speaker:

a compelling. So, like, my parents used Chargebee PCs. Like, I mean, this

Speaker:

is like this is like a mobile phone. This is I mean, this is what,

Speaker:

like, some sort of a milestone, last year.

Speaker:

I think this is this is it. And he is that the actual unit

Speaker:

for main things. You can build products on on language models. And

Speaker:

this is also like. It's wild, isn't it? Like, you know,

Speaker:

and and it's captured everybody's imagination in in good and bad ways.

Speaker:

But, like, my father-in-law, you know, So he used to

Speaker:

say Frank works with computers. Now he says Frank works in AI.

Speaker:

Okay. You know? That's good.

Speaker:

But I also like we used to say machine learning. So now you have to

Speaker:

say AI. That's right. That's right. You have to say that data mining

Speaker:

core something. So it's like, you know That's right. It definitely would.

Speaker:

I wonder what it'll be next year. Who knows? Gen AI probably.

Speaker:

Probably. So our next one, complete this Regulate, I think. Oh, that's

Speaker:

right. Regulation. That's right. Regular. Our our last completes the

Speaker:

sentence is I look forward to the day when I can use technology

Speaker:

to blank. Uh-huh. I

Speaker:

can't it's hard to answer because, I mean, like, I

Speaker:

can't say it would be cool If I can, you know,

Speaker:

develop drugs. And then there are very cool startups for drug design

Speaker:

with AI. Yet, I mean, Just imagine we have

Speaker:

a a a cure for cancer, but Right. We have so

Speaker:

many diseases to care to cure. So let's say, I think I

Speaker:

hope Once we fix anything, then there is gonna

Speaker:

be a next, you know, next milestone for us to look forward. So I'm sorry

Speaker:

if, you know, there's never I hope there will be

Speaker:

no such date, I can say. Right. Right. That's

Speaker:

a good one. I'm pretty sure you will agree with me. Like Yeah.

Speaker:

Especially work with the technology. I mean So true. For sure.

Speaker:

The next question, share something different about yourself, but remember, It's a family

Speaker:

oriented well, not family oriented, but we like we we like it so that

Speaker:

you can list it with your kids in the in the car. Right? Like, That's

Speaker:

kind of a Yeah. Yeah. Yeah. And, yeah, and I live in Berlin across, very

Speaker:

close. There's a very, how to say, kinky club, which is Berlin.

Speaker:

Was that the the the tier garden? It's it's

Speaker:

it's a it's a it is family friendly. It's it's like the most family

Speaker:

friendly place in in in Berlin. You got some. Yeah. No. It's it's

Speaker:

called KitKat. Yes. What I can say.

Speaker:

I have, purple hair. Since last month.

Speaker:

I don't know. So I can say that I speak

Speaker:

a few languages, all of that. But, no, I'm I'm

Speaker:

joking. So I speak Japanese. I don't I don't Japanese, for a

Speaker:

long time. So I I can speak Japanese. I speak

Speaker:

English, obviously, Russian. My parents are from

Speaker:

Russia. And I also speak German. So I actually Studied

Speaker:

German for 2 years. So I actually studied right now. So I had, like, my

Speaker:

German classes 3 or 4 times per week, which is

Speaker:

let me just go. Sorry. So I hope in a year, I will be able

Speaker:

to do a podcast in German as well. Oh, Wendeschon. That is

Speaker:

not

Speaker:

Yeah. Yeah. And we just lost, like, We we just

Speaker:

looked at our analytics, and, like, most of our listeners are from English language countries.

Speaker:

So I think we just lost them. Maybe we

Speaker:

can attract new listeners. Oh, I like it. I like the way you think. We

Speaker:

wanna we wanna get to the top 2.4% now.

Speaker:

Our new goal. So,

Speaker:

Audible is a sponsor of the show, and I'm not sure if

Speaker:

Audible is big in Europe. I think it is because I've seen a lot of

Speaker:

German language audiobooks. It is a no. Okay.

Speaker:

So do you do you listen to audiobooks? And if so, you have a good

Speaker:

recommendation. Otherwise, we'll take a recommendation on the regular good Fashion

Speaker:

paper dead tree book. No. I have a couple. I think I can

Speaker:

give you a couple of examples. This is like,

Speaker:

I like this was the most, you know so so I'm so my

Speaker:

background is from many, places,

Speaker:

since Israel, Russia, and Germany in some extent. So

Speaker:

I would recommend, there is a very Good book. It is

Speaker:

in my opinion, this is very known, but not many people know about it for

Speaker:

some reason. It's called the good soldier's make. Okay.

Speaker:

Like it said, didn't Not heard of. About the, sort of

Speaker:

third world war by Oh, interesting. But this is

Speaker:

it's very good. Like, you can actually learn a lot about

Speaker:

Czech Republic, Germany, Austria in the beginning of

Speaker:

the, Last century. Oh, interesting.

Speaker:

Especially now, it's the very thing. It's called in the park. This is a very

Speaker:

good thing too. And it's very funny. It's like one of the funniest, books

Speaker:

ever written. And also the the second one, I have 2.

Speaker:

This called Arc of Triumph, by remark. Okay.

Speaker:

This is also about the pre war Europe, pre second World War

Speaker:

Europe, like, Southeast, years of the

Speaker:

last century. And this is also very, like, you know, you

Speaker:

really you really feel like what what was the I mean, living in

Speaker:

Germany and, France, during that time, it's very, very interesting.

Speaker:

So one of my favorites. So I can definitely recommend both of these

Speaker:

videos. Very cool. So audible detecting I'm sorry. I'm

Speaker:

detecting a history theme. Yes. Yeah.

Speaker:

Yeah. Yeah? Cool. There's a really good book. Since you live in Berlin,

Speaker:

you might like it. It's called Faust's Metropolis, and

Speaker:

it's about the history of Berlin from, like, you know, Almost

Speaker:

stone age time till Okay. Cool. You know, the 20

Speaker:

you know, early 21st century is kind of like And the basic

Speaker:

gist is, like, you know, a lot has happened in Berlin. Good.

Speaker:

Sure. Yeah. We all know the bad. Right? But, like, some good things have

Speaker:

happened, kinda everything in between. It's kind of it's an interesting look at, like, the

Speaker:

history of the city and how it apparently was built on a swamp or something

Speaker:

like that. Like Yeah. It's just, it's it's

Speaker:

interesting. And Audible is a sponsor of

Speaker:

Data Driven. If you go to the data driven book .com,

Speaker:

I think even the data driven book .com might work. Uh-huh. That was

Speaker:

a pronunciation joke. You'll get a free, on 1 free

Speaker:

audiobook on us, and And we'll get a kickback if you sign up for a

Speaker:

subscription. And finally, where can

Speaker:

folks find out about you, more about you, and what you're up to at Sumsub

Speaker:

And, some of the other things you you're up to.

Speaker:

What's up? My my connection was, Oh, where can folks find out

Speaker:

more about you and what you're up to? Oh, yes. It's,

Speaker:

yes. It's, it's a company. It's called Samsung. So Samsung dot com.

Speaker:

Also, like, what we have is, today is

Speaker:

with anti fraud. And you have to I mean, It's not about

Speaker:

all the product. It's actually about making people helping people

Speaker:

learn about, security. So how they can actually navigate the Internet or,

Speaker:

like, their life More safely. So we have a portal called

Speaker:

some suburb where we actually post a lot

Speaker:

of stuff on Making your Internet life,

Speaker:

can I say like this, safer? So, actually, I I advise you

Speaker:

to take a look, and then probably you'll find something interesting there.

Speaker:

We definitely will. And, any parting thoughts before

Speaker:

we end the show? Any final thoughts? I just

Speaker:

want to say, yeah, Just I was very happy to, to be here

Speaker:

and hope, it was Cool. Interesting. This is a great show. It's always good to

Speaker:

it's always good to kinda understand The the the intersection

Speaker:

of of AI data and security because some people still see

Speaker:

those as separate things. But I think as time goes on,

Speaker:

we're gonna I'm gonna we're gonna wonder how we ever saw it as separate

Speaker:

things. There are so many things to talk about that. Yeah. Yeah. Yeah.

Speaker:

Yeah. Well, awesome. Any parting thoughts, Andy?

Speaker:

No. Just a great show. Pavel, thank you for, for joining us.

Speaker:

It was our honor. Yes. Likewise. And we'll let

Speaker:

Bailey finish the show. That was some show.

Speaker:

We appreciate you listening to Data Driven. We know you're

Speaker:

busy and we appreciate you listening to our podcast. But

Speaker:

we have a favor to ask. Please rate and review our

Speaker:

podcast on Itunes, Stitcher, or wherever you subscribe to

Speaker:

us. You have subscribed to us,

Speaker:

haven't you? Having high ratings and reviews helps us

Speaker:

improve the quality of our show and rank us more favorably with the search

Speaker:

algorithms. That means more people listen to us,

Speaker:

spreading the joy. And, can't the world use a little

Speaker:

more joy these days? So go do your part to

Speaker:

make the world just a little better and be sure to rate and review the

Speaker:

show.

About the author, Frank

Frank La Vigne is a software engineer and UX geek who saw the light about Data Science at an internal Microsoft Data Science Summit in 2016. Now, he wants to share his passion for the Data Arts with the world.

He blogs regularly at FranksWorld.com and has a YouTube channel called Frank's World TV. (www.FranksWorld.TV). Frank has extensive experience in web and application development. He is also an expert in mobile and tablet engineering. You can find him on Twitter at @tableteer.