Skip to content
Exploring Machine Learning, AI, and Data Science

Why ‘Data-Driven Decisions’ Are a Myth

This week, we dive deep into the world of data, decision-making, and uncertainty with Dale Nesbitt, a lecturer at Stanford and principal at Arrowhead Economics.

Drawing on his unique upbringing in a mining town, Dale Nesbitt shares how witnessing raw data collection firsthand shaped his perspective on what it really takes to make informed decisions—hint: it’s not just about having more data.

Together, we explore the pitfalls of relying solely on data for critical choices, the importance of understanding probability and risk, and why data-gathering itself is often a noisy and imperfect process.

From commodity pricing and speculation in oil markets to the real-world impact of data-driven decisions in healthcare, Dale Nesbitt reveals why true analytic power comes from combining rigorous analysis, sound judgment, and the right kind of data—not just more of it.

Join us as we challenge myths around “data-driven” decisions, unpack lessons from COVID-era data science, and discover why wisdom of the crowd, probability, and a healthy respect for uncertainty are key to navigating our data-rich world.

Links

Time Stamps

00:00 Growing up in a mining town

05:44 Data as the New Crude Oil

07:31 Estimating and Understanding Stochastic Processes

12:49 Impact of Strait of Hormuz Closure

14:19 Challenges of AI in Economics

17:05 Betting on events and elections

21:43 Bayesian analysis and hydroxychloroquine data

23:28 Understanding data and judgment

26:38 Analyzing data for better decisions

Transcript
Speaker:

I grew up in a mining town, and I don't recommend

Speaker:

that actually, as a place to grow up. Got

Speaker:

partially homeschooled by my mom and dad because the quality

Speaker:

of the local schools was not what they wanted to see. I

Speaker:

saw resource production up close and personal. I saw data

Speaker:

coming in up close and personal before anybody knew what to do with it

Speaker:

or how to use it or analyze it at all.

Speaker:

Statistics on operations, on interruptions, what

Speaker:

causes machines to go down and be unavailable for all

Speaker:

these kinds of things. Data can help us understand that.

Speaker:

Hello, and welcome back to Data Driven, the podcast. We explore the

Speaker:

emerging industry of AI, data science, and, of course, data

Speaker:

engineering. You may notice that my favorite data engineer in the world,

Speaker:

Andy Leonard, is not here. However, I brought the most quant,

Speaker:

most quantum curious person I know. I don't like calling her curious, although

Speaker:

one could argue. And Candace Cooley. How's it going,

Speaker:

Candice? It's great. Thank you so much. I'm really excited to be part of Data

Speaker:

Driven today. Awesome. We're happy to have you. And we're also very

Speaker:

happy to have Mr. Dale Nesbitt, who is,

Speaker:

in addition to being at Arrowhead Economics, he's also a

Speaker:

lecturer at Stanford, and I guess he has his first summer class today,

Speaker:

and we're happy to have him. Welcome to the show, Dale.

Speaker:

Thank you. I appreciate it. And thanks for the opportunity to speak with you.

Speaker:

No problem. No problem. What exactly does

Speaker:

Arrowhead Economics do? It's implicit in the name

Speaker:

we do economics. We named after

Speaker:

ourselves after the second Nobel laureate in economics,

Speaker:

Kenneth Arrow, who was riding his bike around Stanford

Speaker:

happily until his 96th birthday. And then. Then we

Speaker:

lost him about five years ago. So we do

Speaker:

economics in the energy patch, critical materials

Speaker:

patch, any commodity that's. That's

Speaker:

produced or traded. Interesting. And that's a.

Speaker:

That's a very large field. I don't think. If were it

Speaker:

not for Eddie Murphy's movie Trading Places, most

Speaker:

people probably wouldn't know a thing about it. I was a kid when that came

Speaker:

out, and I was just fascinated. So sorry. I'm sure that's not the first

Speaker:

time you've heard that, and I hate to tell you, it's probably not gonna be

Speaker:

the last time you've heard that. Well, yeah, I've heard that.

Speaker:

I had an auspicious start. I grew up in a mining town.

Speaker:

Okay. I don't recommend that, actually, as a place to

Speaker:

grow up. Got partially homeschooled by

Speaker:

my mom and dad because the quality of the local schools was not what

Speaker:

they want. I saw resource production up

Speaker:

close and personal. I saw data coming in up close and personal

Speaker:

before anybody knew what to do with it or how to use

Speaker:

it or analyze it at all. Statistics on

Speaker:

operations, on interruptions, what causes

Speaker:

machines to go down and be unavailable for all these kinds of

Speaker:

things. Data can help us understand that.

Speaker:

Interesting. So you were the data. Life really found you? It sounds.

Speaker:

Yeah. Yes and no. Yes. My background is actually in

Speaker:

probability and decision analysis. That's where I did my research.

Speaker:

And what. One of the things that you're going to find the data is really

Speaker:

good for is developing probability distributions over

Speaker:

phenomena that you don't understand the uncertainty about.

Speaker:

We want to understand the uncertainty about certain phenomena.

Speaker:

Data is a way to do that. Interesting. Sorry,

Speaker:

Candice, I cut you off. No, I was curious. So many organizations, they

Speaker:

collect enormous amounts of data, but why

Speaker:

do so few of them seem to not be making better

Speaker:

decisions based upon their data? As a professor of decision

Speaker:

analysis, you need more than data to make decisions.

Speaker:

You need alternatives. You need information, probability

Speaker:

distributions, not data. And you need a notion of your

Speaker:

values, your objectives, what do you like and what do you not like. So if

Speaker:

you're coming like profits, there's a lot of uncertainty

Speaker:

in the middle. If you're producing oil, there's a probability distribution

Speaker:

over oil price, over supply chain costs and those things.

Speaker:

You can't make a good decision unless you have those probability

Speaker:

distributions. I get them in part from data.

Speaker:

The data is out there telling you what these probability distributions

Speaker:

are, if you know how to process it. The reason is because

Speaker:

the notion data driven decisions is a non

Speaker:

sequitur. It really is. Data is not

Speaker:

enough to drive your decisions. You need intelligence,

Speaker:

you need an understanding of uncertainty. What's the on

Speaker:

average value of these variables that you're going to see?

Speaker:

How much spread is there in those variables? You can't make a decision under

Speaker:

uncertainty unless you understand the uncertainty.

Speaker:

So data driven is not data driven, it's analytic driven. And

Speaker:

the data helps you. If you, some of the techniques that, that

Speaker:

we use in practice get what those uncertainties are

Speaker:

to feed into a decision analysis

Speaker:

framework, then you make good decisions.

Speaker:

Yeah, I mean, that makes sense, right? Because data would be the raw

Speaker:

commodity, if you will, the raw oil. But the thing that

Speaker:

you put in your car, gasoline, petrol, whatever you want to call it,

Speaker:

is the refined product from the raw material, you know, Frank?

Speaker:

Absolutely. Data is the crude oil and

Speaker:

the probability distribution of the finished product, gasoline,

Speaker:

distillate, so forth that you put into your machinery that makes it go

Speaker:

absolutely Data is the raw material.

Speaker:

And one of the problems there, a Candace, pursuant to your remark,

Speaker:

is this is Nesbitt's maxim number three, people

Speaker:

only gather data that's easy to gather.

Speaker:

And maximum number two is the process of gathering

Speaker:

data is itself stochastic.

Speaker:

Watch how data is gathered. It's a noisy, noisy process

Speaker:

to observe anything and to write down the

Speaker:

correct, say, operation of machine. We

Speaker:

don't even know what the price of crude oil is because everybody reports

Speaker:

it differently. So data itself isn't

Speaker:

gathering itself is noisy. You have to take that into account when you

Speaker:

analyze it. So this idea that data is. It's like

Speaker:

Rumpelstiltskin, right? Spin straw into gold. You can't.

Speaker:

The data is not straw. It has to be scrupulously

Speaker:

worked. Frank, just exactly what you alluded to to get

Speaker:

proper decision making, to get proper

Speaker:

forecasting and those things.

Speaker:

Well, then doesn't that explain where a lot of data projects or

Speaker:

AI projects go wrong, is because they don't take into account the random.

Speaker:

The randomness. Because you said data collection is stochastic. Right. So

Speaker:

it is. It's inherently unpredictable.

Speaker:

Right. And doesn't that kind of echo throughout the training of these models

Speaker:

that we have and cause weird outliers?

Speaker:

Or does is there magic that can happen that can cancel that

Speaker:

out? Yeah, there's. Well, there's magic. You try to measure

Speaker:

through observation the intrinsic uncertainty in the data, like

Speaker:

the difference between predicted and actual, maybe due

Speaker:

to measurement error, just observing the data. And

Speaker:

the great statistician Fisher told us that data was

Speaker:

generated by a stochastic process. And what

Speaker:

you're trying to do is figure out what that stochastic process must

Speaker:

have been very subtle. So you,

Speaker:

you, at the same time you're estimating, say, coefficients for your

Speaker:

model or something you want to estimate, you're also estimating what

Speaker:

stochastic process was at work when the data were

Speaker:

gathered. And you can infer that

Speaker:

it's very sophisticated. AI doesn't do any of that.

Speaker:

Regression analysis does that. AI

Speaker:

doesn't deal with uncertainty at all. No,

Speaker:

it actually gets really wonky when it's uncertain.

Speaker:

Yeah, wonky being a technical term, of course.

Speaker:

But I think it's really important when you think of data analysis. You want to

Speaker:

be careful to try to measure intrinsically what sort of

Speaker:

stochastic variation, what sort of process by which people

Speaker:

observe data. What was it? When we

Speaker:

do regression analysis, simple linear regression is a term,

Speaker:

sigma in there, and that sigma has to do with the random nature

Speaker:

of the data. So

Speaker:

interesting. So, yeah, it really goes back to the fundamentals of

Speaker:

statistics, doesn't it? Yeah. I mean, think of the, you know,

Speaker:

the clock's always right. Suppose you're measuring the time

Speaker:

and you fail to notice that the clock broke.

Speaker:

And it gives you the same time all the time. Your sensor

Speaker:

broke and it gave you the same reading, no matter what the state of the

Speaker:

machinery was. Your data set ain't too good with

Speaker:

that in there. And yet, wow. No matter what, I get the

Speaker:

same answer. That's really predictive. Fact is, it's a broken

Speaker:

data machine. Right. So you get just

Speaker:

industrial measurements. Your, your process of measuring,

Speaker:

be it manual or sometimes be it automated, you have to

Speaker:

be very, very cognizant of

Speaker:

that and the stochastics of that

Speaker:

and take that into account when you try to understand what that data

Speaker:

implies about your probability distribution over something

Speaker:

interesting. Candice looks like she's

Speaker:

going to say something. I'm thinking about uncertainty,

Speaker:

and I'm also thinking about risk. And I'm trying to understand

Speaker:

what's the distinction that matters between risk and

Speaker:

uncertainty. Well, it's, it's kind of interesting. So suppose

Speaker:

that. Let's take the price of gold. You're going to open a gold mine. Price

Speaker:of gold right now is $:Speaker:

infinity minus three. It's really, really high.

Speaker:

Wars and things like that caused that to happen.

Speaker:

But you're uncertain about it. But if you're going to open

Speaker:

that gold mine and it's profitable at a gold price of

Speaker:

fifteen hundred dollars an ounce, there's uncertainty, but there'

Speaker:

risk has to do with loss. So suppose you were

Speaker:

uncertainty. One day I grew up in Reno. One day I walked into the

Speaker:

nugget, I was underage, and I inadvertently pulled the handle on the

Speaker:

slot machine and it played. You didn't have to put money

Speaker:

in it and it would just play. It was broken.

Speaker:

That's a pretty good lottery. I played it until the security guard came and

Speaker:

kicked me out. Right. So there was

Speaker:

uncertainty. I didn't know if it was going to give me cherries or clowns or

Speaker:

whatever, but there was no risk. I didn't have anything at risk.

Speaker:

Risk happens when you have losses that have to be balanced against

Speaker:

gains. Now all of a sudden, your elementary tech

Speaker:

gets a lot tighter when you have losses. I had to put

Speaker:

a quarter in that machine. And so if it didn't come up, cherries

Speaker:

are better. My quarter was gone. So risk has to do

Speaker:

with the probability of loss and how you,

Speaker:

you trade that off with Your preferences against probability of

Speaker:

gain and uncertainty just has to do with how

Speaker:

sure are you. If you had to do an over under on your

Speaker:

profitability, what's the 50, 50 point

Speaker:

risk estimate? That makes a lot of sense. Yeah, that makes a lot of

Speaker:

sense. That's interesting. But

Speaker:

what's your take on. Obviously the commodity that everyone

Speaker:

is most impacted by, at least in the obvious sense, is oil.

Speaker:

Crude oil. Yep, Crude oil. So

Speaker:

I've noticed obviously there's been some, you know, the elephant in the room, there's been

Speaker:

a lot of instability in that, that space, but somebody had said something and

Speaker:

I didn't quite get it, is that the oil

Speaker:

contracts are down a decade out or five to ten years

Speaker:

out. Yeah. So the fluctuation in

Speaker:

prices that we see at the gas pump have more

Speaker:

to do with

Speaker:

speculation. Is that true? No, Did I mishear it?

Speaker:

I don't think it's true. This has to do with the short term supply demand

Speaker:

balance. When the strait of Horamuz was closed, you had 10

Speaker:

million barrels a day less supply at the gate of

Speaker:

the straight of horror moves. So draw yourself a supply

Speaker:

curve and a demand curve and shift that supply curve 10 million

Speaker:

barrels of the day to the left. Your price is going to go high.

Speaker:

Yeah. No matter what we have. And I do this and

Speaker:

there's a different kind of data I'd like to chat about. We have a world

Speaker:

oil model, multi regional world oil model. We forecast this stuff for

Speaker:

the industry short term and long term. Okay. And

Speaker:

so short term is different from the long term. All the oil that's going to

Speaker:

be produced is sitting right there at the wellhead and you have to have

Speaker:

logistics to get it to market. So the supply curve and the demand curve and

Speaker:

the short term are fixed. In the long term they're not fixed.

Speaker:

People can invest capital and go produce some tar sands in

Speaker:

Venezuela or produce some more oil in the Middle East.

Speaker:

And so longer term price effects tend to look a lot different

Speaker:

than shorter term price effects. But both are uncertain.

Speaker:

Both are. So if we gather data on some

Speaker:

phenomenon, if it affects long term prices,

Speaker:

then that data will have a stochastic effect. Right.

Speaker:

It helps you understand them better, helps you reduce your, your

Speaker:

uncertainty of what those prices are going to be. The more data that you gather,

Speaker:

the more definitive, I. E. The lower variance your look.

Speaker:

So a lot of these comments that you hear, they give me

Speaker:. I thought I was back in the:Speaker:

The stupidity in these comments, number

Speaker:

one, the biggest stupidity in the world, is that the future price

Speaker:

is an extrapolation of the past price. And people do

Speaker:

that statistically, it's dead wrong. Economics teaches us that

Speaker:

the future occurs because of future

Speaker:

supply and demand has absolutely nothing to do with the past.

Speaker:

So AI is going to have a really hard time with any economic

Speaker:

problem because the future price that you're going to

Speaker:

see, say right at the gate of the straight of horror moves, it's going to

Speaker:

be a function of what's happening then, not what happened now.

Speaker:

People will speculate into the future. They will solve. They will sign

Speaker:

a long term buy or sell contract. I'll give you a million

Speaker:barrels a day in:Speaker:better have that crude oil in:Speaker:

you'll sign a contract that I'll buy it. They sell, buy and sell

Speaker:

futures. That just makes the price more and more available

Speaker:

and visible to everybody.

Speaker:

Speculation is a good thing. It's a very, very good thing.

Speaker:

The best thing is speculation trading.

Speaker:

Because it takes risk away from the producers, right? Or no,

Speaker:

it shows you the price. If you watch how people make these

Speaker:

trades, if all of them are made at an implicit price of $80 a

Speaker:

barrel, there's your best guess at the price. Speculation

Speaker:

shows us the price. So this idea that speculation

Speaker:

is bad is really stupid. It's so stupid.

Speaker:

You want more and more and more and more trading. So everybody knows the

Speaker:

price. So the price in the fair, the free market is known to all.

Speaker:

Interesting. That's why people

Speaker:

trade, right? So you don't have to, you

Speaker:

don't have to be data driven about the future price. It's transactional.

Speaker:

Right? Okay. We have short

Speaker:

term disruptions, there's not enough trading. And

Speaker:

until a point, because you're trading on the probability

Speaker:

that the straight remains open versus vibrates

Speaker:

between open and closed versus closed.

Speaker:

And people will speculate that. Thank God for

Speaker:

speculators. Interesting.

Speaker:

What's your take on these poly market type

Speaker:

marketplaces? What's your take on that?

Speaker:

People are speculating. You do? Okay, I love it, I love it.

Speaker:

I think people are betting their probability assessment against yours. They

Speaker:

think they got a better probability distribution of something over yours and

Speaker:

they put their money where their mouth is, they trade on it. And if you

Speaker:

look at the volume of trades that people lay down, that gives

Speaker:

you some notion of the cons, consensus probability of those events

Speaker:

occurring. So the what like on the election?

Speaker:

Speculating on the election. I love those things. I don't play them, but I love

Speaker:

them. Sportsbook is that sports books are

Speaker:

great. They're so much fun to play too.

Speaker:

Bad they rake off so much of the pot as their fee.

Speaker:

But you can watch the data on those and I think

Speaker:

the data suggests that they're very pretty. Pretty

Speaker:

predictive of what happens in a probabilistic sense.

Speaker:

No, I remember one of the. There's a paper from a long

Speaker:

time ago and basically discussed how if you remember the show

Speaker:

who Wants to Be a Millionaire, the crowd, when you ask the crowd a question,

Speaker:

the crowd was more accurate than any of the other mechanisms

Speaker:

that they had, which was phone of friends pass

Speaker:

and something else. But the

Speaker:

crowd actually got the answer right. Far and above.

Speaker:

Yeah. That book, the Wisdom of Crowds. That was it.

Speaker:

Yeah. And it's very good. See, the market is the ultimate wisdom of

Speaker:

crowds. The crude oil market is so heavily traded. There's

Speaker:

no misinformation in that price. There's even a

Speaker:

theory which I subscribe to, that there's nothing the matter with insider

Speaker:

trading. It gets more information into the price and

Speaker:

more correct information into the price. It's

Speaker:

illegal because people say it's unfair. It's.

Speaker:

It's capitalizing on. On insider knowledge and all that.

Speaker:

I'm not so sure I buy that. I want the. When I buy somebody's

Speaker:

stock like SpaceX, I want all the insider and

Speaker:

outsider knowledge involved in my decision whether or

Speaker:

not to purchase that st. And it wasn't all

Speaker:

the insider knowledge was concealed from the market by law.

Speaker:

Yeah, no, that. It's funny you mentioned that because insider trading

Speaker:

has only been illegal since the Great

Speaker:

Depression. Yeah, I remember one of the

Speaker:

most interesting books. It was called the Patriarch and it was

Speaker:

basically about Joe Kennedy. Oh yeah. And

Speaker:

FDR put him in charge of that. But he was a notorious insider

Speaker:

trader. Oh yeah. As you would point out later on, it wasn't

Speaker:

illegal then. Right? It wasn't illegal. It's not even. It's not

Speaker:

illegal in Congress today. Well, yeah,

Speaker:

exactly. I mean there are people who are financial

Speaker:

geniuses. This is crazy. I mean, really. Yeah. These

Speaker:

guys have confident insider information, it's

Speaker:

confidentially delivered. Confidentially. And you can go off

Speaker:

and trade on it. Come on, guys. It's

Speaker:

not a practice that be condoned. Yeah. Shutting down insider

Speaker:

information is about fairness. Then they should be included here.

Speaker:

But we're a little far afield. Getting back to data science, there's a couple of

Speaker:

other things that. One of my. One of my favorite

Speaker:

themes. I saw this during COVID go back to the beginning of COVID

Speaker:

How much data was there on what was curative

Speaker:

or what was ameliorating of symptoms about any treatment

Speaker:

we had, there was no data. Not a lot, but

Speaker:

there was a lot that was suppressed. If it's suppressed, it

Speaker:

doesn't exist. Okay. If you can't analyze with it,

Speaker:

then there is no data. It's like if I flip a coin and put a

Speaker:

piece of paper over it and ask you to call the coin,

Speaker:

it's still 50, 50 to you. I can look under that paper, it's 100 to

Speaker:

me. I know what it is. Your state of information is still 50,

Speaker:

50. You don't have any information when I have it.

Speaker:

So you're going to have to take that into account. So

Speaker:

people were coming out and saying, oh, hydroxychloroquine is going to have

Speaker:

prophylactic effects and curative effects. That's my

Speaker:

subjective judgment. And my brand of data analysis is called

Speaker:

Bayesian, where you come to it with a judgment in

Speaker:

advance and then what data does is

Speaker:

revise your judgment. So,

Speaker:

so people were treating people with hydroxychloroquine and they were

Speaker:

on TV to the extent they could get on. This is great.

Speaker:

60 year old medicine is going to cure it all. As the data began

Speaker:

to came in, come in. If you'd have been a Bayesian,

Speaker:

you'd have said, my initial statistics on

Speaker:

the curative powers of high gloxychloroquine are being

Speaker:

totally not supported by the data. I would have changed

Speaker:

my judgment. That's what data analysis is for,

Speaker:

to get you to change your judgment.

Speaker:

So valuable. You know, you start off an AI

Speaker:

program with judgment as to what all the parameters are. Data

Speaker:

stops the judgment process and makes those parameters

Speaker:

consistent with observations. That's powerful, isn't it?

Speaker:

Yeah, very powerful. And so

Speaker:

hydroxychloroquine fairly quickly started

Speaker:

the, the curative and the, and the amelioration

Speaker:

properties of that drug were pretty quickly

Speaker:

contradicted by the data. Okay, Same with

Speaker:

Ivermectin, the horse tranquilizer. All right. Horse

Speaker:

tranquilizer is going to cure everybody. And, and

Speaker:

pretty quickly contradicted when they tried a little bit of that in small

Speaker:

case studies. It, you know, you might as well have been taking a sugar pill.

Speaker:

And so then we get to the shots and all the people said,

Speaker:

these shots, man, they're going to be magic. They're going to stop transmission.

Speaker:

That's my judgment. We're going to stop transmission. We began to gather

Speaker:

data. The shots were extremely valuable. They didn't stop

Speaker:

transmission, but they cut the severity of the symptoms by three

Speaker:

orders of magnitude. And so they changed our judgment.

Speaker:

The point of data science is get the data to give you

Speaker:

the best possible judgment of probability. That's

Speaker:

what data science is all about. Gather the right data that

Speaker:

can have the most effect on your judgment. That's what we want

Speaker:

from data, is our judgment as to how these

Speaker:

processes work. That's cool, isn't

Speaker:

it? So very valuable raw

Speaker:

material to making me a smarter person.

Speaker:

Having better judgment is consistent with a gazillion observations,

Speaker:

each of which is stochastic in its own right, but they're not

Speaker:

completely stochastic. It's not random. There's a lot of systematic

Speaker:

stuff hidden in there. So the idea of data science is, is

Speaker:

gathering data. If you don't just go out and gather the data, that's easiest to

Speaker:

gather. Gather data that you really need to have

Speaker:

to figure out whether the shot was going to

Speaker:

be strictly prophylactic or whether it's going to cut the

Speaker:

transmission. Well, we figured it out after a while, got enough data that

Speaker:

said it's not going to change the transmission. And you've heard all the political

Speaker:

bitching and moaning about that. Oh God, it's a lousy shot. It doesn't cut

Speaker:

the transmission. It saved lives by cutting the severity.

Speaker:

And that's statistically valid. And so

Speaker:

big success. The data gathering in places like Israel where shots were

Speaker:

mandatory was really interesting. Really showed the prophylactic effect.

Speaker:

God, that's valuable data, isn't it? That's

Speaker:

trillion dollar data. Is it allowed people to

Speaker:

make better decisions? Candace, you're right on the mark. Once you know

Speaker:

that, you can make a lot better decisions both collectively

Speaker:

and individually. I decided to take the COVID

Speaker:

shot because I did a decision analysis of it using my

Speaker:

subjective probabilities. It was a no brainer. How could you be an

Speaker:

anti vaxxer when you made that kind of assessment?

Speaker:

I've done that with allergy shots. Great example. I take

Speaker:

allergy shots, I'm a severe allergy patient. I went in to take

Speaker:

these things. Well, you know, they know what you're allergic to and they stick it

Speaker:

directly into your bloodstream. That's what an allergy shot is. If you're allergic to

Speaker:

peanuts, you get peanuts stuck in your body. And so I asked,

Speaker:

well, what are the, what are the problems here? Well, you know, you can die

Speaker:

of anaphylactic shock. You can drop off the bench and be dead in 30

Speaker:

seconds. Good. That's really a good thing to do. And

Speaker:

they might not work. So I looked at the statistics

Speaker:

and as near as I could tell, about 70% of the time

Speaker:

they Provided significant relief and it was about

Speaker:

1 in 10 to the 7th. So one 10

Speaker:

millionth probability that was going to drop dead and so

Speaker:

more likely to win the lottery or find another. You know

Speaker:

I looked at my values and I rolled back my decision tree. It was a

Speaker:

no brainer. 50 years I've been stepping up to the lady and

Speaker:

having her stick peanuts in my arm which I'm deathly allergic to.

Speaker:

And so you there also other things too that you're

Speaker:

in the doctor's office when that happens. So like if you do like they could

Speaker:

administer isn't that kind of alter the risk profile too? It's not

Speaker:

like you're. I carry an amputee. You wouldn't. I carry. Yeah, like so

Speaker:

you wouldn't go on a rowboat in the middle of the Ocean without your EpiPen

Speaker:

and then take that shot. I did but you know you're not supposed to. All

Speaker:

right, but, but I mean that would also. Yeah but in that

Speaker:

simple example the statistics of 10 to the -7

Speaker:

and the statistics of 70 probability of

Speaker:

amelioration of symptoms, that's the billion dollar data

Speaker:

right there. Simple minded data but

Speaker:

it helps you make can the right decision. I got that out

Speaker:

of the data but I also had to analyze it to find out

Speaker:

how that interacted with my preferences. So it

Speaker:

data analysis is critical to getting the information to

Speaker:

make a good decision. That's what I think. I didn't even

Speaker:

do any analysis. I just looked in books and saw those probabilities ain't

Speaker:

that great. So every time I go in there, hey, I might never walk out

Speaker:

of here again but the odds are pretty low.

Speaker:

That's a great point. So we want to be respectful of your time. It's

Speaker:

almost 40 after the hour. Where can folks find out about more about

Speaker:

you? And I know Stanford does a lot of these lectures. They post them online

Speaker:

or any of yours online. They haven't

Speaker:

been. No, they haven't been but that's why I'm talking to you. Maybe I'll do

Speaker:

that. And if you want to continue. I haven't told you much about data

Speaker:

analysis per se. If you want to fire up tomorrow, I'm

Speaker:

glad to do that. Yeah, let's. Let's talk again soon. We

Speaker:

want to know the uncertainty we're looking at. And the more data you

Speaker:

get, the less uncertainty you have. It gets pretty definitive if you

Speaker:

got 60 terabytes of data but we don't have that

Speaker:

candice. The other thing we'll talk about next time is in real decisions you don't

Speaker:

have any or very little data that's relevant to the decision you're

Speaker:

making. You get the data that somebody gathered on a government

Speaker:

grant, like crime statistics. There's

Speaker:

nothing in it. It's just data that's easy to gather. Okay.

Speaker:

You are awesome. I look forward to speaking with you

Speaker:

again. Yeah. With that, we'll let the

Speaker:

outro music play.

About the author, Frank

Frank La Vigne is a software engineer and UX geek who saw the light about Data Science at an internal Microsoft Data Science Summit in 2016. Now, he wants to share his passion for the Data Arts with the world.

He blogs regularly at FranksWorld.com and has a YouTube channel called Frank's World TV. (www.FranksWorld.TV). Frank has extensive experience in web and application development. He is also an expert in mobile and tablet engineering. You can find him on Twitter at @tableteer.