Highest scored questions

1373 votes

27 answers

973k views

Making sense of principal component analysis, eigenvectors & eigenvalues

In today's pattern recognition class my professor talked about PCA, eigenvectors and eigenvalues. I understood the mathematics of it. If I'm asked to find eigenvalues etc. I'll do it correctly like ...

claws

14k

asked Sep 15, 2010 at 20:05

826 votes

10 answers

1.2m views

How to choose the number of hidden layers and nodes in a feedforward neural network?

Is there a standard and accepted method for selecting the number of layers, and the number of nodes in each layer, in a feed-forward neural network? I'm interested in automated ways of building neural ...

Rob Hyndman

58.9k

asked Jul 20, 2010 at 0:15

687 votes

12 answers

512k views

What is the difference between "likelihood" and "probability"?

The wikipedia page claims that likelihood and probability are distinct concepts. In non-technical parlance, "likelihood" is usually a synonym for "probability," but in statistical usage there is a ...

Douglas S. Stones

7,741

asked Sep 14, 2010 at 3:24

633 votes

5 answers

527k views

Relationship between SVD and PCA. How to use SVD to perform PCA?

Principal component analysis (PCA) is usually explained via an eigen-decomposition of the covariance matrix. However, it can also be performed via singular value decomposition (SVD) of the data matrix ...

amoeba

108k

asked Jan 20, 2015 at 23:47

576 votes

15 answers

254k views

What is the intuition behind beta distribution?

Disclaimer: I'm not a statistician but a software engineer. Most of my knowledge in statistics comes from self-education, thus I still have many gaps in understanding concepts that may seem trivial ...

ffriend

10.1k

asked Jan 15, 2013 at 15:31

571 votes

11 answers

670k views

What is the difference between test set and validation set?

I found this confusing when I use the neural network toolbox in Matlab. It divided the raw data set into three parts: training set validation set test set I notice in many training or learning ...

xiaohan2012

7,269

asked Nov 28, 2011 at 11:05

565 votes

23 answers

333k views

Why square the difference instead of taking the absolute value in standard deviation?

In the definition of standard deviation, why do we have to square the difference from the mean to get the mean (E) and take the square root back at the end? Can't we just simply take the absolute ...

c4il

5,905

asked Jul 19, 2010 at 21:04

500 votes

20 answers

180k views

The Two Cultures: statistics vs. machine learning?

Last year, I read a blog post from Brendan O'Connor entitled "Statistics vs. Machine Learning, fight!" that discussed some of the differences between the two fields. Andrew Gelman responded ...

Community wiki

9 revs, 7 users 47%
Shane

447 votes

5 answers

178k views

How to understand the drawbacks of K-means

K-means is a widely used method in cluster analysis. In my understanding, this method does NOT require ANY assumptions, i.e., give me a dataset and a pre-specified number of clusters, k, and I just ...

KevinKim

6,979

asked Jan 16, 2015 at 4:38

443 votes

9 answers

901k views

What is the difference between fixed effect, random effect in mixed effect models?

In simple terms, how would you explain (perhaps with simple examples) the difference between fixed effect, random effect in mixed effect models?

Andrew

6,348

asked Nov 19, 2010 at 0:03

441 votes

13 answers

288k views

Bayesian and frequentist reasoning in plain English

How would you describe in plain English the characteristics that distinguish Bayesian from Frequentist reasoning?

Daniel Vassallo

4,519

asked Jul 19, 2010 at 19:25

426 votes

11 answers

190k views

Explaining to laypeople why bootstrapping works

I recently used bootstrapping to estimate confidence intervals for a project. Someone who doesn't know much about statistics recently asked me to explain why bootstrapping works, i.e., why is it that ...

Alan H.

5,289

asked Apr 8, 2012 at 21:04

421 votes

17 answers

175k views

What happens if the explanatory and response variables are sorted independently before regression?

Suppose we have data set $(X_i,Y_i)$ with $n$ points. We want to perform a linear regression, but first we sort the $X_i$ values and the $Y_i$ values independently of each other, forming data set $(...

arbitrary user

3,891

asked Dec 7, 2015 at 17:22

418 votes

7 answers

429k views

When conducting multiple regression, when should you center your predictor variables & when should you standardize them?

In some literature, I have read that a regression with multiple explanatory variables, if in different units, needed to be standardized. (Standardizing consists in subtracting the mean and dividing ...

mathieu_r

4,591

asked Jun 4, 2012 at 16:32

414 votes

7 answers

1.7m views

How to normalize data to 0-1 range?

I am lost in normalizing, could anyone guide me please. I have a minimum and maximum values, say -23.89 and 7.54990767, respectively. If I get a value of 5.6878 how can I scale this value on a scale ...

Angelo

4,595

asked Sep 23, 2013 at 15:18

Stack Exchange Network