Skip to main content

Unanswered Questions

10,108 questions with no upvoted or accepted answers
17 votes
0 answers
2k views

Rademacher complexity of logistic regression

Consider logistic regression. We have the logistic loss function, $\phi: R\rightarrow [0,1], \phi(u)=\log(1+\exp(-u))$, which is Lipschitz, and we have the linear function class $F=\{f_w:R^d \...
13 votes
0 answers
741 views

Help me understand the Bayesian kernel density estimation (Sibisi and Skilling, 1996)

Sibisi and Skilling (1996, also mentioned in the 1997 paper) define Bayesian kernel density as $$ f(x) = \int dx' \,\phi(x')\, K(x, x') \tag{2} $$ Here the kernel $K$ is an assigned smooth ...
12 votes
0 answers
517 views

Official name of a common type of Bayesian simulation study

There is a kind of simulation study that is commonly used to validate an implementation of a Bayesian model: For independent replication $i = 1, ..., n$: Draw a set of "true" parameters ...
12 votes
0 answers
1k views

Why we really need the concept of "Local" Rademacher complexity?

Recently, I have been studying High-Dimensional Statistics: A Non-Asymptotic Viewpoint written by Martin J. Wainwright. In this book, the author uses a special complexity measure which is called Local ...
12 votes
0 answers
3k views

Fourier transform of a Gaussian process

I would like to discuss and ask a question regarding the Fourier transform of a Gaussian process, if it makes sense. For that purpose, let me describe the following situation. Let $z(s)$ be a ...
11 votes
0 answers
192 views

Pope effect on pizza - Regression with presence absence and similarity data as dependent variables

I'm trying to figure out the right way to set up a regression when the dependent variables are presence absence data (of pizzas), and the similarity between the present pizzas. Bear with the story: ...
11 votes
1 answer
745 views

Hypergeometric: how do I construct a credibility interval around K (population successes) in R?

I have a problem for which I believe I should use the hypergeometric distribution, but I can't figure out how to do it in R. Say I have a bag of marbles with known number ($N$) of marbles, but the ...
10 votes
3 answers
232 views

How to guess the size of a set?

Assume we have a set of unique words and draw a number $n$ of them using simple-random-sampling without replacement independently in each round. We have several rounds and try to guess the set size ...
10 votes
0 answers
624 views

When using L2 regularization outside of linear regression, do the same MAP estimation assumptions hold?

Some context is shared below, and my question is bolded at the end. MLE from observation noise In the linear regression setting, we learn model weights $\mathbf{w}$ to make scalar predictions $\hat{y}...
10 votes
0 answers
2k views

Difference between Shapley values and SHAP

The Paper regarding die SHAP value gives a formula for the Shapley Values in (4) and for SHAP values apparently (?) in (8) Still I dont really understand the difference between Shapley and SHAP ...
10 votes
0 answers
377 views

Reinforcement *Model* Learning

Classical reinforcement learning (Q- or Sarsa-Learning) can be extended with models of the environment. These models are usually transition tables that contain the probability of arriving at a ...
10 votes
0 answers
160 views

Rationale behind Good–Turing frequency estimation?

Good–Turing frequency estimation is a smoothing estimator for estimating a multinomial distribution. It seems very convoluted. From mathematical statistics point of view, what is the rationale behind ...
9 votes
1 answer
2k views

PyMC3 implementation of Bayesian MMM: poor posterior inference

Google released a whitepaper on Media Mix Modelling (MMM) in 2017; vanilla MMM (established in the 1960s) uses multivariate regression. It's a decent mechanism to understand which of your marketing ...
9 votes
0 answers
765 views

How come the BART results are this good at the 2016 Atlantic causal inference competition?

The famous paper Dorie,2017 shows that BART performs dramatically well in causal inference. In my replication, MSE in BART can be 40% lower than MSE in other machine learning methods. But all machine ...
9 votes
1 answer
179 views

Is there a ML or DL tool that can learn to detect periodically occurring patterns in a one dimensional time series?

I am trying to create a tool that labels refrigerator temperature readings. A reading is taken every 5 minutes, and its label identifies whether of not it was taken while the refrigerator was ...

15 30 50 per page
1
2 3 4 5
674