Unanswered Questions
8,983 questions with no upvoted or accepted answers
17
votes
0
answers
2k
views
Rademacher complexity of logistic regression
Consider logistic regression. We have the logistic loss function,
$\phi: R\rightarrow [0,1], \phi(u)=\log(1+\exp(-u))$, which is Lipschitz, and we have the linear function class $F=\{f_w:R^d \...
14
votes
0
answers
700
views
Convolutional neural network for multi-variate time series?
I want to use CNN architectures for classification of multivariate time-series, where we apply one label to each sequence.
I searched the net for the available designs in the literature and i found ...
13
votes
0
answers
272
views
Logistic regression for classification: are there any analytical solutions for the out-of-sample accuracy?
I run a binary logistic regression, with a binary dependent variable and a continuous independent one.
Now I want to evaluate the out-of-sample performance of the classification algorithm so obtained. ...
12
votes
0
answers
1k
views
Why we really need the concept of "Local" Rademacher complexity?
Recently, I have been studying High-Dimensional Statistics: A Non-Asymptotic Viewpoint written by Martin J. Wainwright. In this book, the author uses a special complexity measure which is called Local ...
12
votes
0
answers
2k
views
Computing a bootstrap confidence interval for the prediction error with the percentile and the BCa method
I have two related questions regarding the computation of a non-parametric bootstrap confidence interval for the prediction error.
Setting: I have a sample S from a data population P and a learner L, ...
10
votes
0
answers
624
views
When using L2 regularization outside of linear regression, do the same MAP estimation assumptions hold?
Some context is shared below, and my question is bolded at the end.
MLE from observation noise
In the linear regression setting, we learn model weights $\mathbf{w}$ to make scalar predictions $\hat{y}...
10
votes
0
answers
2k
views
Difference between Shapley values and SHAP
The Paper regarding die SHAP value gives a formula for the Shapley Values in (4) and for SHAP values apparently (?) in (8)
Still I dont really understand the difference between Shapley and SHAP ...
10
votes
0
answers
377
views
Reinforcement *Model* Learning
Classical reinforcement learning (Q- or Sarsa-Learning) can be extended with models of the environment. These models are usually transition tables that contain the probability of arriving at a ...
10
votes
2
answers
2k
views
Random Forest: Class specific feature importance
I'm using the bigrf R-package to analyse a dataset with ca. 50.000 observations x 120 variables, classified into two groups.
After growing a forest of 1000 trees, ...
9
votes
0
answers
765
views
How come the BART results are this good at the 2016 Atlantic causal inference competition?
The famous paper Dorie,2017 shows that BART performs dramatically well in causal inference. In my replication, MSE in BART can be 40% lower than MSE in other machine learning methods.
But all machine ...
9
votes
1
answer
179
views
Is there a ML or DL tool that can learn to detect periodically occurring patterns in a one dimensional time series?
I am trying to create a tool that labels refrigerator temperature readings. A reading is taken every 5 minutes, and its label identifies whether of not it was taken while the refrigerator was ...
8
votes
0
answers
3k
views
In the attention mechanism, why are there separate weight matrices for the queries and keys?
To perform self attention over a collection of $n$ vectors stacked up into a matrix $X \in \mathbb{R}^{n \times d}$, we first obtain query, key, and value representations of these vectors via three ...
8
votes
0
answers
2k
views
Zero-inflation with sklearn and continuous target?
My current data have quite a large amount of zeros (~60%), and I'm thinking of trying to implement a zero-inflated model of sorts with sklearn. While I've used zero-inflated poisson/negative binomial ...
8
votes
0
answers
1k
views
The extrapolation problem: model selection, performance metrics, and improvement
Machine learning models are fit to a response variable within a given range. This leads to weak and sometimes disastrous performance when it comes to instances with an actual response variable outside ...
8
votes
0
answers
331
views
What machine learning and deep learning models are used for longitudinal studies (panel data)?
As the title suggests, I have a longitudinal database (also called panel data). (I have over 100.000 observations. The time period is X years. This means that for every year I have the values of the ...