Skip to main content

Unanswered Questions

8,983 questions with no upvoted or accepted answers
17 votes
0 answers
2k views

Rademacher complexity of logistic regression

Consider logistic regression. We have the logistic loss function, $\phi: R\rightarrow [0,1], \phi(u)=\log(1+\exp(-u))$, which is Lipschitz, and we have the linear function class $F=\{f_w:R^d \...
14 votes
0 answers
700 views

Convolutional neural network for multi-variate time series?

I want to use CNN architectures for classification of multivariate time-series, where we apply one label to each sequence. I searched the net for the available designs in the literature and i found ...
13 votes
0 answers
272 views

Logistic regression for classification: are there any analytical solutions for the out-of-sample accuracy?

I run a binary logistic regression, with a binary dependent variable and a continuous independent one. Now I want to evaluate the out-of-sample performance of the classification algorithm so obtained. ...
12 votes
0 answers
1k views

Why we really need the concept of "Local" Rademacher complexity?

Recently, I have been studying High-Dimensional Statistics: A Non-Asymptotic Viewpoint written by Martin J. Wainwright. In this book, the author uses a special complexity measure which is called Local ...
12 votes
0 answers
2k views

Computing a bootstrap confidence interval for the prediction error with the percentile and the BCa method

I have two related questions regarding the computation of a non-parametric bootstrap confidence interval for the prediction error. Setting: I have a sample S from a data population P and a learner L, ...
10 votes
0 answers
624 views

When using L2 regularization outside of linear regression, do the same MAP estimation assumptions hold?

Some context is shared below, and my question is bolded at the end. MLE from observation noise In the linear regression setting, we learn model weights $\mathbf{w}$ to make scalar predictions $\hat{y}...
10 votes
0 answers
2k views

Difference between Shapley values and SHAP

The Paper regarding die SHAP value gives a formula for the Shapley Values in (4) and for SHAP values apparently (?) in (8) Still I dont really understand the difference between Shapley and SHAP ...
10 votes
0 answers
377 views

Reinforcement *Model* Learning

Classical reinforcement learning (Q- or Sarsa-Learning) can be extended with models of the environment. These models are usually transition tables that contain the probability of arriving at a ...
10 votes
2 answers
2k views

Random Forest: Class specific feature importance

I'm using the bigrf R-package to analyse a dataset with ca. 50.000 observations x 120 variables, classified into two groups. After growing a forest of 1000 trees, ...
9 votes
0 answers
765 views

How come the BART results are this good at the 2016 Atlantic causal inference competition?

The famous paper Dorie,2017 shows that BART performs dramatically well in causal inference. In my replication, MSE in BART can be 40% lower than MSE in other machine learning methods. But all machine ...
9 votes
1 answer
179 views

Is there a ML or DL tool that can learn to detect periodically occurring patterns in a one dimensional time series?

I am trying to create a tool that labels refrigerator temperature readings. A reading is taken every 5 minutes, and its label identifies whether of not it was taken while the refrigerator was ...
8 votes
0 answers
3k views

In the attention mechanism, why are there separate weight matrices for the queries and keys?

To perform self attention over a collection of $n$ vectors stacked up into a matrix $X \in \mathbb{R}^{n \times d}$, we first obtain query, key, and value representations of these vectors via three ...
8 votes
0 answers
2k views

Zero-inflation with sklearn and continuous target?

My current data have quite a large amount of zeros (~60%), and I'm thinking of trying to implement a zero-inflated model of sorts with sklearn. While I've used zero-inflated poisson/negative binomial ...
8 votes
0 answers
1k views

The extrapolation problem: model selection, performance metrics, and improvement

Machine learning models are fit to a response variable within a given range. This leads to weak and sometimes disastrous performance when it comes to instances with an actual response variable outside ...
8 votes
0 answers
331 views

What machine learning and deep learning models are used for longitudinal studies (panel data)?

As the title suggests, I have a longitudinal database (also called panel data). (I have over 100.000 observations. The time period is X years. This means that for every year I have the values of the ...

15 30 50 per page
1
2 3 4 5
599