Unanswered Questions
8,652 questions with no upvoted or accepted answers
17
votes
0
answers
2k
views
Rademacher complexity of logistic regression
Consider logistic regression. We have the logistic loss function,
$\phi: R\rightarrow [0,1], \phi(u)=\log(1+\exp(-u))$, which is Lipschitz, and we have the linear function class $F=\{f_w:R^d \...
12
votes
0
answers
1k
views
Why we really need the concept of "Local" Rademacher complexity?
Recently, I have been studying High-Dimensional Statistics: A Non-Asymptotic Viewpoint written by Martin J. Wainwright. In this book, the author uses a special complexity measure which is called Local ...
12
votes
0
answers
788
views
What approaches use multiple eigenvectors in graph spectral clustering?
Background: In Newman's PNAS 2006 paper Modularity and community structure in networks, the first eigenvector splits the graph in two clusters, and then each cluster can be further divided by ...
10
votes
0
answers
624
views
When using L2 regularization outside of linear regression, do the same MAP estimation assumptions hold?
Some context is shared below, and my question is bolded at the end.
MLE from observation noise
In the linear regression setting, we learn model weights $\mathbf{w}$ to make scalar predictions $\hat{y}...
10
votes
0
answers
2k
views
Difference between Shapley values and SHAP
The Paper regarding die SHAP value gives a formula for the Shapley Values in (4) and for SHAP values apparently (?) in (8)
Still I dont really understand the difference between Shapley and SHAP ...
10
votes
0
answers
377
views
Reinforcement *Model* Learning
Classical reinforcement learning (Q- or Sarsa-Learning) can be extended with models of the environment. These models are usually transition tables that contain the probability of arriving at a ...
10
votes
0
answers
3k
views
Cluster analysis vs Factor analysis as a means for "grouping" variables or cases
I've noticed responses that at face value seem to be in contradiction with each other.
For instance, here @peter-flom writes
Short answer: Cluster analysis is about grouping subjects (e.g.
...
9
votes
0
answers
765
views
How come the BART results are this good at the 2016 Atlantic causal inference competition?
The famous paper Dorie,2017 shows that BART performs dramatically well in causal inference. In my replication, MSE in BART can be 40% lower than MSE in other machine learning methods.
But all machine ...
9
votes
1
answer
179
views
Is there a ML or DL tool that can learn to detect periodically occurring patterns in a one dimensional time series?
I am trying to create a tool that labels refrigerator temperature readings. A reading is taken every 5 minutes, and its label identifies whether of not it was taken while the refrigerator was ...
9
votes
0
answers
182
views
Territories from observations
I have a number of animal observations, and want to deduce the number of territories (i.e. the number of individual animals) from this.
More formally, the problem can be stated as follows: Each ...
8
votes
0
answers
3k
views
In the attention mechanism, why are there separate weight matrices for the queries and keys?
To perform self attention over a collection of $n$ vectors stacked up into a matrix $X \in \mathbb{R}^{n \times d}$, we first obtain query, key, and value representations of these vectors via three ...
8
votes
0
answers
2k
views
Zero-inflation with sklearn and continuous target?
My current data have quite a large amount of zeros (~60%), and I'm thinking of trying to implement a zero-inflated model of sorts with sklearn. While I've used zero-inflated poisson/negative binomial ...
8
votes
0
answers
1k
views
The extrapolation problem: model selection, performance metrics, and improvement
Machine learning models are fit to a response variable within a given range. This leads to weak and sometimes disastrous performance when it comes to instances with an actual response variable outside ...
8
votes
0
answers
331
views
What machine learning and deep learning models are used for longitudinal studies (panel data)?
As the title suggests, I have a longitudinal database (also called panel data). (I have over 100.000 observations. The time period is X years. This means that for every year I have the values of the ...
8
votes
1
answer
809
views
How to predict routes using clustering data
I've been working on a ship route prediction algorithm such that given the past and current trajectory of a ship I am able to estimate the future one. The trajectories are represented as a sequence of ...