Skip to main content

Unanswered Questions

8,651 questions with no upvoted or accepted answers
17 votes
0 answers
2k views

Rademacher complexity of logistic regression

Consider logistic regression. We have the logistic loss function, $\phi: R\rightarrow [0,1], \phi(u)=\log(1+\exp(-u))$, which is Lipschitz, and we have the linear function class $F=\{f_w:R^d \...
12 votes
0 answers
1k views

Why we really need the concept of "Local" Rademacher complexity?

Recently, I have been studying High-Dimensional Statistics: A Non-Asymptotic Viewpoint written by Martin J. Wainwright. In this book, the author uses a special complexity measure which is called Local ...
12 votes
0 answers
788 views

What approaches use multiple eigenvectors in graph spectral clustering?

Background: In Newman's PNAS 2006 paper Modularity and community structure in networks, the first eigenvector splits the graph in two clusters, and then each cluster can be further divided by ...
10 votes
0 answers
624 views

When using L2 regularization outside of linear regression, do the same MAP estimation assumptions hold?

Some context is shared below, and my question is bolded at the end. MLE from observation noise In the linear regression setting, we learn model weights $\mathbf{w}$ to make scalar predictions $\hat{y}...
10 votes
0 answers
2k views

Difference between Shapley values and SHAP

The Paper regarding die SHAP value gives a formula for the Shapley Values in (4) and for SHAP values apparently (?) in (8) Still I dont really understand the difference between Shapley and SHAP ...
10 votes
0 answers
377 views

Reinforcement *Model* Learning

Classical reinforcement learning (Q- or Sarsa-Learning) can be extended with models of the environment. These models are usually transition tables that contain the probability of arriving at a ...
10 votes
0 answers
3k views

Cluster analysis vs Factor analysis as a means for "grouping" variables or cases

I've noticed responses that at face value seem to be in contradiction with each other. For instance, here @peter-flom writes Short answer: Cluster analysis is about grouping subjects (e.g. ...
9 votes
0 answers
765 views

How come the BART results are this good at the 2016 Atlantic causal inference competition?

The famous paper Dorie,2017 shows that BART performs dramatically well in causal inference. In my replication, MSE in BART can be 40% lower than MSE in other machine learning methods. But all machine ...
9 votes
1 answer
179 views

Is there a ML or DL tool that can learn to detect periodically occurring patterns in a one dimensional time series?

I am trying to create a tool that labels refrigerator temperature readings. A reading is taken every 5 minutes, and its label identifies whether of not it was taken while the refrigerator was ...
9 votes
0 answers
182 views

Territories from observations

I have a number of animal observations, and want to deduce the number of territories (i.e. the number of individual animals) from this. More formally, the problem can be stated as follows: Each ...
8 votes
0 answers
3k views

In the attention mechanism, why are there separate weight matrices for the queries and keys?

To perform self attention over a collection of $n$ vectors stacked up into a matrix $X \in \mathbb{R}^{n \times d}$, we first obtain query, key, and value representations of these vectors via three ...
8 votes
0 answers
2k views

Zero-inflation with sklearn and continuous target?

My current data have quite a large amount of zeros (~60%), and I'm thinking of trying to implement a zero-inflated model of sorts with sklearn. While I've used zero-inflated poisson/negative binomial ...
8 votes
0 answers
1k views

The extrapolation problem: model selection, performance metrics, and improvement

Machine learning models are fit to a response variable within a given range. This leads to weak and sometimes disastrous performance when it comes to instances with an actual response variable outside ...
8 votes
0 answers
331 views

What machine learning and deep learning models are used for longitudinal studies (panel data)?

As the title suggests, I have a longitudinal database (also called panel data). (I have over 100.000 observations. The time period is X years. This means that for every year I have the values of the ...
8 votes
1 answer
809 views

How to predict routes using clustering data

I've been working on a ship route prediction algorithm such that given the past and current trajectory of a ship I am able to estimate the future one. The trajectories are represented as a sequence of ...

15 30 50 per page
1
2 3 4 5
577