Skip to main content

Unanswered Questions

10,510 questions with no upvoted or accepted answers
11 votes
0 answers
2k views

AdaBoost implementation and tuning for high dimensional feature space in R

I am trying to implement the AdaBoost.M1 algorithm (trees as base-learners) to a data set with a large feature space (~ 20.000 features) and ~ 100 samples in R. ...
10 votes
1 answer
2k views

Why is my Keras model not learning image segmentation?

Edit: as is turns out, not even the model's initial creator could successfully fine-tune it. This is most likely a problem of implementation, or possibly related to the non-intuitive way in which the ...
9 votes
0 answers
3k views

Python : Feature Matching + Homography to find Multiple Objects

I'm trying to use OpenCV via Python to find multiple objects in a train image and match it with the key points detected from a query image. For my case, I'm trying to detect the tennis courts in the ...
8 votes
0 answers
156 views

Training value neural network AlphaGo style

I have been trying to replicate the results obtained by AlphaGo following their supervise learning protocol. The papers specify that they use a network that has two heads: a value head that predicts ...
8 votes
3 answers
839 views

Chi-square as evaluation metrics for nonlinear machine learning regression models

I am using machine learning models to predict an ordinal variable (values: 1,2,3,4, and 5) using 7 different features. I posed this as a regression problem, so the final outputs of a model are ...
8 votes
1 answer
448 views

How to predict advantage value in deep reinforcement learning

I'm currently working on a collection of reinforcement algorithms: https://github.com/lhk/rl_gym For deep q-learning, you need to calculate the q-values that should be predicted by your network. There ...
8 votes
1 answer
4k views

Decision Trees - C4.5 vs CART - rule sets

When I read the scikit-learn user manual about Decision Trees, they mentioned that CART (Classification and Regression Trees) is very similar to C4.5, but it differs in that it supports numerical ...
7 votes
2 answers
140 views

Data transformations in hierarchical classification

I am building a hierarchical text classifier using the Local Classifier Per Parent Node (LCPN) approach with the 'siblings' policy as described in the A survey of hierarchical classification across ...
7 votes
1 answer
9k views

How is WordPiece tokenization helpful to effectively deal with rare words problem in NLP?

I have seen that NLP models such as BERT utilize WordPiece for tokenization. In WordPiece, we split the tokens like playing to play and ##ing. It is mentioned that it covers a wider spectrum of Out-Of-...
7 votes
0 answers
3k views

Tensorflow v1 Dataset API AttributeError with ndim

I'd like to make pipeline for optimizing Gpu and Cpu. Dataset It's about 10000 datapoint and 4 description variables for the regression problem. ...
7 votes
0 answers
2k views

Using the Python Keras multi_gpu_model with LSTM / GRU to predict Timeseries data

I'm having an issue with python keras LSTM / GRU layers with multi_gpu_model for machine learning. When I use a single GPU, the predictions work correctly ...
7 votes
0 answers
1k views

Multivariate, multistep forecasting with LSTM

I want to use an RNN with LSTM to forecast multiple steps into the future, based on multiple inputs. I have some ideas for different ways to approach this, but I'm afraid I'm missing the "right way" ...
7 votes
0 answers
2k views

Fine tuning accuracy lower than Raw Transfer Learning Accuracy

I've used transfer learning on Inception V3 with ImageNet weights on Keras with Tensorflow backend on python 2.7 to create an image classifier. I first extracted and saved the bottleneck features from ...
7 votes
0 answers
966 views

ALS in Spark: what loss function is it minimizing?

I’ve playing with the MovieLens ratings dataset under Spark’s ALS and a manual implementation of ALS and comparing results with the same hyperparameters. I’d like to know this exactly in order to make ...
7 votes
0 answers
523 views

differences between LSQR and FTRL when working with very sparse data

I have a 2M instances dataset with millions of very very sparse dummy variables created using the hashing trick = ...

15 30 50 per page
1
2 3 4 5
701