Skip to main content

Unanswered Questions

327 questions with no upvoted or accepted answers
6 votes
0 answers
66 views

What are some popular but outdated or ineffective practices in data science?

I was taught stepwise feature selection (like forward and backward selection) during college, and at the time, it seemed like a really effective way to pick features. But recently i have been reading ...
6 votes
1 answer
218 views

Predicting change of shapes/coordinates

I'm trying to find a way to predict/calculate how a shape (e.g. outline of a glacier) will change in the future—based on its history (previous shape) and additional factors (e.g. Δtemperature). In my ...
5 votes
2 answers
4k views

Fix first two levels of decision tree?

I am trying to build a regression tree with 70 attributes where the business team wants to fix the first two levels namely country and product type. To achieve this, I have two proposals: Build a ...
4 votes
2 answers
279 views

Support Vector Regression trained with data sets

I am now searching for a long time on the internet and on papers for an answers of simple questions. Am I able to train a Support Vector Regression algorithm with different data sets? If yes, how is ...
3 votes
0 answers
31 views

suppose 1 category in a variable create data leakage, can we use other categories in the same variable as dummy to predict?

We are predicting conversion. Conversion means customer converted from paying one-off to paying regular (subscribe) If one feature is categorical feature "Activity" , consisting 15+ ...
3 votes
0 answers
59 views

How can I link tasks using machine learning / ai based on historical task sequences?

I'm working on an AI model to predict dependency links between tasks for industrial plannifications, based on historical project data. I have two tables: Task Table (15 sheets, one sheet = one ...
3 votes
0 answers
29 views

History that lead to the word "predict" being used for the application of a model on data

Background The framework scikit-learn uses "predict" for the application of model on (new) input data and I have seen many people use that term. In the scientific papers that I have read (...
3 votes
1 answer
615 views

Neural Network regression negative performance

I have a problem with the performance of a multi layer perceptron regressor (neural network) and I cannot figure out why. Task: I am trying to improve a time series prediction. I have predictions of a ...
3 votes
3 answers
305 views

How to decide who to market? Clustering or Decision Tree?

I am working with a dataset that has enough observations and ~ 10 variables, half of the variables are numeric another half of the variables are categorical with 2-3 levels (demographics) one ID ...
3 votes
1 answer
125 views

How to incorporate the uncertainty of the model coefficients in the prediction interval of a multiple linear regression

I'm dealing with modeling small experimental data sets. As most experimental work does not generate thousands of samples, but rather a handful, I need to be inventive about how to deal with this small ...
3 votes
0 answers
46 views

Serializing a trained classification model into a set of actionable insights

I'm looking for ways to convert a trained classification model into a list of insights based on the resulting parameters of the model. To make an example, let's assume we trained a decision tree to ...
3 votes
1 answer
127 views

How can I improve the accuracy of my model? (Cab Cancellation Prediction)

I am trying to predict based on several parameters like trip type, car type, source of booking, start time, lead time (start- book) and a few other params whether or not a customer will cancel. From ...
3 votes
0 answers
58 views

Improving a simple trig model

I have some data which I know is well approximated as a trig function, and I can fit it with scipy.optimize.curve_fit as follows: ...
3 votes
1 answer
281 views

Model Guardrails

Suppose I am building a machine learning model for an application where I do not need to make a prediction on all new samples, and given a new sample, it is better to make no prediction at all when ...
3 votes
0 answers
56 views

Is linear regression on the trees of XGBoost (rather than taking their mean) useful/popular?

Given training data $(\underline{x}_1, y_1),...,(\underline{x_N}, y_N)$, one can choose a variety of ensemble method for trees. These algorithms output a set of trees $T_1, ..., T_n$, and then the ...

15 30 50 per page
1
2 3 4 5
22