Highly scored unanswered questions - Data Science Stack Exchange

6 votes

0 answers

66 views

What are some popular but outdated or ineffective practices in data science?

I was taught stepwise feature selection (like forward and backward selection) during college, and at the time, it seemed like a really effective way to pick features. But recently i have been reading ...

Guna

480

asked Apr 23 at 16:21

6 votes

1 answer

218 views

Predicting change of shapes/coordinates

I'm trying to find a way to predict/calculate how a shape (e.g. outline of a glacier) will change in the future—based on its history (previous shape) and additional factors (e.g. Δtemperature). In my ...

CommunityBot

1

modified Feb 5 at 15:05

5 votes

2 answers

4k views

Fix first two levels of decision tree?

I am trying to build a regression tree with 70 attributes where the business team wants to fix the first two levels namely country and product type. To achieve this, I have two proposals: Build a ...

CommunityBot

1

modified Feb 10 at 9:03

4 votes

2 answers

279 views

Support Vector Regression trained with data sets

I am now searching for a long time on the internet and on papers for an answers of simple questions. Am I able to train a Support Vector Regression algorithm with different data sets? If yes, how is ...

CommunityBot

1

modified Jan 22 at 6:02

3 votes

0 answers

31 views

suppose 1 category in a variable create data leakage, can we use other categories in the same variable as dummy to predict?

We are predicting conversion. Conversion means customer converted from paying one-off to paying regular (subscribe) If one feature is categorical feature "Activity" , consisting 15+ ...

user30388975

31

asked Apr 28 at 5:31

3 votes

0 answers

59 views

How can I link tasks using machine learning / ai based on historical task sequences?

I'm working on an AI model to predict dependency links between tasks for industrial plannifications, based on historical project data. I have two tables: Task Table (15 sheets, one sheet = one ...

user180417

101

modified Apr 15 at 7:35

3 votes

0 answers

29 views

History that lead to the word "predict" being used for the application of a model on data

Background The framework scikit-learn uses "predict" for the application of model on (new) input data and I have seen many people use that term. In the scientific papers that I have read (...

Make42

842

modified Mar 8, 2021 at 13:13

3 votes

1 answer

615 views

Neural Network regression negative performance

I have a problem with the performance of a multi layer perceptron regressor (neural network) and I cannot figure out why. Task: I am trying to improve a time series prediction. I have predictions of a ...

CommunityBot

1

modified Apr 6 at 7:03

3 votes

3 answers

305 views

How to decide who to market? Clustering or Decision Tree?

I am working with a dataset that has enough observations and ~ 10 variables, half of the variables are numeric another half of the variables are categorical with 2-3 levels (demographics) one ID ...

CommunityBot

1

modified Feb 21 at 22:03

3 votes

1 answer

125 views

How to incorporate the uncertainty of the model coefficients in the prediction interval of a multiple linear regression

I'm dealing with modeling small experimental data sets. As most experimental work does not generate thousands of samples, but rather a handful, I need to be inventive about how to deal with this small ...

CommunityBot

1

modified May 4 at 1:09

3 votes

0 answers

46 views

Serializing a trained classification model into a set of actionable insights

I'm looking for ways to convert a trained classification model into a list of insights based on the resulting parameters of the model. To make an example, let's assume we trained a decision tree to ...

ozz1k

83

asked Mar 9, 2020 at 0:15

3 votes

1 answer

127 views

How can I improve the accuracy of my model? (Cab Cancellation Prediction)

I am trying to predict based on several parameters like trip type, car type, source of booking, start time, lead time (start- book) and a few other params whether or not a customer will cancel. From ...

CommunityBot

1

modified Mar 28 at 3:04

3 votes

0 answers

58 views

Improving a simple trig model

I have some data which I know is well approximated as a trig function, and I can fit it with scipy.optimize.curve_fit as follows: ...

Brian Spiering

22.9k

modified Oct 15, 2020 at 13:10

3 votes

1 answer

281 views

Model Guardrails

Suppose I am building a machine learning model for an application where I do not need to make a prediction on all new samples, and given a new sample, it is better to make no prediction at all when ...

CommunityBot

1

modified May 2 at 3:09

3 votes

0 answers

56 views

Is linear regression on the trees of XGBoost (rather than taking their mean) useful/popular?

Given training data $(\underline{x}_1, y_1),...,(\underline{x_N}, y_N)$, one can choose a variety of ensemble method for trees. These algorithms output a set of trees $T_1, ..., T_n$, and then the ...

Andrew NC

131

asked Oct 30, 2018 at 0:44

Stack Exchange Network

Unanswered Questions

What are some popular but outdated or ineffective practices in data science?

Predicting change of shapes/coordinates

Fix first two levels of decision tree?

Support Vector Regression trained with data sets

suppose 1 category in a variable create data leakage, can we use other categories in the same variable as dummy to predict?

How can I link tasks using machine learning / ai based on historical task sequences?

History that lead to the word "predict" being used for the application of a model on data

Neural Network regression negative performance

How to decide who to market? Clustering or Decision Tree?

How to incorporate the uncertainty of the model coefficients in the prediction interval of a multiple linear regression

Serializing a trained classification model into a set of actionable insights

How can I improve the accuracy of my model? (Cab Cancellation Prediction)

Improving a simple trig model

Model Guardrails

Is linear regression on the trees of XGBoost (rather than taking their mean) useful/popular?

Unanswered Questions

Unanswered Tags