Skip to main content

Tags

A tag is a keyword or label that categorizes your question with other, similar questions. Using the right tags makes it easier for others to find and answer your question.

The relationship between cause and effect.
Requests for datasets are off-topic on this site. Use this tag for questions concerning creating, processing, or maintaining datasets.
1927 questions
Signals situations where one is concerned about achieving intended power and size when more than one hypothesis test is performed.
1902 questions
An area of machine learning concerned with learning hierarchical representations of the data, mainly done with deep neural networks.
1857 questions
Descriptive statistics summarize features of a sample, such as mean and standard deviations, median and quartiles, the maximum and minimum. With multiple variables, may include correlations and crosst…
A machine-learning library for Python. Use this tag for any on-topic question that (a) involves scikit-learn either as a critical part of the question or expected answer, & (b) is not just about how t…
The residuals of a model are the actual values minus the predicted values. Many statistical models make assumptions about the error, which is estimated by the residuals.
1821 questions
Cox proportional hazards regression is a semi-parametric method for survival analysis. No distributional form needs to be assumed, only that the effect of one-unit increase in a covariate is a constan…
Covariance is a quantity used to measure the strength and direction of the linear relationship between two variables. The covariance is unscaled, & thus often difficult to interpret; when scaled by th…
Autocorrelation (serial correlation) is the correlation of a series of data with itself at some lag. This is an important topic in time series analysis.
Probability density function (PDF) of a continuous random variable gives the relative probability for each of its possible values. Use this tag for discrete probability mass functions (PMFs) too.
1721 questions
Usage and meaning of specific technical words/concepts in statistics.
1717 questions
Is a property of a hypothesis testing method: the probability of rejecting the null hypothesis given that it is false, i.e. the probability of not making a type II error. The power of a test depends o…
When the data present lack of information (gaps), i.e., are not complete. Hence, it is important to consider this feature when performing an analysis or test.
Markov Chain Monte Carlo (MCMC) refers to a class of simulation methods for generating samples from a complex target distribution by generating random numbers from a Markov Chain whose stationary dist…
1658 questions
Refers to the standard deviation of the sampling distribution of a statistic calculated from a sample. Standard errors are often required when forming confidence intervals or testing hypotheses about …
1631 questions
Prediction of unknown random quantities, using a statistical model.
1603 questions
Given a random variable $X$ which arise from a parameterized distribution $F(X;θ)$, the likelihood is defined as proportional to the probability of observed data as a function of $θ$: $\operatorname{L…
1591 questions
Methods focused on contrasting and combining results from different studies, in the hope of increasing precision and external validity.
for any on-topic question that (a) involves Stata as a critical part of the question or expected answer, & (b) is not just about how to use Stata.
1529 questions
A regularization method for regression models that shrinks coefficients towards zero, making some of them equal to zero. Thus lasso performs feature selection.
1516 questions
For statistical topics which involve the assumption of linearity, for example, linear regression or linear mixed models, or for the discussion of linear algebra as applied to statistics.
1515 questions
A family of algorithms combining weakly predictive models into a strongly predictive model. The most common approach is called gradient boosting, and the most commonly used weak models are classificat…
Convolutional Neural Networks are a type of neural network in which only subsets of possible connections between layers exist to create overlapping regions. They are commonly used for visual tasks.
1487 questions
A binary variable takes one of two values, typically coded as "0" and "1".
1477 questions
Inclusion of additional constraints (typically a penalty for complexity) in the model fitting process. Used to prevent overfitting / enhance predictive accuracy.
1445 questions
A strictly stationary process (or time series) is one whose joint distribution is constant over time shifts. A weakly stationary (or covariance stationary) process or series is one whose mean and cova…
1441 questions
A stochastic process describes the evolution of random variables/systems over time and/or space and/or any other index set. It has applications in areas such as econometrics, weather, signal processin…
1436 questions
The science of statistics applied to the analysis of biological or medical data.
A formalization of relationships between stochastically (randomly) related variables in the form of mathematical equations. DO NOT USE THIS TAG BY ITSELF: always include a more specific one.
1425 questions
An outlier is an observation that appears to be unusual or not well described relative to a simple characterization of a dataset. A discomfiting possibility is that these data come from a different po…
1375 questions
Parameters associated with the particular levels of a covariate are sometimes called the “effects” of the levels. If the levels that are observed represent a random sample from the set of all possible…
1372 questions
Using (pseudo-)random numbers and the Law of Large Numbers to simulate the random behavior of a real system.
1344 questions
for any on-topic question that (a) involves MATLAB either as a critical part of the question or expected answer, & (b) is not just about how to use MAT…
1291 questions
'Classification And Regression Trees', also sometimes called 'decision trees'. CART is a popular machine learning technique, and it forms the basis for techniques like random forests and common implem…
1271 questions
Factor analysis is a dimensionality reduction latent variable technique which replaces inter-correlating variables with a smaller number of continuous latent variables called factors. The factors are …
1241 questions
1 2
3
4 5
57