





















































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Isye 6501 Final exam Questions and Answers (Solved Papers)
Typology: Exams
1 / 61
This page cannot be seen from the preview
Don't miss anything!
1-norm - Correct Answers ✅Similar to rectilinear distance; measures the straight-line length of a vector from the origin. If z=(z1,z2,...,zm) is a vector in an m-dimensional space, then it's 1- norm is square root(|𝑧1|+|𝑧2|+⋯+|𝑧𝑚| = |𝑧1|+|𝑧2|+⋯+|𝑧| = Σm over i=1 |𝑧𝑖| A/B Testing - Correct Answers ✅testing two alternatives to see which one performs better 2-norm - Correct Answers ✅Similar to Euclidian distance; measures the straight-line length of a vector from the origin. If z=(z1,z2,...,zm) is a vector in an 𝑚-dimensional space, then its 2- norm is the same as 1-norm but everything is squared= square root(Σm over i=1 (|𝑧𝑖|)^2) Accuracy - Correct Answers ✅Fraction of data points correctly classified by a model; equal to TP+TN / TP+FP+TN+FN Action - Correct Answers ✅In ARENA, something that is done to an entity.
Additive Seasonality - Correct Answers ✅Seasonal effect that is added to a baseline value (for example, "the temperature in June is 10 degrees above the annual baseline"). Adjusted R-squared - Correct Answers ✅Variant of R2 that encourages simpler models by penalizing the use of too many variables. AIC - Correct Answers ✅Akaike information criterion- Model selection technique that trades off between model fit and model complexity. When comparing models, the model with lower AIC is preferred. Generally penalizes complexity less than BIC. Algorithm - Correct Answers ✅Step-by-step procedure designed to carry out a task. Analysis of Variance/ANOVA - Correct Answers ✅Statistical method for dividing the variation in observations among different sources. Approximate dynamic program - Correct Answers ✅Dynamic programming model where the value functions are approximated.
interchangeable with "feature", and often with "covariate" or "predictor". In the standard tabular format, a column of data. Autoregression - Correct Answers ✅Regression technique using past values of time series data as predictors of future values. Autoregressive integrated moving average (ARIMA) - Correct Answers ✅Time series model that uses differences between observations when data is nonstationary. Also called Box-Jenkins. Backward elimination - Correct Answers ✅Variable selection process that starts with all variables and then iteratively removes the least-immediately-relevant variables from the model. Balanced Design - Correct Answers ✅Set of combinations of factor values across multiple factors, that has the same number of runs for all combinations of levels of one or more factors. Balking - Correct Answers ✅An entity arrives to the queue, sees the size of the line (or some other attribute), and decides to leave the system.
Bayes' theorem/Bayes' rule - Correct Answers ✅Fundamental rule of conditional probability: 𝑃(𝐴|𝐵)=𝑃(𝐵|𝐴)*𝑃(𝐴) / 𝑃(𝐵) Bayesian Information criterion (BIC) - Correct Answers ✅Model selection technique that trades off model fit and model complexity. When comparing models, the model with lower BIC is preferred. Generally penalizes complexity more than AIC. Bayesian Regression - Correct Answers ✅Regression model that incorporates estimates of how coefficients and error are distributed. Bellman's Equation - Correct Answers ✅Equation used in dynamic programming that ensures optimality of a solution. Bernoulli Distribution - Correct Answers ✅Discrete probability distribution where the outcome is binary, either 0 or 1. Often, 1 represents success and 0 represents failure. The probability of the outcome being 1 is 𝑝 and the probability of outcome being 0 is 𝑞 = 1 −𝑝, where 𝑝 is between 0 and 1. Bias - Correct Answers ✅Systematic difference between a true parameter of a population and its estimate.
reasonable ranges of variability ("whiskers"), and points (possible outliers) outside those ranges. Box-Cox Transformation - Correct Answers ✅Transformation of a non-normally-distributed response to a normal distribution. Branching - Correct Answers ✅Splitting a set of data into two or more subsets, to each be analyzed separately. CART - Correct Answers ✅Classification and regression trees. Categorical Data - Correct Answers ✅Data that classifies observations without quantitative meaning (for example, colors of cars) or where quantitative amounts are categorized (for example, "0-10, 11-20, ..."). Causation - Correct Answers ✅Relationship in which one thing makes another happen (i.e., one thing causes another). Chance Constraint - Correct Answers ✅A probability-based constraint. For example, a standard linear constraint might be 𝐴x≤𝑏. A similar chance constraint might be Pr (𝐴x≤𝑏) ≥0.
Change Detection - Correct Answers ✅Identifying when a significant change has taken place in a process. Classification - Correct Answers ✅The separation of data into two or more categories, or (a point's classification) the category a data point is put into. Classification tree - Correct Answers ✅Tree-based method for classification. After branching to split the data, each subset is analyzed with its own classification model. Classifier - Correct Answers ✅A boundary that separates the data into two or more categories. Also (more generally) an algorithm that performs classification. Clique - Correct Answers ✅A set of nodes where each pair is connected by an arc. Cluster - Correct Answers ✅A group of points identified as near/similar to each other.
Confusion matrix - Correct Answers ✅Visualization of classification model performance. Constant - Correct Answers ✅A number that remains the same. constraint - Correct Answers ✅Part of an optimization model that describes a restriction on the solution (the values of the variables). Contextual outlier - Correct Answers ✅A data point that is (uncommonly) far from other data points related to it - for example, in Atlanta, a 90-degree (Fahrenheit) day in winter is an outlier, but a 90-degree day in summer is not. continuous-time simulation - Correct Answers ✅A simulation that models a system continuously, at every instant of time; continuous-time simulation models are often based on differential equations. Control - Correct Answers ✅(1) A variable whose value remains constant for all runs of an experiment, so changes in this variable don't affect the experiment. (2) Design an experiment where some
factors ("controls" by definition (1)) are held constant to avoid them affecting the outcome. Convex function - Correct Answers ✅A function f() where for every two points 𝑥 and 𝑦, 𝑓(𝑐x+ (1−𝑐)𝑦) ≤ 𝑐f(𝑥) + (1−𝑐)𝑓(𝑦) for all 𝑐 between 0 and 1. In two dimensions, this means if the points (𝑥,𝑓(𝑥)) and (𝑦,𝑓(𝑦)) are connected with a straight line, the line is always above [or equal to] the function's curve between those two points. If 𝑓() is convex, then −𝑓() is concave. Convex Hull (of a set of points) - Correct Answers ✅Smallest convex shape that the set of points is contained in. Convex Optimization model - Correct Answers ✅An optimization model where the objective function is to minimize a convex function (or maximize a concave function) and the constraints define a convex set of feasible solutions. Convex Quadratic Function - Correct Answers ✅A second-order polynomial function that is convex.
Covariate - Correct Answers ✅A characteristic or measurement that can be used to estimate the value of something - for example, a person's height or the color of a car. A "feature" or "attribute"; in the standard tabular format, a column of data. Cross-validation - Correct Answers ✅Validation technique where a model is tested on data different from what it was trained on. CUSUM - Correct Answers ✅Change detection method that compares observed distribution mean with a threshold level of change. Data Point - Correct Answers ✅Observation/record of (perhaps multiple) measurements for a single member of a population or data set. In the standard tabular format, a row of data. Decision - Correct Answers ✅Choice of action. Decision Point - Correct Answers ✅Place in a simulation where there is a branch (or decision to be made or observed).
Decision Tree - Correct Answers ✅Tree-based method for decision-making. After branching to split the data, each subset is analyzed with its own decision model (or just has its own decision applied). Deep Learning - Correct Answers ✅Neural network-type model with many hidden layers. Descriptive Analytics - Correct Answers ✅Loosely speaking, the use of analytics to explain or describe what has happened. Design of Experiments - Correct Answers ✅Choosing a set of tests to be made to find the effect of input variables on an outcome. Deterministic Simulation - Correct Answers ✅Simulation with no randomness/uncertainty, so results are the same each run Detrending - Correct Answers ✅Removal of trend, such as a change in the mean over time, from time-series data. Diagnostics odds ratio - Correct Answers ✅Ratio of the odds that a data point in a certain category is correctly classified by a model,
Distribution-fitting - Correct Answers ✅Determining whether a set of data seems to follow a certain probability distribution, or determining which of several distributions the data is close to. Double exponential smoothing - Correct Answers ✅Two- parameter exponential smoothing technique that incorporates trend. Dynamic programming - Correct Answers ✅Optimization approach that involves making a sequence of decisions over time, based on the current state of a system. Earth - Correct Answers ✅Name of many implementations of multi-adaptive regression spline (MARS) model, because "MARS" is a trademark. Edge - Correct Answers ✅Connection between two nodes/vertices in a network. In a network model, there is a variable for each edge, equal to the amount of flow on the arc, and (optionally) a capacity constraint on the edge's flow. Also called an arc.
Eigenvalue - Correct Answers ✅Amount by which an eigenvector gets rescaled in a linear transformation. Eigenvector - Correct Answers ✅Non-zero vector that does not change direction when a linear transformation is applied to it, but only gets rescaled by the eigenvalue Elastic Net - Correct Answers ✅Combination of lasso and ridge regression. Elbow Diagram - Correct Answers ✅A graph of improvement in function value as something else (e.g.,number of clusters) increases or decreases; the spot where improvement levels out EM Algorithm - Correct Answers ✅Expectation-maximization algorithm. Emperical Bayes Model - Correct Answers ✅Model that uses Bayes' theorem to update an initial guess/distribution based on observed data.
exploitation - Correct Answers ✅Using known information to get good outcomes. Exploration - Correct Answers ✅Finding new/better/more information to determine how to optimize output. Exponential Distribution - Correct Answers ✅A continuous probability distribution of the time between events: 𝑓(𝑥)=𝜆𝑒^−𝜆x. If the number of events in a fixed time follows the Poission distribution, then the time between them has the exponential distribution. The exponential distribution has the memoryless property. Exponential smoothing - Correct Answers ✅Data smoothing technique in which older observations are assigned exponentially decresing weights, so more emphasis is given to recent observations. Factorial Design - Correct Answers ✅Tests of different combinations of factor values over multiple factors, to find each one's effect, and interaction effects, on the outcome.
Fall out - Correct Answers ✅Fraction of data points not in a certain category that are incorrectly classified by a model; equal to FP / TN+FP Also called false positive rate. False Negative (FN) - Correct Answers ✅Data point that a model incorrectly classifies as not being in a certain category. ("Negative" means the model classified it as not being in the category, and "False" means the model's classification is incorrect.) Sometimes abbreviated as "FN". False Negative Rate - Correct Answers ✅Fraction of data points in a certain category that are incorrectly classified by a model; equal to FN / TP+FN. Also called miss rate. False Positive (FP) - Correct Answers ✅Data point that a model incorrectly classifies as being in a certain category. ("Positive" means the model classified it as being in the category, and "False" means the model's classification is incorrect.) Sometimes abbreviated as "FP". False Positive Rate - Correct Answers ✅Fraction of data points not in a certain category that are incorrectly classified by a model; equal to FP / TN+FP. Also called fall out.