Study your flashcards anywhere!

Download the official Cram app for free >

  • Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off

How to study your flashcards.

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key


Play button


Play button




Click to flip

232 Cards in this Set

  • Front
  • Back

Conditional Entropy

Measures the probability that two patterns are similar.

Kaufman's Efficiency Ratio

Calculated by dividing the absolute value of the net change in price movement over n periods by the sum of all component moves, taken as positive numbers over the same n periods. If the ratio approaches the value of 1.0 then the movement is smooth (not chaotic); if the ratio approaches 0, then there is great inefficiency, chaos, or noise.

Fractal dimension

The degree of roughness or irregularity of a structure or system. In many chaotic systems there is a constant fractal dimension; that is, the interval used for measuring will have a predictable impact on the resulting values in a manner similar to a normal distribution.

Fractal geometry

An area of mathematics which uses various geometric shapes to measure chaotic systems; it's approach strikes a true note about how the real world of numbers actually works.

Chaos Theory

A method to describe the complex behavior of nonlinear systems, those that cannot be described by a straight line. Also known as nonlinear dynamics.

Chaotic Systems

Appear, at first, to be random but turn out to be "not without any form or method".

Noise (price)

The erratic movement of price and, by definition, it is unpredictable. It has definite statistical attributes. As financial systems mature, more investors participate, and noise increases.

Trend following system

Triggers buy or sell orders as prices rise or fall during the day

Countertrend system

Triggers sells and buys at relative intraday highs or lows.

Elder Ray

Oscillator for separating bullish and bearish sentiment
Bull power = High t - 13 day exponential smoothing
Bear power = Low t - 13 day exponential smoothing

Force Index

Force Index = Volume x (close t - close t-1)

Market Sentiment

The collective opinion of investors, is a driving force in the market yet is very difficult to measure and even harder to deliver those results in a timely fashion. For that reason, analysts often substitute a combination of volume, open interest and price for true sentiment, hoping that the recorded actions fo traders closely relate to what they are thinking.

Volatility ratio

VR = True range (n)/Average true range previous n days, lagged m days

Entropy formula

A more advanced from of conditional entropy which is a measurement of expectations. If the output of the system is known, and there is no uncertainty, the value of the entropy formula is zero. When entropy is low, the state of the system is easy to predict. As the output becomes less certain, the value approaches 1, and there is more disorder.

Neural networks

A technique that offers exceptional ability for discovering nonlinear relationships between any combination of fundamental information, technical indicators and price data. It is recognized as a powerful tool for uncovering market relationships. Its disadvantage is that it is so powerful that , without proper control, it will find relationships that exist only by chance.

Artificial neural network

A computerized neural network based on the biological functions of the human brain. The operation of which can be thought of as a feedback process similar to the Pavlovian response of training a dog. It is very good at finding patterns, whether they are continuous or discrete, appearing at different times. It differs from a least squares regression because it includes linear, nonlinear and pattern recognition relationships.


Neural network terminology analogous to the cells that compose the brain.


Neural network terminology analogous to groups of neurons


Neural network terminology referring to receivers of information, passing it directly to the neurons.


Neural network terminology referring to pathways that come out of the neuron and allow information to pass from one neuron to another.


Neural network terminology referring to the path between neurons and may inhibit or enhance the flow of information between neurons. They can be considered selectors or weighting factors.

Training or learning process (neural network)

Feedback process which creates weighting factors that would have given the correct results the most often.

Preprocessing (neural network)

Selection and presentation of inputs in the most direct form.

Genetic algorithm

Method of finding weighting factors in a neural network where weighting factors are randomly mutated until the best combination is found using a method referred to as "survival of the fittest", giving preference to the best and discarding the worst. Testing is completed when the results, as measured by the success criteria, cannot be improved. It is particularly valuable when the number of tests or combinations is so large that a test of all combinations is impractical.


Most basic component of a genetic algorithm


Comprised of a number of genes

Chromosome or string

Combination of individuals (and therefore genes). Represents a potential solution, a set of trading rules or parameters where the genes are specific values and calculations.

Trading strategy

Created by a combination of trading rules, or chromosomes.


A procedure which determines which chromosomes will survive and in what manner.


Introduction of new characteristics


Combining genes to give chromosomes with greater potential a better chance for survival.

Sharpe Ratio or Information Ratio

Favored test objective for trading systems.

Sharpe ratio = (Annualized returns - Risk free rate)/Annualized risk

(Expected Return - Risk-free rate)/Standard deviation of returns

Parameter (trading system)

Is a value within the strategy that can be changed in order to vary the speed or timing of the system. e.g. Moving average calculation period, exponential smoothing constant, etc.

One parameter optimization

Hold all parameters constant except for one and step thru the possible range for that parameter

Two parameter optimization

An optimization process using a sequential test, selecting the first parameter and then testing all the values for the second parameter, repeating until all combinations of the parameters have been tested.

Continuous parameters

A test parameter in which values can take on any fractional number within a well-defined range.

Discrete parameters

A test parameter in which values are whole numbers or integer values.

Coded parameters or a regime

A test parameter which represents a category of operations. e.g. when parameter value is A use a single moving average when B use double moving average and when C a linear regression is used.

Synthetic data

Has the same characteristics as the data that will be traded, yet is different.

Methods to create synthetic data (2)

1) Monte Carlo method (abandoned)

2) Random Numbers and Distribution

Fraction Same Sign

A process used for verifying that the synthetic data has the same characteristics as the real data. Percentage of synthetic records in which the predicted and actual data have the same sign.

Sequential optimization

Has the advantage of reducing a 3-parameter test case from n1 x n2 x n3 to n1 + n2 + n3. It has the disadvantage, however, of not always finding the best combination of parameter values when the test is very large.

Seeding (trading system design)

Method to choose parameter by choosing a series or random starting values for each of the parameters, including the primary one and search horizontally and vertically from that point. A much faster method of selecting parameters when n1 and n2 are very large.

Optimization by steepest descent

A mathematical technique to locate the point of maximum profit with the minimum steps. It uses both the returns and rate of change of returns to determine the size of the steps taken during the optimization process. Can be suboptimal if there is more than one return peak, but can be solved using seeding.

Profit factor

Ratio of gross profits to gross losses

Return on acccount

Ratio of net profits to the absolute value of the maximum drawdown.

Objective function

Ranking score or performance criteria for strings e.g. information ratio. Test objective. What measurement or statistic will show success?

Convergence (trading system development)

Occurs when after a number of passes the objective function does not increase significantly.

Uniform random number

Is a number that has an equal chance of being any of the values between specified values.

Monte Carlo sampling

Used to create a statistically valid subset of tests when the total number tests is too large. The results of this smaller set of tests can be analyzed as though it were the complete set. Gives information on the depth of success that is needed to determine robustness.

Robust system

One that performs consistently in a wide variety of situations, including patterns that are still unforeseen. From a practical view, this translates into a method that succeeds with the fewest parameters tested over the most data. The best performing systems commonly have four or fewer variables.

The characteristics of a system with the best forecasting ability (3)

1) It must be based on a sound premise.
2) It must adapt to changing market conditions.

3) It must be tested properly.
From time to time, each of the three points above will seem to be the most important. In the long run, they must all be included to create a robust program.


Provides a way of measuring success. It is best if is is a well documented indicator e.g. S&P Index, Lehman Brothers Treasury Index, etc.

Idiosyncrasies of published benchmarks (2)

1) Survivorship bias

2) Asymmetric measurement of returns

Popular performance criteria used to evaluate any trading strategy (10)

1) Information ratio and/or profit factor
2) Net profits or losses
3) Number of trades
4) Percentage of profitable trades
5) Average net return per trade
6) Maximum drawdown

7) Annualized rate of return (AROR)

8) Average time-to-recovery

9) Time in market

10) Slope of periodic returns

Percentage of profitable trades

Also called reliability, a value above 60% can be interpreted as a method that captures profits regularly. A trend system will be working correctly if its reliability is between 30% and 40%.

Average net return per trade

Gives you an indication of how difficult it will be to realize the system returns.

Maximum drawdown

The largest equity swing form peak to valley. This measurement can be very erratic and is not likely to be the largest risk seen in the future; however, it gives a rough idea of the minimum capital needed to trade this market. An investor will typically capitalize a trading account with three times the maximum drawdown.

Time to recovery

The time between two successive equity highs. A larger drawdown with a much faster recovery seems to be a better trade-off for most investors.

Slope of periodic returns

Derived from drawing a linear regression thru quarterly or annualized returns. The slope of that line would indicate whether the performance of the strategy is constant over time. A declining slope should be interpreted as better performance in the past and a rising slope means that the strategy does better under current price patterns than older ones. A horizontal slope is best, indicating consistent performance across all time periods.

Statistical indication of the likely success of a strategy

Adjusted returns = Average of all tests - 1 standard deviation of all test returns. A system satisfying this requirement should have an 84% chance of achieving adjusted returns.

Sensitivity testing

A way of finding out how much performance changes when the individual and combinations of parameter values are shifted up and down by small amounts.Overall, small changes in parameter values should not jeopardize the profitability of the strategy. If the performance declines for all other parameter values, then the current model is a peak solution, which usually means it was overfit and will not meet expectations.


A term used to describe a system or method that works under many market conditions, one in which we have confidence. In the best scenario, a robust system is not sensitive to moderate changes in parameter values.

Philosophies for system development (2)

1) Vertical

2) Full integrated

Fully integrated system development

Method that takes advantage of the interrelationship between features. It is similar to puzzle box where all pieces fit together in just one special way.

Vertical method of system developmen

Method in which the primary rule is tested first and must be profitable without any other features.

Price shock

An unexpected, unpredictable event. During a price shock, diversification is ineffective. Markets all move together, or all reverse together. Correlations go to 1.

Mean reverting system

A system that takes advantage of nondirectional price movement or a fundamental divergence between two markets. The belief is that prices will to the mean. Increases risk and makes the success of the method dependent upon good timing.

Systematic risk

Market or Macroeconomic risk. Risk which cannot be reduced/diversified away. It is measured by the beta coefficient of the stocks included in the fund.

Unsystematic risk

Firm-specific risk. Can be reduced/diversified.

Concept of diminishing marginal utility

A theory that indicates that as wealth becomes greater, the preference for more wealth diminishes.

Efficient frontier

A way of visualizing which choices would be made by any rational investor. The efficient frontier is a curve plotted though those investment alternatives that have the highest returns for a given risk, or the lowest risk for a given return.

Rational investor

One who wants the highest return for the lowest risk.

Treynor Ratio

An additional measure of return to risk calculated as follows:
TR = (Annualized return (AROR) - Risk free return)/Beta
Beta is usually the relative volatility of the current portfolio compared to a benchmark, usually the S&P. Not applicable when stock has a negative beta.

Average minimum retracement

Schwager's favorite measure of return to risk which finds the average equity drawdown, the difference between the current value in the account and the highest past value, by ignoring all days when equity was on new highs. A much simpler computation would use only the low equity day of each month; it would give a rough but good approximation.

Maximum adverse excursion

Largest loss seen over the test period.

Maximum drawdown

Largest historic loss.

Calmar Ratio

Useful measure of performance as a part of delivering value to the firm's clients. Another measure of return to risk utilizing the maximum drawdown as the measure of risk. Calculated as follows:
CR = AROR/Maximum drawdown where max drawdown as of day t is the largest historic drawdown, peak to valley, from the beginning of the data to today. Favored by hedge funds because it reflects gain to pain in the most realistic way, with consideration given to time, the way often impatient investors look at it.

Sortino Ratio

Another measure of return to risk similar to the Sharpe ratio and semivariance, it includes only the downside risk in the denominator and replaces the risk-free threshold with a minimum acceptable return (MAR) in the numerator. Calculated as follows:
SR = (AROR - MAR)/std dev(PE-E) where PE = peak equity E = current equity.

Ulcer Index

Another measure of return to risk which measures the increased anxiety as current returns drop farther below the highest returns previously achieved. A form of semivariance, similar to Average Minimum Retracement that produces a statistical measure of relative declines on all days that were not new high returns.

Probability of drawdown (DP)

A method to calculate the potential for a loss over n days. The standard deviation of all daily drawdowns, measured from the most recent equity high to today's equity value. More conservative version of Schwager and the Ulcer Index because

Semivariance (SV)

A method to calculate the potential for a loss over n days. Calculate the linear regression of the equity stream or NAV, then find the standard deviation of all the drawdowns, below the corresponding value of the straight line fit. Semivariance will produce a smaller value than DP because the values on the straight line will be lower than the equity peak.

Methods to calculate the potential for loss (2)

1) Probability of a drawdown
2) Semivariance

Drawdown ratio (2)

Alternative measure of return to risk usint either the probability of a drawdown (DP) or semivariance (SV) calculated as follows:
These ratios satisfy all three of the original criteria: higher profits are favored because the rate of return is in the numerator, the order of profits and losses will result in larger and smaller net equity drops, and large gains are not penalized because only the drawdowns are used to measure risk.

Annualized volatility

The most common way to express risk. It is simply just the annualized standard deviation. Even though we know that returns are not symmetric,this remains the most common measurement of risk. In portfolio analysis, this will be used to determine target volatility.

Value at risk (VAR)

Used in most companies to assess whether current market positions are likely to produce a loss that is unacceptably large over the next few days. It is different from volatility because it attempts to "anticipate risk", even though it is done using historic risk. This is done by finding the probability of loss of the current market positions based on past market movement. A combination of cross-correlations between markets for which there is exposure, the position sizes, the volatility of those markets, the projected time period over which the risk will be forecast and a confidence interval to determine the risk tolerance.

Three calculation methods for value at risk (VAR)

1) Variance-Covariance
2) Historical

3) Monte Carlo

Stress test

Used to find out how VAR performs under extreme cases. This isolates specific market periods, such as price shocks, or can use simulated data in which all prices reverse direction and make a 2 to 3 standard deviation move with very high correlation.


Used to measure the degree of a fat tail present in the distribution of returns. A normal distribution has kurtosis of 3 bu the actual returns series might have a kurtosis as large as 25, indicating a significant fat tail.

Target volatility

The amount of risk you are willing to take expressed as annualized volatility. In the real world, the lowest practical volatility target is 6%, more often 8%.

Trailing stop

A stop that captures an increasing part of the profits as prices move in a favorable direction. Some trailing stops advance, but never retreat, as in Wilder's Parabolic; others, based on volatility and price changes may retreat.

Construction of trailing stops (3)

1) Fixed percentage

2) Volatility

3) Percentage of profits

Standard deviation stop

The basis for traditional risk assessment, and it can be used to determine stop-loss levels.

This method adjusts for volatility using a standard statistical measurement and is applied to the extreme profit of a trade in the manner of a trailing stop. Uses ATR and standard deviation.

Proximity risk

When many stop orders cluster around the same point. One way this can be seen is in a gap as price breaks through a key resistance level. Given the large number of trend-following systems, it would not be surprising to find that many of them are generating the same orders on the close especially following a very volatile price move.

Other factors to use to determine stop loss order levels

1) Percentage of initial margin - loosely related to long-term volatility but lags considerably.
2) Percentage of the portfolio value or total account value - this concept of equalized risk across all market is very popular; however, it is not sensitive to individual markets, and as with many stops, it imposes artificial overrides.

3) Maximum adverse excursion - Determined by historic evaluation. A stop is placed just beyond the maximum adverse excursion for each trade or 2.5% of the price whichever is smaller.

Efficiency Ratio (ER)

Method to define the amount of noise in a market. Defined by the absolute value of the net price change divided by the sum of the individual price changes taken as positive values, over the same time interval using closing prices. This not the same as volatility and indicates that noise increases as the ratio gets closer to zero because the divisor increases with the amount of noise.

Point of ruin

In investment terminology and probability theory, the level at which there is no longer enough money to continue trading.

Risk of ruin

The probability of the point of ruin.

Traders advantage

Proportion of winning trades

True range calculation

The maximum of H(t)-L(t), H(t)-C(t-1), C(t)-L(t-1). True range is always a positive number

Directional Movement

Relationship of the price direction to the true range.

Optimal f

The optimal fixed fraction of an account that should be invested at any one time, or the size of the bet to place on any one trade. The amount to be risked is measured as a percentage of portfolio size. The objective is to maximize the amount invested (put at risk) yet avoid the possibility of a loss greater than some target amount.

Two levels of optimal f

1) The part of the total portfolio put at risk compared to that part held in cash equivalents.

2) The individual size of the commitment to each stock or futures contract within that portfolio.
This is particularly important for futures, where the high leverage of individual markets makes it very easy to risk too much on each trade.

Tolerance method

The statistically preferred method of assessing multicollinearity is to calculate the tolerance coefficient for each independent variable. Calculated as 1 - r squared for the regression of each independent variable on all the other independents, ignoring the dependent. There will be as many tolerance coefficients as there are independents.

Geometric mean

The nth root of the product of n data points.

Problems with using raw price data to calculate statistical metrics such as correlation

1) Prices are not normally distributed
2) Relationships between markets are not linear

Relative Strength calculation

Calculated by dividing the price of one security by the benchmark. Usually the ratio is further smoothed by a moving average in order to eliminate the effect of erratic price movements or "noise" from daily price fluctuations.

Disparity Index

Defined as the percentage difference or "disparity" of the latest close to a chosen moving average.

Intermarket LRS divergence

Can be used to calculate the divergence of a security from its related intermarket.

Intermarket Regression Divergence

Used to make a prediction of likely values of the dependent variable or the security to be predicted based on values of a correlated market.

Intermarket Momentum Oscillator

Helps identify indicator extreme values by normalizing the divergence on a scale from 1 to 100. Signal interpretation is similar to that of the stochastic oscillator.

Z-Score Divergence

Avoids the the price scaling problems associated with two security comparisons by converting prices to Z-scores.

Congestion Index

Attempts to identify the market's character by dividing the actual percentage that the market has changed in the past x days by the extreme range.

JSK RS-Ratio

Measures the relative strength of all elements in a universe in such a way that the numerical results are all comparable. Not only does it tell you whether an element of the universe is doing better than the benchmark, but it also tells you if an element is doing better or worse than another element.

JSK RS-Momentum

A uniform measure of relative momentum which can be used to further compare elements in a universe against a benchmark and against each other. Measures the direction and the rate of change of the RS-Ratio line. This basically tells us if relative strength is getting stronger or weaker and whether it is turning up or down.

Relative Rotation Graph

Visualization which shows the user the relative positions of all elements in the universe, not only against a benchmark but also vis-a-vis each other in one picture.

Assumptions of Markowitz's analysis (3)

1) Investors are generally risk averse.
2) They base their portfolio decisions on risk and expected return only.

3) They measure risk as the variance (or standard deviation) of expected returns.

Characteristics of a normal distribution (Bell Curve)

Symmetric around its central value, the arithmetic mean (as well as the median and the mode, as these are all equal for normal distributions).

Standard deviation

The normal distributions dispersion around the mean which represents the concept of risk, or volatility in finance. As it increases the probability or risk that successive observations will not equal the expected value increases. Represents firm-specific risk.

Total risk equals

Market or Macroeconomic risk/Systematic Risk/Undiversifiable risk

+ Firm-specific risk/Unsystematic Risk/Diversifiable risk

Measured by standard deviation of returns.

Mean/Variance efficient portfolio

An investment portfolio which generates higher returns per unit of risk or has lower volatility per unit of return.

Macroeconomic or Systematic risk

Risk that cannot be reduced via diversification only - regardless of the number of securities in the portfolio, each stock will still be affected by the state of the macro-economy.

Firm-specific or Unsystematic risk

Represents the type of risk that can be reduced by diversifying - the more stocks held in a portfolio, the less effect each stock's individual volatility has on the overall volatility of a portfolio.


Measures how two stocks co-vary around their respective means. A portfolio risk metric that reflects how companies stock returns move together and co-respond to macroeconomic news. Will take a positive value if the returns of two stocks tend to move together, or more precisely, if they tend to be above and below their means at the same time. Can take a wide range of values which makes it difficult to interpret as a standalone metric.

Correlation coefficient

Covariance between two series (stocks) divided by the product of their standard deviations. "Scaling" covariance in this manner, we take this not particularly well-behaved statistic and force it to lie in a well defined range between -1 (perfect negative correlation) to +1 (perfect positive correlation). When it is close to the midpoint of the range (zero), the interpretation is that the two series have no reliable statistical relation.

Portfolio variance

A nonlinear weighted average of the covariances between each pair of securities in a portfolio.

Portfolio expected return

A simple linear weighted average of the expected returns of each security in the portfolio.


1) An index that measures how much volatility a stock will contribute to a diversified portfolio. It allows us to rank every stock's potential contribution to portfolio volatility using the same index scale.
2) Similar to the way we scaled covariance and transformed it into the correlation coefficient to to make it easier to interpret, beta is also a scaled version of covariance and equally easy to interpret.

3) Portfolio's average beta will be computed as a weighted average of each stock's individual beta.

Main driver of Beta

The covariance of a stock's returns with the market is the main driver of beta.


Annualized excess return over and above the fair expected return based on its volatility versus the market. Across the entire stock market for any given period, the sum of all the positive alpha must equal the sum of all the negative alpha, thus alpha is "zero sum".

Probability density function

Gives the probability of every possible positive or negative deviation from zero. In other words, it shows the degree to which chance can cause a useless rule to generate profits.

Data mining

Testing many rules with the aim of selecting the one with the best performance. Although data mining is an effective research method, testing many rules increases the chance of a luck performance. Therefore, the threshold of performance needed to reject the null hypothesis must be set higher, perhaps much higher.


All the observations in which we are interested

Stationary population

Population in which statistical characteristics remain stable over time

Nonstationary population

Population in which statistical characteristics change over time.


Subset of the population

Probability experiment

An observation on or a manipulation of our environment that has an uncertain outcome.

Random variable

The quantity or quality observed in a probability experiment.


The most important probability experiment in statistical analysis.

Sample statistic

It is any measurable/computable characteristic/attribute of the sample. The random variable at issue within sample. It sheds light on the population parameter.

Sampling variation or variability

The unpredictable variation in the sample statistic from sample to sample. It is responsible for the uncertainty in statistical conclusions. It is the source of uncertainty that is addressed by statistical inference.

Law of large numbers

Tendency of relative frequencies to converge to theoretical probabilities as the number of observations become large. Large samples reduce the role of chance.

Distribution of observations

It depicts how a set of observations on a random variable are distributed or sprinkled across the variable's range of possible outcomes.

Elements of Statistical Inference (6)

1) A population

2) A sample consisting of a set of observations randomly selected from the population

3) A population parameter

4) A sample statistic

5) An inference

6) A statement about the reliability of the inference.

Population parameter

A fact or characteristic about the population that we would like to know. It is typically numerical but it need not be.

Statistical inference

The inductive leap from the observed value of a sample statistic, which is known with certainty but which is true only for a specific sample of data, to the value of a population parameter, which is uncertain but which is deemed to hold true for a wide, perhaps infinite, number of observed cases.

Two main areas of the field of statistics

1) Descriptive
2) Inferential

Goal of descriptive statistics

Data reduction, that is, reducing a large set of observed values to a smaller, more intelligible set of numbers and plots. Tells the story of the forest rather than the individual trees.

Descriptive tools (3)

1) Frequency distribution

2) Measures of central tendency

3) Measures of variation

Measures of central tendency

1) Average or mean

2) Median
3) Mode


Used to quantify uncertainty. The relative frequency of an event over the long run--the very long run.

Relative frequency

Number of times the event actually occurred divided by the total number of opportunities on which the event could have occurred.

Kinds of probabilities (2)

1) Theoretical

2) Empirical

Theoretical probabilities

Can be determined with a good degree of confidence purely on logical grounds. They are derived independent of prior experience, most often based on arguments of symmetry. E.G. coin toss, royal flush, etc.

Empirical probabilities

Based on observed frequencies. Technical analysis is concerned with this type of probability.

Sampling distribution

An infinite-sized probability distribution of a random variable and that random variable happens to be a sample statistic.

Data distribution of the population

An infinite-sized distribution comprised of all possible daily rule returns, which we assume extends into the immediate practical future.

Data distribution of the sample

A distribution comprised of a finite number (N) of daily returns from the past.

Central Limit Theorem

States that as the size of a sample gets larger, the sampling distribution of the mean, with some qualifications, converges towards a specific shape irrespective of the shape of the population distributions. In other words, no matter how weirdly shaped the distribution of the data in the parent population, the shape of the sampling distribution approaches a specific shape that statisticians refer to as the normal or Gaussian distribution.

Standard error of the mean

Equal to the standard deviation of the population divided by the square root of the sample size.

Statistical hypothesis

A conjecture about the value of a population parameter.

Null hypothesis

Asserts that nothing new has been discovered

Alternative hypothesis

The hypothesis we would like to prove, asserts the discovery of important new knowledge.

Occam's Razor

A principle that says if a phenomenon can be explained by more than one hypothesis, then the simplest hypothesis is more likely to be correct.

Factors affecting width of sampling distribution (2)

1) The amount of variation within the parent population which gave rise to the sample - the greater the variability of the data comprising the population, the larger the width of the sampling distribution.
2) # of observations comprising the sample - the larger the number of observations comprising the sample, the smaller the width of the sampling distribution.

Conditional probability

A probability that is contingent on the existence of a specified condition. A probability that is conditional upon some other fact being true. In a hypothesis test, this conditional probability is given the special name p-value.

P-value or statistical significance of the test

Probability that the observed value of the test statistic could have occurred conditioned upon (given that) the hypothesis being tested is true. The smaller the p-value, the greater our justification for calling into question the truth of the null hypothesis. Can also be interpreted as the probability the null hypothesis will be erroneously rejected when it is in fact true.

Type I Error

Occurs when a low p-value leads us to reject the null hypothesis but in reality it is true. Fooled by randomness. Leads to the use of a useless rule thus exposing capital to risk without the prospect of compensation. More serious error.

Type II Error

Occurs when a high p-value leads us to retain the null hypothesis when it is in fact false. Causes a useful rule to be ignored, resulting in lost trading opportunities. Less serious of two errors.

Computer based methods for estimating the shape of the sampling distribution of the test statistic (2)

1) Bootstrapping

2) Monte Carlo permutation

Bootstrap method

Asserts that the population distribution of rule returns has an expected value of zero or less.
Utilizes a daily history of rule returns and resampling with replacement. Only holds true if the number of observations in each resample is equal to the number of observations in the original sample.

Monte Carlo permutation method

Asserts that the rule's output values (+1 and -1) are randomly paired with future market price changes. In other words, it asserts that the rule's output is uninformative noise that could have just as easily been generated by a roulette wheel. Utilizes a daily history of the rule's output values (i.e. a sequence of +1 and -1's) and a daily history of price changes for the market being traded. Pairs output values with market returns without replacement. Tests a claim about the information content of a rule's signal.

Zero centering adjustment

Makes the mean daily return of the rule equal to zero. In other words, if the rule was able to earn a nonzero return on detrended data, its returns must be zero centered. This serves the purpose of bringing the daily returns into conformity with the null hypothesis, which asserts that their average values is equal to zero.

Noise rule

The random pairing of the rule output values with market changes destroys any predictive power that the rule may have had. The same time series of actual rule output values is paired with (permuted with) numerous scrambled (randomized) versions of market price changes.

Criteria for good estimators (4)

1) Unbiased

2) Consistent

3) Efficient

4) Sufficient

Sufficient estimator criteria
Makes such use of all the available sample data than no other estimator would add any information about the parameter being estimated.
Efficient estimator criteria
Relates to the width of its sampling distribution. The one that produces the narrowest sampling distribution or smallest standard error. Mean is about 80% narrower than median.
Consistent estimator criteria
If its value converges to the value of the population parameter as sample size is increased
Unbiased estimator criteria
If it's expected value is equal to the population value or its deviations from the true population value have an average value of zero.

Confidence interval

A range of values that surround the point estimate. Combines the information of the point estimate with the information about the estimator's sampling distribution. Accompanied by probability number that tells us how likely it is that the true value of the population parameter falls within the bounds of the confidence interval. Sampling distribution is centered over the sample mean.

Bootstrap percentile method

Method used to derive confidence intervals. Easy to use and generally gives good results.


The tendency of most people to read too much into stereotypes.

Conjunction bias/fallacy

Tendency to select a smaller subset when this statement put in conjunction with a statement they think is likely. People ignore the conjunction and focus on the part of the sentence they think is likely.

Reading into randomness

This reflects our nature as humans, attempting to find order in a sea of chaos. In the past it has been to our advantage as a species to err on the side of finding patterns in observable data and draw actionable conclusions from them, even when those patterns might not truly exist.

Small sample bias

Individuals tend to put more stock into the observations of a small sample than is warranted. Extreme results are much less likely to occur as the sample size increases. This is because the probability of getting a large number of draws of any one given type goes down as the number of draws increases.

Probability neglect

The tendency to overstate the probability of events for which they have a large bank of relevant memories or stories relative to the probability of other events.

Cognitive illusions or biases

Is a departure in inference or judgment from objective analysis that leads to a distortion in perception or understanding.

Illusion of talent

People providing too much attribution of a given observation to an individual's talent and not enough to luck. Closely related to our tendency to see patterns in the outcomes of random processes.

Confirmation bias

The tendency for people to focus on evidence that confirms their beliefs rather than view the entire set of observations objectively.

Illusion of skill

The tendency for people to think they have an ability to execute a particular task when the evidence shows they are not better at it than random chance.

Illusion of superiority

The average person thinks he or she is above average. This illusion affects the process of memory and also affects perceptions.

Illusion of validity

Individuals tend to believe the conclusions they draw from a brief set of observables are more likely to be valid than they actually are, and this tendency holds true even in the face of clear evidence to the contrary. Closely resembles the representative heuristic

Illusion of control

Individuals tend to believe they have more control over the outcome of random events than they actually have, even when the events are explicitly constructed as random.

Explanation based decision making

A process in which we collect evidence (usually in a biased fashion), then we construct a story to explain the evidence. This story (not the original evidence) is then used to reach a decision.

Statistical group

Effectively involves asking a large number of people what they think the answer is and taking the mean. Such groups have a good track record when it comes to forecasting.

Three conditions for a statistical group to be a useful device

1) People must be unaffected by others' decisions (effectively their errors must be uncorrelated.

2) The probability of being correct must be independent of the probability of everyone else being correct.

3) The participants must be unaffected by their own vote possibly being decisive.


A situation whereby an individual's action is potentially independent of his private information, and totally dependent upon the observation of other's actions and/or words. One of the key features of cascades is their tendency to exhibit idiosyncrasy. That is to say, the behavior resulting from signals of just the first few individuals drastically affects the behavior of numerous followers. Effectively, cascades are highly path dependent.

Group polarization

The tendency for members of a group to end up in a more extreme position in line with their original beliefs after talking to one another. The increased confidence in view mentioned earlier begins to create feedback into the extremity of of view, generally creating a loop of increased confidence in more and more extreme views.

Characteristics of group think (9)

1) A tendency to examine to few alternatives

2) A lack of critical assessment of each other's ideas

3) A high degree of selectivity in information gathering

4) A lack of contingency plans

5) Rationalizing poor decisions

6) An illusion of invulnerability and shared morality by the group

7) Suppressing true feelings and beliefs

8) Maintaining an illusion of unanimity

9) Appointing mind guards (essentially information sentinels) to protect the group from negative information

Three possible routes to reducing group biases

1) Secret ballots
2) Devil's advocate

3) Respect for other group members

Model/G of bubbles

1) Displacement

2) Credit creation

3) Euphoria

4) Critical stage/financial distress

5) Revulsion


Generally an exogenous shock that triggers the creation of profit opportunities in some sectors, while shutting down profit availability in other sectors.

Credit creation

Stage in which the boom is further exacerbated by monetary expansion and/or credit creation. Sooner or later demand for the asset will outstrip supply, resulting in the perfectly natural response of price increases. These price increases give rise to yet more investment (both real and financial). A positive feedback loop ensues; new investment leads to increases in income which, in turn, stimulate further investment.


The term given for when speculation for price increase is added to investment for production and sales. Effectively this is momentum trading or the "greater-fool-theory" of investment.

Critical Stage/Financial Distress

The critical stage is the point where a set of insiders decide to take their profits and cash out. Financial distress usually follows straight on from the critical stage (indeed the two can be hard to separate, hence we tacked them together. The term "financial distress" is borrowed from the finance literature where it refers to a situation in which a firm must contemplate the possibility that it may not be able to meet its liabilities. For an economy as a whole, the equivalent condition is an awareness on the part of a considerable segment of the speculating community that a rush for liquidity (out of assets into cash) may develop. Fraud tends to emerge at this stage of the bubble process.


The final stage of the bubble cycle. Refers to the fact that people are so badly scarred by the events in which they were embroiled that they can no longer bring themselves to participate in the market at all. It is clearly related to that most dreadful of current buzzwords - capitulation. Capitulation is described as degenerate panic. One of the hallmarks of the end of the bubble will be a collapse in volumes, a sign that investors have truly lost their faith in the equity culture.

A bubble is more likely to be found when: (5)

1) The ratio of inexperienced to experienced traders is high

2) The uncertainty over fundamental value is greater

3) The lottery characteristics of the security are high (effectively a small chance of a big payoff increase the likelihood that people will over pay for an asset

4) Buying on margin is possible

5) Short selling is difficult

Fallacy of composition

Is simply what may be true at the micro level does not hold at the macro level.

Three cross-section strategies that should be among the vanguard of those benefiting in a de-bubbling/deflationary period

1) Balance sheet strength

2) Earnings quality

3) Capital expenditure

Balance sheet strength

Given the risks of firms taking on too much debt during the euphoria stage, it should come as little surprise that we feel that balance sheet strength should be a key component of any stock selection process in a post-bubble world.

Earnings Quality

Earnings can be seen as the summation of two components, cash flows (about which we should care) and accruals (accountant tricks).

Capital expenditure

Firms that carry out high levels of cap ex relative to their sales (or total assets) underperform firms that resist the temptation to splurge on pointless expenditures.


The phenomenon that securities that have performed well relative to peers (winners) on average continue to outperform, and securities that have performed relatively poorly (losers) tend to continue to underperform. Its existence is a well-established empirical fact.

Market sentiment

The collective opinion of investors, is a driving force in the market, yet is very difficult to measure and even harder to deliver those results in a timely fashion. For that reason, analysts often substitute a combination of volume, open interest, and price for true sentiment, hoping that the recorded actions of traders closely relate to what they are thinking.


An analyst that lies somewhere between the fundamentalist and the technician, basing actions on the behavior of crowds, in this case the market participants. He/she believes that opportunities always lie in the reverse direction from crowd thinking.

Contrary opinion

This alone is not meant to signal a new entry into a position; it only identifies situations that qualify. It lacks the timing. It is more of a filter than a trading system, a means of avoiding risk and finding opportunity.

Types of estimates (2)

1) Point estimate

2) Interval estimate

Point estimate

A single value that approximates a population parameter, for example the rule has an expected return of 10%.

Interval estimate

A range of values within which the population parameter lies with a given level of probability. The following statement would exemplify this: The rule's expected return lies within the range 5% to 15% with a probability of .95.

Null hypothesis

Asserts that zero new knowledge has been discovered thus the symbol H(0). Presents a better target for falsification because it can be reduced to a single claim about the value of the parameter.

Alternative hypothesis

The one the scientist would like to prove, asserts the discovery of important new knowledge and is symbolized by H(A). It represents an infinite number of claims about the parameter's value. With no unique value to shoot at, an infinite number of tests would have to be performed to falsify the alternative hypothesis.

Deliberative groups

The group sits down and debates the issues before arriving at a conclusion. Deliberation tends to reduce variance. After talking together, the members of a group will tend to reach a consensus; hence the variance of views is diminished.

Mutual enhancement

A situation where those who share information that confirms the group's views are seen as competent and more credible by their peers and by themselves.

Correlation strength chart

 .00-.19 “very weak”

 .20-.39 “weak”

 .40-.59 “moderate”

 .60-.79 “strong”

 .80-1.0 “very strong”

Moving Volume Weighted Average Price (MVWAP)

Widely used by traders throughout the years as a benchmark of price. Use MVWAP as a means to determine the markets average trading price during a given period by taking into account both the price and the volume or number of shares being traded during the same period.

Parabolic Stop and Reverse (PSAR)

An indicator that can help with both market entries and placing stops in trending markets. It is a technical indicator that was invented by Welles Wilder with the goal of finding points of reversals in trending markets. If price is below a red PSAR dot this would be indicative of a continued down trend as lower lows are created. When price is moving higher the opposite is true.

Total risk

Total Risk = Market or Macroeconomic risk + Firm-Specific Risk

Total Risk = Systematic Risk + Unsystematic Risk

Total Risk = Undiversifiable Risk + Diversifiable Risk

Measured by the standard deviation of returns.

Calculate Implied Range

Implied Range = Index price x volatility x square root of n/252

Money Flow Index

The Money Flow Index (MFI) is an oscillator that uses both price and volume to measure buying and selling pressure. Created by Gene Quong and Avrum Soudack, MFI is also known as volume-weighted RSI. MFI starts with the typical price for each period. Money flow is positive when the typical price rises (buying pressure) and negative when the typical price declines (selling pressure). A ratio of positive and negative money flow is then plugged into an RSI formula to create an oscillator that moves between zero and one hundred. As a momentum oscillator tied to volume, the Money Flow Index (MFI) is best suited to identify reversals and price extremes with a variety of signals.

Step-Forward testing

The repeated process of choosing parameter values from test of in-sample data, then applying the results to out-of-sample data. Also called walk-forward testing and blind simulation.

Standardizing test results (3)

1) Annualizing all values
2) Risk adjusting - profits/risk or information ratio

3) Adjusting for standard error - subtract standard error from the current result

Ways to reduce the possibility of loss from a price shock is to be out of the market as much as possible. Two trading rules:

1) Do not hold the position any longer than necessary

2) Try to earn as much as possible while investing as little as possible.