Ali Hadi's Research Activities

Ali Hadi's Research Activities


Areas of Research Interests

I am interested in solving practical problems in statistics and related fields (e.g, applied probability, computer science, mathematics, and engineering). My publications include four Books and more than seventy articles. The methods for outlier detection, the Influence measure and the Potential-Residual Plot have been implemented in several statistics packages (e.g., Data Desk, Stata, and SYSTAT). Areas of my research Interests include:

Return to Home


Robust Statistics and Outlier Detection

Although it is customary to assume that data are homogeneous, in fact they often contain outliers or subgroups. Scientists and philosophers have recognized for at least 380 years that real data are not homogeneous and that the identification of outliers is an important step in the progress of scientific understanding. Methods that deal with robust estimation and outlier detection are presented in the following articles:

Robust Regression Methods:

Detection of Outliers in Large Data Sets:

Detection of Outliers in Multivariate Data:

Detection of Outliers in Regression Data:

Graphical Methods for the Detection of Outliers:

Return to Home or Research activities


Parameter and Quantile Estimation

Return to Home or Research activities


Fatigue and Lifetime Data Analysis

Return to Home or Research activities


Extreme Value Distributions

Return to Home or Research activities


Perturbed Eigenvalue Problem

Return to Home or Research activities


Generalized Inverses

Return to Home or Research activities


Statistical Analysis of Employment Discrimination Data

Return to Home or Research activities


Probability

Return to Home or Research activities


Neural and Functional Networks

Return to Home or Research activities


Bayesian and Markov Networks

Return to Home or Research activities


Software Available


S- PLUS Code:
function(X) {
# -----------------------------------------------------------------
#  Hadi, Ali S. (1994), "A Modification of a Method for the
#  Detection of Outliers in Multivariate Samples," Journal of the
#  Royal Statistical Society (B), 2, 393-396.
# -----------------------------------------------------------------
  n <- dim(X) [1]
  p <- dim(X) [2]
  h <- trunc((n + p + 1)/2)     id <- 1:n
  r <- p
  out <- 0
  cf <- (1 + ((p + 1)/(n - p)) + (2/(n - 1 - (3*p))) )^2
# cf <- (1 + ((p + 1)/(n - p)) + (1/(n - p - h)) )^2
  alpha <- 0.05
  tol <- max(10^-(p+5), 10^-12)
# -----------------------------------------------------------------
# **  Compute Mahalanobis distance
# -----------------------------------------------------------------
  C <- apply(X, 2, mean)
  S <- var(X)
  if (det(S) < tol) stop ()
  D <- mahalanobis(X, C, S)
  mah.out <- 0
  cv <- qchisq(1-(alpha/n), p)
  for (i in 1:n) if (D[i] >= cv) mah.out <- cbind(mah.out, i)
  mah.out <- mah.out[-1]
  mah <- sqrt(D)
  Xbar <- C
  Covariance <- S   #
# ----------------------------------------------------------------
#  **  Step 0
# ----------------------------------------------------------------
#  **  Compute Di(Cm, Sm)
  C <- apply(X, 2, median)
  C <- t(array(C, dim = c(n, p)))
  Y <- X - C
  S <- ((n - 1)^-1)*(t(Y) %*% Y)
  D <- mahalanobis(X, C[1, ], S)
  Z <- sort.list(D)
# ----------------------------------------------------------------
#  **  Compute Di(Cv, Sv)
  repeat {
    Y <- X[Z[1:h], ]
    C <- apply(Y, 2, mean)
    S <- var(Y)
    if (det(S) > tol) {
       D <- mahalanobis(X, C, S)
       Z <- sort.list(D); break }
    else h <- h + 1
    }
# ----------------------------------------------------------------
#  **  Step 1
# ----------------------------------------------------------------
  repeat {
    r <- r + 1
    if ( h < r) break
    Y <- X[Z[1:r],]
    C <- apply(Y, 2, mean)
    S <- var(Y)
    if (det(S) > tol) {
       D <- mahalanobis(X, C, S)
       Z <- sort.list(D) }
    }
# ----------------------------------------------------------------
#  **  Step 3
# ----------------------------------------------------------------
#  **  Compute Di(Cb, Sb)
  repeat {
    Y <- X[Z[1:h],]
    C <- apply(Y, 2, mean)
    S <- var(Y)
    if (det(S) > tol) {
       D <- mahalanobis(X, C, S)
       Z <- sort.list(D)
       if (D[Z[h + 1]] >= (cf*qchisq(1-(alpha/n), p))) {
            out <- Z[(h + 1) : n]
            break }
       else { h <- h + 1
              if (n <= h) break }
       }
    else { h <- h + 1
          if (n <= h) break }
    }
  D <- sqrt(D/cf)
  dst <- cbind(id, mah, D)
  Outliers <- out
  Cb <- C;
  Sb <- S
  Distances <- dst
  return(Xbar, Covariance, mah.out, Outliers, Cb, Sb, Distances)
  result
}
# ----------------------------------------------------------------

Return to Home, Research activities or Outlier detection