Mathematics Installation Art as Mathematics or Performance Art Pins New Trends in Statistics (2010)

 

 

Series: Statistics Analysis Consulting Company Digs New Papers & Trends in Statistics

Today’s paper:

First passage time for multivariate jump-diffusion processes in finance and other areas of applications

Authors: Di Zhang and Roderick V. N. Melnik

Abstract. The first passage time (FPT) problem is an important problem with a wide range of applications in science, engineering, economics, and industry. Mathematically, such a problem can be reduced to estimating the probability of a stochastic process first to reach a boundary level. In most important applications in the financial industry, the FPT problem does not have an analytical solution and the development of efficient numerical methods becomes the only practical avenue for its solution. Most of our examples in this contribution are centered around the evaluation of default correlations in credit risk analysis, where we are concerned with the joint defaults of several correlated firms, the task that is reducible to a FPT problem. This task represents a great challenge for jump-diffusion processes (JDP). In this contribution, we develop further our previous fast Monte Carlo method in the case of multivariate (and correlated) JDP. This generalization allows us, among other things, to evaluate the default events of several correlated assets based on a set of empirical data. The developed technique is an efficient tool for a number of financial, economic, and business applications, such as credit analysis, barrier option pricing, macroeconomic dynamics, and the evaluation of risk, as well as for a number of other areas of applications in science and engineering, where the FPT problem arises.

Find the paper here

Statistics Analysis Consulting Company Goes Beyond Numbers

Famous Statistics Mathematicians Consult & Dig New Trends in Statistics (2010)

Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics

Today’s paper:

Using the bootstrap to quantify the authority of an empirical ranking

Authors: Peter Hall and Hugh Miller

Abstract. The bootstrap is a popular and convenient method for quantifying the authority of an empirical ordering of attributes, for example of a ranking of the performance of institutions or of the influence of genes on a response variable. In the first of these examples, the number, p, of quantities being ordered is sometimes only moderate in size; in the second it can be very large, often much greater than sample size. However, we show that in both types of problem the conventional bootstrap can produce inconsistency. Moreover, the standard n-out-of-n bootstrap estimator of the distribution of an empirical rank may not converge in the usual sense; the estimator may converge in distribution, but not in probability. Nevertheless, in many cases the bootstrap correctly identifies the support of the asymptotic distribution of ranks. In some contemporary problems, bootstrap prediction intervals for ranks are particularly long, and in this context, we also quantify the accuracy of bootstrap methods, showing that the standard bootstrap gets the order of magnitude of the interval right, but not the constant multiplier of interval length. The m-out-of-n bootstrap can improve performance and produce statistical consistency, but it requires empirical choice of m; we suggest a tuning solution to this problem. We show that in genomic examples, where it might be expected that the standard, “synchronous” bootstrap will successfully accommodate nonindependence of vector components, that approach can produce misleading results. An “independent component” bootstrap can overcome these difficulties, even in cases where components are not strictly independent.

Find the abstract here – download the paper here

Statistical Analysis Consulting Company Goes Beyond Numbers

Famous Statistics Consultancy Company Consults New Trends in Statistics (2010)

Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics

Today’s paper:

Credible sets for risk ratios in over-reported two-sample binomial data using the double sampling scheme

Authors: Dewi Rahardja and Dean M. Younga

Abstract. We consider point and interval estimation for risk ratios based on two independent samples of binomial data subject to false positive misclassification. For such data it is well known that the model is unidentifiable. We consider incorporating training data obtained by double sampling scheme to make the model identifiable. In this identifiable model, we propose a Bayesian method to make statistical inference. In particular, we derive an easy-to-implement closed-form algorithm to draw from the posterior distributions. The algorithm is illustrated using a real data example and further examined via Monte Carlo simulation studies.

Find the paper here

Statistical Analysis Consulting Company Goes Beyond Numbers

Statistician Wanted: Famous Statistics Consultant Company Hits New Trends in Statistics (2010)

Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics

Today’s paper:

Prediction in linear mixed models

Authors: Sue Welham, Brian Cullis, Beverley Gogel, Arthur Gilmour, Robin Thompson

Abstract. Following estimation of effects from a linear mixed model, it is often useful to form predicted values for certain factor/variate combinations. The process has been well defined for linear models, but the introduction of random effects into the model means that a decision has to be made about the inclusion or exclusion of random model terms from the predictions. This paper discusses the interpretation of predictions formed including or excluding random terms. Four datasets are used to illustrate circumstances where different prediction strategies may be appropriate: in an orthogonal design, an unbalanced nested structure, a model with cubic smoothing spline terms and for kriging after spatial analysis. The examples also show the need for different weighting schemes that recognize nesting and aliasing during prediction, and the necessity of being able to detect inestimable predictions.

Find the paper here

Statistical Analysis Consulting Company Goes Beyond Numbers

Statistics Consultancy Firm (Consultant Company) Shoots New Trends in Statistics (2010)

Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics

Today’s paper:

Rank tests in heteroscedastic multi-way HANOVA

Authors: Haiyan Wang, Michael G. Akritas

Abstract. This article develops rank tests for the nonparametric main factor effects and interactions in multi-way high-dimensional analysis of variance when the cell distributions are completely unspecified. The design can be balanced or unbalanced with the cell sample sizes fixed or tending to infinity. An arbitrary number of factors and all types of ordinal data are allowed. This extends the use of rank methods to the Neymann-Scott and triangular array problems. The asymptotic distribution of the rank statistics is obtained by showing their asymptotic equivalence to corresponding expressions based on the asymptotic rank transform. Compared with test procedures based on the original observations, the proposed rank procedures are free of moment conditions, converge to their limiting distribution faster, and have better power when the underlying distributions are heavy tailed or skewed. These advantages are demonstrated by simulations and an application to a real data set.

Find the paper here

Statistical Analysis Consulting Company Goes Beyond Numbers

Statistics Consulting Firm & Consultant Company Pics New Trends in Statistics (2010)

Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics

Today’s paper:

Cumulative correspondence analysis of ordered categorical data from industrial experiments

Authors: Luigi D’Ambra, Onur Kksoy, Biagio Simonetti

Abstract. Most studies of quality improvement deal with ordered categorical data from industrial experiments. Accounting for the ordering of such data plays an important role in effectively determining the optimal factor level of combination. This paper utilizes the correspondence analysis to develop a procedure to improve the ordered categorical response in a multifactor state system based on Taguchi’s statistic. Users may find the proposed procedure in this paper to be attractive because we suggest a simple and also popular statistical tool for graphically identifying the really important factors and determining the levels to improve process quality. A case study for optimizing the polysilicon deposition process in a very large-scale integrated circuit is provided to demonstrate the effectiveness of the proposed procedure.

Find the paper here

Statistical Analysis Consulting Company Sees Beyond Numbers

Statistical Consulting Firms & Consultants Scoop New Trends in Statistics (2009)

Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics

Today’s paper:

Multivariate Weibull mixtures with proportional hazard restrictions for dwell-time-based session clustering with incomplete data

Authors: Patrick Mair and Marcus Hudec

Abstract. Emanating from classical Weibull mixture models we propose a framework for clustering survival data with various more parsimonious models by imposing restrictions on the distributional parameters. We show that these restrictions on the Weibull mixtures correspond to different proportional hazard restrictions across mixture components and Web page areas. A parametric cluster approach based on the EM algorithm is carried out on a multivariate data set. Our model set-up encompasses incomplete-data structures as well as censoring observations. We apply the methodology on retail data stemming from a global e-commerce company. Sessions are clustered with respect to the dwell times that a user spends on certain page areas. The cluster solution that is found allows for a detailed examination of the navigation behaviour in terms of the hazard and survivor functions within each component.

Find the paper here or read it here

Statistical Analysis Consulting Company Goes Beyond Numbers

Statistical Analysis Consultants and Company Cut New Trends in Statistics (2009)

Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics

Today’s paper:

Generalized Neyman–Pearson optimality of empirical likelihood for testing parameter hypotheses

Authors: Taisuke Otsu

Abstract. This paper studies the Generalized Neyman–Pearson (GNP) optimality of empirical likelihood-based tests for parameter hypotheses. The GNP optimality focuses on the large deviation errors of tests, i.e., the convergence rates of the type I and II error probabilities under fixed alternatives. We derive (i) the GNP optimality of the empirical likelihood criterion (ELC) test against all alternatives, and (ii) a necessary and a sufficient condition for the GNP optimality of the empirical likelihood ratio (ELR) test against each alternative.

Find the paper here

Statistical Analysis Consulting Company Goes Beyond Numbers

Statistical Consulting & Analysis Company Digs New Trends in Statistics (2009)

Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics

Today’s paper:

Estimation of regression parameters in the presence of outliers in the response

Authors: Sugata Sen Roy, Sibnarayan Guria

Abstract. In this paper, we consider a regression model with non-spherical covariance structure and outliers in the response. The generalized least squares estimator obtained from the full data set is generally not used in the presence of outliers and an estimator based on only the non-outlying observations is preferred. Here we propose as an estimator a convex combination of the full set and the deleted set estimators and compare its performance with the other two.

Find the paper here

Statistical Analysis Consulting Company Goes Beyond Numbers

Statistical Analysis Consulting Company Mines New Trends in Statistics (2009)

Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics

Today’s paper:

Randomization methods for assessing data analysis results on real-valued matrices

Authors: Markus Ojala, Niko Vuokko, Aleksi Kallio, Niina Haiminen, Heikki Mannila

Abstract. Randomization is an important technique for assessing the significance of data analysis results. Given an input dataset, a randomization method samples at random from some class of datasets that share certain characteristics with the original data. The measure of interest on the original data is then compared to the measure on the samples to assess its significance. For certain types of data, e.g., gene expression matrices, it is useful to be able to sample datasets that have the same row and column distributions of values as the original dataset. Testing whether the results of a data mining algorithm on such randomized datasets differ from the results on the true dataset tells us whether the results on the true data were an artifact of the row and column statistics, or due to some more interesting phenomena in the data. We study the problem of generating such randomized datasets. We describe methods based on local transformations and Metropolis sampling, and show that the methods are efficient and usable in practice. We evaluate the performance of the methods both on real and generated data. We also show how our methods can be applied to a real data analysis scenario on DNA microarray data. The results indicate that the methods work efficiently and are usable in significance testing of data mining results on real-valued matrices.

Find the paper here or read it here

Statistical Analysis Consulting Company Goes Beyond Numbers