Mathematics Installation Art as Mathematics or Performance Art Pins New Trends in Statistics (2010)
Series: Statistics Analysis Consulting Company Digs New Papers & Trends in Statistics
Today’s paper:
First passage time for multivariate jump-diffusion processes in finance and other areas of applications
Authors: Di Zhang and Roderick V. N. Melnik
Abstract. The first passage time (FPT) problem is an important problem with a wide range of applications in science, engineering, economics, and industry. Mathematically, such a problem can be reduced to estimating the probability of a stochastic process first to reach a boundary level. In most important applications in the financial industry, the FPT problem does not have an analytical solution and the development of efficient numerical methods becomes the only practical avenue for its solution. Most of our examples in this contribution are centered around the evaluation of default correlations in credit risk analysis, where we are concerned with the joint defaults of several correlated firms, the task that is reducible to a FPT problem. This task represents a great challenge for jump-diffusion processes (JDP). In this contribution, we develop further our previous fast Monte Carlo method in the case of multivariate (and correlated) JDP. This generalization allows us, among other things, to evaluate the default events of several correlated assets based on a set of empirical data. The developed technique is an efficient tool for a number of financial, economic, and business applications, such as credit analysis, barrier option pricing, macroeconomic dynamics, and the evaluation of risk, as well as for a number of other areas of applications in science and engineering, where the FPT problem arises.
Find the paper here
Statistics Analysis Consulting Company Goes Beyond Numbers
Famous Statistics Mathematicians Consult & Dig New Trends in Statistics (2010)
Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics
Today’s paper:
Using the bootstrap to quantify the authority of an empirical ranking
Authors: Peter Hall and Hugh Miller
Abstract. The bootstrap is a popular and convenient method for quantifying the authority of an empirical ordering of attributes, for example of a ranking of the performance of institutions or of the influence of genes on a response variable. In the first of these examples, the number, p, of quantities being ordered is sometimes only moderate in size; in the second it can be very large, often much greater than sample size. However, we show that in both types of problem the conventional bootstrap can produce inconsistency. Moreover, the standard n-out-of-n bootstrap estimator of the distribution of an empirical rank may not converge in the usual sense; the estimator may converge in distribution, but not in probability. Nevertheless, in many cases the bootstrap correctly identifies the support of the asymptotic distribution of ranks. In some contemporary problems, bootstrap prediction intervals for ranks are particularly long, and in this context, we also quantify the accuracy of bootstrap methods, showing that the standard bootstrap gets the order of magnitude of the interval right, but not the constant multiplier of interval length. The m-out-of-n bootstrap can improve performance and produce statistical consistency, but it requires empirical choice of m; we suggest a tuning solution to this problem. We show that in genomic examples, where it might be expected that the standard, “synchronous” bootstrap will successfully accommodate nonindependence of vector components, that approach can produce misleading results. An “independent component” bootstrap can overcome these difficulties, even in cases where components are not strictly independent.
Find the abstract here – download the paper here
Statistical Analysis Consulting Company Goes Beyond Numbers
Famous Statistics Consultancy Company Consults New Trends in Statistics (2010)
Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics
Today’s paper:
Credible sets for risk ratios in over-reported two-sample binomial data using the double sampling scheme
Authors: Dewi Rahardja and Dean M. Younga
Abstract. We consider point and interval estimation for risk ratios based on two independent samples of binomial data subject to false positive misclassification. For such data it is well known that the model is unidentifiable. We consider incorporating training data obtained by double sampling scheme to make the model identifiable. In this identifiable model, we propose a Bayesian method to make statistical inference. In particular, we derive an easy-to-implement closed-form algorithm to draw from the posterior distributions. The algorithm is illustrated using a real data example and further examined via Monte Carlo simulation studies.
Find the paper here
Statistical Analysis Consulting Company Goes Beyond Numbers
Statistician Wanted: Famous Statistics Consultant Company Hits New Trends in Statistics (2010)
Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics
Today’s paper:
Prediction in linear mixed models
Authors: Sue Welham, Brian Cullis, Beverley Gogel, Arthur Gilmour, Robin Thompson
Abstract. Following estimation of effects from a linear mixed model, it is often useful to form predicted values for certain factor/variate combinations. The process has been well defined for linear models, but the introduction of random effects into the model means that a decision has to be made about the inclusion or exclusion of random model terms from the predictions. This paper discusses the interpretation of predictions formed including or excluding random terms. Four datasets are used to illustrate circumstances where different prediction strategies may be appropriate: in an orthogonal design, an unbalanced nested structure, a model with cubic smoothing spline terms and for kriging after spatial analysis. The examples also show the need for different weighting schemes that recognize nesting and aliasing during prediction, and the necessity of being able to detect inestimable predictions.
Find the paper here
Statistical Analysis Consulting Company Goes Beyond Numbers
Statistics Consulting Firm & Consultant Company Pics New Trends in Statistics (2010)
Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics
Today’s paper:
Cumulative correspondence analysis of ordered categorical data from industrial experiments
Authors: Luigi D’Ambra, Onur Kksoy, Biagio Simonetti
Abstract. Most studies of quality improvement deal with ordered categorical data from industrial experiments. Accounting for the ordering of such data plays an important role in effectively determining the optimal factor level of combination. This paper utilizes the correspondence analysis to develop a procedure to improve the ordered categorical response in a multifactor state system based on Taguchi’s statistic. Users may find the proposed procedure in this paper to be attractive because we suggest a simple and also popular statistical tool for graphically identifying the really important factors and determining the levels to improve process quality. A case study for optimizing the polysilicon deposition process in a very large-scale integrated circuit is provided to demonstrate the effectiveness of the proposed procedure.
Find the paper here
Statistical Analysis Consulting Company Sees Beyond Numbers
Statistical Consulting Firms & Consultants Scoop New Trends in Statistics (2009)
Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics
Today’s paper:
Multivariate Weibull mixtures with proportional hazard restrictions for dwell-time-based session clustering with incomplete data
Authors: Patrick Mair and Marcus Hudec
Abstract. Emanating from classical Weibull mixture models we propose a framework for clustering survival data with various more parsimonious models by imposing restrictions on the distributional parameters. We show that these restrictions on the Weibull mixtures correspond to different proportional hazard restrictions across mixture components and Web page areas. A parametric cluster approach based on the EM algorithm is carried out on a multivariate data set. Our model set-up encompasses incomplete-data structures as well as censoring observations. We apply the methodology on retail data stemming from a global e-commerce company. Sessions are clustered with respect to the dwell times that a user spends on certain page areas. The cluster solution that is found allows for a detailed examination of the navigation behaviour in terms of the hazard and survivor functions within each component.
Find the paper here or read it here
Statistical Analysis Consulting Company Goes Beyond Numbers
Statistical Analysis Consultants and Company Cut New Trends in Statistics (2009)
Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics
Today’s paper:
Generalized Neyman–Pearson optimality of empirical likelihood for testing parameter hypotheses
Authors: Taisuke Otsu
Abstract. This paper studies the Generalized Neyman–Pearson (GNP) optimality of empirical likelihood-based tests for parameter hypotheses. The GNP optimality focuses on the large deviation errors of tests, i.e., the convergence rates of the type I and II error probabilities under fixed alternatives. We derive (i) the GNP optimality of the empirical likelihood criterion (ELC) test against all alternatives, and (ii) a necessary and a sufficient condition for the GNP optimality of the empirical likelihood ratio (ELR) test against each alternative.
Find the paper here
Statistical Analysis Consulting Company Goes Beyond Numbers
Statistical Consulting & Analysis Company Digs New Trends in Statistics (2009)
Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics
Today’s paper:
Estimation of regression parameters in the presence of outliers in the response
Authors: Sugata Sen Roy, Sibnarayan Guria
Abstract. In this paper, we consider a regression model with non-spherical covariance structure and outliers in the response. The generalized least squares estimator obtained from the full data set is generally not used in the presence of outliers and an estimator based on only the non-outlying observations is preferred. Here we propose as an estimator a convex combination of the full set and the deleted set estimators and compare its performance with the other two.
Find the paper here
Statistical Analysis Consulting Company Goes Beyond Numbers
Statistical Analysis Consulting Company Mines New Trends in Statistics (2009)
Series: Statistical Analysis Consulting Company Digs New Papers & Trends in Statistics
Today’s paper:
Randomization methods for assessing data analysis results on real-valued matrices
Authors: Markus Ojala, Niko Vuokko, Aleksi Kallio, Niina Haiminen, Heikki Mannila
Abstract. Randomization is an important technique for assessing the significance of data analysis results. Given an input dataset, a randomization method samples at random from some class of datasets that share certain characteristics with the original data. The measure of interest on the original data is then compared to the measure on the samples to assess its significance. For certain types of data, e.g., gene expression matrices, it is useful to be able to sample datasets that have the same row and column distributions of values as the original dataset. Testing whether the results of a data mining algorithm on such randomized datasets differ from the results on the true dataset tells us whether the results on the true data were an artifact of the row and column statistics, or due to some more interesting phenomena in the data. We study the problem of generating such randomized datasets. We describe methods based on local transformations and Metropolis sampling, and show that the methods are efficient and usable in practice. We evaluate the performance of the methods both on real and generated data. We also show how our methods can be applied to a real data analysis scenario on DNA microarray data. The results indicate that the methods work efficiently and are usable in significance testing of data mining results on real-valued matrices.
Find the paper here or read it here
Statistical Analysis Consulting Company Goes Beyond Numbers