Anbieter: Books From California, Simi Valley, CA, USA
paperback. Zustand: Very Good. Cover and edges may have some wear.
Anbieter: Ria Christie Collections, Uxbridge, Vereinigtes Königreich
EUR 52,36
Anzahl: Mehr als 20 verfügbar
In den WarenkorbZustand: New. In.
Anbieter: Revaluation Books, Exeter, Vereinigtes Königreich
EUR 75,23
Anzahl: 2 verfügbar
In den WarenkorbPaperback. Zustand: Brand New. 1st edition. 59 pages. 9.25x6.25x0.50 inches. In Stock.
Anbieter: AHA-BUCH GmbH, Einbeck, Deutschland
Taschenbuch. Zustand: Neu. Druck auf Anfrage Neuware - Printed after ordering - This book introduces the gamma-divergence, a measure of distance between probability distributions that was proposed by Fujisawa and Eguchi in 2008. The gamma-divergence has been extensively explored to provide robust estimation when the power index Gamma is positive. The gamma-divergence can be defined even when the power index Gamma is negative, as long as the condition of integrability is satisfied. Thus, the authors consider the gamma-divergence defined on a set of discrete distributions. The arithmetic, geometric, and harmonic means for the distribution ratios are closely connected with the gamma-divergence with a negative Gamma. In particular, the authors call the geometric-mean (GM) divergence the gamma-divergence when Gamma is equal to -1.The book begins by providing an overview of the gamma-divergence and its properties. It then goes on to discuss the applications of the gamma-divergence in various areas, including machine learning, statistics, and ecology. Bernoulli, categorical, Poisson, negative binomial, and Boltzmann distributions are discussed as typical examples. Furthermore, regression analysis models that explicitly or implicitly assume these distributions as the dependent variable in generalized linear models are discussed to apply the minimum gamma-divergence method.In ensemble learning, AdaBoost is derived by the exponential loss function in the weighted majority vote manner. It is pointed out that the exponential loss function is deeply connected to the GM divergence. In the Boltzmann machine, the maximum likelihood has to use approximation methods such as mean field approximation because of the intractable computation of the partition function. However, by considering the GM divergence and the exponential loss, it is shown that the calculation of the partition function is not necessary, and it can be executed without variational inference.
Anbieter: Ria Christie Collections, Uxbridge, Vereinigtes Königreich
EUR 138,90
Anzahl: Mehr als 20 verfügbar
In den WarenkorbZustand: New. In.
Anbieter: Buchpark, Trebbin, Deutschland
Zustand: Hervorragend. Zustand: Hervorragend | Sprache: Englisch | Produktart: Bücher | This book explores minimum divergence methods of statistical machine learning for estimation, regression, prediction, and so forth, in which we engage in information geometry to elucidate their intrinsic properties of the corresponding loss functions, learning algorithms, and statistical models. One of the most elementary examples is Gauss's least squares estimator in a linear regression model, in which the estimator is given by minimization of the sum of squares between a response vector and a vector of the linear subspace hulled by explanatory vectors. This is extended to Fisher's maximum likelihood estimator (MLE) for an exponential model, in which the estimator is provided by minimization of the Kullback-Leibler (KL) divergence between a data distribution and a parametric distribution of the exponential model in an empirical analogue. Thus, we envisage a geometric interpretation of such minimization procedures such that a right triangle is kept with Pythagorean identity in the sense of the KL divergence. This understanding sublimates a dualistic interplay between a statistical estimation and model, which requires dual geodesic paths, called m-geodesic and e-geodesic paths, in a framework of information geometry. We extend such a dualistic structure of the MLE and exponential model to that of the minimum divergence estimator and the maximum entropy model, which is applied to robust statistics, maximum entropy, density estimation, principal component analysis, independent component analysis, regression analysis, manifold learning, boosting algorithm, clustering, dynamic treatment regimes, and so forth. We consider a variety of information divergence measures typically including KL divergence to express departure from one probability distribution to another. An information divergence is decomposed into the cross-entropy and the (diagonal) entropy in which the entropy associates with a generative model as a family of maximum entropy distributions; the cross entropy associates with a statistical estimation method via minimization of the empirical analogue based on given data. Thus any statistical divergence includes an intrinsic object between the generative model and the estimation method. Typically, KL divergence leads to the exponential model and the maximum likelihood estimation. It is shown that any information divergence leads to a Riemannian metric and a pair of the linear connections in the framework of information geometry. We focus on a class of information divergence generated by an increasing and convex function U, called U-divergence. It is shown that any generator function U generates the U-entropy and U-divergence, in which there is a dualistic structure between the U-divergence method and the maximum U-entropy model. We observe that a specific choice of U leads to a robust statistical procedurevia the minimum U-divergence method. If U is selected as an exponential function, then the corresponding U-entropy and U-divergence are reduced to the Boltzmann-Shanon entropy and the KL divergence; the minimum U-divergence estimator is equivalent to the MLE. For robust supervised learning to predict a class label we observe that the U-boosting algorithm performs well for contamination of mislabel examples if U is appropriately selected. We present such maximal U-entropy and minimum U-divergence methods, in particular, selecting a power function as U to provide flexible performance in statistical machine l.
Anbieter: AHA-BUCH GmbH, Einbeck, Deutschland
Buch. Zustand: Neu. Druck auf Anfrage Neuware - Printed after ordering - This book explores minimum divergence methods of statistical machine learning for estimation, regression, prediction, and so forth, in which we engage in information geometry to elucidate their intrinsic properties of the corresponding loss functions, learning algorithms, and statistical models.One of the most elementary examples is Gauss's least squares estimator in a linear regression model, in which the estimator is given by minimization of the sum of squares between a response vector and a vector of the linear subspace hulled by explanatory vectors. This is extended to Fisher's maximum likelihood estimator (MLE) for an exponential model, in which the estimator is provided by minimization of the Kullback-Leibler (KL) divergence between a data distribution and a parametric distribution of the exponential model in an empirical analogue. Thus, we envisage a geometric interpretation of such minimization procedures such that a right triangle is kept with Pythagorean identity in the sense of the KL divergence. This understanding sublimates a dualistic interplay between a statistical estimation and model, which requires dual geodesic paths, called m-geodesic and e-geodesic paths, in a framework of information geometry. We extend such a dualistic structure of the MLE and exponential model to that of the minimum divergence estimator and the maximum entropy model, which is applied to robust statistics, maximum entropy, density estimation, principal component analysis, independent component analysis, regression analysis, manifold learning, boosting algorithm, clustering, dynamic treatment regimes, and so forth. We consider a variety of information divergence measures typically including KL divergence to express departure from one probability distribution to another. An information divergence is decomposed into the cross-entropy and the (diagonal) entropy in which the entropy associates with a generative model as a family of maximum entropy distributions; the cross entropy associates with a statistical estimation method via minimization of the empirical analogue based on given data. Thus any statistical divergence includes an intrinsic object between the generative model and the estimation method. Typically, KL divergence leads to the exponential model and the maximum likelihood estimation. It is shown that any information divergence leads to a Riemannian metric and a pair of the linear connections in the framework of information geometry. We focus on a class of information divergence generated by an increasing and convex function U, called U-divergence. It is shown that any generator function U generates the U-entropy and U-divergence, in which there is a dualistic structure between the U-divergence method and the maximum U-entropy model. We observe that a specific choice of U leads to a robust statistical procedurevia the minimum U-divergence method. If U is selected as an exponential function, then the corresponding U-entropy and U-divergence are reduced to the Boltzmann-Shanon entropy and the KL divergence; the minimum U-divergence estimator is equivalent to the MLE. For robust supervised learning to predict a class label we observe that the U-boosting algorithm performs well for contamination of mislabel examples if U is appropriately selected. We present such maximal U-entropy and minimum U-divergence methods, in particular, selecting a power function as U to provide flexible performance in statistical machine learning.