
Softmax: Neural Networks and Maximum Mutual Information Estimation
Failed to add items
Add to basket failed.
Add to Wish List failed.
Remove from Wish List failed.
Follow podcast failed
Unfollow podcast failed
-
Narrated by:
-
By:
About this listen
The paper published in 1989, "Training Stochastic Model Recognition Algorithms as Networks can lead to Maximum Mutual Information Estimation of Parameters" by John S. Bridle, proposes a novel approach to pattern recognition, specifically improving Hidden Markov Models (HMMs) used in speech recognition. It focuses on discrimination-based training methods within neural networks (NNs). The paper demonstrates how modifying a multilayer perceptron's output layer to yield correct probability distributions, and replacing the standard squared error criterion with a probability-based score, is equivalent to Maximum Mutual Information (MMI) training. This method, when applied to a specially constructed network for stochastic model-based classifiers, offers a powerful way to train model parameters, exemplified by an HMM-based word discriminator called an "Alphanet." Ultimately, the research explores how NN architectures can embody the desirable traits of stochastic models and clarifies the relationship between discriminative NN training and MMI training of stochastic models.
Source: https://proceedings.neurips.cc/paper_files/paper/1989/file/0336dcbab05b9d5ad24f4333c7658a0e-Paper.pdf