I'm following a little tutorial with an example Neural Network implementation from here.
This originally used the MNIST dataset but I've attempted to modify it for text use instead (future goal is to categorize text messages).
Here's what I have to convert a string into an array of double-precision floating point numbers:
func dataFromText(text string) (data []float64) {
data = make([]float64, 600)
for position, character := range text {
data[position] = float64(int(character))
}
return data
}
If the string isn't 600 characters the array will just add a series of zeroes to the end.
I've also modified a prediction function:
func predictFromText(net Network, text string) int {
input := dataFromText(text)
output := net.Predict(input)
matrixPrint(output)
best := 0
highest := 0.0
for i := 0; i < net.outputs; i++ {
if output.At(i, 0) > highest {
best = i
highest = output.At(i, 0)
}
}
return best
}
Then how I train the network:
count := 1500
fmt.Println("Training...")
net := CreateNetwork(600, 200, 2, 0.15)
strings := make([]string, count)
target := make([][]float64, count)
for a := 0; a < count; a++ {
strings[a], target[a] = someRandomPair()
}
for epoch := 0; epoch < 5; epoch++ {
for index, s := range strings {
net.Train(dataFromText(s), target[index])
}
fmt.Println("Epoch:", epoch+1)
}
This line:
strings[a], target[a] = someRandomPair()
Will generate a String and a float64 array. Both are decided on a boolean random generator. If true, then it will return a string: "This is a test message." and float array {0.01, 0.99} and if false: (string is random and taken from "/usr/share/dict/words") and float array is {0.99, 0.01}.
When I begin to actually make predictions:
fmt.Println("Testing:", os.Args[2])
prediction := predictFromText(net, os.Args[2])
fmt.Println("Prediction:", prediction)
Results:
"This is a test message." -> (0.963765005571003, 0.03361956184092902)
Prediction: 0
"I don't even know." -> (0.963765005571003, 0.03361956184092902)
Prediction: 0
"Why isn't this working" -> (0.963765005571003, 0.03361956184092902)
Prediction: 0
It doesn't matter what text I put in... the result is always the same... Why isn't my neural network predicting anything right?
Edit: I have since also tried populating the training set with only one message:
strings[a] = "Message"
target[a] = []float64{0.01, 0.99}
This actually will vary the results when I put different messages in:
(0.013864548477560683, 0.9850204703692592)
(0.02411385414797107, 0.971204710177904)
Unfortunately, it still does NOT categorize the "Message" correctly... I expect this to be returned somehow:
(~0.99, ~0.01)