I am writing a program that emulates the functionality of the following Java program:
public class MFCC {
public static void main(String[] args) throws IOException, InterruptedException, Exception {
System.out.println("MFCC Coefficient Extractor");
// Executables
String sox = "/usr/bin/sox";
String x2x = "/usr/local/bin/x2x";
String frame = "/usr/local/bin/frame";
String window = "/usr/local/bin/window";
String mcep = "/usr/local/bin/mcep";
String swab = "/usr/local/bin/swab";
// Command Line Options
String WavFile = "/output/audio.wav";
String RawFile = WavFile + ".raw";
String frameLength = "400";
String frameLengthOutput = "512";
String framePeriod = "80";
String mgcOrder = "24";
String mfccFile = WavFile + ".mfc";
String soxcmd = sox + " " + WavFile + " " + RawFile;
launchProc(soxcmd, "sox", WavFile);
// MFCC
String mfcccmd = x2x + " +sf " + WavFile + " | " + frame + " -l " + frameLength + " -p " + framePeriod + " | " + window
+ " -l " + frameLength + " -L " + frameLengthOutput + " -w 1 -n 1 | " + mcep + " -a 0.42 -m " + mgcOrder
+ " -l " + frameLengthOutput + " | " + swab + " +f > " + mfccFile;
launchBatchProc(mfcccmd, "getSptkMfcc", WavFile);
int numFrames;
DataInputStream mfcData = null;
Vector<Float> mfc = new Vector<Float>();
mfcData = new DataInputStream(new BufferedInputStream(new FileInputStream(mfccFile)));
try {
while (true) {
mfc.add(mfcData.readFloat());
}
} catch (EOFException e) {
}
mfcData.close();
System.out.println("Coefficient vector length: " + mfc.size());
System.out.println("The coefficients are: " + mfc);
}
}
(Run it by cloning this repo and running docker build -t javamfcc:latest . && docker run --name javamfcc --rm -v $PWD/output:/output javamfcc:latest
)
The basic gist of what this program is by running an audio file through a list of pipes of executables provided by the SPTK project, and parsing the final output by reading 4 bytes as a float, and appending those values to a single vector.
I more or less have a good idea of how to get the pipes, but am having trouble figuring out how I can loop through the final *io.Reader and read every 4 bytes as a float (as is accomplished in Java.io.DataInputStream.readFloat()
) done with the following code:
package main
import "fmt"
func main() {
fileName := "/output/audio.wav"
frameLength := 400
frameLengthOutput := 512
framePeriod := 80
mgcOrder := 24
mfcc := exec.Command("mfcc.sh", "/output/"+fileheader.Filename, strconv.Itoa(frameLength), strconv.Itoa(frameLengthOutput), strconv.Itoa(framePeriod), strconv.Itoa(mgcOrder))
mfccout, mfccerr := mfcc.Output()
if mfccerr != nil {
log.Println("Error executing mfcc.sh")
panic(mfccerr)
}
b := bytes.NewReader(mfcc)
}
mfcc.sh
#!/bin/bash
filename=$1
frameLength=$2
frameLengthOutput=$3
framePeriod=$4
mgcOrder=$5
echo "filename: "$filename
echo "frameLength: "$frameLength
echo "frameLengthOutput: "$frameLengthOutput
echo "framePeriod: "$framePeriod
echo "mgcOrder: "$mgcOrder
x2x +sf $filename | frame -l $frameLength -p $framePeriod | window -l $frameLength -L $frameLengthOutput -w 1 -n 1 | mcep -a 0.42 -m $mgcOrder -l $frameLengthOutput | swab +f