转载

(原创)speex与wav格式音频文件的互相转换

我们的司信项目又有了新的需求，就是要做会议室。然而需求却很纠结，要继续按照原来发语音消息那样的形式来实现这个会议的功能，还要实现语音播放的计时，暂停，语音的拼接，还要绘制频谱图等等。

如果是wav，mp3不论你怎么拼接，绘制频谱图，我也没有问题，网上都有现成的例子。然而这一次居然让用speex的音频做这一切。

于是看了司信之前的发语音消息部分speex的代码，天啊，人家录的时候这是实时录音实时编码的好不好，人家放的时候也是实时解码实时播放的好不好。你这让我怎么通过一个speex文件就得到全部的频谱图和时间啊，你让我怎么在播放的时候暂停，然后再按一下继续播放啊，这哪里是坑啊，这简直就是坑爹啊。

speex格式的文件是不能暂停的，也不能直接得到时间长度和频谱，因此只能转化成wav或者mp3格式的才可以。要想实现上面的功能就必须实现speex文件与正常音频格式的转换。

这里可能有些人对安卓的录音过程不太懂，先介绍一下（研究了这么久，就让我卖弄一下吧）

安卓录音的时候是使用AudioRecord来进行录制的（当然mediarecord也可以，mediarecord强大一些），录制后的数据称为pcm，这就是raw（原始）数据，这些数据是没有任何文件头的，存成文件后用播放器是播放不出来的，需要加入一个44字节的头，就可以转变为wav格式，这样就可以用播放器进行播放了。

怎么加头，代码在下边：

 1 // 这里得到可播放的音频文件    2     private void copyWaveFile(String inFilename, String outFilename) {  3         FileInputStream in = null;  4         FileOutputStream out = null;  5         long totalAudioLen = 0;  6         long totalDataLen = totalAudioLen + 36;  7         long longSampleRate = AudioFileFunc.AUDIO_SAMPLE_RATE;  8         int channels = 2;  9         long byteRate = 16 * AudioFileFunc.AUDIO_SAMPLE_RATE * channels / 8; 10         byte[] data = new byte[bufferSizeInBytes]; 11         try { 12             in = new FileInputStream(inFilename); 13             out = new FileOutputStream(outFilename); 14             totalAudioLen = in.getChannel().size(); 15             totalDataLen = totalAudioLen + 36; 16             WriteWaveFileHeader(out, totalAudioLen, totalDataLen, longSampleRate, channels, byteRate); 17             while (in.read(data) != -1) { 18                 out.write(data); 19             } 20             in.close(); 21             out.close(); 22         } catch (FileNotFoundException e) { 23             e.printStackTrace(); 24         } catch (IOException e) { 25             e.printStackTrace(); 26         } 27     } 28  29     /**  30      * 这里提供一个头信息。插入这些信息就可以得到可以播放的文件。  31      * 为我为啥插入这44个字节，这个还真没深入研究，不过你随便打开一个wav  32      * 音频的文件，可以发现前面的头文件可以说基本一样哦。每种格式的文件都有  33      * 自己特有的头文件。  34      */ 35     private void WriteWaveFileHeader(FileOutputStream out, long totalAudioLen, long totalDataLen, long longSampleRate, int channels, long byteRate) throws IOException { 36         byte[] header = new byte[44]; 37         header[0] = 'R'; // RIFF/WAVE header   38         header[1] = 'I'; 39         header[2] = 'F'; 40         header[3] = 'F'; 41         header[4] = (byte) (totalDataLen & 0xff); 42         header[5] = (byte) ((totalDataLen >> 8) & 0xff); 43         header[6] = (byte) ((totalDataLen >> 16) & 0xff); 44         header[7] = (byte) ((totalDataLen >> 24) & 0xff); 45         header[8] = 'W'; 46         header[9] = 'A'; 47         header[10] = 'V'; 48         header[11] = 'E'; 49         header[12] = 'f'; // 'fmt ' chunk   50         header[13] = 'm'; 51         header[14] = 't'; 52         header[15] = ' '; 53         header[16] = 16; // 4 bytes: size of 'fmt ' chunk   54         header[17] = 0; 55         header[18] = 0; 56         header[19] = 0; 57         header[20] = 1; // format = 1   58         header[21] = 0; 59         header[22] = (byte) channels; 60         header[23] = 0; 61         header[24] = (byte) (longSampleRate & 0xff); 62         header[25] = (byte) ((longSampleRate >> 8) & 0xff); 63         header[26] = (byte) ((longSampleRate >> 16) & 0xff); 64         header[27] = (byte) ((longSampleRate >> 24) & 0xff); 65         header[28] = (byte) (byteRate & 0xff); 66         header[29] = (byte) ((byteRate >> 8) & 0xff); 67         header[30] = (byte) ((byteRate >> 16) & 0xff); 68         header[31] = (byte) ((byteRate >> 24) & 0xff); 69         header[32] = (byte) (2 * 16 / 8); // block align   70         header[33] = 0; 71         header[34] = 16; // bits per sample   72         header[35] = 0; 73         header[36] = 'd'; 74         header[37] = 'a'; 75         header[38] = 't'; 76         header[39] = 'a'; 77         header[40] = (byte) (totalAudioLen & 0xff); 78         header[41] = (byte) ((totalAudioLen >> 8) & 0xff); 79         header[42] = (byte) ((totalAudioLen >> 16) & 0xff); 80         header[43] = (byte) ((totalAudioLen >> 24) & 0xff); 81         out.write(header, 0, 44); 82     }

得到了wav文件，那我们如何转化成speex文件呢？由于之前的项目采用的是googlecode上gauss的代码，没有经过太多改动，也没有仔细研究过。这里我先请教了公司的技术达人，天虹总监（之前国内首先研究ios上使用speex库的大牛），他说就把wav去掉header，然后把pcm数据放入的speex的encode方法里编码就可以了，得到的数据就是speex的文件。

听大牛一说如此简单，还等啥，照办，代码写好了，一运行就崩溃，擦，为什么呢，再运行还崩溃，错误提示是：

1 JNI WARNING: JNI function SetByteArrayRegion called with exception pending

2 in Lcom/sixin/speex/Speex;.encode:([SI[BI)I (SetByteArrayRegion)

数组越界，天啊为什么？！

于是我仔细去找了speex的源码：

 1 extern "C"  2 JNIEXPORT jint JNICALL Java_com_sixin_speex_Speex_encode  3     (JNIEnv *env, jobject obj, jshortArray lin, jint offset, jbyteArray encoded, jint size) {  4   5         jshort buffer[enc_frame_size];  6         jbyte output_buffer[enc_frame_size];  7     int nsamples = (size-1)/enc_frame_size + 1;  8     int i, tot_bytes = 0;  9  10     if (!codec_open) 11         return 0; 12  13     speex_bits_reset(&ebits); 14  15     for (i = 0; i < nsamples; i++) { 16         env->GetShortArrayRegion(lin, offset + i*enc_frame_size, enc_frame_size, buffer); 17         speex_encode_int(enc_state, buffer, &ebits); 18     } 19     //env->GetShortArrayRegion(lin, offset, enc_frame_size, buffer); 20     //speex_encode_int(enc_state, buffer, &ebits); 21  22     tot_bytes = speex_bits_write(&ebits, (char *)output_buffer, 23                      enc_frame_size); 24     env->SetByteArrayRegion(encoded, 0, tot_bytes, 25                 output_buffer); 26  27         return (jint)tot_bytes; 28 }

发现了enc_frame_size 有一个恒定的值：160

然后仔细研究发现这个encode方法每次也就只能编码160个short类型的音频原数据，擦，大牛给我留了一个坑啊。

没事，这也好办，既然你只接受160的short，那我就一点一点的读，一点一点的编码不行么。

方法在下：

 1 public void raw2spx(String inFileName, String outFileName) {  2   3         FileInputStream rawFileInputStream = null;  4         FileOutputStream fileOutputStream = null;  5         try {  6             rawFileInputStream = new FileInputStream(inFileName);  7             fileOutputStream = new FileOutputStream(outFileName);  8             byte[] rawbyte = new byte[320];  9             byte[] encoded = new byte[160]; 10             //将原数据转换成spx压缩的文件，speex只能编码160字节的数据，需要使用一个循环 11             int readedtotal = 0; 12             int size = 0; 13             int encodedtotal = 0; 14             while ((size = rawFileInputStream.read(rawbyte, 0, 320)) != -1) { 15                 readedtotal = readedtotal + size; 16                 short[] rawdata = byteArray2ShortArray(rawbyte); 17                 int encodesize = speex.encode(rawdata, 0, encoded, rawdata.length); 18                 fileOutputStream.write(encoded, 0, encodesize); 19                 encodedtotal = encodedtotal + encodesize; 20                 Log.e("test", "readedtotal " + readedtotal + "/n size" + size + "/n encodesize" + encodesize + "/n encodedtotal" + encodedtotal); 21             } 22             fileOutputStream.close(); 23             rawFileInputStream.close(); 24         } catch (Exception e) { 25             Log.e("test", e.toString()); 26         } 27  28     }

注意speex.encode方法的第一个参数是short类型的，这里需要160大小的short数组，所以我们要从文件里每次读取出320个byte（一个short等于两个byte这不用再解释了吧）。转化成short数组之后在编码。

经过转化发现speex的编码能力好强大，1.30M的文件，直接编码到了80k，好腻害呦。

这样在传输的过程中可以大大的减少流量，只能说speex技术真的很牛x。听说后来又升级了opus，不知道会不会更腻害呢。

编码过程实现了，接下来就是如何解码了，后来测试又发现speex的编码也是每次只能解码出来160个short，要不怎么说坑呢。

那个方法是这样子的

1 decsize = speex.decode(inbyte, decoded, readsize);

既然每次都必须解码出160个short来，那我放进去的inbyte是多少个byte呢，你妹的也不告诉我啊？？？

不告诉我，我也有办法，之前不是每次编码160个short吗？看看你编完之后是多少个byte不就行了？

经过测试，得到160个short编完了是20个byte，也就是320个byte压缩成了20个byte，数据缩小到了原来的1/16啊，果然牛x。

既然知道了是20，那么每次从压缩后的speex文件里读出20个byte来解码，这样就应该可以还原数据了。

 1 public void spx2raw(String inFileName, String outFileName) {  2         FileInputStream inAccessFile = null;  3         FileOutputStream fileOutputStream = null;  4         try {  5             inAccessFile = new FileInputStream(inFileName);  6             fileOutputStream = new FileOutputStream(outFileName);  7             byte[] inbyte = new byte[20];  8             short[] decoded = new short[160];  9             int readsize = 0; 10             int readedtotal = 0; 11             int decsize = 0; 12             int decodetotal = 0; 13             while ((readsize = inAccessFile.read(inbyte, 0, 20)) != -1) { 14                 readedtotal = readedtotal + readsize; 15                 decsize = speex.decode(inbyte, decoded, readsize); 16                 fileOutputStream.write(shortArray2ByteArray(decoded), 0, decsize*2); 17                 decodetotal = decodetotal + decsize; 18                 Log.e("test", "readsize " + readsize + "/n readedtotal" + readedtotal + "/n decsize" + decsize + "/n decodetotal" + decodetotal); 19             } 20             fileOutputStream.close(); 21             inAccessFile.close(); 22         } catch (Exception e) { 23             Log.e("test", e.toString()); 24         } 25     }