Thursday 12 December 2013

Sphynx4 experience Round 2: Improve accuracy

After running the pocketsphynx transcriber a few times, I realized the text to speech accuracy is utter crap.  So, now I have to find a way to make it understand me better.

As a first example, I run the following command:

pocketsphinx-0.8/src/programs/pocketsphinx_continuous -infile sound3.wav  -hmm hub4wsj_sc_8k -dict pocketsphinx-0.8/model/lm/en_US/cmu07a.dic -lm pocketsphinx-0.8/model/lm/en_US/wsj0vp.5000.DMP  2>/dev/null

The sound3.wav file is a 8khz sample where I speak the words "take note, this information is confidential". Now, I do have a non-native English accent (being Mexican myself) but I cannot understand how the Sphynx program is understanding the following:

000000000: her

000000001: her to him crude high in the hot

Running it several times, I get the same results so... I guess Sphynx has no idea of what I am saying. Now, in theory the program sphynxtrain should be used to train the sphynx program to recognize my voice... however, according to the documentation, what I need is to adapt  the available voice samples to my voice:

http://cmusphinx.sourceforge.net/wiki/tutorialadapt

EDIT 1:
Ok, so after reading a bit more I got to the following page:
http://www.jaivox.com/pocketsphinx.html

Which compares pocketsphynx with sphynx4. It seems that sphynx4's accuracy is much better. It occurred to me to use the sphyx4-batch program provided :

http://www.jaivox.com/sites/default/files/downloads/pocket.zip

And see how does it perform with my own sound. To to this I unzipped the archive files into the same folder as the sphynx4 folder, which I previously downloaded from:

http://downloads.sourceforge.net/project/cmusphinx/sphinx4/1.0%20beta6/sphinx4-1.0beta6-src.zip?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fcmusphinx%2Ffiles%2Fsphinx4%2F1.0%2520beta6%2F&ts=1386876518&use_mirror=colocrossing

And then I ran it with the command:

java -cp "lib/*:." sphinx4batch .

Then I modified the sphynx4batch.java file and compiled it with:

javac -cp "lib/*:." sphinx4batch.java


After modifying the sphynx4batch.java file to process my ound sound1.wav, I got following (bad) results:

Origina: Take note, this information is confidential
Recognized: and are


I wasn't very optimistic, given that I learnt that for their example, sphyx4batch had a specific dictionary

I tried to manually add words "take", "note", "information", "confidential" to the dictionary, but that didn't have any effect.

:(

Sphynx4 experience: Transcribing a wav file

This post documents the steps I had to perform in order to setup Sphynx4 to transcribe a wav file in English (and later maybe in Spanish).

Basically, I don't know what am I doing. It shouldn't be so complicated to setup a voice recognition system in 2013. Let's see how this goes (I might get bored before being able to achieve my goals).

I chose to use Sphynx4 platform, which is available at: http://cmusphinx.sourceforge.net/


First, I downloaded the sphynxbase and sphynxtrain... files. From the few thinks I read, this should be used to create a voice file which will make Sphynx understand my voice.

Additionally, supposedly, the voxforge site ( http://www.repository.voxforge1.org/downloads/Main/Trunk/AcousticModels/Sphinx/ ) has some voice model already available for Sphynx4. I downloaded them and ran the "build.sh" script. This script doesn't do anything "as is", but it seems that you need to modify it by adding a call to the "downloads" function. This function seems to download a bunch of wav files.

EDIT 1:

While compiling some of the Sphynx4 stuff (basesphynx mainly) I found the following blog post:
http://nshmyrev.blogspot.mx/2010/09/voicemail-transcription-with.html which seems to achieve something similar to what I want using pocketsphynx... so now I am downloading pocketsphynx too and following those instructions. Let's hope that it works.


EDIT 2:

So, while trying to build sphyxbase, I stumbled into some problem (the make script was doing nothing) so I googled a bit more. I found the page http://www.cs.columbia.edu/~ecooper/CS4706/ps-mac.html  which tells to use ./autogen.sh  ... I am doing that now :).

After running ./autogen.sh I ran make. This time I got the following error:
autom4te: m4sugar/m4sugar.m4: no such file or directory

From this page http://stackoverflow.com/questions/6033989/aclocal-autoconf-reports-missing-m4sugar-m4-on-mac-os-x   it seems that we need to run:

sudo ln -s  /Developer/usr/share/autoconf /usr/share

So that make can find the required libraries.

*sigh*... after that, I get the following error:

libtool: Version mismatch error.  This is libtool 2.2.10, but the
libtool: definition of this LT_INIT comes from libtool 2.2.6b.
libtool: You should recreate aclocal.m4 with macros from libtool 2.2.10
libtool: and run autoconf again.

Let's see how to troubleshoot this...

Ok, I ran sphynxbase's ./autogen.sh again, and then make. It seems that now it is doing something. I assume that the autogen.sh should be run after doing the ln -s (shown above).


Right, so the "make" command seems kind of stuck. Since I was doing all this in OSX, I decided to try it on a Linux instance.  I repeated all the steps on Linux, and sphynxbase compiled without issues... So I will continue with this approach.


I then run pocketsphynx's ./autogen.sh  and it runs without problems, then I successfully do make.

EDIT 3:

Ok, so I could run pocketsphynx_continuous, however after getting the error:

ERROR: "pocketsphinx.c", line 625: No search module is selected, did you forget to specify a language model or grammar?


I searched a bit more and found the following page:
http://mariangemarcano.blogspot.mx/2012/09/speech-recognition-with-pocketsphinx.html

Where they say to run the pocketsphynx_continuous program in the following way:

pocketsphinx_continuous.exe -hmm C:\Project\SpeechRecognition\CMUSphinx\pocketsphinx\model\hmm\en_US\hub4wsj_sc_8k 
-dict C:\Project\SpeechRecognition\CMUSphinx\pocketsphinx\model\lm\en_US\cmu07a.dic 
-lm C:\Project\SpeechRecognition\CMUSphinx\pocketsphinx\model\lm\en_US\wsj0vp.5000.DMP


I followed those instructions, adding the -infile wav/sound1.wav   parameter (with a sound I previously recorded) and the program ran!. However the transcription was completely wrong haha. I guess now I need to find out how to improve its quality.

This will be continued in another post



Friday 17 February 2012

Using Insync without installing (portable "installation")

I hope this help and to anyone who would try it, please please please do it at your own risk. I can not be held liable for any problem including but not limited to loss of data. So far I have had no problem but YMMV.

- Introduction -

Just recently I started using the great Insync softare/service which allows you to use your Google Docs space in a similar way to DropBox. The advantage of using Insync/Google is that you can get 20 GB of storage for only $5 USD a year, whereas all the popular cloud-syncing services charge at least the same amount per month.

Currently Insync do not offer a portable version of their client, however it is not too difficult to create one. By "portable" here I only mean a version which you do not need to install and can be used from without administrator privileges. I have no idea whether the program modifies the registry when running.

- Requirements -

  1. We must download the Windows version of the InSync client (because this tutorial only works for the Windows version).
  2. We also require the amazing "Universal Extractor" (available here). Be sure to download the "portable" version (it wouldn't make sense having to install a program to make another program portable, would it?
- Process -
  1. Open (unpack/unrar) and Run the universal extractor program. You should see a screen such as this:
  2. In the "Archive/Installer to extract" select the Insync installer that you previously downloaded (should be named "Insync-0.9.XX.YYYY.exe"
  3. After selecting the installer, the "Destination directory" text field will be automatically filled. If you want to change it do it now, otherwise note the location of this folder.
  4. Press "OK"
  5. Wait for some time until the program has finished extracting the files to the specified folder.
  6. Open to the output folder. You should see the following files and folders:
- The tricky part: fixing missing files -
  1. Now, go to the $INSTDIR folder and within this, to the res folder. You should see several icons (image files with .ico extension).
  2. look for the file named "taskbar-normal.ico". Make a copy of it (select it, press CTRL+C and then CTRL+V) . You should have a new file called "taskbar-normal copy.ico" or similar.
  3. Rename the copy you just made exactly to "taskbar-normal-update.ico" (take care of not adding spaces or an additional ".ico" if your Windows is configured to hide extensions).
  4. Repeat step 2 and 3 for the file: "taskbar-syncing-4.ico" (copying this file) renaming it to "taskbar-syncing-4-update.ico"
  5. Repeat steps 2 and 3 for the files:
    taskbar-syncing-3.ico
    taskbar-syncing-2.ico
    taskbar-syncing-1.ico
  6. At the end you should have created the following new files:
    taskbar-normal-update.ico
    taskbar-syncing-4-update.ico
    taskbar-syncing-3-update.ico
    taskbar-syncing-2-update.ico
    taskbar-syncing-1-update.ico
  7. Now go to the folder $PLUGINSDIR (under the main folder where you extracted the contents of the installer). You should see the following files:
  8. Move all those file to the $INSTDIR folder (where the insync.exe program is located)
  9. Finally, rename the $INSTDIR folder to something like InSyncPortable or any other meaningful name you want.
  10. Move this folder to your prefered location. Use the insync.exe file to launch the program.

Important: Do not forget to define the folder where you want Insync to download your files before you login to your google account.

And that's it. I have been using the InSync client in this way for about 2 weeks without a problem. And to my surprise the client was even automatically updated!


Wednesday 22 June 2011

Internships en México


Estancias de Verano

Mucho se habla en las esferas tecnológicas del Internet acerca de las estancias de verano o "internships" ofrecidas por empresas como Google (Google Summer of Code). Estas oportunidades ofrecen la posibilidad de que los estudiantes realicen una estancia laboral en una empresa trabajando en un proyecto de su elección. Muchas veces dichas estancias incluyen una remuneración para el estudiante, además de proporcionarle una experiencia positiva y un punto más para su Curriculum Vitae.


Posibilidades de Estancias en México

En México existen algunas oportunidades similares, mayormente enfocadas a la investigación científica. Diversos programas están disponibles, tales como el Verano de la Investigación Científica de la Academia Mexicana de la Ciencia; el Verano del Delfín, organizado por 51 instituciones educativas del norte de la república; y el verano Innova (enfocado más hacia el desarrollo de la industria).

Mediante estos programas, los estudiantes participantes pueden realizar estancias de al rededor de dos meses en un instituto de investigación, universidad o empresa. Durante este tiempo, los participantes trabajan en un proyecto de investigación y desarrollo real - bajo la tutela de un investigador - ayudando a la institución que lo recibe.

A cambio de dicho trabajo, los participantes reciben una beca mensual que le permite cubrir los costos de su estancia y obtener un ahorro para su propio gasto.

Convocatorias

Las convocatorias de los distintos programas son abiertas en los primeros meses del año (entre Enero y Marzo)

Requisitos

Para poder participar en dichos programas se deben cubrir ciertos requisitos:

Ser estudiante de licenciatura inscrito regularmente;
· No adeudar materias;
· Haber concluido el sexto semestre de la licenciatura o contar con el 75% de los créditos, al momento de iniciar la estancia;
· Demostrar un promedio general de calificaciones mínimo de 8.5 si la carrera que se cursa pertenece al área de Ciencias Físico-Matemáticas, o bien un promedio general de calificaciones mínimo de 9.0 si la carrera pertenece a cualquiera de las siguientes áreas: Ciencias Biológicas, Biomédicas, Químicas, Ciencias Sociales, Humanidades, Ingeniería o Tecnología.
(fuente: convocatoria XXI de la AMC, énfasis propio)

Cabe señalar, que los distintos programas cuentan con diversos requisitos. El verano del delfín permite aplicar a estudiantes de Ingeniería e Industria cuyo promedio es de 8.5.


Monto de Becas

Aunque los montos varían entre un programa y otro (y van cambiando año con año) vale la pena señalar el monto de las becas de proporcionadas por la AMC en el XXI Verano Científico (2011):

BECAS:

· $7,000.00 si realizas tu Verano fuera de la entidad federativa donde cursas tus estudios. El costo del pasaje redondo por vía terrestre entre tu lugar de residencia y el sitio donde lleves a cabo tu estancia se te reembolsará previa entrega de tu recibo firmado y el boleto de ida correspondiente.
· $3,000.00 si realizas tu estancia en la misma entidad federativa en donde estudias.

Experiencias

Este autor participó en el verano de la Academia Mexicana de Ciencias hace como 10 años. Realizando una estancia en un prestigioso instituto de investigación con sede en el Distrito Federal.

Además de participar activamente en el desarrollo de software (programando en .NET en la época en que acababa de ser lanzado por Microsoft), la estancia abrió puertas para realizar su maestría y doctorado en el extranjero. Esto fue posible gracias a los contactos establecidos durante la estancia.

Cabe señalar que algunos estudiantes que participan en estos programas en el 6to semestre de su carrera vuelven a participar (en el mismo instituto o en uno diferente) en el 8vo semestre. Es decir, los estudiantes quedan satisfechos por el programa.


Opinión

En opinión de este autor, estos programas deberían tener mayor difusión entre la juventud. Especialmente entre los estudiantes que inician la carrera o que terminan el bachillerato. Esto les permitirá a los estudiantes interesados, mantener un promedio adecuado para participar en los programas.

En la experiencia de este autor, muchos estudiantes de licenciatura en semestres avanzados no tenían conocimiento de los programas. De hecho, este autor pudo participar en uno de los veranos al enterarse de su existencia 5 días antes de que se cerrara la convocatoria (y por consiguiente, pudo aplicar 1 día antes de que se cerrara la fecha, después de conseguir la papelería correspondiente).

Este autor también cree que este tipo de programas deben ser expandidos para incluir no solo desarrollo científico pero además desarrollo tecnológico. Bajo el auspicio del CONACYT (con T de Tecnología), se deberían crear programas de verano que permitan a estudiantes de licenciatura participar en proyectos enfocados al desarrollo de nuevas tecnologías (en empresas de México).


Wednesday 1 June 2011

This is a test post.

Hello world.

Simple test post for this mostly empty blog. Nothing to write about so far.