The Kavita Project: 2013

Saturday, November 23, 2013

AT-SPI news

In previous posts I have explained how you can install and see Simon work perfectly on Windows only. I also expressed my concern about haven't available the AT-SPI plug-in on windows, but it was available in Fedora. It happens that the AT-SPI is an exclusive tool of Linux operating systems. Which leaves us with only two options to continue our project. Install Simon successfully on a Linux operating system that can use the AT-SPI plug-in or extended the Simon's keyboard in Qt, in order to allow that this keyboard can respond to the voice recognition instead of the mouse control.

Thursday, November 21, 2013

Dictation on Simon

Over the course of the summer, Simon's developers has been working on bringing dictation capabilities to Simon. Now, They're trying to build up a network of developers and researchers that work together to build high accuracy, large vocabulary speech recognition systems for a variety of domains (desktop dictation being just one of them). Even before this project started, Simon already had a "Dictation" command plugin. Basically, this plugin would just write out everything that Simon recognizes. But that's far from everything there is to dictation from a software perspective.

First of all, is necessary to take care of replacing the special words used for punctuation, like ".period", with their associated signs. To do that, Simon's developers implemented a configurable list of string replaces in the dictation plugin. Dictation will improve our project with better recognition of the vocabulary that we'll have between Caribou and Simon.

Tuesday, November 19, 2013

Simon Installation 3: Windows: It Works

I'm just tried to install the compiled version of Simon on windows and it's working pretty well. Finally, Simon does what I tell him to do. Also, I made this video: Testing Simon. In this video I was just trying it and, surprisingly, Simon wrote in the browser the commands that I gave. "show places" and "programs" commands did not work, because they are commands of scenarios that I'm not installed yet.

The thing here is that the Simon's scenarios doesn't have the AT-SPI plug-in that allow us to connect Simon with another application that have AT-SPI as a plug-in too. In order to get the connection between these application we have to, in some way, enable this plug-in in the windows version of Simon.

Also, to turning Simon into a real dictation system, first of all, the Simon's developers needed to hack Simond a bit to accept and use an n-gram based language model instead of the scenarios grammar when the first was available. With this little bit of trickery, Simon was already able to use the models they built in the last weeks. More info here.

Simon Installation 2: Fedora 18 & 19: Almost Failed

In Fedora 18 I tried only to get it through yum search simon and it didn't found it. So I tried with Fedora 19 and it's work.

Steps to install the compiled application:

get it through yum search simon
when it found the packages, do: yum install simon simon-doc simon-l10

After the installation, I downloaded some scenarios to Simon and I trained it to se them. Also, the scenarios have the AT-SPI plug-in that we need to connect Simon with Caribou. But, when I tried to compile the trained scenarios and tried to connect Simon with it server I got an error: //screenshot

Simon Installation 1: Ubuntu 13.04: Failed

Steps to install it from source files:

download the source file: git clone git://anont.kde.org/simon
run this command

./build_ubuntu.sh

And I got an error with the Qt installation that will break plugins and linking of some applications.

Steps to install the compiled application:

get it through apt-cache search search simon
when it found the packages, do: apt-get install simon simon-doc simon-l10n

And I got an error with the Simon's scenarios:

I realized that it can’t be installed in Ubuntu and also, I found it in this blog that say: maybe the current package is still broken

meet Simon

The speech recognition that we are using for the research is Simon. It is an open source project and also very complete in the sense that it have plug-ins like AT-SPI, dictation (in process) and a keyboard, among other things. See Simon in action. When I was trying to install Simon, I tried four different OS. Ubuntu 13.04, Fedora 18, Fedora 19 and W indows. I tried the two ways that I know to do it: downloading the source files to build it and downloading the application compiled. I’ll put it here as individual posts, with the steps that I followed and the errors that I has.

Presenting at Carnegie Mellon University

As part of a workshop, I was at CMU this October. I had the opportunity to present our idea to the faculty, graduate students and the student of the OurCS 2013 program. The poster consists of the conceptual idea without being developed and without final results. Which is fine to be presented at a workshop. This was the poster that I presented:

Also, I was working in a research during those three days on speech recognition, with Alan Black. We were working on automated methods for building new voices for speech synthesis. I synthesized my own sentences with Spanish accent. This helped me a lot to understand how Simon works. Besides, Simon used as a dependency a software that was created at CMU, Sphinx. Although Simon users do not have to worry about this process that happens internally when you train it.

We are currently developing the proof of concept between Simon and Caribou through AT-SPI. Because we are working with open source projects and we are also in mid-semester, we have taken time to prepare the environment to install both projects. By the end of the semester, at least, we'll have a fully functional Simon with applications that are compatible through the AT-SPI plugin as gedit.

Saturday, November 16, 2013

Technical Caribou installation steps in Fedora and Ubuntu

Fedora 19
1. Create virtual machines using VMware using Fedora 19
2. Tried to install AT-SPI2 in Fedora and it affected the yum functionality. Finally reinstalled Fedora. We did a yum update command for this.
3. Created a directory in home folder for the project and proceed to download a clone of the Caribou git repository. It downloaded all the source files.
4. Proceed to install gnome-common package with all it's dependencies (yum install gnome-common)

Caribou git

Ubuntu 12.0.4
1. Used same virtual environment to create another virtual machine with Ubuntu 12.0.4
2. run apt-get update command
3. Created a directory in home folder for the project and proceed to download a clone of the Caribou git repository. It downloaded all the source files.
4. Proceed to install gnome-common package with all it's dependencies (apt-get install gnome-common)

Next Steps:
1. Run the ./autogen.sh command to generate a pre-compiled version of Caribou*
2. Install any dependencies required to complete this task successfully
3. Compile the Caribou source code
4.Test the compiled version

*Read the config file and proceed to install all dependency packages before running the ./autogen.sh command on each system respectively.

Free vs Open Source

Monday, October 14, 2013

AT-SPI2

Assistive technologies, such as screen readers or magnifiers, can use this logical representation to enable individuals with disabilities to browse and interact with applications.The Accessibility ToolKit (ATK) is a development toolkit from GNOME which allows programmers to use common GNOME accessibility features, such as high-contrast visual themes for the visually-impaired and keyboard behavior modifiers for those with diminished motor control, to make GNOME applications accessible.

The Assistive Technology Service Provider Interface (AT-SPI) is a toolkit-neutral way of facilitating accessibility in applications, by using native accessibility APIs. AT-SPI2 is, as was the intention from the beginning, a platform-neutral framework for providing bi-directional communication between assistive technologies (AT) and applications. Through the use of AT-SPI, an application's components' state, property and role information is communicated directly to the end user's AT, thereby facilitating bi-directional (input and output) user interactivity with, and control over, an application or compound document instance.

Who is Kavita?

Kavita got her B.S. in Computer Science and Mathematics. In 2001, Kavita joined UMBC's graduate program in Computer Science to pursue her Ph.D. Kavita is a student with spinal muscular atrophy (SMA), can no longer physically attend classes and can only type with one finger. Last year, with the help of her Mom and her wheelchair, Kavita was able to come to campus and attend classes and our meetings. This year, she is only able to continue her studies and classes via Skype from home. But even from home, Kavita continuesto maintain the highest grades, a 4.0 GPA while working on her research.

Tuesday, September 24, 2013

Proposal: Part 1

We decided to write and submitted a proposal to present posters at the Tapia conference in order to get experience when presenting our work. We divide the project in two parts. My part was titled Simon speech recognition integration with Caribou On Screen Keyboard (OSK). Our goal is to integrate the Gnome's On-Screen Keyboard, Caribou and Simon, KDE's speech recognition software, through Assistive Technology Service Provider Interface (AT-SPI 2) in order to develop a very potential tool to disabled people and give them the flexibility to obtain all the same possibilities without almost disadvantage. To obtain a multi modal interface that can be a very useful tool to people with disabilities. The integration of there two platforms through AT-SPI2 can allow us to develop software of physically disabled people to give them the means to perform many activities with minimal or no assistance. The implementation of this useful assistive technology are through two different platforms that are both an open source resources.

In previous work, Simon was integrated with a KDE’s gedit software through their AT-SPI2 interface, and there has also been previous work with training vi for programming using voice recognition. See python to code by voice ; however, our objective to create a voice interface to the keyboard, so that the user can use any editor to program without having to program and memorize the two thousand commands that the creator of previous interface did. It can be a powerful tool for their daily lives. The overreaching goal of our work is to implement this integration with any IDE.

Monday, September 16, 2013

Proposal: Part 2

The Tapia Conference 2014 was accepting proposal submissions and in our case we wanted to apply for the posters. We had already discussed most of our ideas for the project but this was the opportunity to present it formally and get feedback on it. Our goal is to create an assistive technology that comprises of two open source projects: KDE's Simon voice recognition Software and GNOME's Caribou on-screen keyboard. By integrating these and expanding their features we hope to create a tool for disabled individuals that, while programming, might enhance their performance and give them access to other features that might have previously been unfeasible under their circumstances.

My part, Part 2, consisted of explaining how the programming interaction would be implemented and why would it be beneficial. This led us to discuss more about the technical limitations of the typical user interface and input mechanisms, and how we could enable a compact interface with advanced features. The integration with programming would come directly from a compatible IDE that would port through ATSPI2 it's procedures and useful data to Caribou and sub-sequentially to Simon's applications. Once the data is ported it would be managed for efficient display on the interface and voice command enabling. To achieve this, the caribou interface must be extensively modified to create an area to process the data depending on it's usefulness at any point.

As part of the proposal, a project draft was also required to be included. This draft is to be the Poster's in the final work. The following is the text submitted as requested:

The second phase of this assistive technology research project consists of integrating an on-screen keyboard (OSK) with an Integrated Development Environment (IDE) that implements a modified interface to expand functionality and facilitate interaction for users, specifically physically disabled individuals. Common assistive tools are on-screen keyboards and speech recognition software, such as Caribou, Simon and Dragon Naturally Speaking. Extensive research has been done on the efficacy of these tools, proving that they work well for controlled environments and specific tasks, but may not be useful for conditions such as motor disorders that inhibit the user’s mobility or pronunciation. Integrating their functionality to manage complex dynamic tasks is an aspect that this project explores.

Programming is an arduous task for individuals with physical disabilities that rely on independent tools to interact with their digital environment. Providing a multimodal Integrated Development Environment where programming and its complex syntax and dynamic structure is key to lessening this burden. By modifying the static structure of a on-screen keyboard to dynamically adjust to the criteria of the environment, our project provides flexibility on what type of information is displayed to the users at a given moment. While facilitating and reducing keystrokes, the application’s speech recognition mode can perform any task available to it through the interface thereby reducing stress on extremities by relying on voice commands and taking advantage of convenient features as word completion, word prediction, embedded application commands (open, close, save), or grammar specific commands (comment, collapse, expand) that are available through both methods.

"Free" vs "Open Source"

Before we start to work on the Kavita project, we read about "free" and "open source" software projects, and the difference between them. It's important to understand that "free" projects isn't about non cost software projects, the word "free" carried with it a moral connotation. It's more about freedom to share and modify for any purpose, but it can't be marketing as proprietary software. In that way, the "open source" software projects can be shared, modified and also can be using as proprietary software. The word "open source" appears to be part of a vocabulary to talk about free software as a development of business methodology, instead of as moral reference.

Tuesday, July 16, 2013

Introduction

This blog is a collaboration between an Assistant Professor and me, an undergrad, to develop an assistive technology to help another student at the graduate level complete her dissertation. We are working on creating a multimodal On Screen Keyboard (OSK) that can interface with an Integrated Development Environment(IDE). We are hoping to integrate this technology into one of the Humanitarian Free and Open Source Software (HFOSS) communities.