Reading Source Code Aloud
Posted
by Jon Purdy
on Stack Overflow
See other posts from Stack Overflow
or by Jon Purdy
Published on 2010-01-28T07:49:03Z
Indexed on
2010/04/17
21:33 UTC
Read the original article
Hit count: 376
After seeing this question, I got to thinking about the various challenges that blind programmers face, and how some of them are applicable even to sighted programmers. Particularly, the problem of reading source code aloud gives me pause. I have been programming for most of my life, and I frequently tutor fellow students in programming, most often in C++ or Java.
It is uniquely aggravating to try to verbally convey the essential syntax of a C++ expression. The speaker must give either an idiomatic translation into English, or a full specification of the code in verbal longhand, using explicit yet slow terms such as "opening parenthesis", "bitwise and", et cetera. Neither of these solutions is optimal.
On the one hand, an idiomatic translation is only useful to a programmer who can de-translate back into the relevant programming code—which is not usually the case when tutoring a student. In turn, education (or simply getting someone up to speed on a project) is the most common situation in which source is read aloud, and there is a very small margin for error.
On the other hand, a literal specification is aggravatingly slow. It takes far far longer to say "pound, include, left angle bracket, iostream, right angle bracket, newline" than it does to simply type #include <iostream>
. Indeed, most experienced C++ programmers would read this merely as "include iostream", but again, inexperienced programmers abound and literal specifications are sometimes necessary.
So I've had an idea for a potential solution to this problem.
In C++, there is a finite set of keywords—63—and operators—54, discounting named operators and treating compound assignment operators and prefix versus postfix auto-increment and decrement as distinct. There are just a few types of literal, a similar number of grouping symbols, and the semicolon. Unless I'm utterly mistaken, that's about it.
So would it not then be feasible to simply ascribe a concise, unique pronunciation to each of these distinct concepts (including one for whitespace, where it is required) and go from there? Programming languages are far more regular than natural languages, so the pronunciation could be standardised. Speakers of any language would be able to verbally convey C++ code, and due to the regularity and fixity of the language, speech-to-text software could be optimised to accept C++ speech with a high degree of accuracy.
So my question is twofold: first, is my solution feasible; and second, does anyone else have other potential solutions? I intend to take suggestions from here and use them to produce a formal paper with an example implementation of my solution.
© Stack Overflow or respective owner