may attempt to launch our system to process a graphic
that is not a bar chart. Therefore, when the CON-
TROL+Z keystroke is detected, the BHO runs a brief
image processing algorithm to determine whether the
selected graphic has particular attributes that identify
it as a possible bar chart, such as whether the graphic
has 20 or fewer gray levels, and whether the graphic
contains at least two rectangles with a common begin-
ning row or column. If the graphic does not appear
to be a bar chart, the message “The selected graphic
does not appear to be a bar chart,” is read to the user
by JAWS.
2.2.2 Focusing On Bar Charts
When browsing a web page in Internet Explorer,
JAWS users have the option of “tabbing” through the
content of the page. By repeatedly hitting the tab but-
ton, JAWS will traverse the elements on the page ac-
cording to the browser’s default tab order for that page
or the tab order specified in the html code. When the
user is traversing a web page in JAWS using the tab
keys, the focus in Internet Explorer is updated along
with JAWS’ focus. For graphics that are in the tab
order, JAWS will read the alt text (if any) associ-
ated with the graphic when the user traverses to the
graphic. If the user hits CONTROL+Z when JAWS
reaches the graphic, our BHO will be activated.
However, many graphics are not included in the
tab order of the page, since there is typically not an ac-
tion associated with them (such as a link to follow or
a text box to fill in). To address this issue, when a web
page is opened in Internet Explorer, our BHO per-
forms a scan of all of the graphics on that page, per-
forming the bar chart detection logic described pre-
viously. If an image appears to be a bar chart, our
BHO will insert that graphic into the tab order of the
page, and will append “This graphic appears to be a
bar chart” to the alt text (if any). This pre-processing
of the web page ensures that the user will not acci-
dentally overlook bar charts that could be processed
by our system while traversing the page.
When a web page is opened in Internet Explorer
while JAWS is running, JAWS automatically begins
reading the content of the page from top to bottom
(this is sometimes called “say all” mode). It is gen-
erally more convenient for JAWS users to utilize the
“say all” mode rather than tabbing through the content
of the page. If the BHO’s pre-processing has detected
a potential bar chart, the user will be alerted to its
presence by JAWS reading the alt text, “This graphic
appears to be a bar chart,” that was inserted into the
document by our BHO. Unfortunately, while in the
“say all” mode, the focus in JAWS is not reflected
in Internet Explorer. This obviously poses a problem
since our BHO relies on the Document object in Inter-
net Explorer to identify the graphic in which the user
is interested. Fortunately, the latest release of JAWS
allows the user to set the application focus to the lo-
cation of the JAWS virtual cursor by entering CON-
TROL+INSERT+DELETE. After entering this while
JAWS’ virtual cursor has the graphic containing a bar
chart in focus, the user may then enter CONTROL+Z
to hear the summary of the bar chart.
2.2.3 Extensibility of the Browser Extension
While the current version of the user interface has
been designed specifically with JAWS and Internet
Explorer in mind, we expect similar solutions to work
for other applications. For example, extensions sim-
ilar to BHOs can be developed for Mozilla’s Firefox
browser using the Cross Platform Component Object
Model (XPCOM). Regarding the use of screen read-
ers other than JAWS, our BHO in Internet Explorer
will work with any screen reader; it is simply a mat-
ter of investigating how the focus of Internet Explorer
and the screen reading software interact and of ensur-
ing that the keystroke combination does not conflict
with existing screen reader functionality. For visually
impaired users who primarily use a screen magnifier
(such as ZoomText), rather than a screen reader, the
text produced by our BHO can be handled in the same
manner as text in any other application.
2.3 Processing the Image
Once the browser component of the SIGHT system
has detected that the user would like to access a partic-
ular graphic, it sends the image to the Visual Extrac-
tion Module. VEM is responsible for analyzing the
graphic’s image file and producing an XML represen-
tation containing information about the components
of the information graphic including the graphic type
(bar chart, pie chart, etc.) and the textual pieces of the
graphic (such as its caption). For a bar chart, the rep-
resentation includes the number of bars in the graph,
the labels of the axes, and information for each bar
such as the label, the height of the bar, the color of the
bar, and so forth (Chester and Elzer, 2005). This mod-
ule currently handles only electronic images produced
with a given set of fonts and no overlapping charac-
ters. In addition, the VEM currently assumes standard
placement of labels and axis headings. Work is under-
way to remove these restrictions. But even with these
restrictions removed, the VEM can assume that it is
dealing with a simple bar chart, and thus the problem
of recognizing the entities in a graphic is much more
constrained than typical computer vision problems.
WEBIST 2007 - International Conference on Web Information Systems and Technologies
62