values of g-means (Kubat and Matwin, 1997), which
is the most used measure to evaluate results in
imbalanced datasets, and the area under the ROC
curve are really good.
Table 6: Results of C. Validation and the studied Subset.
C.Validation Studied subset
G-means 0.8071 0.9170
Positive 0.869 0.941 ROC
Area
Negative 0.869 0.941
7 CONCLUSIONS
We have presented a system of learning that allows
detecting the headers of a table. We can offer the
relationship between the header and the cells under
its scope. This is an important improvement because
it means that the content of the table is not only a list
of elements. The table recovers the bi-dimensional
nature and allows the impaired user to obtain all the
information inside the table.
The proposed solution has been tested with a
heterogeneous set of real Web pages. The selection
of this set was completely random and with it we
can assure that the system does not offer good
results for only a concrete situation. The system
obtains excellent results and improves the results of
the systems developed up to now. Hence, the study
done in Section 4 regarding the previous work has
allowed us to improve some lacks detected in this
subject.
The next step to do will be to include this system
in the ACTAW platform. With this application the
feature can offer easily its help to all kind of people.
REFERENCES
World Wide Web Consortium (W3C) (1999a) Web
Content Accessibility Guidelines 1.0, http://www.
w3.org/TR/WCAG10/ (Retrieved on June 2007)
World Wide Web Consortium (W3C) (2007a) World
Accessibility Initiative (WAI). http://www.w3.org/ WAI/
(Retrieved on June 2007)
Chen Shan, Hong Dan, Vicent Shen. An experimental
study on Validation Problems with existing HTML
Webpages. In Proceeding of International Conference
on Internet Computing, pages 373-379, Las Vegas,
EUA, 2005.
Quality Assurance Activity (W3C) (2007) The W3C
Markup Validation Service. http://validator.w3.org/
(Retrieved on September 2007)
Benfeng Chen, Vicent Y. Shen. Transforming Web Pages
to Become Standard-Compliant through Reverse
Engineering. In Proceeding of Workshop Web for All
in International World Wide Web Conference, pages 14-
22, Edinburgh, UK, 2006.
World Wide Web Consortium (W3C) (2000) HTML
Techniques for Web Content Accessibility Guidelines
1.0, http://www.w3.org/TR/WCAG10-HTML-ECHS/
(Retrieved on June 2007)
Creating Accessible Tables – Data Tables (2007) Web
Accessibility in Mind (WebAIM)
http://www.webaim.org/techniques/tables/data.php
(Retrieved on September 2007)
Kubat, M, Matwin, S. Addressing the Curse of Imbalanced
Training Sets: One-Sided Selection. In Proceedings of
the 14th International Conference on Machine
Learning, 1997.
Freedom Scientific (2007). JAWS for Windows.
http://www.freedomscientific.com/fs_products/softwar
e_jaws.asp (Retrieved on June 2007)
Ai Squared (2007) Ai Squared site. http://www.aisquared
.com/index.cfm (Retrieved on June 2007)
Yeliz Yesilada, Robert Stevens, Carole Goble and Shazad
Hussein. Rendering Tables in Audio: The Interaction
of Structure and Reading Styles. In Proceeding
ASSETS’04, pages 16-23, Atlanta,Georgia, USA,
2004.
Juan Manuel Fernández, Vicenç Soler and Jordi Roig.
Automatic Conversion Tool for Accessible Web. In
Proceedings of the 3rd International Conference on
Web Information Systems and Technologies, pages
459-462. Barcelona, Spain, 2007.
Enrico Pontelli and Tran Cao Son. Planning, Reasoning,
and Agents for Non-visual Navigation of Tables and
Frames. In International ACM SIGCAPH Conference
on Assistive Technologies pages 73-80. Edinburgh,
UK, 2002.
Robet Filepp, James Challenger and Daniela Rosu.
Improving the accessibility of aurally rendered HTML
tables. In International ACM SIGCAPH Conference
on Assistive Technologies pages 9-16. Edinburgh, UK,
2002.
Bernhard Krüpl and Marcus Herzog. Visually Guided
Bottom-Up Table Detection and Segmentation in Web
Documents. In Proceeding of International World
Wide Web Conference, pages 933-934, Edinburgh,
UK, 2006.
K Kottapally, C. Ngo, R. Reddy, E. Pontelli, T.C.Son and
D.Gillan. Towards the Creation of Accessibility
Agents for Non-visual Navigation of the Web. In ACM
Conference Universal Usability, pages 134-141,
Vancouver, Canada, 2003.
World Wide Web Consortium (W3C) (2007b). www-
html@w3.org Mail Archives. http://lists.w3.org/
Archives/Public/www-html/2007May/0416.html
(Retrieved on June 2007)
Ian H. Witten and Eibe Frank. Data Mining: Practical
Machine Learnign Tools and Techniques 2
nd
Edition.
Elsevier, San Francisco, USA 2005.
Free Software Foundation (2007). GNU General Public
License Version 3. http://www.gnu.org/copyleft/
gpl.html (Retrieved on September 2007)
WEBIST 2008 - International Conference on Web Information Systems and Technologies
246