consider that we have relevant information on our
system, and the second one where we perform direct
crawling from web stores. The dataset where the
query is performed consist of 72 index terms (mostly
technology brands) and 7756 articles.
8.1 Use Case – Relevant Information
Exists
In this use case we perform a query with keywords
(indexes) that already exist on our system, so that we
evaluate the efficiency of our ranking algorithm. The
query that we request is “acer laptop chromebook”
(Figure 9) where all the indexes already exist on
storage and have relations to existing articles.
In this case the total time spend from request to
response visualization is about 146.96ms and the top
three results are shown on the Figure 8
Figure where
we see that the results are accurate.
Figure 8: Search result page for query “acer laptop
chromebook”.
8.2 Use Case – No Relevant Data Exists
In the second use case we perform the query "hugo"
and the result contains mostly articles about books,
movies and perfumes. The total time of response is
about 2.41s and based on the search results we have
seen that this approach is not providing highly
relevant data (also the query which consists only
from one word).
9 CONCLUSIONS
In our system we have developed an environment
that adapts general search engines to the needs of
specific domain. We have treated main functions of
search engines (crawling, indexing, query
processing), and tried to develop original methods
for each of the processes. We also have developed a
ranking algorithm that can be adapted to any dataset
with simple modifications.
However we haven’t discussed how the system
could be distributed on multiple machines, which
may be treated on future works. Other problems that
could be treated include developing stemming
techniques for our system, making recommendations
based on geo locations, and improving featured
functions of warehouse algorithm. Additionally, on
future works we may evaluate the system with larger
datasets that may be acquired by gathering data from
different queries entered by users.
REFERENCES
Anon., 2016. Moz. Basics of Search Engine friendly
design and development.. [Online] Available at:
https://moz.com/ beginners- guide-to-seo/ basics-of-
search- engine- friendly- design- and- development
[Accessed July 2016].
Brin, S. & Page, L., 1998. The Anatomy of a Large-Scale
Hypertextual Web Search Engine.. Computer networks
and ISDN systems, 30(1), pp. 107-117.
Clark, J., 1999. XSL Transformations (XSLT). [Online]
Available at: https://www.w3.org/TR/xslt/
Clark, J. & DeRose, S., 1999. XML Path Language
(XPath). [Online] Available at: https://www.w3.org/
TR/xpath/
Connolly, R. & Hoar, R., 2014. Fundamentals of Web
Development. 1st ed.:Pearson Education.
Croft, B. W., Metzler, D. & Strohman, T., 2010. Search
Engines, Information retrieval in practice. 1st
ed.:Pearson.
Microsoft, n.d. Introduction to Windows Service
Applications. [Online] Available at: https://msdn.
microsoft.com/en-us/library/d56de412(v=vs.110).aspx
[Accessed 2016].
Vise, D. A., Malseed & Mark, 2006. The Google Story.
reprint ed.:Delta Trade Paperbacks.
Zhou, K., Cummins, R., Lalmas, M. & Jose, J. M., 2013.
Which Vertical Search Engines are Relevant?. Rio de
Janeiro, Brazil, ACM, pp. 1557-1568.