Undermining
Social Engineering using Open Source Intelligence Gathering
Leslie Ball, Gavin Ewan and Natalie Coull
School of Engineering, Computing and Applied Mathematics, University of Abertay Dundee, Bell Street, Dundee, U.K.
Keywords: Social Engineering, Cognitive Hacking, Open Source Intelligence, Phishing, Data Mining, Visualisation.
Abstract: Digital deposits are undergoing exponential growth. These may in turn be exploited to support cyber
security initiatives through open source intelligence gathering. Open source intelligence itself is a double-
edged sword as the data may be harnessed not only by intelligence services to counter cyber-crime and
terrorist activity but also by the perpetrator of criminal activity who use them to socially engineer online
activity and undermine their victims. Our preliminary case study shows how the security of any company
can be surreptitiously compromised by covertly gathering the open source personal data of the company’s
employees and exploiting these in a cyber attack. Our method uses tools that can search, drill down and
visualise open source intelligence structurally. It then exploits these data to organise creative spear phishing
attacks on the unsuspecting victims who unknowingly activate the malware necessary to compromise the
company’s computer systems. The entire process is the covert and virtual equivalent of overtly stealing
someone’s password ‘over the shoulder’. A more sophisticated development of this case study will provide
a seamless sequence of interoperable computing processes from the initial gathering of employee names to
the successful penetration of security measures.
1 INTRODUCTION
The ubiquity of online social networking sites and
the blogosphere provides a reservoir of relatively
untapped data in terms of extracting value from their
analysis. Many opportunities are thus presented by
these data. Specific to security, the data may act as
open source intelligence (OSINT) for the benefit of
intelligence services and military strategists to
counter cyber-crime and terrorist activity. Indeed,
effective counter-terrorism has been deemed
unsuccessful “without the adequate exploitation of
open source information” (Borchgrave et al. 2006).
Other work has integrated OSINT into the wider
intelligence cycle in terms of crowd sourcing and the
empowerment of the public (Steele, 2007). More
broadly, automatic analyses have been made on
other forms of open source information such as the
natural language processing of social media to
model the prediction of social tension (Vybornova et
al., 2011) and to detect emergent conflict through
web mining (Johansson et al., 2011). These are only
a few examples of literature that report on the
benefits of social media analysis to those involved in
counter-terrorism, military strategy and social and
political stability. Contrary to the societal benefits of
OSINT, it can also be harnessed by those intent on
perpetrating crime. In the remainder of this paper we
focus on this issue and, in particular, on the coupled
and covert processes of data mining and social
engineering employed by the cyber-criminal to
deceive the user into disclosing security data for
access to computer systems.
The next section introduces the phenomenon of
social engineering and how it can be a very effective
tool for breaching security and complementary to the
technical hacking techniques, which are a direct
attack on a computer system and do not involve the
behavioural aspects of the user. Thereafter we
present a preliminary case study, which illustrates
the effectiveness of a covert process of intelligence
gathering integrated into social engineering to
compromise computer security.
2 SOCIAL ENGINEERING
Social engineering is “the art of gaining access to
buildings, systems or data by exploiting human
psychology, rather than by breaking in or using
technical hacking techniques” (CSO Magazine,
2012). As technology becomes more sophisticated
275
Ball L., Ewan G. and Coull N..
Undermining - Social Engineering using Open Source Intelligence Gathering.
DOI: 10.5220/0004168802750280
In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (KDIR-2012), pages 275-280
ISBN: 978-989-8565-29-7
Copyright
c
2012 SCITEPRESS (Science and Technology Publications, Lda.)
and users more inter-connected as a result, virtual
social interaction has inevitably followed on a huge
scale. Facebook accounts for approximately 3 in 4
minutes spent on social networking sites and 1 in
every 7 minutes spent online around the world (The
New Age, 2011). Moreover, digital deposits now go
far beyond the superficial “toast and coffee for
breakfast” type of blog. Website forums, Facebook,
Twitter, Tumblr, Wordpress.com are all examples of
social networking media that mobilise social
networks and allow the expression of opinions and
the disclosure of personal data. These data are, to
varying degrees, public and therefore open to
exploitation.
Our inter-connected virtual society has presented
opportunities to the malicious hacker, not only in
terms of direct brute force attack but also in terms of
psychological manipulation, and both contribute
towards what is known as the vector attack in
computer security terminology. Security of
Information Technology has thus become a major
concern for companies and governments. In 2010 in
the UK, cyber-terrorism was prioritised as a Tier
One threat to national security by the government.
The term cyber security is widely adopted to define
this phenomenon.
In the literature, the terms social engineering and
cognitive hacking appear to be synonymous, though
the latter has appeared less recently since it was
coined within a body of work by Cybenko et al.
(2002) and Giani and Thompson (2007). Enrici et al.
(2010) offer a discourse on the cognitive profiling of
a computer hacker and the psychological effects of
human factors in terms of usability and of human
errors in terms of failure, all within the context of IT
security.
Stech (2011) confirms that there have been few
publications that map the social and behavioural
aspects of cyber-deception to the classical denial and
deception tactics adopted in conventional warfare.
Rather, the focus has been on recognising that a
social engineering attack incorporates both technical
and social considerations that feed on the lethargy of
the user regarding security and the aggression of the
malicious hacker (Abraham and Chengalur-Smith,
2010). This combination is further endorsed by
Maan and Sharma (2012). A framework of feedback
loops has also been considered to model the
manoeuvres of the attacker against those of the
organisational countermeasures, where they
postulate that an organisation’s technical defences
are superior to their human equivalents (Gonzalez et
al., 2006). The same authors argue that the key for
the social engineer is to make the countermeasures
transparent so that they can be incorporated into the
main attack feedback loop, which measures the
outcomes of each attack, in order to evaluate the
next action to take.
With specific reference to social media content,
the use of natural language processing has been used
to measure information assurance (Raskin et al.,
2010). This technique applies to monitoring
suspicious activity at social networking sites, where
postings may exhibit inconsistency and therefore
expose the possibility of uncovering insider threats
to social engineering attacks. Linked to this is
research implementing an automated social
engineering bot attack on social media sites such as
Twitter ad Facebook (Huber et al., 2009). In a recent
review, Heikkinen (2010) states how the user can be
lulled into a false sense of security knowing that the
company implement firewall strategies and virus
detection, and emphasise the importance of user
training. The focus of our paper encapsulates the
spirit of Heikkinen’s work as well as encompassing
the notion of the partial technical and social attack of
other authors’ research already outlined.
The next section presents our case study to
illustrate the creative ideas behind the processes of
social engineering to compromise security measures
on a computer system.
3 CASE STUDY
The focus of the case study is on the proposed attack
of a company with whom we have previously
consulted. For privacy, we refer to the company as X
hereafter. The key to unlocking the security
measures on X’s computer system is its employees
by exposing them to a vector attack. All employee
data have also been made anonymous. The full
process of how the employees may be deceived to
disclose the necessary information to breach security
is revealed.
3.1 Aim
The purpose of the case study is to demonstrate
show how a malicious attacker, coupled with the
appropriate use of software tools can harness and
integrate open intelligence gathering into the social
engineering process to bring about a successful
vector attack.
3.2 The Procedure
We adopt a sequence of events to illustrate how a
KDIR2012-InternationalConferenceonKnowledgeDiscoveryandInformationRetrieval
276
socially engineered vector attack can be organised
around a target. The plan of attack is as follows and
encompasses the key stages of searching for
employee profiles, drilling down for employee
interests and targeting spear phishing attacks:
1. Prepare the software platform to perform the
strategic searches.
2. Implement a covert search on company
employees and extract detail of interest.
3. Construct a spear phishing attack targeted at
vulnerable employees.
4. Propose countermeasures to the social
engineering process.
Each of the stages is considered in the remainder of
this subsection. The first three stages illustrate the
attack, while the last stage considers how a company
might counter such an attack in the real world.
3.2.1 Acquisition of Tools
Software tools were used to effectively search
company X’s website and affiliated online content.
The tools were able to extract employee names,
identify some personal interests of these employees
and to craft a spear phishing attack based on these
data. We used Maltego to perform the first two data
gathering exercises and the Simple Phishing Toolkit
to construct the email attack. Maltego is an open
source intelligence and forensics application
program, which is capable of mining internet
websites, Twitter feeds and other social media
content. The Simple Phishing Toolkit is a relatively
new addition to the social engineer’s arsenal and
allows the construction of a spear phishing
campaign, which is essentially the confidence trick
of the operation. Most importantly, the tool provides
a website ‘scraper’ facility. This facility can
effectively extract details from a website that is
deemed of interest to a targeted individual and
design an email template that appears to have been
sent from this website. This is the crucial stage of
deception.
3.2.2 Covert Intelligence Gathering
This stage proceeds with the processes of
intelligence gathering. At this prototype stage the
entire sequence of events is not automated and
requires intermittent manual intervention. Company
X’s website contains a list of employee names and
each member of staff displays various degrees of
personal and work-related information. We selected
as targets only those employees who displayed
detailed information about themselves (i.e. both
personal and work information). The vetted list was
then input to Maltego, using manual intervention.
Figure 1, shows the full anonymised list of staff
email addresses.
Figure 1: The initial search on company X’s website
extracts employee emails (anonymised).
Having assimilated a target list of employees the
next stage was to run what are termed transforms.
These perform the drilling down process and there
are many options from which to choose. This case
study made the assumption that other emails
associated with the employees in Figure 1 would be
a good transform to perform. While the results for
this transform are potentially large, Figure 2
illustrates the returned information for one of the
employees only.
Figure 2: The anonymised transform results for emails
associated with a single employee.
While the bulk of these returns did not show
anything much of interest, other transforms based on
common interests did. With this type of transform
we found that many employees linked to the same
interest such as, for example, hill walking. This
information is invaluable to the social engineer
planning a vector attack. Maltego offers a
Undermining-SocialEngineeringusingOpenSourceIntelligenceGathering
277
visualisation module to enhance the display of
complex data. Using what is termed an ‘edge-
weighted view’ the visualisation in Figure 3 shows
which of the employees link to which common
interest. The more links to an interest the bigger the
bubble representation. Hill walking is seen as the
most common interest in this group of employees,
followed by Badminton and Travel. These items are
those that the social engineer will use to plan their
vector attacks.
Figure 3: The bubble representation that links employees
(anonymised) to common interests.
Armed with these data, the social engineer can now
proceed to the next stage of designing the spear
phishing attack, which could specifically target, say,
all those employees interested in hill walking.
3.2.3 The Vector Attack
The objective of the spear phishing attacks is to use
the covertly gathered information from previous
processes described in order to increase the
likelihood of successfully penetrating the security
system offsite.
Phishing is a confidence trick to steal personal
information, usually by email and, whilst it is not a
new phenomenon, spear phishing is a relatively
recent tool addition for the malicious attacker.
Though email phishing attacks still succeed, we are
generally much more aware in terms of recognising
these types of attack as they adopt typical designs
such as a generic greeting, a sense of urgency or a
direct request for personal information. Furthermore,
automated spam filters use these same criteria to
intercept suspicious looking emails. However, the
same measures fall short when encountering a spear
phishing attack. This is because, unlike generic
phishing attacks, they are not issued widely and
randomly but rather they target individuals and so
adopt a more socially aware design into their emails.
The goal is, however, the same in that they will ask
the receiver to click a link, which may indeed appear
to be urgent, though related to their interests
directly. It is the contextualising of the request
within a personalised email that increases the
probability of success. By targeting the individual in
this way the attack seeks to abuse the relationship of
trust and this falls categorically into the repertoire of
the social engineer (Williams, 2011).
Using this approach, the generic emails of
phishing attacks have essentially been replaced with
emails from seemingly trusted sources and by
displaying the recipient’s name as part of the
personalisation. The spear phisher thrives on
familiarity by knowing your name, your address and
a little about your personal interests (Norton, no
date). So, while users are now more educated on
phishing attacks, how likely are they to not click on
a link that has come from an apparently trusted
source such as a friend or a website used by the user
for leisure activity?
The Simple Phishing Toolkit allows the social
engineer to construct a list of target individuals
including their name, email address and groups them
into categories related to their interests as in Figure
4.
Figure 4: The Simple Phishing Toolkit generates lists of
individuals to target (anonymised).
The software can then generate a personalised email
for each of the groups by using the text associated
with an identified website template. The flavour of
the email is personal and friendly and invites the
targeted individual to visit their website, a bogus
website, for further information (Figure 5). Once the
individual has clicked the linked to the website
malware is activated that can steal security by
keystroke monitoring or from user activity on
Company X’s server. In this respect there is never a
request to the individual to disclose their security
KDIR2012-InternationalConferenceonKnowledgeDiscoveryandInformationRetrieval
278
information directly as the malware achieves this in
lieu of the request. The victim has thus been
undermined.
Figure 5: The spear phishing attack takes the form of a
personalised email inviting the recipient to visit a website
for further information (anonymised).
3.2.4 Proposal for Countermeasures
So what could any company or other venture do to
secure their computing infrastructure from these
types of creative vector attacks?
Firstly, it is clear from this initial case study that
those staff members leaking professional and
personal information are more vulnerable to a social
engineering attack. A company should therefore
provide the necessary education to its staff on the
security threats (Abraham and Chengalur-Smith,
2010; Heikkinen, 2010) that can be engineered from
public data and to always question requests to click
on links that were unexpected or unsolicited. While
this study only simulated an attack on those
interested in hill walking, any other interest
highlighted in Figure 3 could have been used to
similar effect. Even less suspicious would be those
email requests that are work related and seem to
follow the natural course of everyday working life
rather than specific to personal interests.
Taken further, a company could design and
implement policies that prevent their staff from
posting personal details. Without such a policy in
place, the ‘humanising’ effect ensues, which plays
straight into the hands of the social engineer who is
studying the psychological behaviour of its targets in
order to mimic them in the attack.
Lastly the company could take a more aggressive
approach by actively spear phishing their employees
explicitly in a harmless attack in order to test their
awareness. It in effect becomes the company drill of
a cyber-attack as a preventative measure rather than
a fire drill exercise, for example.
4 CONCLUSIONS
This paper has focused on the creative process of
cyber-attacks using the surreptitious techniques of
social engineering. Ultimately, however, the social
engineering attack has been identified as the top
information security threat in 2012 (Trend Micro,
2012). A case study was designed to simulate such
an attack on company X’s computing infrastructure
in order to highlight the vulnerability of disclosing
too much data on publicly available websites.
Our spear phishing email demonstrates how
using appropriate search and mining tools and
manual interventions, a company’s computer
security could be compromised by exploiting
personal details posted by the company’s staff
members.
By extracting an initial list of staff names and
their associated interests it was possible, using open
source intelligence and phishing software, to craft a
personalised email that engendered trust in the user
but was in fact a confidence trick to get the user to
click on a link to a bogus website. By clicking on a
malicious link, malware can be easily downloaded to
the victim’s machine, in spite of existing security
measures such as firewalls and anti-virus, which
could steal user credentials and other valuable
information.
The work illustrates that spear phishing as
opposed to normal phishing, is likely to be much
more effective as they target the individual in a more
socially aware design than the latter, which issues a
random and blanket email attack.
The case study illustrates the various stages
required to search, extract and visualise intelligence
data, which are invaluable to designing the spear
phishing attack. The future requirement from this
work is to devise an intelligent bot that can perform
the entire sequence of events seamlessly without the
manual intervention of the attacker. The
achievement of this with a bot would require the
integration of pattern recognition algorithms for text
analytics as well as decision-making capabilities on
who to target and how to automate the email attack
effectively.
REFERENCES
Abraham, S. and Chengalur-Smith, I., 2010. An Overview
Undermining-SocialEngineeringusingOpenSourceIntelligenceGathering
279
of Social Engineering Malware: Trends, Tactics, and
Implications. Technology in Society, 32(3): 183-196.
Borchgrave de, A., Sanderson, T. and MacGaffin J., 2006.
Open Source Information: The Missing Dimension of
Intelligence. Report of the CSIS Transnational Threats
Project.
CSO Magazine, 2012. The Ultimate Guide to Social Engi-
neering. [Online] Accessed 12/06/2012 at http://assets.
csoonline.com/documents/cache/pdfs/Social-Enginee
ring-Ultimate-Guide.pdf
Cybenko, G., Giani, A. and Thompson, P., 2002.
Cognitive Hacking: A Battle for the Mind. IEEE
Computer, 35(8), 50-56.
Enrici, I., Ancilli, M. and Lioy, A., 2010. A Psychological
Aproach to Information Technology Security. In 3
rd
Conference on Human System Interaction, 459-466.
Giani and P. Thompson. Detecting Deception in the
Context of Web 2.0. In Web 2.0 Security & Privacy,
2007.
Gonzalez, J., Sarriegi, J. and Gurrutxaga, A., 2006. A
Framework for Conceptualizing Social Engineering
Attacks. CRITIS 2006, LNCS 4347, 79-90.
Heikkinen, S., 2010. Social Engineering in the World of
Emerging Communication Technologies. In
Proceedings of Wireless World Research Forum
meeting #17, Nov 2006.
Huber, M., Kowalski, S., Nohlberg, M. and Tjoa, S., 2009.
Towards Automating Social Engineering Using Social
Networking Sites. In International Conference on
Computational Science and Engineering, 3:117-124.
Johansson, F., Brynielsson, J., Hörling, P., Malm, M.,
Mårtenson, C., Truvé, S. and Rosell, M., 2011.
Detecting Emergent Conflicts Through Web Mining
and Visualization. In European Intelligence and
Security Informatics Conference 2011, 346-353.
Maan, P. and Sharma, M., 2012. Social Engineering: A
Partial Technical Attack. International Journal of
Computer Science Issues, 9(2), 1694-0814.
The New Age, 2011. Social Networking is the most Popu-
lar Online Activity. [Online] Accessed 12/6/2012 at
http://www.thenewage.co.za/38836-1021-53-Social_
networking_is_the_most_popular_online_activity
Norton., [no date]. Spear Phishing: Scam, not Sport.
[Online] Accessed 12/06/2012 at http://uk.norton.com
/spear-phishing-scam-not-sport/article
Raskin, V., Taylor, J. and Hempelmann, C., 2010.
Ontological Semantic Technology for Detecting
Insider Threat and Social Engineering. NSPW’10, 21-
23 Sept. 2010, Concord, MA, 115-127.
Stech, F., Heckman, K., Hilliard, P. and Ball, R., 2011.
Scientometrics of Deception, Counter-deception, and
Deception Detection in Cyber-space. PsychoNology
Journal, 9(2), 79-122.
Steele, R., 2007. Open Source Intelligence. In Johnson, L.
(ed.) Strategic Intelligence: The Intelligence Cycle,
Praeger. Westport CT, 96-122.
Trend Micro, 2012. Social Engineering Remains Top
Security Threat in 2012. [Online] Accessed 12/06/
2012 at http://www.newswit.com/.it/2012-04-05/bf95
43225f9137e29c7a64af58a75c2b/
Vybornova, O., Smirnov, I., Sochenkov, I., Kiselyov, A.
and Tikhomirov, I., 2011. Social Tension Detection
and Intention Recognition Using Natural Language
Semantic Analysis. European Intelligence and
Security informatics Conference 2011, 277-281.
Williams, C., 2011. Google Cyber Attacks: What is Spear
Phishing? [Online] Accessed 12/06/2012 at http://
www.telegraph.co.uk/technology/news/8552297/Goog
le-cyber-attacks-what-is-spear-phishing.html
KDIR2012-InternationalConferenceonKnowledgeDiscoveryandInformationRetrieval
280