In view of the fact that for process mining purpose a
specific format of event log would be used, having
direct access to change histories would be helpful.
Unfortunately, reaching these logs needs digging
into data tables of WSS content database. Although
WSS content database contains many tables, this
section focuses on only AllUserData, AllLists and
UserInfo which hold necessary data for process
mining tools. Table 2 shows brief information about
tables used in this paper (Ethan, 2007).
Table 2: Inspecting the SharePoint content database.
AllUserD
ata
Holds information about all the list items
for each list.
AllLists Holds information about lists for each site.
UserInfo Holds information about all the users for
each site collection.
Table 3: AllUserData table.
tp_Id
Identifier for the list item, uniquely
identifying it within the AllUserData
table.
tp_ListId
List Identifier of the list or document
library containing the list item.
tp_SiteId
Identifier of the site collection containing
the list item.
tp_Version
A counter incremented any time a change
is made to the list item, used for internal
conflict detection. Due to the mapping of
application properties to the generic
columns schema in this table, changes to
application schema as well as property
values can affect a version increment.
tp_Author
Identifier for the user who created the list
item.
tp_Editor
Identifier for the user who last edited the
list item.
tp_Modified
A date and time value specifying when
this list item was last modified.
tp_Created
A date and time value specifying when
this list item was created.
nvarchar#
Columns for application-defined fields
that hold values of type nvarchar. The 64
columns are named nvarchar1 to
nvarchar64. If the column does not
contain data, this value MUST be NULL.
int#
Columns for application-defined fields
that hold values of type int. The 16
columns are named int1 to int16. If the
column does not contain data, this value
MUST be NULL.
float#
Columns for application-defined fields
that hold values of type float. The 12
columns are named float1 to float12. If
the column does not contain data, this
value MUST be NULL.
The detail description of whole content database
is beyond the scope of this article. Instead, let’s
focus on the AllUserData table as a place where all
list item data is stored. This important table has 192
columns and all list items including their history are
stored in this single table. Table3 describes the more
important columns. For a complete list of columns in
the AllUserData table, see MSDN documentations
at: http://msdn.microsoft.com/en-us/library/
dd358229(v=prot.13).aspx.
As can be seen, there are group of columns,
including nvarchar#, ntext#, int#, in which the
values of SharePoint lists are stored. Every time you
create a new column in a list, it is automatically
mapped to a “free” column of the desired type in the
AllUserData table. If you add column “Order No” of
type single line to your list for instance, an unused
database column in the range from nvarchar1 to
nvarchar 64 will be assigned to store this field data
(Krause et al., 2010).
The first thing to do is to identify the relevant
data in AllUserData table by query the table for ID
of the list needed. Since lists information are stored
in AllLists table, as shown in listing 2, filtering this
table could return the GUID of the lists.
SELECT tp_ID
FROM AllLists
WHERE tp_Title = 'You lists title'
Listing 2: Query AllLists to get list GUID.
Then, using returned tp_Id from the AllLists
table in query criterion of AllUserData can fetch the
list item data (See listing 3).
SELECT tp_ID AS [Item ID], tp_ListId AS
[List ID], tp_Version AS [Version],
tp_Author AS [CreatedBy], tp_Editor AS
[Modified By], tp_Modified AS
[Modified], tp_Created AS [Created],
nvarchar1 AS [Order No], nvarchar9 AS
[Description], nvarchar23 AS [Status]
FROM AllUserData
WHERE (tp_ListId = '39158042-39dc-4951-
9400-03f8f45893b3')
Listing 3: Query list item data.
The only remaining challenge is to get the name
of the mapped database columns from SharePoint
list. To overcome this issue, you need to filter
AllUserData for a List ID and then find related
columns. For example, Status filed data of Purchase
Order Processing list is mapped to nvarchar23.
After applying Listing 3 commands, the results
may not be completely satisfactory. Because, this
ICEIS 2011 - 13th International Conference on Enterprise Information Systems
416