respectively. The chromosome compositions of the
five components in terms of features are shown in
Table 1.
The last component is constructed by
preprocessing the user profile data pertaining to the
movie attributes, that has a set of 7 unique features.
We can observe apparently that prediction of GA
is much higher than that of Pearson. The average
fitness of Pearson is about 67.9%, whereas the
average fitness for ‘GA’ with various combinations
of features range from 83.51% to 83.93%. In
particular, the fitness of ‘GA UserPro12’ is the
highest, about 23.61% better than that of Pearson.
The experiment shows the process time for
running GA on different features. ‘GA UserPro7’
that is GA with 7 features out performs the other 4 in
terms of speed. Its average process time is 19
seconds. This is a 67.53% reduction over ‘GA’.
From the experiment, the longest time taken is
by ‘GA UserPro37’ and the shortest time is ‘GA
UserPro7’. This reinforces the belief that when more
features are into the recommender a longer
processing time it takes.
3.3 Experiment 3 - Neighbor Set
For a particular user, we tested the performance on
different group sizes of neighbor set, from 10 to 100
respectively.
3.3.1 Process Time vs. Neighbor Set
This experiment shows the performance of different
features running on GA with different neighbor sets.
At the beginning, their performances are close. As
the neighbor set size increases, the process time for
‘GA’, ‘GA Merge’, ‘GA UserPro37’ increase quite
sharply. The additional features they have in
common are the 18 movie genres.
As we can see, ‘GA UserPro37’ for 37 features,
the process time increases gradually as the neighbor
set expands; where for ‘GA UserPro7’ with 7
features, the process time increases slowly.
3.3.2 Fitness vs. Neighbor Set
In the experiment, the fitness of different features
rise up gradually as the neighbor set enlarges.
Interestingly when the neighbor set reaches over the
size of 35, the fitness continues to stay constant. As
indicated by the dotted line in chart, the fitness
approaches 95% at the turning point. Further
increase on the neighbor set size has no effect on it.
4 CONCLUSION
Our proposed GA-based hybrid CF model combines
both correlation analysis of active user to users, and
contents of the items rated by peer users. The
information of user profiles and item attributes are
encoded accordingly into GA chromosomes. As our
experiments show, this GA model offers more
accurate recommendation than that of Pearson
Algorithm. By applying user profile features that are
more valuable than other features such as movie
genres on GA, the similarity measure finds the
neighbors with similar taste to the user; as a result,
the user preference can be better predicted. And
when user profile features are used alone, the
process speeds up. In essence, this hybrid approach
exploits merits from CF techniques by selectively
encoding both the user profiles and the product
information into the same chromosomes in a Genetic
Algorithm. Our experiment also shows that the GA
fitness keeps constant when neighbor set size
increases. This implies some positive elements in the
scalability and speed issues of GA online
recommendation systems.
REFERENCES
Breese, J., Heckerman, D., and Kadie, C., 1998.Empirical
Analysis of Predictive Algorithms for Collaborative
Filtering. Proceedings of the 14th Conference on
Uncertainty in Artificial Intelligence.
Ujjin, S., Bentley, P., 2002. Evolving Good
Recommendations. Genetic and Evolutionary
Computation Conference (GECCO).
GALib, http://lancet.mit.edu/ga/
A HYBRID GA-BASED COLLABORATIVE FILTERING MODEL FOR ONLINE RECOMMENDERS
203