
 
 
Digital television will offer a large amount of 
digital broadcast data services as well as TV 
channels, such as Electronic Program Guide (EPG) 
and digital Teletext. People can access digital 
television services on the move using their mobile 
devices, such as mobile phones and PDAs.  
Users are often frustrated in their efforts to 
access these services with limited number of input 
buttons on their mobile devices. Also some mobile 
devices, such as mobile digital television set in a car, 
demand less user input. Thus, designing a user 
interface with minimum or without user’s 
interference is an advantage. It could save user time, 
efforts, and frustration. 
Our goal is to find an intelligent solution to 
create “zero-input” for browsing services. In further 
detail, whenever a user switches his or her mobile 
digital television sets or is at any state of a TV 
service, the system would be able to recommend 
next button to be pressed and execute the function 
represented by this button automatically for the user 
instead of manual browsing if the user does not want 
to give his or her own input. And therefore, the 
number of buttons pressed in this way can be 
significantly reduced.  
Here we clear some confusion: the paper is not 
the issue of helping novice users or compensating 
the original poor user interface, but of resolving the 
design by novel learning algorithm. We also argue 
that multiple users may not access the same device 
to interrupt the agent. 
It is very difficult to predict real intentions of a 
user because there are no examples to guide agent’s 
learning, and also most of the times the interactions 
made are stochastic. Interactions are dynamic and 
parallel with learning, and demand real-time 
reaction. Unexpected button-press recommendation 
is unacceptable.  
The agent’s learning in this problem heavily 
depends on user’s interactions or experiences with 
the environment and the changes of the broadcast 
environment itself as well. Consequently, we use 
experience-based and reinforcement learning 
techniques (especially the standard Q-learning 
algorithm) in machine learning. In section 3, we will 
describe reinforcement learning technique used in 
this paper and our approaches. 
2  DESIGN OF LEARNING AGENT 
The mobile digital television will consist of 
estimated tens of TV channels and 800 digital 
Teletext pages [Peng, 2002]. Navigation agent in a 
mobile device is more personalized toward 
individual users’ interests and dynamic user 
behavior [Lieberman, 2001]. Every time user presses 
a button on a mobile device, that’s an expression of 
interest in the subject of the services.  
The design goal of learning agent in this paper 
was to be able to learn and infer user’s intensions 
and interests by tracking interactions (i.e., history 
information of user’s behavior) between the user and 
the device over the long term and provide 
continuous, real-time display of recommendations 
for the user. Agent keeps any significant histories of 
interaction. Browsing history, after all, is a rich 
source of input about the user’s interests 
[Lieberman, 2001].  
Further goal on the learning agent is concerned 
with tracking and understanding users’ patterns of 
services browsing. In this paper, a reward function 
in Q-learning algorithm is used to match the 
behavior of the user in current situation with the past 
behaviors whose browsing pattern fits most closely, 
and return its predictions.   
The agent performs reconnaissance in tracking 
user-browsing history to recommend new actions.  
This concept is not new in user interface design 
[Lieberman, 2001]. Given enough time, the agent 
becomes pretty good at predicting which button the 
user would be likely to choose next.  
This paper presents another concept: What if 
there is no or few history information or services are 
changing? How the agent deals with this kind of 
situation? How to deal with the recommendations 
that user might not be interested in? Users might 
have many interests and changes over time. Also, 
users have a rich browsing history over potentially 
many months that can be exploited to better 
understand their true interests. Agent finds functions 
on a service of interest that the user might overlook 
because of the complexity of the service. 
The agent designed runs simultaneously with 
mobile digital television services. The agent 
automates interactions between user and mobile 
device. Over time, it learns a pattern of the user’s 
interests by recording and analyzing the user’s 
browsing activity in real time, and provides a 
continuous stream of recommendations.  
When the user is actively browsing and attention 
is focused on the current service or function, the user 
need not to pay any attention to the agent. However, 
if the user is unsure where to go next, or dissatisfied 
with the current offerings, he or she can glance over 
to the recommendation window to look at agent’s 
suggestions.  
 
 
 
 
 
 
AUTOMATIC NAVIGATION AMONG MOBILE DTV SERVICES
141