hydrophones include multipath effects (and reverber-
ations), which create secondary peaks in the Cross-
Correlation (CC) function that the generalized CC
methods cannot eliminate.
2.1 Material
Table 1: Hydrophones positions: D=Datasets,
Dist=Distance to barycenter (m).
D Hydros Dist X (m) Y (m) Z (m)
D1
H 1 5428 18501 9494 -1687
H 2 4620 10447 4244 -1677
H 3 2514 14119 3034 -1627
H 4 1536 16179 6294 -1672
H 5 3126 12557 7471 -1670
H 6 4423 17691 1975 -1633
D2
H 7 1518 10658 -14953 -1530
H 8 4314 12788 -11897 -1556
H 9 2632 14318 -16189 -1553
H 10 3619 8672 -18064 -1361
H 11 3186 12007 -19238 -1522
The signals are records from the ocean floor near An-
dros Island - Bahamas (Tab.1), provided with celer-
ity profiles and recorded in March 2002. Datasets
are sampled at 48 kHz and contain MM clicks and
whistles, background noises like distant engine boat
noises. Dataset1 (D1) is recorded on hydrophones
1 to 6 with 20 min length while dataset2 (D2)
is recorded on hydrophones 7 to 11 with 25 min
length. We will use a constant sound speed with
c = 1500ms
−1
or a linear profile with c(z) = c
0
+ gz
where z is the depth, c
0
= 1542ms
−1
is the sound
speed at the surface and g = 0.051s
−1
is the gradi-
ent. Sound source tracking is performed by continu-
ous localization in 3D using Time Delays Of Arrival
(T) estimation from four hydrophones.
2.2 Signal Filtering
A sperm whale click is a transient increase of sig-
nal energy lasting about 20 ms (Fig.1-a). Therefore,
we use the Teager-Kaiser (TK) energy operator on the
discrete data:
Ψ[x(n)] = x
2
(n) − x(n+ 1)x(n− 1), (1)
where n denotes the sample number. An important
property of TK is that it is nearly instantaneous given
that only three samples are required in the energy
computation at each time instant. Considering the raw
signal s(n) as:
s(n) = x(n) + u(n),
where x(n) is the signal of interest (clicks), u(n) is an
additive noise defined as a process realization consid-
ered wide sense stationary (WSS) Gaussian during a
short time, by applying TK to s(n), Ψ[s(n)] is:
Ψ[s(n)] ≈ Ψ[x(n)] + w(n),
where w(n) is a random gaussian process (Kandia
and Stylianou, 2006). The output is dominated by
the clicks energy. Then, we reduce the sampling
frequency to 480Hz by the mean of 100 adjacent
bins to reduce the variance of the noise and the data
size. We apply the Mallat’s algorithm (Mallat, 1989)
with the Daubechies wavelet (order 3). We chose
this wavelet for its great similarity to the shape of
a decimated click (Giraudet and Glotin, 2006b; Gi-
raudet and Glotin, 2006a). The signal is denoised
with a universal thresholding (Donoho, 1995) defined
as D(u
k
,λ) = sgn(u
k
)max(0,|u
k
| − λ), with u
k
the
wavelet coefficients, λ =
p
(2log
e
(Q))σ
N
σ
˜
N
, and Q
the length of the signal resolution level to denoise.
The noise standard deviation σ
N
is calculated on each
10s window on the raw signal with a maximum like-
lihood criterion. σ
˜
N
is the standard deviation of the
wavelet coefficients on a resolution level of a gener-
ated, reduced and 0-mean Gaussian noise. This fil-
tering step is very fast without any parameter. Fig.1-
d is the filtered signal on multiple (Fig.1-c) emitting
MMs.
2.3 Rough
e
T Estimation
First, T estimates are based on MM click realignment
only. Every 10s, and for each pair of hydrophones
(i, j), the difference between times t
i
and t
j
of the ar-
rival of a click train on hydrophones i and j is referred
as T(i, j) = t
j
−t
i
. Its estimate
e
T(i, j) is calculated by
CC of 10s chunks (2s shifting) of the filtered signal
for hydrophones i and j (Giraudet and Glotin, 2006b;
Giraudet and Glotin, 2006a). We keep the 35 high-
est peaks on each CC to determine the corresponding
e
T(i, j). The filtered signals give a very fast rough esti-
mate of T (precision ± 2ms). Fig.(1.e) shows the CC
with the raw signal and (1.f) with the filtered signal.
Without filtering, CC generates spurious delays esti-
mates and the tracks are not correct. The maximum
e
T rank (Fig.2) in D1, pitching the source localization,
are high among the 35
e
T kept in the CC which justi-
fies this number.
2.4
e
T Selection and Localization with a
Constant Profile
Each signal shows echoes for each click (Fig.1 b),
maybe due to the reflection of the click train off the
SIGMAP 2008 - International Conference on Signal Processing and Multimedia Applications
56