Seadragon 2 - User Guide

1. Introduction

This document is the user guide for the Leafy Seadragon software from http://c2h.sourceforge.net/.

This open source software is designed to support interactive, two-way acoustic communication research with dolphins and eventually larger cetaceans. It is intended as a tool to help determine the characteristics of the acoustic communication abilities of cetaceans in a scientific manner.

In order to get started with Seadragon and experiment with listening to and emitting underwater whistles with dolphins, all you need to do is:

  1. Download and install Seadragon (see Section 2 below). Most current laptops are adequate.
  2. Design your own whistles (Section 3) and try them in air by using headphones and microphone.
    • Do not wear the headphones for these tests.
    • Samples whistles are included.
  3. Use your whistles at sea with dolphins.
    • Replace the headphones and microphone with 2 transducers (one of which must be designed to emit and is sometime called a projecting transducer).
    • You may need to insert a small battery powered amplifier between the computer and the emitting transducer (for example, an Altec Lansing model for iPods).
    • Make sure that you use safe sound pressure levels (Section 2.3).
    • Make sure that you know and respect the laws and regulations that apply to your nationality and your location. For example, it is possible that American citizens require a permit from National Marine Fisheries Services (NMFS), NOAA, in order to use Seadragon in any waters, and such a permit would be required for anyone in US waters.
  4. Interpret the whistles emitted by dolphins in apparent relation with your emissions and reply to them (Section 2.5 will help).
  5. When you close the application, the acquired signals are written in a text file (xml) and you can use these in subsequent sessions (Section 4). All emissions and acquisitions are also written in a session report file (Section 5).
Exchange your data with other users. Support the replication of your discoveries by others. Publish your work.

2. Install and Run

2.1. Install

Summary: install Seadragon, install Java, and run one of the batch files (under Windows) in seadragon2\run\standalone.


To install Seadragon, download and unzip file seadragon2_build_20060401a.zip from http://c2h.sourceforge.net/ (follow the *Downloads* links).

Seadragon requires that you have previously installed Java 5 on your computer.

To install Java 5, also called Java SE 5 runtime environment (and also called JRE 1.5, for historical reasons), on Microsoft Windows, one easy way is to get it from http://java.com/ and use the installation wizard from this site.

To install the Java 5 JRE for Solaris or Linux, go to http://java.sun.com/.

As of April 2006, Mac users still needed to consult Apple for upgrading their system to Java 5. As of approximately June 2006, Java 5 is the default version in MacOS X 10.4 but Java 5 still needs to be installed by the user because Java 5 is not in Macintoshes out of the box. The installation of Java 5 and Seadragon 2 on a Mac may require some help from Apple tech support. Once Java 5 is installed, the user can unzip the Seadragon 2 download file from c2h and modify the run.bat file to launch the application (replace ; by : and set it to be executable).

2.2. To run Seadragon under Windows:

Use either file run10.bat or file run40.bat from folder seadragon2\run\standalone.

File run10.bat is for using a frequency sampling rate of 10 frequency samples per second and file run40.bat is for using 40 frequency samples per second. In both cases the voltage sampling rate is normally at 48,000 voltage samples per second, and 1024 voltage samples are required to calculate one frequency sample, 10 or 40 times per second.

It is often best to use the rate of 40 frequency samples per second but your PC may not be fast enough, and in this case, you should use 10 frequency samples per second. So if you are using a slower PC, e.g., less than 1 GHz processor, then use the run10.bat file.

Seadragon is very demanding in processor time (CPU intensive), therefore it is recommended to run Seadragon by itself, i.e., ensure that no other application is running at the same time as Seadragon. You may also set the -Xmx and -Xms switches in the java command line in the batch file to approx. 80% of your RAM for optimized performance.

Once you have installed Java 5, then execute one of these two batch files, for example, by double clicking on the file icon.

If you are running at 10 frequency samples per second and would like to run at 40 frequency samples per second, or vice versa, then you must stop the application and re-launch it with the other batch file.

For the Macintosh, Solaris, and Linux, the runXX.bat files can be edited by changing the "\" slashes and ";" characters to their Apple (and Uni*) equivalent, "/" and ":".

2.3. Sound Pressure Level (SPL)

Attention: Seadragon can emit loud sounds (at your command), so if you are using headphones for testing in air, adjust the volume to a safe level prior to putting them on. Failure to do so may result in hearing damage.

Seadragon software does not have its own controls to amplify or reduce the sound pressure level being emitted by the projector transducer (output hydrophone). You control the emitted sound volume levels using the PC controls. If you are using an optional amplifier between the PC and the projector transducer or speakers or headphones, then you can also use the controls on the amplifier, if any.

Dangerous Underwater SPL = 146 dB

US Navy divers are not allowed to be exposed to underwater sound pressure above 146 dB (referenced to 1 microPascal).

A Blue Whale can produce sounds at up to 180 dB (re. 1 uP).

Military high power sonars can produce bursts above 200 dB and these are considered very dangerous for mammals, including humans, cetaceans, and other species.

In the current version, the projector transducer (output hydrophone) is connected to the headphone jack of a PC or to an amplifier connected to the headphone jack. An amplifier may be required in order to communicate with dolphins at sea. For underwater emissions, the headphone jack of most PCs is assumed to produce a low and safe sound level when used without an amplifier (this is only valid for underwater emissions and for emissions in air even a PC without an amplifier may produce sound levels that can be damaging). The user is responsible for monitoring the sound pressure level being emitted, either underwater or in air.

The Spectrogram & SPL Display:

Seadragon now supports the monitoring of sound pressure levels (SPL). This function must be calibrated by the user (human). See SPL Calibration below.

To see the Spectrogram and SPL Measurements:

  1. select the Controls tab,

  2. select the checkbox for Water or unselect for Air,

  3. enter the SPL calibration value for your equipment, for water or air (see below),

  4. select the Spectrogram & SPL Enabled checkbox,

  5. select the Spectrogram tab.

For better performance, the spectrogram is not updated when it is not visible.

To stop the Spectrogram & SPL function:

  1. select the Controls tab,

  2. deselect the Spectrogram & SPL Enabled checkbox.

SPL Calibration: The SPL calibration values for water and air can be set by running the Spectrogram & SPL function in a quiet environment. The value to use in the calibration controls should be the negative of the minimum SPL measured in the quiet environment. For example, if the minimum SPL shown by Seadragon in a very quiet environment is 105 and you determine that the displayed level should be 0 dB for this environment, then enter -105 in the appropriate calibration control field (one for air, the other for water). You may need to use a pre-calibrated SPL meter instrument in order to determine the appropriate level in your test environment.

You'll notice that you can resize the Seadragon window and the contents will resize automatically. The spectrogram window should be resized with the spectrogram function turned off in order to avoid possible spurious graphic errors.

2.4. Turn on the FILTER SELF control: Stop Seadragon from listening to itself

After starting the application, it is recommended to disable the Self-Filtering function and emit a few whistles (e.g., s1, s2) to verify that the system is working properly, because Seadragon would be listening to what it is emitting and you can see what it is recognizing. For normal operation, you should enable the Self-Filtering function and therefore stopping Seadragon from displaying the whistles that it emits. You enable and disable the Self-Filtering function by going to the Controls tab and selecting or de-selecting the Self-Filtering check box. When selected, Seadragon filters itself and does not display the signals that it emits.

The Spectrogram display can be used to observe more details about the emitted and incoming sounds. The whistles emitted by Seadragon are displayed in the Spectrogram window even when the Self-Filtering function is enabled.

2.5. The syntax of signal names used in the msg window:

((( s2 ))) = Seadragon recognized an incoming signal as matching a signal named s2, the ((())) characters are used to mean recognized; signal s2 may be either man-made or cetacean-made; the human user can define her own naming technique to distinguish between the two categories, e.g., in the downloaded version all signals are man-made and they all start with s.

((( *19 ))) = this is a signal that matches the previously acquired unknow signal # 19 in this session; the asterix * means that this unknown signal is new (i.e., its first appearance is in the current session).

((( 200503262213_19 ))) = this is a signal that matches the unknow signal # 19 from session 200303262213 (March 26, 2005, 10:13 PM).

*9~3L67%*7 = unrecognized signal number 9; 3L means that it has 3 frequency values (L=length); the best matching score is 67% with the signal named *7, which is unrecognized # 7 in the current session.

*5~13L31%s10.5 = unrecognized signal number 5; it has 13 samples; its best score is 31% with man-made signal s10.5.

3. Define a New Whistle - Manually

To create a whistle manually write a set of xml elements in the whistle file to be read by the application the next time that it is launched. This file is: seadragon2\run\standalone\lib\signals\signals_to_read.xml.

An example of a set of xml elements defining a whistle:

<object class="org.leafyseadragon.j2se.signal.StoredSignal">
<void property="hz10ps">
<array class="java.lang.Double" length="5">
<void index="0">
<void index="1">
<void index="2">
<void index="3">
<void index="4">
<void property="signalType">
<void property="text">
<void property="uid">

To save time, it is common to copy and modify an existing set of xml elements.

4. Use a Whistle Previously Acquired By Seadragon

You can easily use a whistle that was acquired by Seadragon during a previous session. This is basically a copy and paste operation on text. At the end of each session, i.e., when the user closes the application, Seadragon writes a text file containing the whistles it has in memory, including whistles it acquired during the session. Instances of this file, e.g., signals_saved_1118369700921.xml, are located in seadragon2\run\standalone\results\signals and the 1118369700921 number is a timestamp in milliseconds used to make the filename unique.

This type of files is a text file with xml formatting and can be used to manually select one or more whistles (aka. signals) and include these into the whistle file to be read by the application the next time that it is launched. The file read at startup is seadragon2\run\standalone\lib\signals\signals_to_read.xml. So essentially you copy a whistle from the signal_saved file to the signals_to_read file.

The whistles could be from your own communication sessions or from the sessions of someone else. A whistle written in xml is defined by the lines starting from this line:

<object class="org.leafyseadragon.j2se.signal.StoredSignal">

all the way to the next line containing this tag: </object>

When you select a whistle to be cut and pasted, you should change the name of the whistle by changing the content of tag for the element with property called "text" such as in:

<void property="text">

and change it to this:

<void property="text">

So now Seadragon will write s21 in the Messages window (Emitted and Acquired) when it acquires the signal and you type s21 to emit this whistle. You may wish to classify the copied signal as a LEX_SIGNAL by including this element in it's xml:

<void property="signalType">

You may also wish to edit the frequency values of the whistle to suit your experimental design. The frequency values for rate 40 per second are written in the element with property="hz40ps", and the frequency values for rate 10 per second are in element with property="hz10ps". You can either edit both sets of frequency samples in a consistent manner or delete one set and edit the other set, and Seadragon will calculate the other set if needed. You may also remove the xml tags from the acquired whistle that are not needed for the whistle to be read. These unneeded tags are:

<void property="creationMillis">
<void property="histInitialHzSamplingPerSec">
<void property="histVoltSamplingPerSec">
<void property="score">

5. Files to Document Your Communication Sessions

Seadragon writes 3 types of xml files that are important to document your communication session, particularily the Session Report file which contain the messages of the session and the Signals files which contain entire signal data sets. These files are written in different folders in folder seadragon2\run\standalone\results. For example, Report files are written in folder reports in folder seadragon2\run\standalone\results.

Session Report, Properties, and Signals files can easily be shared with other researchers as they are text files with xml tags. Logs files are not meant to be shared but they also can be.

Properties files are not fully implemented in the current version of the software.

6. Files Housekeeping

For some types of files, Seadragon writes a new file at the end of each session in the corresponding folder, and these must be cleaned up once in a while so that your hard disk does not get full. These files are in the folders in seadragon2\run\standalone\results.

Log files, in folder results\logs, do not need to be cleaned up because Seadragon does the housekeeping automatically for this type of files. To do this, Seadragon keeps the total size of all log files to less than a preset value by deleting the older file when the maximum space is reached. You may copy any of these files to another location if you wish to keep any of these log files permanently.

7. Main Features:

  • Live spectrogram display - since v. 2.

  • Live Sound Pressure Level measurement - since v. 2.

  • Predefined whistles in editable text file (xml) - since 1.0

  • Session whistles stored in editable Signals text file (xml) - since 1.0

  • Session Report text file (xml) written during each communication session, contains all messages

  • Improved displayed whistles names - since 1.0

  • Single machine (standalone) or multiple machines configurations (only the backbone configuration is supported in the current version)

  • Standalone configuration successfully tested under Windows XP laptops and desktops (including a laptop with AMD Athlon 64, 512 MB); the current release package is configured for standalone operation (on a single computer).

  • Entirely written in Java - requires Java SE 5 (which is free from Sun Microsystems for Windows, Unix, and Linux systems and from Apple for Macintosh computers). Java SE 5 Runtime Environment (JRE) for Windows can be installed from http://java.com.

  • Uses common built-in audio interface (e.g., Windows Direct Audio, Microsoft Advance AC97 Audio); normally no need for additional audio hardware.

  • Input hydrophone in microphone jack, output hydrophone in headphone jack (you purchase your hydrophones from a third party, not from us; we don't sell anything)

  • Maximum effective whistle frequency: 11 kHz (could be increased with special audio hardware in future version)

  • Adjustable minimum whistle frequency, e.g., 400 Hz or 1 kHz. Signals at a lower frequency than this are considered noise and filtered out.

  • 48,000 voltage samples per second (fixed in this version)

  • Choice of two frequency sampling rates: 10 and 40 frequency samples per second (fsps) - new since 1.0. The 10 fsps rate is for slower PCs.

  • Filters short signals as background noise: less than 2/10 second long (adjustable) - since 1.0 (0.9.5)

  • Optionally filters whistles that it emitted so that emissions are not echoed in the display window - since 1.0

  • Extensive auto diagnostics

  • EMAIL SUPPORT: sergemasse1 a-t yahoo dot com

  • Apple Macintosh: This version probably works with Apple's current release of its Java Runtime and that requires Mac OS X 10.4. Seadragon has not been tested with it yet. The Seadragon runtime package may need to be adapted to fit in Apple's system.

  • License: CPL - Common Public License - commercial use is allowed without fee.

8. Resources

An excellent forum on bioacoustics can be joined using the instructions from http://cetus.pmel.noaa.gov/AB/ABbioList.html

Java Runtime Environment for Windows (free): http://java.com

C2h project for Seadragon: http://c2h.sourceforge.net/

A manufacturer of an emitting hydrophone, the H1 model (not free, but not expensive): http://www.aquarianaudio.com/

The links of the right side of this web page.



Dolphin diary

Dolphin diary - The interaction of an Australian with wild dolphins.


ORCINUS ORCA COLLECTIVE A blog mostly on North American West Coast Orcas.

Cetaceans Log


Seadragon design is supported by recent research

Extract from http://www.sarasotadolphin.org/Social/WhatVoice.asp

*An earlier set of playback experiments (2003-4) showed that dolphins are capable of recognizing synthetic signature whistle contours, suggesting that contour is the most important feature of the whistle for individual recognition.*

By Laela Sayigh, PhD, and Vincent Janik, PhD
University of North Carolina at Wilmington and University of St. Andrews

Also this abstract from 2000: http://www.sciencemag.org/cgi/content/abstract/289/5483/1355

Whistle Matching in Wild Bottlenose Dolphins (Tursiops truncatus)

Vincent M. Janik

Dolphin communication is suspected to be complex, on the basis of their call repertoires, cognitive abilities, and ability to modify signals through vocal learning. Because of the difficulties involved in observing and recording individual cetaceans, very little is known about how they use their calls. This report shows that wild, unrestrained bottlenose dolphins use their learned whistles in matching interactions, in which an individual responds to a whistle of a conspecific by emitting the same whistle type. Vocal matching occurred over distances of up to 580 meters and is indicative of animals addressing each other individually.

School of Biology, University of St. Andrews, Bute Building, Fife KY16 9TS, UK, and Lighthouse Field Station, Aberdeen University, Cromarty, Ross-shire IV11 8YJ, UK. Present address: Woods Hole Oceanographic Institution, Biology Department, Woods Hole, MA 02543, USA.




Processing dolphin whistles in Seadragon v. 1.0

Here is a summary of the process and data flows from cetacean sound to human interface, i.e., the c-to-h flow (c2h), and some references to the h-to-c flow (h2c), in Seadragon; c2h involves underwater signal acquisition and recognition, while h2c involves human input and emission of signals underwater.

There are 3 major nodes in the backbone subsystem in Seadragon: c, c2h, h. Each node can be deployed on a single host (e.g., a PC) or on the same host as another node. In version 1.0, these 3 nodes are deployed on a single host. The nodes exchange data using text (UTF8) containing xml tags. When the nodes are deployed on different hosts, then the data exchange takes place over TCP/IP sockets, and when two nodes are on the same host then the data exchange takes place within Java objects and between different threads, not involving sockets.

The 3 backbone nodes design, chosen a few years ago, allows us to easily have an underwater system composed of three or two hosts, one for the handheld human interface (hosting the h node) and the other device hosting the c2h node and the c node, as described on another post on this blog. Such a system is feasible with off-the-shelf parts today (or shortly, with some testing and debugging). The Seadragon software is configured to run on multiple hosts by using properties in files that it reads at startup. For the proposed underwater system, the same software would be used for an h node on its own host and for the c and c2h nodes on another host, these two installations just use different properties at startup. This feature, among others, is given by the generic Leafy API. This multi-host design will also allow us to easily use the more powerful processors that will be required when we process signals that are more complex than tonal whistles, such as those the complex signals used by larger cetaceans (and other species such as Elephants), and also mixtures of whistles and clicks used by smaller cetaceans. This design will also be useful when we add complex multi-signal structure recognition (e.g., real-time grammatical theories analysis) on top of single signal recognition (e.g., converting a whistle to a textual identifier).

The c2h flow between backbone nodes is: c to c2h to h.

The h2c flow is: h to c2h to c.

The flow between backbone nodes is the same whether the nodes are hosted on different hosts or on the same host. Seadragon also supports other nodes than backbone nodes and these are for peer-to-peer networks, either human peer-to-peer or cetacean peer-to-peer networks. The cetacean p2p has not been fully implemented but an human-side p2p has been tested, including hosts which are cell phones.

The c node is in charge of the cetacean interface: it emits underwater sounds to cetaceans and it acquires underwater sounds. In the h2c flow, it receives data from c2h (in text form), then converts it to voltage levels representation (numbers) and then to actual voltage (analog) and the goes to a hydrophone (e.g., a piezo-electric cristal) that converts the voltages to vibrations (sound). In the c2h flow, it acquires sounds as voltage levels (numbers), using FFT, it converts 1024 voltage numbers to a single frequency value (Hz or cycles per second) and sends the data (a single frequency value) to the c2h node for processing, i.e., attempt at recognition.

The c2h flow summary: sound --> c node: hydrophone --> voltage --> analog-to-digital --> voltage levels --> FFT --> frequency value --> send text (single frequency value) --> c2h node: assembly of frequency values into a series (i.e., a whistle) --> pattern matching --> signal object in lexicon (unrecognized acquired whistle) --> send text --> h node: writing the text to the human user in the msg window.

The other flow is h-to-c, human to cetacean (h2c), and it is similar to the reverse of the c2h flow but does not involve frequency pattern matching because the human user can only emit a whistle which is already present in the lexicon and the human user has to use the unique text name of the whistle. One could say that there is the simpler text name matching in this flow, but no frequency matching.

The most complex part, as far as code structure is concerned, is the *assembly of frequency values into a series (i.e., a whistle)* in the c2h flow. This involves, for example, the recognition of the start and the end of a whistle and the completion of the whistle, prior to comparing it with signals in the lexicon (a fancy name for Seadragon's whistles database). This process must be extremely efficient and it took me many months to fine tune it because the quantity of this data in real time is huge (10 and 40 per second now) and I would like to process even more, ideally maybe 100 frequency values per second, when off-the-shelf PCs are fast enough.

An overview of the pattern matching (aka. signal recognition): It is performed in the c2h node. Once an incoming whistle's start and end have been determined and we have all the intermediate frequency values (not trivial), then the frequencies of the incoming whistle are compared with the frequencies of the signals present in the lexicon. A score is tallied for each comparisons. If the score is outside the acceptable limit than this match is abandoned and the incoming whistle is compared with another whistle in the lexicon. Out of this process a best match is obtained or no match. If we have a match, then the name of the matching whistle from the lexicon is sent to the h node and displayed in the *msg* window. If no match, then the incoming whistle is given a unique name (by the system), the whistle is added to the lexicon, and the name is sent to the h node and displayed.

A detailed description of the pattern matching process between two whistles will be published later.



DIY Surface Dolphin Communication System ~$2,500.00

Do-it-yourself: a laptop, 2 hydrophones, and Seadragon version 1.0 (available today for free).

The Seadragon software can be downloaded from http://c2h.sourceforge.net/.

Seadragon requires Java SE version 5 or above, available for free from http://java.com/ or from http://java.sun.com/ .

Seadragon has been tested under Windows. It may also work under Mac OSX with the valid version of Java from Apple (v. 5), or under Uni*, Solaris, Linux, given the valid Java version (v. 5).

I use 2 inexpensive and quality hydrophones from http://www.aquarianaudio.com/ - They have a male RCA connector that fits in the microphone and earphone jacks of the laptop (or amplifier). Model H1 for output and H2 for input. We are not affiliated with this or any other manufacturer.

Optional amplifier between the earphone jack and the output hydrophone. I use very small battery-operated device from Altec Lansing, inMotion (built for iPods). Make sure that the output intensity is much less than 100 dB (re 1 microPascal) near the dolphins.

You get your own hardware. We do not sell anything.

Btw, the laptop should be running at 2 GHz minimum.