Working
with Time-Based Media
Any
data that changes meaningfully with respect to time can be
characterized
as time-based media.
Examples:
- Audio
clips,
- MIDI
sequences,
- Movie
clips,
- Animations
Such
media data can be obtained from a variety of sources,
such as
- Local
or Network files,
- Cameras,
- Microphones

Streaming
Media
- A key
characteristic of time-based media is that it requires timely
delivery and processing.
- Once the flow of media data begins,
there are strict timing deadlines that must be met, both in terms of receiving
and presenting
the data.
- For this reason, time-based media is often referred
to as streaming media --
it is
delivered in a steady stream that must be received and processed within
a particular timeframe to produce acceptable results.
Content
Type:
The
format in which the media data is stored is referred to as its content
type.
Examples:
- Most
time-based media is audio or video data that can be presented through output
devices such as speakers and
monitors.
Such devices are the most common destination for
media data output.
- Media streams can also be sent to other destinations--for
example, saved to a file or
transmitted across
the network.
- An output destination for media
data is sometimes referred to as a data
sink.
Presentation
Control:
While
a media stream is being presented, VCR-style
presentation controls are
often provided to enable the user to control playback.
For example, a control
panel for a movie player might offer buttons for:
stopping,
starting,
fast-forwarding,
rewinding
Latency:
In
many
cases, particularly when presenting a media stream that resides on the
network, the presentation of the media stream cannot begin immediately.
The time it takes before presentation can begin is referred to as the start
latency.
Multimedia
presentations often combine several types of time-based media into a synchronized
presentation. For example:
- background
music might be played during an image slide-show,
- animated
text might be synchronized with an audio or video clip.
When
the presentation of multiple media streams is synchronized, it is
essential
to take into account the start latency of each stream--otherwise the
playback
of the different streams might actually begin at different times.
Presentation
Quality:
The
quality of the presentation of a media stream depends on several
factors,
including:
- The
compression scheme used
- The
processing capability of the playback system
- The
bandwidth available (for media streams acquired over the network)
Traditionally,
the higher the quality, the larger the file size and the greater the
processing
power and bandwidth required.
Bandwidth
is usually represented as the number of bits that are transmitted in a
certain period of time--the bit
rate.
To
achieve high-quality video presentations, the number of frames
displayed
in each period of time, the frame
rate,
should be as high as possible.
Usually movies at a frame rate of 30 frames-per-second
are considered indistinguishable from regular TV broadcasts or
video tapes.
Media
Processing
In
most
instances, the data in a media stream is manipulated before it is
presented
to the user:
- If the
stream is multiplexed,
the individual tracks
are extracted.
- If the
individual tracks are compressed,
they are
decoded.
- If necessary,
the tracks are
converted
to a different format.
- If desired, effect
filters are applied to the decoded tracks.
Demultiplexers
and Multiplexers:
- A demultiplexer extracts
individual tracks of media data from a multiplexed media stream.
- A mutliplexer performs
the opposite function, it takes
individual tracks
of media data and merges them into a single multiplexed media stream.
Codecs:
- A codec
performs media-data compression
and decompression.
- When a track is encoded,
it is converted to a compressed format suitable
for storage or transmission.
- When it is decoded
it is converted to a non-compressed
(raw) format suitable for presentation.
Effect
Filters:
- An effect
filter modifies the track data in some way, often
to create special effects
such as blur
or echo.
- Typically, effect filters are applied to uncompressed (raw)
data.
Renderers:
- A renderer
is an abstraction of a presentation
device.
- For audio, the presentation device is typically the
computer's hardware
audio card that outputs sound to the speakers.
- For video, the presentation
device is typically the computer monitor.
Compositing:
- Certain
specialized devices support compositing.
- Compositing time-based media is the process of combining
multiple tracks
of data onto a single presentation medium.
- For example, overlaying
text on a video
presentation is one common form of compositing.
Media
Capture
- Time-based
media can be captured from a live
source for
processing and playback.
audio
can be captured from a microphone
or
video
capture card can be used to obtain video from a camera.
- Capturing
can be thought of as the input
phase
of the standard media processing model.
Capture
Devices:
- Capture
devices can be characterized as either push or pull sources.
A still
camera is a pull
source--the
user controls when to capture an image.
A microphone
is a push source--the
live source continuously provides a stream
of audio.
Capture
Controls:
A
capture control
panel might enable the user to specify:
- data
rate,
- encoding
type for the captured
stream, and
- start
and stop of the capture process.