[MultiMedia] DirectShow Note

Vince Huang
4 min readJun 27, 2017

--

This is a studying note for DirectShow. It summarizes some key knowledge of DirectShow and a studying roadmap. Current progress is Building the Filter Graph (2010/03/08).

DirectShow

  • The Microsoft DirectShow application programming interface (API) is a media-streaming architecture for the Microsoft Windows platform.
  • DirectShow simplifies media playback, format conversion, and capture tasks.
  • DirectShow is based on the Component Object Model (COM).
  • DXVA: DirectX Video Acceleration.
  • EVR: Enhanced Video Renderer.
  • VMR: Video Mixing Renderer.
  • DMOs: Microsoft DirectX Media Objects.

Filters and Filter Graphs

The building block of DirectShow is a software component called a filter. A filter is a software component that performs some operation on a multimedia stream. For example, DirectShow filters can

  • read files
  • get video from a video capture device
  • decode various stream formats, such as MPEG-1 video
  • pass data to the graphics or sound card

Filters receive input and produce output.

In DirectShow, an application performs any task by connecting chains of filters together, so that the output from one filter becomes the input for another. A set of connected filters is called a filter graph.

How To Play a File

See example and sample code on MSDN.

The DirectShow Solution

See DirectShow component relationship diagram on MSDN.

About DirectShow Filters The connection points are also COM objects, called pins. Filters use pins to move data from one filter the next. In DirectShow, a set of filters is called a filter graph.

Filters have three possible states: running, stopped, and paused. When a filter is running, it processes media data. When it is stopped, it stops processing data. The paused state is used to cue data before running.

The primary function of most filters is to process and deliver media data. How that occurs depends on the type of filter:

  • A push source has a worker thread that continuously fills samples with data and delivers them downstream.
  • A pull source waits for its downstream neighbor to request a sample. It responds by writing data into a sample and delivering the sample to the downstream filter. The downstream filter creates the thread that drives the data flow.
  • A transform filter has samples delivered to it by its upstream neighbor. When it receives a sample, it processes the data and delivers it downstream.
  • A renderer filter receives samples from upstream, and schedules them for rendering based on the time stamps.

About the Filter Graph Manager

The Filter Graph Manager is a COM object that controls the filters in a filter graph. It performs many functions, including the following:

  • State changes: Coordinating state changes among the filters.
  • Reference clock: Establishing a reference clock.
  • Graph events: Communicating events back to the application.
  • Graph-building methods: Providing methods for applications to build the filter graph.

About Media Types

The media type is a universal and extensible way to describe digital media formats. When two filters connect, they agree on a media type. The media type identifies what kind of data the upstream filter will deliver to the downstream filter, and the physical layout of the data. If two filters cannot agree on a media type, they will not connect.

Media types are defined using the AM_MEDIA_TYPE structure. This structure contains the following information:

  • Major type: The major type is a GUID that defines the overall category of the data. Major types include video, audio, unparsed byte stream, MIDI data, and so forth.
  • Subtype: The subtype is another GUID, which further defines the format. For example, within the video major type, there are subtypes for RGB-24, RGB-32, UYVY, and so forth. Within audio, there is PCM audio, MPEG-1 payload, and others. The subtype provides more information than the major type, but it does not define everything about the format. For example, video subtypes do not define the image size or the frame rate. These are defined by the format block, described below.
  • Format block: The format block is a block of data that describes the format in detail. The format block is allocated separately from the AM_MEDIA_TYPE structure. The pbFormat member of the AM_MEDIA_TYPE structure points to the format block. The pbFormat member is typed void* because the layout of the format block changes depending on the media type. For example, PCM audio uses a WAVEFORMATEX structure. Video uses various structures, including VIDEOINFOHEADER and VIDEOINFOHEADER2. The formattype member of the AM_MEDIA_TYPE structure is a GUID that specifies which structure is contained in the format block. Each format structure is assigned a GUID. The cbFormat member specifies the size of the format block. Always check these values before dereferencing the pbFormat pointer.

About Media Samples and Allocators

Filters deliver data across pin connections. Data moves from the output pin of one filter to the input pin of another filter. The most common way for the output pin to deliver the data is by calling the IMemInputPin::Receive method on the input, although a few other mechanisms exist as well. More detail see MSDN.

How Hardware Devices Participate in the Filter Graph

Wrapper Filters

All DirectShow filters are user mode software components. In order for a kernel mode hardware device, such as a video capture card, to join a DirectShow filter graph, the device must be represented as a user-mode filter. This function is performed by specialized “wrapper” filters provided with DirectShow. DirectShow also provides a filter called KsProxy, which can represent any type of Windows Driver Model (WDM) streaming device. Hardware vendors can extend KsProxy to support custom functionality, by providing a Ksproxy plug-in, which is a COM object aggregated by KsProxy.

The wrapper filters expose COM interfaces that represent the capabilities of the device. The application uses these interfaces to pass information to and from the filter. The filter translates the COM method calls into device driver calls, passes that information to the driver in kernel mode, and then translates the result back to the application. Some filters support custom driver properties through the IKsPropertySet interface.

For application developers, wrapper filters enable the application to control devices just as they control any other DirectShow filter. No special programming is required; the details of communicating with the kernel-mode device are encapsulated within the filter.

References

Originally published at vincewiki.blogspot.com on June 27, 2017.

--

--

Vince Huang
Vince Huang

Written by Vince Huang

A Product Owner in software company, interests include agile/scrum, machine learning and mobile design. https://www.linkedin.com/in/kuoyuhuang/

No responses yet