memo: video

Showing posts with label video. Show all posts

Tuesday, November 2, 2010

IQ 调制

I:in phase同相分量
Q:quadrature phase正交分量
IQ调制是通信调制的基础概念，实际是属于相干调制的范畴。
IQ调制通过将数据分路，分别进行相干的载波调制（注意，这里不完全是正交，正交适应于BPSK、QPSK中，8PSK等八相以上调制不适合这种提法）。正交调幅信号QAM有两个相同频率的载波，但是相互正交(相位相差90度,四分之一周期）。一个信号叫I信号，另一个信号叫Q信号。最后通过合路，将调制信息置于载波中的幅度、相位或者频率。
从传输线角度来看，I/Q信号是一种双线传输模式，能量主要集中在两线之间。与外界关系不大。以此可以抗击共模干扰。当然，双线间回路面积要小些。

IQ信号和抗干扰没关系
现代通信系统为了使频谱利用率更高，所以用了许多种矢量调制，如BPSK、QPSK、QAM等等。
而对于数字信号而言是不会区分一个信号是不是矢量的，所以就用了IQ调制这种方式，使数字和模拟之间塔起了矢量的桥梁。
I/Q调制最基本的好处就是单边带输出。

Thursday, September 16, 2010

Format of a Transport Stream Packet

Each MPEG-2 TS packet carries 184 B of payload data prefixed by a 4 B (32 bit) header.

The header has the following fields:
The header starts with a well-known Synchronisation Byte (8 bits). This has the bit pattern 0x47 (0100 0111).

A set of three flag bits are used to indicate how the payload should be processed.

The first flag indicates a transport error.
The second flag indicates the start of a payload (payload_unit_start_indicator)
The third flag indicates transport priority bit.

The flags are followed by a 13 bit Packet Identifier (PID). This is used to uniquely identify the stream to which the packet belongs (e.g. PES packets corresponding to an ES) generated by the multiplexer. The PID allows the receiver to differentiate the stream to which each received packet belongs. Some PID values are predefined and are used to indicate various streams of control information. A packet with an unknown PID, or one with a PID which is not required by the receiver, is silently discarded. The particular PID value of 0x1FFF is reserved to indicate that the packet is a null packet (and is to be ignored by the receiver).
two scrambling control bits, used by conditional access procedures to encrypted the payload of some TS packets.
Two adaption field control bits, which may take four values:

01 – no adaptation field, payload only
10 – adaptation field only, no payload
11 – adaptation field followed by payload
00 - RESERVED for future use Finally there is a half byte Continuity Counter (4 bits)

Two options are possible for inserting PES data into the TS packet payload:

The simplest option, from both the encoder and receiver viewpoints, is to send only one PES (or a part of single PES) in a TS packet. This allows the TS packet header to indicate the start of the PES, but since a PES packet may have an arbitrary length, also requires the remainder of the TS packet to be padded, ensuring correct alignment of the next PES to the start of a TS packet. In MPEG-2 the padding value is the hexadecimal byte 0xFF.
In general a given PES packet spans several TS packets so that the majority of TS packets contain continuation data in their payloads. When a PES packet is starting, however, the payload_unit_start_indicator bit is set to ‘1’ which means the first byte of the TS payload contains the first byte of the PES packet header. Only one PES packet can start in any single TS packet. The TS header also contains the PID so that the receiver can accept or reject PES packets at a high level without burdening the receiver with too much processing. This has an impact on short PES packets

video transmission (mpeg2)

(excerpted from http://www.abdn.ac.uk/erg/research/future-net/digital-video/mpeg2-trans.html)
The MPEG-2 standards define how to format the various component parts of a multimedia programme (which may consist of: MPEG-2 compressed video, compressed audio, control data and/or user data). It also defines how these components are combined into a single synchronous transmission bit stream. The process of combining the steams is known as multiplexing.

The multiplexed stream may be transmitted over a variety of links, standards / products are (or will soon be) available for :

Radio Frequency Links (UHF/VHF)
Digital Broadcast Satellite Links
Cable TV Networks
Standard Terrestrial Communication Links (PDH, SDH)
Microwave Line of Sight (LoS) Links (wireless)
Digital Subscriber Links (ADSL family)
Packet / Cell Links (ATM, IP, IPv6, Ethernet)

Building the MPEG Bit Stream

To understand how the component parts of the bit stream are multiplexed, we need to first look at each component part. The most basic component is known as an Elementary Stream (ES) in MPEG. A programme (perhaps most easily thought of as a television programme, or a DVD track) contains a combination of elementary streams (typically one for video, one or more for audio, control data, subtitles, etc).

Each Elementary Stream (ES) output by an MPEG audio, video and (some) data encoders contain a single type of (usually compressed) signal. There are various forms of ES, including:

Digital Control Data
Digital Audio (sampled and compressed)
Digital Video (sampled and compressed)
Digital Data (synchronous, or asynchronous)

For video and audio, the data is organised into access units, each representing a fundamental unit of encoding. For example, in video, an access unit will usually be a complete encoded video frame.

Each ES is input to an MPEG-2 processor (e.g. a video compressor or data formatted) which accumulates the data into a stream of Packetised Elementary Stream (PES) packets. A PES packet may be a fixed (or variable) sized block, with up to 65536 bytes per block and includes a 6 byte protocol header. A PES is usually organised to contain an integral number of ES access units.

The PES header starts with a 3 byte start code, followed by a one byte stream ID and a 2 byte length field.

PES Indicators provide additional information about the stream to assist the decoder at the receiver. The following indicators are defined:

PES_Scrambling_Control - Defines whether scrambling is used, and the chosen scrambling method.
PES_Priority - Indicates priority of the current PES packet.
data_alignment_indicator - Indicates if the payload starts with a video or audio start code.
copyright information - Indicates if the payload is copyright protected.
original_or_copy - Indicates if this is the original ES.A one byte flags field completes the PES header. This defines the following optional fields, which if present, are inserted before the start of the PES payload.
Presentation Time Stamp (PTS) and possibly a Decode Time Stamp (DTS) - For audio / video streams these time stamps which may be used to synchronise a set of elementary streams and control the rate at which they are replayed by the receiver.
Elementary Stream Clock Reference (ESCR)
Elementary Stream rate - Rate at which the ES was encoded.
Trick Mode - indicates the video/audio is not the normal ES, e.g. after DSM-CC has signalled a replay.
Copyright Information - set to 1 to indicated a copyright ES.
CRC - this may be used to monitor errors in the previous PES packet
PES Extension Information - may be used to support MPEG-1 streams.

The PES packet payload includes the ES data. The information in the PES header is, in general, independent of the transmission method used.

MPEG-2 Multiplexing

The MPEG-2 standard allows two forms of multiplexing:
MPEG Program Stream
A group of tightly coupled PES packets referenced to the same time base. Such streams are suited for transmission in a relatively error-free environment and enable easy software processing of the received data. This form of multiplexing is used for video playback and for some network applications.
MPEG Transport Stream Each PES packet is broken into fixed-sized transport packets forming a general purpose way of combining one or more streams, possibly with independent time bases. This is suited for transmission in which there may be potential packet loss or corruption by noise, or / and where there is a need to send more than one programme at a time.

MPEG Transport Streams

A transport stream consists of a sequence of fixed sized transport packet of 188 B. Each packet comprises 184 B of payload and a 4 B header. One of the items in this 4 B header is the 13 bit Packet Identifier (PID) which plays a key role in the operation of the Transport Stream.

The format of the transport stream is described with an example: two elementary streams sent in the same MPEG-2 transport multiplex. Each packet is associated with a PES through the setting of the PID value in the packet header (the values of 64 and 51 respectively). The audio packets have been assigned PID 64, and the video packets PID 51 (these are arbitrary, but different values). As is usual, there are more video than audio packets, but you may also note that the two types of packets are not evenly spaced in time. The MPEG-TS is not a time division multiplex, packets with any PID may be inserted into the TS at any time by the TS multiplexor. If no packets are available at the multiplexor, it inserts null packets (denoted by a PID value of 0x1FFF) to retain the specified TS bit rate. The multiplexor also does not synchronise the two PESs, indeed the encoding and decoding delay for each PES may (and usually is different). A separate process is therefore require to synchronise the two streams.

Single and Multiple Program Transport Streams

A TS may correspond to a single TV programme, or multimedia stream (e.g. with two a video PES and an audio PES). This type of TS is normally called a Single Programme Transport Stream (SPTS).

An SPTS contains all the information requires to reproduce the encoded TV channel or multimedia stream. It may contain only an audio and video PESs, but in practice there will be other types of PES as well. Each PES shares a common timebase. Although some equipments output and use SPTS, this is not the normal form transmitted over a DVB link.

In most cases one or more SPTS streams are combined to form a Multiple Programme Transport Stream (MPTS). This larger aggregate also contains all the control information (Program Specific Information (PSI)) required to co-ordinate the DVB system, and any other data which is to be sent.

Signalling Tables

For a user to receive a particular transport stream, the user must first determine the PID being used, and then filter packets which have a matching PID value. To help the user identify which PID corresponds to which programme, a special set of streams, known as Signalling Tables, are transmitted with a description of each program carried within the MPEG-2 Transport Stream. Signalling tables are sent separately to PES, and are not synchronised with the elementary streams (i.e they are an independent control channel).

The tables (called Program Specific Information (PSI) in MPEG-2) consist of a description of the elementary streams which need to be combined to build programmes, and a description of the programmes. Each PSI table is carried in a sequence of PSI Sections, which may be of variable length (but are usually small, c.f. PES packets). Each section is protected by a CRC (checksum) to verify the integrity of the table being carried. The length of a section allows a decoder to identify the next section in a packet. A PSI section may also be used for down-loading data to a remote site. Tables are sent periodically by including them in the transmitted transport multiplex.

Thursday, July 8, 2010

run level coding (codec)

Scan
After quantization, the DCT coefficients for a block are reordered to group together nonzero coefficients, enabling efficient representation of the remaining zero-valued quantized coefficients. The optimum reordering path (scan order) depends on the distribution of nonzero DCT coefficients. For a typical frame block a suitable scan order is a zigzag starting from the DC (top-left) coefficient. Nonzero coefficients tend to be grouped together at the start of the reordered array, followed by long sequences of zeros.

The zigzag scan may not be ideal for a field block because of the skewed coefficient distribution and a modified scan order
"

Run-Level Encoding
The output of the reordering process is an array that typically contains one or more clusters of nonzero coefficients near the start, followed by strings of zero coefficients. The large number of zero values may be encoded to represent them more compactly, for example by representing the array as a series of (run, level) pairs where run indicates the number of zeros preceding a nonzero coefficient and level indicates the magnitude of the nonzero coefficient.

Higher-frequency DCT coefficients are very often quantized to zero and so a reordered block will usually end in a run of zeros. A special case is required to indicate the fin nonzero coefficient in a block.
Example:
Input array: 16,0,0, -3,5,6,0,0,0,0, -7, . . .
Output values: (0,16),(2, -3),(0,5),(0,6),(4, -7). . .

In so-called "Two-dimensional" run-level encoding is used, each run-level pair is encoded as above and a separate code symbol, "last", indicates the end of the nonzero values.

If "Three-dimensional" run-level encoding is used, each symbol encodes three quantities, run, level and last. In the example above, if -7 is the fin nonzero coefficient, the 3D values are:
(0, 16, 0), (2, -3, 0), (0, 5, 0), (0, 6, 0), (4, -7, 1)
The 1 in the fin code indicates that this is the last nonzero coefficient in the block.

Quantization (codec)

A quantiser maps a signal with a range of values X to a quantised signal with a reduced range of values Y. It should be possible to represent the quantised signal with fewer bits than the original since the range of possible values is smaller.

A scalar quantiser maps one sample of the input signal to one quantised output value and a vector quantiser maps a group of input samples (a "vector") to a group of quantised values.

A simple example of scalar quantisation is the process of rounding a fractional number to the nearest integer, i.e. the mapping is from R to Z. The process is lossy (not reversible)

2 types of quantisers are listed here: a linear quantiser (with a linear mapping between input and output values) and a nonlinear quantiser that has a "dead zone" about zero (in which small-valued inputs are mapped to zero).

In image and video compression CODECs, the quantisation operation is usually made up of two parts: a forward quantiser FQ in the encoder and an "inverse quantiser" or (IQ) in the decoder.

A critical parameter is the step size QP between successive re-scaled values. If the step size is large, the range of quantised values is small and can therefore be ef?ciently represented (highly compressed) during transmission, but the re-scaled values are a crude approximation to the original signal.

The forward quantiser in an image or video encoder is designed to map insignificant coefficient values to zero whilst retaining a reduced number of significant, nonzero coefficients.

Vector Quantisation
A vector quantiser maps a set of input data (such as a block of image samples) to a single value (codeword) and, at the decoder, each codeword maps to an approximation to the original set of input data (a "vector").

A typical application of vector quantisation to image compression [5] is as follows:
1. Partition the original image into regions (e.g. M × N pixel blocks).
2. Choose a vector from the codebook that matches the current region as closely as possible.
3. Transmit an index that identi?es the chosen vector to the decoder.
4. At the decoder, reconstruct an approximate copy of the region using the selected vector.

Wednesday, July 7, 2010

Transform coding (codec)

Transform coding is at the heart of the majority of video coding systems and standards.
Spatial image data (image samples or motion-compensated residual samples) are transformed into a different representation, the transform domain.

The two most widely used image compression transforms are the discrete cosine transform (DCT) and the discrete wavelet transform (DWT). The DCT is usually applied to small, regular blocks of image samples (e.g. 8 x 8 squares) and the DWT is usually applied to larger image sections ("tiles") or to complete images

DCT

The DCT has proved particularly durable and is at the core of most of the current generation of image and video coding standards, including JPEG, H.261, H.263, H.263+, MPEG-l, MPEG-2 and MPEG-4. The DWT is gaining popularity because it can outperform the DCT for still image coding and so it is used in the new JPEG image coding standard (JPEG-2000) and for still "texture" coding in MPEG-4.

DCT become the most popular transform for image and video coding. There are two main reasons for its popularity: first, it is effective at transforming image data into a form that is easy to compress and second, it can be efficiently implemented in software and hardware.

The forward DCT (FDCT) of an N × N sample block isgiven by:
     Y = AXA(T)
and the inverse DCT (IDCT) by:
     X = A(T)YA

The transform matrix A for a 4 × 4 DCT is:
A =

0.5         0.5 0.5 0.5
0.653     0.271       0.271 −0.653
0.5 −0.5 −0.5         0.5
0.271 −0.653 −0.653     0.271

The forward DCT (FDCT) transforms a set of image samples (the "spatial domain") into a set of transform coefficients (the "transform domain"). The transform is reversible: the inverse DCT (IDCT) transforms a set of coefficients into a set of image samples.

The DCT has two useful properties for image and video compression, energy compaction (concentrating the image energy into a small number of coefficients) and decorrelution (minimising the interdependencies between coefficients).

A reasonable approximation to the original image block can be reconstructed from just these most significant coefficients.

The DCT becomes increasingly complex to calculate for larger block sizes.

DWT

a wavelet transform is typically applied to a complete image or a large rectangular region ("tile") of the imag.

A single-stage wavelet transformation consists of a filtering operation that decomposes an image into four frequency bands. m. The top-left comer of the transformed image ("LC) is the original image, low-pass filtered and subsampled in the horizontal and vertical dimensions. The top-right comer ("W) consists of residual vertical frequencies .
The bottom-left comer "LH" contains residual horizontal frequencies.
The bottom-right comer "HH" contains residual diagonal frequencies.

This decomposition process may be repeated for the "LL" component to produce another set of four components: a new "LL" component that is a further subsampled version of the original image, plus three more residual frequency component.

The wavelet decomposition has some important properties. First, the number of wavelet "coefficients" (the spatial values that make up Figure 7.8) is the same as the number of pixels in the original image and so the transform is not inherently adding or removing information.
Second, many of the coefficients of the high-frequency components ("HH", "HL" and "LH" at each stage) are zero or insignificant. This reflects the fact that much of the important information in an image is low-frequency. Third, the decomposition is not restricted by block boundaries (unlike the DCT) and hence may be a more flexible way of decorrelating the image data (i.e. concentrating the significant components into a few coefficients) than the block-based DCT.

Wavelet-based compression performs well for still images (particularly in comparison with DCT-based compression) and can be implemented reasonably efficiently.

Monday, June 28, 2010

video demystified(summary on mpeg2/mpeg4/h264)

MPEG-2

MPEG-2 uses the YCbCr color space, supporting 4:2:0, 4:2:2 and 4:4:4 sampling. The 4:2:2 and 4:4:4 sampling options increase the chroma resolution over 4:2:0, resulting in better picture quality.

There are three types of coded pictures.
I (intra) pictures are fields or frames coded as a
stand-alone still image.
P (predicted) pictures are fields or frames coded relative to the nearest previous I or P picture, resulting in forward prediction pro-cessing. B (bidirectional) pictures are fields or frames that use the closest past and future I or P picture as a reference, resulting in bidirectional prediction

A group of pictures (GOP) is a series of one or more coded pictures intended to assist in random accessing and editing. The GOP value is configurable during the encoding process. The smaller the GOP value, the better the response to movement (since the I pictures are closer together), but the lower the compression. In the coded bitstream, a GOP must start with an I picture and may be followed by any number of I, P, or B pictures in any order. In display order, a GOP must start with an I or B picture and end with an I or P picture.

An open GOP, identified by the broken_link flag, indicates that the first B pictures (if any) immediately following the first I picture after the GOP header may not be decoded correctly (and thus not be displayed) since the reference picture used for prediction is not available due to editing.

Macroblocks
Three types of macroblocks are available in MPEG-2.
The 4:2:0 macroblock consists of four Y blocks, one Cb block, and one Cr block.
The 4:2:2 macroblock consists of four Y blocks, two Cb blocks, and two Cr blocks.
The 4:4:4 macroblock consists of four Y blocks, four Cb blocks, and four Cr blocks.

Macroblocks in P pictures are coded using the closest previous I or P picture as a reference, resulting in two possible codings:
- intra coding no motion compensation
- forward prediction closest previous I or P picture is the reference
Macroblocks in B pictures are coded using the closest previous and/or future I or P picture as a reference, resulting in four possible codings:
- intra coding: no motion compensation
- forward prediction: closest previous I or P picture is the reference
- backward prediction: closest future I or P picture is the reference
- bi-directional prediction: two pictures used as the reference:
- the closest previous I or P picture and
- the closest future I or P picture

Block size: 8x8 for MPEG2

Video Bitstream
a hierarchical structure with seven layers.
From top to bottom the layers are:
- Video Sequence
- Sequence Header
- Group of Pictures (GOP)
- Picture
- Slice
- Macroblock (MB)
- Block

Sequence Header
A sequence header should occur about every one-half second.
- Sequence_header_code
- Horizontal_size_value
- Vertical_size_value
- Aspect_ratio_information
- Frame_rate_code
- Bit_rate_value
- Vbv_buffer_size_value
- Constrained_parameters_flag
- Load_intra_quantizer_matrix
- Intra_quantizer_matrix
- Load_non_intra_quantizer_matrix
- Non_intra_quantizer_matrix
- Sequence Extension
- Extension_start_code
- User Data
--- User_data_start_code
--- User Data

Data for each group of pictures consists of a GOP header followed by picture data. A GOP header should occur about every two seconds.
Data for each picture consists of a picture header followed by slice data. If a sequence extension is present, each picture header is followed by a picture coding extension.
Data for each picture consists of a picture header followed by slice data.
Data for each slice layer consists of a slice header followed by macroblock data.
Data for each macroblock layer consists of a macroblock header followed by motion vector and block data
Data for each block layer consists of coefficient data.

The program stream, used by the DVD and SVCD standards, is designed for use in relatively error-free environments. It consists of one or more PES packets multiplexed together and coded with data that allows them to be decoded in synchronization. Program stream packets may be of variable and relatively great length.

Data for each pack consists of a pack header followed by an optional system header and one or more PES packets.
The program stream map (PSM) provides a description of the bitstreams in the program stream, and their relationship to one another. It is present as PES packet data if stream_ID = program stream map.

A transport stream combines one or more programs, with one or more independent time bases, into a single stream. Each program in a transport stream may have its own time base. The time bases of different programs within a transport stream may be different.
The transport stream consists of one or more 188-byte packets. The data for each packet is from PES packets, PSI (Program Specific Information) sections, stuffing bytes, or private data.
At the start of each packet is a Packet IDentifier (PID) that enables the decoder to determine what to do with the packet.
Data for each packet consists of a packet header followed by an optional adaptation field and/or one or more data packets.

MPEG-4

MPEG-4 visual is divided into two sections.
MPEG-4 Part 2 includes the original MPEG-4 video codecs discussed in this section. MPEG-4 Part 10 specifies the "advanced video codec" also known as H.264, and is discussed at the end of this chapter.
Like H.263 and MPEG-2, the MPEG-4 Part 2 video codecs are also macroblock, block and DCT-based.

Instead of the video "frames" or "pictures" used in earlier MPEG specifications, MPEG-4 uses natural and synthetic visual objects.
Instances of video objects at a given time are called visual object planes (VOPs).

MPEG-4 Part 2 supports many visual profiles and levels. Only natural visual profiles are currently of the most interest in the marketplace.

Visual layers: (from top to bottom)
VS -> VO -> VOL -> GOV -> VOP
A MPEG-4 visual scene consists of one or more video objects.
Each video object may have one or more layers to support temporal or spatial scalable coding.

Each video object can be encoded in scalable (multi-layer) or nonscalable form (single layer), depending on the application, represented by the video object layer (VOL).

Video object planes can be grouped together to form a group of video object planes.

H.264

Rather than a single major advancement, H.264 employs many new tools designed to improve performance. These include:
- Support for 8-, 10- and 12-bit 4:2:2 and 4:4:4 YCbCr
- Integer transform
- UVLC, CAVLC and CABAC entropy coding
- Multiple reference frames
- Intra prediction
- In-loop de-blocking filter
- SP and SI slices
- Many new error resilience tools

H.264 supported three profiles. Baseline profile is designed for progressive video.
- I and P slice types
- 1/4-pixel motion compensation
- UVLC and CAVLC entropy coding
- Arbitrary slice ordering
- Flexible macroblock ordering
- Redundant slices
- 4:2:0 YCbCr format
Main profile is designed for a wide range of broadcast applications. Additional tools over baseline profile include:
- Interlaced pictures
- B slice type
- CABAC entropy coding
- Weighted prediction
- 4:2:2 and 4:4:4 YCbCr, 10- and 12-bit formats
- Arbitrary slice ordering not supported
- Flexible macroblock ordering not supported
- Redundant slices not supported
Extended profile is designed for mobile and Internet streaming applications. Additional tools over baseline profile include:
- B, SP and SI slice types
- Slice data partitioning
- Weighted prediction

H.264 uses the YCbCr color space, supporting 4:2:0, 4:2:2 and 4:4:4 sampling.
With H.264, the partitioning of the 16x16 macroblocks as been extended. Such fine granularity leads to a potentially large number of motion vectors per macroblock (up to 32) and number of blocks that must be interpolated (up to 96).

H.264 adds an in-loop de-blocking filter. It removes artifacts resulting from adjacent macroblocks having different estimation types and/or different quantizer scales.

The slice has greater importance in H.264 since it is now the basic independent spatial element. This prevents an error in one slice from affecting other slices.

When motion estimation is not efficient, intra prediction can be used to eliminate spatial redundancies. This technique attempts to predict the current block based on adjacent blocks. The difference between the predicted block and the actual block is then coded. This tool is very useful in flat backgrounds where spatial redundancies often exist.

H.264 adds supports for multiple reference frames. This increases compression by improving the prediction process and increases error resilience by being able to use another reference frame in the event that one was lost.

H.264 uses a simple 4x4 integer transform. An additional 2x2 transform is applied to the four CbCr DC coefficients. Intra-16×16 macroblocks have an additional 4x4 transform performed for the sixteen Y DC coefficents.

For everything but the transform coefficients, H.264 uses a single Universal VLC (UVLC) table that uses an infinite-extend codeword set (Exponential Golomb).

For transform coefficients, which consume most of the bandwidth, H.264 uses Context Adaptive Variable Length Coding (CAVLC). Based upon previously processed data, the best VLC table is selected.

Additional efficiency (5-10%) may be achieved by using Context Adaptive Binary Arithmetic Coding (CABAC). CABAC continually updates the statistics of incoming data and real-time adaptively adjusts the algorithm using a process called context modeling.

NAL
The NAL facilitates mapping H.264 data to a variety of transport layers including:
- RTP/IP for wired and wireless Internet services
- File formats such as MP4
- H.32X for conferencing
- MPEG-2 systems
The data is organized into NAL units, packets that contain an integer number of bytes.
The first byte of each NAL unit indicates the payload data type and the remaining bytes contain the payload data. The payload data may be interleaved with additional data to prevent a start code prefix from being accidentally generated.

Monday, May 10, 2010

DVB related

excerpted from http://en.wikipedia.org/wiki/Digital_Video_Broadcasting etc.

DVB
Digital Video Broadcasting (DVB) is a suite of internationally accepted open standards for digital television. DVB standards are maintained by the DVB Project, an international industry consortium with more than 270 members, and they are published by a Joint Technical Committee (JTC) of European Telecommunications Standards Institute (ETSI), European Committee for Electrotechnical Standardization (CENELEC) and European Broadcasting Union (EBU). The interaction of the DVB sub-standards is described in the DVB Cookbook.[1] Many aspects of DVB are patented, including elements of the MPEG video coding and audio coding.

DVB systems distribute data using a variety of approaches, including by satellite (DVB-S, DVB-S2 and DVB-SH; also DVB-SMATV for distribution via SMATV); cable (DVB-C, DVB-C2); terrestrial television (DVB-T, DVB-T2) and digital terrestrial television for handhelds (DVB-H, DVB-SH); and via microwave using DTT (DVB-MT), the MMDS (DVB-MC), and/or MVDS standards (DVB-MS)

These standards define the physical layer and data link layer of the distribution system. Devices interact with the physical layer via a synchronous parallel interface (SPI), synchronous serial interface (SSI), or asynchronous serial interface (ASI). All data is transmitted in MPEG transport streams with some additional constraints (DVB-MPEG). A standard for temporally-compressed distribution to mobile devices (DVB-H) was published in November 2004.

The conditional access system (DVB-CA) defines a Common Scrambling Algorithm (DVB-CSA) and a physical Common Interface (DVB-CI) for accessing scrambled content. DVB-CA providers develop their wholly proprietary conditional access systems with reference to these specifications. Multiple simultaneous CA systems can be assigned to a scrambled DVB program stream providing operational and commercial flexibility for the service provider.

ASI
ASI is streaming data format which often carries an MPEG Transport Stream (MPEG-TS).
DVB-ASI interface has become popular for use with infrastructure equipment. DVB-ASI is a fixed-frequency serial interface with a clock rate of 270 Mbps that transmits MPEG-2 data in PA fashion. The physical layer is based upon a subset of fiber channel levels (FC-0 and FC-1), and makes use of the 8B/10B channel coding of that standard.
An ASI signal can carry one or multiple SD, HD or audio programs that are already compressed, not like an uncompressed SDI.
An ASI signal can be at varying transmission speeds and is completely dependent on the user's setup requirements. Generally, the ASI signal is the final product of video compression, either MPEG2 or MPEG4.
DVB-ASI interfaces must support 188-byte MPEG packets and optionally may support 204-byte packets with either 16 Reed-Solomon (RS) error correction bytes or 16 dummy bytes.

SDI
Serial digital interface (SDI) is a serial link standardized by ITU-R BT.656 and the Society of Motion Picture and Television Engineers (SMPTE). SDI transmits uncompressed digital video over 75-ohm coaxial cable within studios, and is seen on most professional video infrastructure equipment.

Data is encoded in NRZI format, and a linear feedback shift register(LFSR) is used to scramble the data to reduce the likelihood that long strings of zeroes or ones will be present on the interface. The interface is self-synchronizing and self-clocking. Framing is done by detection of a special synchronization pattern, which appears on the (unscrambled) serial digital signal to be a sequence of ten ones followed by twenty zeroes (twenty ones followed by forty zeroes in HD); this bit pattern is not legal anywhere else within the data payload.

The first revision of the standard, SMPTE 259M, was defined to carry digital representation of analog video such as NTSC and PAL over a serial interface and is more popularly known as standard-definition (SD) SDI. The data rate required to transmit SD SDI is 270 Mbps.

With the advent of high-definition (HD) video standards such as 1080i and 720p, the interface was scaled to handle higher data rates of 1.485 Gbps. The 1.485-Gbps serial interface is commonly called the HD SDI interface and is defined by SMPTE 292M, using the same 75-ohm coaxial cable.

Studios and other video production facilities have invested heavily on the hardware infrastructure for coaxial cable and have a vested interest in extending the life of their infrastructure. Fortunately, SMPTE recently ratified a new standard called SMPTE 424M that doubles the SDI data rates to 2.97 Gbps using the same 75-ohm coaxial cable. This new standard, also called 3-Gbps (3G)-SDI, enables higher resolution of picture quality required for 1080p and digital cinema. This interface supports 4:4:4 at 2K resolution on ONE BNC.

Monday, May 3, 2010

电视原理2

excerpted from http://courseware.ecnudec.com/zsb/zjx/zjx09/zjx093/zjx09307/zjx093070.HTM et al.
射频电视信号的形成及频道的划分
0.调制基本知识
在模拟调制过程中已调波的频谱中除了载波分量外在载波频率两旁还各有一个频带，因调制而产生的各频率分量就落在这两个频带之内。这两个频带统称为边频带或边带。位于比载波频率高的一侧的边频带，称为上边带。位于比载波频率低的一侧的边频带，称为下边带。在单边带通信中可用滤波法、相移法或相移滤波法取得调幅波中一个边带，这种调制方法称为单边带调制(SSB)。单边带调制常用于有线载波电话和短波无线电多路通信。在同步通信中可用平衡调制器实现抑制载波的双边带调制(DSB-SC)。在数字通信中为了提高频带利用率而采用残留边带调制(VSB),即传输一个边带（在邻近载波的部分也受到一些衰减）和另一个边带的残留部分。在解调时可以互相补偿而得到完整的基带。
1．地面广播电视系统射频全电视信号的形成
使用频段。视频图像电视信号具有6MHz的带宽，因此，地面广播电视系统使用的频段应选在超短波范围，米波和分米波。我国规定广播电视系统选用的频段：
在地面广播电视系统中，图像信号的调制采用残留边带(VSB)调幅，伴音信号采用调频方式，由于图像与伴音的调制方式不同而不致于互相干扰，接收到的伴音信号的质量也较高。
在地面广播电视系统中，图像信号的残留边带调幅就是发送一个完整的上边带和一小部分下边带，抑制大部分另一下边带。（0.75～1.25MHz保留）我国标准规定，伴音载频fS比图像载频fP高6.5MHz，距fP为-1.25MHz处的最小衰减量为20 dB。
残留边带方式的优点：已调信号的频带较窄；滤波器比单边带（SSB）滤波器易实现；易解调(峰值包络检波器即可)。
电视广播中伴音信号的频率范围在50Hz到15kHz之间。为了提高伴音信号的接收质量，送往伴音发射机的伴音信号经过调频(FM)后变成宽带信号。我国规定伴音己调信号的最大频偏为50kHz（调频广播为75kHz），所以已调伴音信号的带宽B为
B＝2(Δfm+fm)＝2(50+15)＝130kHz。

射频全电视信号的频谱及频道划分
视频图像信号和伴音信号分别对图像载频和伴音载频进行VSB调幅和调频后形成射频全电视信号，其频谱如图3-20所示，图中fp是图像载波，fs是伴音载波。其总频带宽度(频道带宽)为8MMz。
以8MHz为间隔，我国电视频道在VHF和UHF频段共分为68个频道，见表3-1。其中，频率为92～167MHz、566～606MHz的部分供调频广播和无线电通信使用（调频广播使用88～108 MHz，其中88～92 MHz频带内可以安排电视频道）。在开路电视系统中不安排电视频道，但在有线电视中常设置有增补频道以增加频道数量。

2. 卫星广播电视射频全电视信号的形成
卫星电视是指用通信广播卫星转发器传送电视信号，供覆盖区内的广大用户接收观看，它包含卫星电视信号的取得、存储、传输、接收再现的全过程。卫星电视覆盖范围大，信号传输质量高，传输频带宽，容量大，费用省，效益高，无距离限制等特点。
卫星广播电视系统都使用微波频段。
·微波频段带宽很宽，具有丰富的频率资源，可容纳更多的频道，且允许每个频道占用较宽的带宽；
·微波频段频率高、波长短，可使星上和地面的天线尺寸大大减小，增益提高，方向性增强，从而减小卫星的体积相重量，降低对发射功率的要求，且可防止对邻近区域约干扰；
·微波频段不易受大气扰动噪声的影响；
·微波能穿过电离层；
·无线电业务已占用较低频率，而微波频段相对比较“空闲”。

根据国际电信联盟ITU的有关规定，卫星广播下行频段有六个，目前使用较多的为C波段（下行频率3.7G~4.2G）和Ku波段（下行频率11.7~12.2G）。由于波段资源有限，卫星广播下行电波均采用不同的极化方式，达到频率复用的目的。

电视发射天线
电视发射天线，根据频段的不同，主要分为VHF天线和UHF天线两大类。
在VHF频段，蝙蝠翼天线是一种被广泛采用的天线，可以较好地满足对电视发射天线的要求。蝙蝠翼天线属于旋转场型天线。
在实际中，常常使用多层蝙蝠翼天线。多层蝙蝠翼天线在水平面内的方向图与单层的基本相同，近似为一个圆。在垂直方向上，方向图与层数有关；层数越多，垂直方向性越尖锐，增益提高，远区场增加。
在UHF频段，广泛采用的发射天线是带反射板的四偶极子天线及其改进型双环天线，它们具有增益高、频带宽等特点，但造价较高。

电视信号属于超短波段，频率高，波长短，传播方式主要为空间波传播，沿直线方向传播到直接可见的地方，即视距传播。
电视信号经地面或遇到障碍物(如大建筑物等)会产生反射，直射信号和反射信号在接收天线上相互干扰，形成多径传播。多径传播的结果表现为重影(右重影)。

电视差转是电视差频转播的简称，电视差转的主要功能是将接收到的主台(或称骨干台)某频道的电视节目，经过差转机的频率变换、放大后，再用另一频道发射出去，从而扩大主台的覆盖范围或服务面积。

卫星广播电视信号的接收
接收天线一股都采用抛物面天线。抛物面天线的增益高、方向性强，在抛物面焦点上接收的信号最强;。它通常用整体成型铝合金材料制成。对于C波段，天线直径要求不小于1.3m。
室外单元主要是一个高频头。为减小噪声，高频头应尽量靠近天线馈源，且与之直接相连；为降低传输损耗，一般采用降频传送方式，即把微波信号降为中频信号，再用电缆传输。因此，室外单元一般由高频低噪声放大器、下变频器(含本振)和中放(或前置中放)组成。
室内单元是卫星电视接收机的控制中心和信号处理设备。其主要作用是从室外单元送来的信号中选取所需要的某频道信号，并将其解调成图像信号和伴音信号送至监视器；或者再把图像信号和伴音信号调制到地面电视广播的某频道上，送至普通电视接收机。此外，室内单元还向室外单元馈送直流电压和控制电压。
根据室内单元是否有变频器，把卫星电视接收机分为一次变频型和两次变频型。一次变频只能接收一个频道。二次变频保持室外单元的本振不变，改变第二本振即可实现频道选择，且一个室外单元可给多个室内单元提供信号。同时，为实现自动频率控制和抑制镜频干扰带来了方便。因此目前大多采用二次变频。

Thursday, April 29, 2010

电视原理

色温和标准光源：

(1). 通常的照明光源，如太阳、日光灯、白炽灯泡等所发出的光虽然都笼统地称其为白光，但由于发光物质不同，它们的光谱成分相差很大，用它们照射相同物体时，呈现的颜色则相差较大。例如，白炽灯泡偏橙红（呈暖色调），而汞灯偏青蓝（呈冷色调）。为了比较和区别各种光源的特性，国际照明委员会(CIE)规定了A、B、C、D、E等几种标准白色光源，并以基本参量"色温"予以表征。
(2). 色温的概念：
A. 色温是以绝对黑体的加热温度来定义的。

所谓绝对黑体是指既不反射光、也不透射光，而完全吸收入射光的物体，被加热时的电磁波辐射波谱仅由温度决定。

内容扩展

随着温度的增加，黑体辐射能量将增大，其功率波谱向短波方向移动。所以当温度升高时，不仅亮度增大，其发光颜色也随之变化。为了区分各种光源的不同光谱分布与颜色，可以用绝对黑体的温度来表征色温。

B. 色温的概念：
当绝对黑体在某一特定绝对温度下，所辐射的光谱与某光源的光谱具有相同的特性时，则绝对黑体的这一特定温度就定义为该光源的色温，单位以K表示。内容扩展

例如，温度保持在2800 K时的钨丝灯泡所发的白光，与温度保持在2854K的绝对黑体所辐射光的功率波谱基本一致，于是既称该白光的色温即为2854K。

C. 色温并非光源本身的实际温度，而是表征光源波谱持性的参量。
(3). 标准白光源：

各种标准白光源的光谱分布如下图所示。

A光源：相当于2800 K钨丝灯所发的光。其色温为2854K。它的光谱能量分布主要集中于波长较长的区域，因而A光源的光总带着橙红色。
B光源：相当于中午直射的太阳光。其色温为4800 K。在实验室中可由特制的滤色镜从A光源中获得。
C 光源：相当于白天的自然光。色温为 6800K。其波谱成分在400～500nm处较大，因此C光源的光偏蓝色。它被选作为NTSC制彩色电视系统的标准白光源。
D 光源：相当于白天平均照明光。因其色温为6500 K，故又称D65光源。它被选作为PAL制彩色电视系统的标准白光源。
E 光源：是一种理想的等能量的白光源，其色温为5500 K。

2．2 亮度信号与色差信号
为了传送彩色图像，从兼容的角度出发，彩色电视系统中应传送一个只反映图像亮度的亮度信号，以Y表示，其特性应与黑白电视信号相同。同时还需传送色度信息，常以 F 表示。根据三基色原理，必须传送反映R、G、B三个基色的信息。亮度方程Y = 0.30R + 0.59G + 0.11B告诉我们在Y、R、G、B这4个变量中，只有3个是独立的。所以只要在传送Y 的同时，再传送三个基色中的任意两个即可。注：（此处的亮度信号Y、基色信号R、G、B指的是已经过光电转换后的电信号。）
由于每个基色信息中都含有亮度信息，如果直接传送基色信号，已传送的亮度信号Y(为各基色亮度总和)与所选出的两个基色所包含的亮度参量就重复了，因而使得基色与亮度之间的相互干扰也会十分严重。所以通常选择不反映亮度信息的信号传送色度信息，例如基色信号与亮度信号相减所得到的色差信号(R-Y)、(G-Y)和(B-Y)，可从中选取两个代表色度信息。因此，在彩色电视系统中，为传送彩色图像，选用了一个亮度信号和两个色差信号。

2．2．1 亮度、色差与R、G、B的关系
由亮度方程： Y =0.30R + 0.59G + 0.11B （2 -1）
可得色差信号：
R-Y=R -（0.30R + 0.59G + 0.11B）=0.70R - 0.59G - 0.11B （2-2a）
G-Y=G -（0.30R + 0.59G + 0.11B）= - 0.30R + 0.41G - 0.11B （2-2b）
B-Y=B -（0.30R + 0.59G + 0.11B）= - 0.30R - 0.59G + 0.89B （2-2c）

2.2.2 标准彩条亮度与色差信号的波形与特点
标准彩条信号是由彩条信号发生器产生的一种测试信号。它是用电的方法产生的模拟彩色摄像机拍摄的光电转换信号，常用以对彩色电视系统的传输特性进行测试和调整。

标准彩条信号是由三个基色、三个补色、白色和黑色，依亮度递减的顺序排列的8条垂直彩带。彩条电压波形是在一周期内用三个宽度倍增的理想方波构成的三基色信号。标准彩条信号有多种规范，如 “100％幅度、100％饱和度”彩条，这种规范中白条电平为1，黑条电平为0，三基色信号的电平非1即0。

但此类彩条色度信号幅度较大，与亮度信号叠加后会造成信号动态范围过大而产生失真。故我国规定使用75%幅度、100％饱和度信号作为标准测试信号。

标推彩条信号还可以用另一种由四个数码表示的命名法。

例如l00-0-100-0彩条信号、100-0-75-0彩条信号等。在四位数码中，各信号均指经γ校正后的信号。每一数字表示相应条的基色信号的百分比幅度，而基准则是组成白条的任一基色信号的幅度。

第一和第二个数字分别表示组成无色条(白、黑条)的R、G、B的最大值和最小值；

第三和第四数字分别表示组成各彩条的R、G、B的最大值和最小值。

例如，若组成白条的基色信号的幅度为1，则100-0-75-0彩条的各基色幅度为：白条信号为1；黑条信号为0；对应的各彩条信号的最大值为0.75，最小值为0。

为了收、发同步的需要，电视发送端每当扫描完一行时加入一个行同步脉冲；每当扫描完一场时加入一个场同步脉冲，它们分别在行与场逆程期间传送，其宽度分别小于行、场逆程时间。下图中给出了行、场同步信号，通常将行、场同步信号合称为复合同步信号。

我国电视标准规定，行同步脉冲宽度为4.7μs，脉冲前沿滞后行消隐信号前沿约1.3μs；场同步脉冲宽度为160μs（2.5个行周期），在接收端必须先将这些行、场同步脉冲分离出来，用以分别控制接收机中的行、场扫描锯齿波电流的周期和相位。换言之，只有当行、场同步脉冲到来时才开始行与场的回扫，这就可保证收、发双方扫描电流的频率和相位都相同，即可保证同步。

槽脉冲和均衡脉冲

电视系统中，提取行同步信息的方法，是利用鉴相或微分电路来提取行同步脉冲的前沿.

在场同步脉冲上加开几个槽，称为槽脉冲，并使槽脉冲的后沿(即上升沿)恰好对应于应该出现原行同步脉冲的前沿位置。加入槽脉冲之后就可以保证在场同步脉冲期间可以检测出行同步脉冲。槽脉冲的宽度与行同步脉冲相同，也是4.7μs。

由于电视系统一般采用隔行扫描，相邻两场扫描的起点和终点位置都不相同。在进行、场同步分离时，每一个行同步脉冲出现，要对积分电容器进行一次充电，行同步脉冲过后则进行放电。

在场同步脉冲前后（场消隐期间）以及中间，每隔半行都增加一个行同步脉冲，这样就可以使相邻两场的场同步脉冲前沿到达积分电路时，积分电容器上所充的电压基本相等。为了使增加脉冲后的平均电平不增加，把这部分脉冲宽度减小为原来的一半(即2.35μs)。场同步脉冲上的槽也每半行一个，槽宽仍为4.7μs，场同步期间要开5个槽。每个场同步脉冲前、后各有5个 2.35μs 宽的脉冲，常称其为前、后均衡脉冲

正交调幅
将两个调制信号分别对频率相等、相位相差90°的两个正交载波进行调幅，然后再将这两个调幅信号进行矢量相加（频带宽度没有增加），这一调制方式称正交调幅。如果两个调制信号分别对正交的两个载波进行平衡调幅，其合成信号即为正交平衡调幅信号。
彩色电视系统中，为实现色度与亮度信号频谱交错，应用了正交平衡调幅的方式，只用一个副载波便实现对两个色差信号的传输，而且在解调端采用同步解调又很容易分离出红色差与蓝色差分量。

要从彩色全电视信号中获得两个色差信号，首先必须把色度信号从全电视信号中分离出来，然后送同步检波电路，利用两个色度分量FV、FU的相位差来解调出色差信号的。

同步检波器可看成两个受副载波控制的开关。开关工作特点是，当副载波为正的最大值时，开关闭合，其余时间开关断开。将色度信号F = U sinωSCt + V cosωSCt 送入这两个同步检波开关。在FU = U sinωSCt分量出现最大值时，U同步解调开关闭合，这时FV分量恰好为零，从而把U分量解调出来。同理亦可解调出V分量。由于控制同步检波的副载波必须与被检波的色度信号相位相同，所以称同步检波。

2.3.3　色同步信号

要实现同步解调，需要一个与色差信号调制时的副载波同频、同相的恢复副载波。由于色度信号中副载波已被平衡调制器所抑制，所以在彩色电视接收机中需要设置一个副载波产生电路（副载波恢复电路）。为保证所恢复副载波与发端的副载波同频、同相，需要发端在发送彩色全电视信号的同时发出一个能反映发端副载波频率与相位信息的“色同步信号”，以使电视接收机中的副载波恢复电路所产生的恢复副载波与发端的副载波同步。色同步信号是由8～12个周期副载波组成的一小串副载波群构成（正弦填充脉冲），这个正弦填充脉冲的周期与行周期相同，位于行消隐的后肩上，前沿滞后行同步脉冲前沿5.6μs，如图2-12所示.

Monday, April 5, 2010

video format

excerpted from http://en.wikipedia.org/wiki/ITU-R_BT.656
ITU-R Recommendation BT.656, sometimes also called ITU656, describes a simple digital video protocol for streaming uncompressed PAL or NTSC Standard Definition TV (525 or 625 lines) signals. The protocol builds upon the 4:2:2 digital video encoding parameters defined in ITU-R Recommendation BT.601, which provides interlaced video data, streaming each field separately, and uses the YCbCr color space and a 13.5 MHz sampling frequency for pixels.

The standard can be implemented to transmit either 8-bit values (the standard in consumer electronics) or 10-bit values (sometimes used in studio environments). Both a parallel and a serial transmission format are defined. For the parallel format, a 25-pin Sub-D connector pinout and ECL logic levels are defined. The serial format can be transmitted over 75-ohm coaxial cable with BNC connectors, but there is also a fibre-optical version defined.

The parallel version of the ITU-R BT.656 protocol is also used in many TV sets between chips using CMOS logic levels. Typical applications include the interface between a PAL/NTSC decoder chip and a DAC integrated circuit for driving a CRT in a TV set.

Data format
A BT.656 data stream is a sequence of 8-bit or 10-bit words, transmitted at a rate of 27 Mbyte/s. Horizontal scan lines of video pixel data are delimited in the stream by 4-byte long SAV (Start of Active Video) and EAV (End of Active Video) code sequences. SAV codes also contain status bits indicating line position in a video field or frame. Line position in a full frame can be determined by tracking SAV status bits, allowing receivers to 'synchronize' with an incoming stream.

Individual pixels in a line are coded in YCbCr format. After an SAV code (4 bytes) is sent, the first 8 bits of Cb (chroma U) data are sent then 8 bits of Y (luma), followed by 8 bits of Cr (chroma V) for the next pixel and then 8 bits of Y. To reconstruct full resolution Y,Cb, Cr pixel values, chroma upsampling must be used.

SMPTE 292M (http://en.wikipedia.org/wiki/SMPTE_292M)
SMPTE 292M is a standard published by SMPTE which expands upon SMPTE 259M and SMPTE 344M allowing for bit-rates of 1.485 Gbit/s, and 1.485/1.001 Gbit/s. These bit-rates are sufficient for and often used to transfer uncompressed High Definition video.[1]

This standard is usually referred to as HD-SDI; it is part of a family of standards that define a Serial Digital Interface based on a coaxial cable, intended to be used for transport of uncompressed digital video and audio in a television studio environment.

Wednesday, November 19, 2008

led vs ccfl

。在视频领域，人们一般用NTSC作为衡量视频设备的色彩还原特性的标准。这个指标是指在整个色彩空间内，显示设备能在各种色彩上显示到何种饱和度，就是能够显示显眼到什么程度的蓝色、绿色、红色。对于传统的液晶电视和显示器而言，能够覆盖的色域范围只有NTSC标准的65%～75%。如果我们仔细观察其色域曲线，就会发现在绿色、黄色和红色部分和标准距离较远。因此，传统液晶电视的色彩范围较小，难以完美呈现绿草、大海等自然场景。

我们知道，液晶显示器本身不会发光，它依靠背光源将光线穿过显示面板，展现图形图像。因此，背光源的技术直接影响到液晶电视的画质。传统的液晶显示器通常采用冷阴极荧光灯（cold cathode fluorescent lamps，CCFL）作为光源，而正是CCFL造成了液晶显示器颜色不够丰富，色彩还原度差。我们再看Sony QUALIA005系列电视机，它把液晶电视的色域范围扩展到NTSC标准的105%，基本上可以重现我们所有观察到的自然界场景。其核心就是使用LED背光源替代传统的CCFL背光源。

LED背光技术领先优势

LED作为LCD的背光源，与传统背光技术相比，除了在色域范围的优势外，还有很多独特的优点，归纳为十个方面：

1）LED背光源有更好的色域。其色彩表现力强于CCFL背光源，可对显示色彩数量不足的液晶技术起到很好的弥补作用，色彩还原好；

2）LED的使用寿命可长达10万小时。即使每天连续使用10个小时，也可以连续用上27年，大大延长了液晶电视的使用寿命，可获得对等离子技术压倒性的优势；

3）亮度调整范围大。实现LED功率控制很容易，不像CCFL的最低亮度存在一个门槛。因此，无论在明亮的户外还是全黑的室内，用户都很容易把显示设备的亮度调整到最悦目的状态；

4）完美的运动图像。传统CCFL灯管的闪烁发光频率较低，表现动态场景可能产生画面跳动。LED背光可以灵活调整发光频率，而且频率大大高于CCFL，因此能完美地呈现运动画面；

5）实时色彩管理。由于红绿蓝3色独立发光，很容易精确控制目前的显示色彩特性；

6）可以调整的背光白平衡，同时保证整体对比度。当用户的视频源在计算机和DVD机间切换时，可以轻松在9600K和6500K间调整白平衡，而且不会牺牲亮度和对比度；

7）可以为大尺寸屏幕提供连续面阵光源。LED是一种平面状光源，最基本的发光单元是3～5mm边长的正方形封装后，极容易组合在一起成为既定面积的面光源，具有很好的亮度均匀性，如果作为液晶电视的背光源，所需的辅助光学组件可以做得非常简单，屏幕亮度均匀性更为出色；

8）安全。LED使用的是5～24V的低压电源，十分安全，供电模块的设计也颇为简单；

9）环保。LED光源没有任何射线产生，也没有水银之类的有毒物质，可谓是绿色环保光源；

10）抗震。平面状结构让LED拥有稳固的内部结构，抗震性能很出色。

表1、LED与CCFL对比
CCFL LED
色域 72% 105%
色温固定可变
频闪驱动需要不需要
上升/下降沿时间 500毫秒 20纳秒
寿命 5万小时 10万小时
功耗 110瓦 415瓦
厚度 32.5毫米 45毫米
灯/LED数量 16 455

（数据来源：Samsung Electronics）

在液晶显示器已经成为主流显示器的今天，LED背光源凭借其独特、压倒性的优势，逐渐显示出强大的应用前景。

LED背光源现在存在的问题

LED背光技术就象许多新型技术一样拥有许多诱人的优点，但LED要想占据大尺寸LCD背光源的主流，目前还需要解决一些技术难点。通过表1的对比，我们已经发现LED在功耗方面处于劣势，除此之外还存在成本高、一致性差等问题。

1）LED的发光效率较低，与同等尺寸CCFL背光源相比耗电量高。目前CCFL的输出光通量多在5000～7000lm范围，实际屏幕的输出光通量高于300lm，而多数LED背光都还无法达到这一指标。不过，现在全球有大量的企业从事相关研究，LED发光效率提升相当之快，目前光通量达到10000lm 的高亮型LED背光也已经出现，相信离成熟仅是咫尺之遥。

2）成本太高、价格昂贵，同等尺寸的背光源，LED是CCFL价格的4倍。对于目前价格竞争激烈的市场而言，让厂家有些望而却步。只有索尼为了图像质量不计成本。当然，随着工艺的成熟和生产规模的增加，LED背光的成本会逐步下降。

3）用LED作为背光源存在白光的一致性问题，这比起CCFL是个劣势。

4）LED在网点设计上较线性光源CCFL难，需考虑LED辐射状的光强衰减。

5）RGB LED背光源时间一久会产生色移波长会随温度变化，产生不同颜色。

Tuesday, October 7, 2008

视频接口及其它

from zh.wikipedia.org
color space
YUV, YCbCr，是一種顏色編碼方法。
YUV是編譯true-color顏色空間（color space）的種類。「Y」表示明亮度（Luminance、Luma），「U」和「V」則是色度、濃度（Chrominance、Chroma）。
The scope of the terms Y'UV, YUV, YCbCr, YPbPr, etc., is sometimes ambiguous and overlapping. Historically, the terms YUV and Y'UV was used for a specific analog encoding of color information in television systems, while YCbCr was used for digital encoding of color information suited for video and still-image compression and transmission such as MPEG and JPEG. Today, the term YUV is commonly used in the computer industry to describe file-formats that are encoded using YCbCr.

电视制式
PAL制式：
PAL制式是电视广播中色彩编码的一种方法。全名为 Phase Alternating Line 逐行倒相。除了北美，东亚部分地区使用 NTSC制式，中东、法国及东欧采用 SECAM制式以外，世界上大部份地区都是采用 PAL。PAL 由德国人 Walter Bruch 在1967年提出，当时他是为德律风根(Telefunken)工作。“PAL”有时亦被用来指625 线，每秒25格，隔行扫瞄，PAL色彩编码的电视制式。

PAL 发明的原意是要在兼容原有黑白电视广播格式的情况下加入彩色讯号。PAL 的原理与 NTSC 接近。“逐行倒相”的意思是每行扫瞄线的彩色讯号，会跟上一行倒相。作用是自动改正在传播中可能出现的错相。早期的 PAL 电视机没有特别的组件改正错相，有时严重的错相仍然会被肉眼明显看到。近年的电视会把上行的色彩讯号跟下一行的平均起来才显示。这样 PAL 的垂直色彩解像度会低于NTSC 。但由于人眼对色彩的灵敏不及对光暗，因此这并不是明显问题。

PAL 本身是指色彩系統，經常被配以 625線，每秒25格畫面，隔行掃瞄的電視廣播格式

NTSC制式：
NTSC制式，又简称为N制，是1952年12月由美国国家电视标准委员会（National Television System Committee，缩写为NTSC）制定的彩色电视广播标准，两大主要分支是NTSC-J与NTSC-US（又名NTSC-U/C）。

它属于同时制，帧频为每秒29.97fps，扫描线为525，逐行扫描，画面比例为4：3，分辨率为720x480

分辨率
480p 是一种视频显示格式。字母p表示逐行扫描（progressive scan），数字 480 表示其垂直解析度，也就是垂直方向有480条水平线的扫描线；而每条水平线分辨率有640个像素，纵横比（aspect ratio）为4:3，即通常所说的标准电视格式(standard-definition television，SDTV)。帧频通常为30赫兹或者60赫兹。

通常1080p的画面解析度为1920×1080，即一般所说的高解析度电视(HDTV)。

通常720p的画面解像度为1280×720，一般亦可称为HD。

显示接口
VGA端子（其他的名称包括RGB端子，D-sub 15，或mini D15），是一种3排共15针的DE-15。VGA端子通常在电脑的显示卡、显示器及其他设备。是用作传送类比讯号

AV端子（又称复合端子）原文为Composite video connector，是家用影音电器用来传送类比视讯如NTSC、PAL、SECAM）的常见端子。AV端子通常是黄色的RCA端子，另外配合两条红色与白色的RCA端子传送音讯。欧洲的电视机通常以SCART端子取代RCA端子，不过SCART的设计上可以载送画质比YUV更好的RGB讯号，故也被用来连接显示器、电视游乐器或DVD播放机。在专业应用当中，也有使用BNC端子以求获得更佳讯号品质。

在AV端子中传送的是类比电视讯号的三个来源要素：Y、U、V，以及作为同步化基准的脉冲信号。Y代表影像的亮度(luminance，又称brightness)，并且包含了同步脉冲，只要有Y信号存在就可以看到黑白的电视影像（事实上，这是彩色电视与早期黑白电视相容的方法）。U信号与V信号之间承载了颜色的资料，U和V先被混合成一个信号中的两组正交相位（此混合后的信号称为彩度(chrominance)），再与Y信号作加总。因为Y是基频信号而UV是与载波混合在一起，所以这个加总的动作等同于分频多工。

S-端子，或称“独立视讯端子” ，而当中的S是“Separate”的简称。也称为Y/C (或被错误的称为S-VHS和“超级端子”) 。它是一种将视频数据分成两个单独的讯号（光亮度和色度）进行传送的模拟视频讯号，不像合成视频讯号（composite video）是将所有讯号打包成一个整体进行传送。

S-端子能在480i或576i的解析度下工作。

在S-端子中，光亮度(Y; greyscale)的讯号和调制色度(C; colour)的讯号也是由独立的电线或电线组所传送。

在合成视频中，光亮度的讯号是被低通滤波器变成低通滤波，以防以因线路而干扰，因高频率的光亮度资讯及色度讯号的一部分是重叠的。而S-端子把两种讯号分开，这种就不用把光亮度的讯号再转成低通滤波。这样可以给予光亮度的讯号有更大的频宽，也解决了讯号重叠的问题。因此，受干扰的点阵讯号事被排除
但是，影像讯号被分离为亮度与色度两部分，因此S-端子有时也被视为是一种合成影像讯号，但就品质上而言，S-Video是component讯号中最差的一种，远不如其他更为复杂的component影像讯号（如RGB），但较之另外一种模拟信号 CVBS 锐利

目前S-Video的讯号一般采用4 接脚(pin)的mini-DIN连接端子，终端阻抗须为75欧姆

色差端子（Component Video Connector，简体中文译为分量接口）是把类比视频中的明度、彩度、同步脉冲分解开来各自传送的端子
分量传送的视频有许多种方式，例如将三原色直接传送的RGB方式，以及从RGB转换为明度(Y)与色差(Cb/Cr或Pb/Pr)的方式。RGB方式将所有的颜色信息作同等的处理，虽然有最高的画质，但由于RGB方式对传输带宽和储存空间的消耗太大，为节省带宽，使用色差方式来传送与记录分量视频是现在的主流。

色差在设计上利用了“人眼对明度较敏感，而对色度较不敏感”的特性，将视讯中的色彩信息加以削减，转换公式如下：
明度: Y=0.299*R + 0.587*G + 0.114*B
色差: Cb=0.564*(B-Y) = -0.169*R - 0.331*G + 0.500*B
　　 Cr=0.713*(R-Y) = 0.500*R - 0.419*G - 0.081*B

所谓的“色差”即为颜色值与明度之间的差值。转换过后的颜色信息量被删减了约一半，但由于人眼的特性，使得色差处理过后的影像与原始影像的差异很难被察觉。最终的色差数据与RGB数据相比节省了1/3的带宽

DVI的英文全名为Digital Visual Interface，中文称为“数位视讯介面”。是一种视讯介面标准，设计的目标是透过数位化的传送来强化个人电脑显示器的画面品质。目前广泛应用于LCD，数位投影机等显示设备上。此标准由显示业界数家领导厂商所组成的论坛：“数位显示工作小组”（Digital Display Working Group，DDWG）制订。DVI介面可以传送未压缩的数位视频资料到显示装置。本规格部分相容于HDMI标准。

HDMI（英语：High Definition Multimedia Interface），即高清晰度多媒体介面，是一种全数位化影像/声音传送介面，可以传送无压缩的音频信号及视频信号。HDMI提供所有相容装置——如机上盒、DVD播放机、个人电脑、电视游乐器、综合扩大机、数位音响与电视机——一个共通的资料连接管道。HDMI可以同时传送音频和影音信号，由于音频和视频信号采用同一条电缆，大大简化了系统的安装。
HDMI支援各类电视与电脑影像格式，包括SDTV、HDTV视频画面，再加上多声道数位音频。在传送时，各种视频资料将被HDMI收发晶片以“Transition Minimized Differential Signaling”（TMDS）技术编码成资料封包。规格初制订时其最大画素传输率为165Mpx/sec，足以支援1080p画质每秒60张画面，或者UXGA解像度（1600x1200）；后来在HDMI 1.3规格中扩增为340Mpx/sec，以符合未来可能的需求。

HDMI也支援非压缩的8声道数位音频传送（取样率192kHz，资料长度24bits/sample），以及任何压缩音频串流如Dolby Digital或DTS，亦支援SACD所使用的8声道的1bit DSD信号。在HDMI 1.3规格中，又追加了超高资料量的非压缩音频串流如Dolby TrueHD与DTS-HD的支援

DisplayPort 是Video Electronics Standards Association（VESA）推动的数位式视讯介面标准，订定于2006年5月，目前1.1版本订定于2007年4月2日。该介面订定免认证、免授权金，发展中的新型数位式音讯/视讯界面，有意要取代旧有电脑萤幕，或是电脑的家庭剧院界面。
技术规格
10.8 Gbit/s 的频宽，只需单条传输线即可支援 2560×1600 的高解析度显示器。
8B/10B 资料传输

streaming
ITU656
ITU-R Recommendation BT.656, sometimes also called ITU656, describes a simple digital video protocol for streaming uncompressed PAL or NTSC Standard Definition TV (525 or 625 lines) signals. The protocol builds upon the 4:2:2 digital video encoding parameters defined in ITU-R Recommendation BT.601, which provides interlaced video data, streaming each field separately, and uses the YCbCr color space and a 13.5 MHz sampling frequency for pixels.
The standard can be implemented to transmit either 8-bit values (the standard in consumer electronics) or 10-bit values (sometimes used in studio environments). Both a parallel and a serial transmission format are defined. For the parallel format, a 25-pin Sub-D connector pinout and ECL logic levels are defined. The serial format can be transmitted over 75-ohm coaxial cable with BNC connectors, but there is also a fibre-optical version defined.
The parallel version of the ITU-R BT.656 protocol is also used in many TV sets between chips using CMOS logic levels. Typical applications include the interface between a PAL/NTSC decoder chip and a DAC integrated circuit for driving a CRT in a TV set.
Data format
A BT.656 data stream is a sequence of 8-bit or 10-bit bytes, transmitted at a rate of 27 Mbyte/s. Horizontal scan lines of video pixel data are delimited in the stream by 4-byte long SAV (Start of Active Video) and EAV (End of Active Video) code sequences. SAV codes also contain status bits indicating line position in a video field or frame. Line position in a full frame can be determined by tracking SAV status bits, allowing receivers to 'synchronize' with an incoming stream.
Individual pixels in a line are coded in YCbCr format. After an SAV code (4 bytes) is sent, the first 8 bits of Cb (chroma U) data are sent then 8 bits of Y (luma), followed by 8 bits of Cr (chroma V) for the next pixel and then 8 bits of Y. To reconstruct full resolution Y,Cb, Cr pixel values, chroma upsampling must be used.

ITU601
ITU-R Recommendation BT.601, more commonly know by the abbreviations Rec. 601 or BT.601 or its former name, CCIR 601, is a standard published by the CCIR (now ITU-R) for encoding interlaced analogue video signals in digital form. It includes methods of encoding 525 line 60 Hz and 625-line 50 Hz signals, both with 720 luminance samples and 360 chrominance samples per line. The colour encoding system is known as YUV 4:2:2, that being the ratio of Y:Cb:Cr samples (luminance data:blue chroma data:red chroma data). For a pair of pixels, the data are stored in the order Y1:Y2:Cb:Cr, with the chrominance samples co-sited with the first luminance sample.
The CCIR 601 signal can be regarded as if it is a digitally encoded analog component video signal, and thus includes data for the horizontal and vertical sync and blanking intervals. Regardless of the frame rate, the luminance sampling frequency is 13.5 MHz. The luminance sample is at least 8 bits, and the chrominance samples are at least 4 bits each.