Gerd Knorr wrote: > > * new field handling, the current v4l2 spec a bit confusing about > this. There is now: > > enum v4l2_field { > V4L2_FIELD_ANY = 0, /* driver can choose from none, > top, bottom, interlaced > depending on whatever it thinks > is approximate ... */ > V4L2_FIELD_NONE = 1, /* this device has no fields ... */ > V4L2_FIELD_TOP = 2, /* top field only */ > V4L2_FIELD_BOTTOM = 3, /* bottom field only */ > V4L2_FIELD_INTERLACED = 4, /* both fields interlaced */ > V4L2_FIELD_SEQUENTIAL = 5, /* both fields sequential into one > buffer */ > V4L2_FIELD_ALTERNATE = 6, /* both fields alternating into > separate buffers */ > }; > This is not complete. V4L2_FIELD_INTERLACED means what? top-field-first? > bottom-field-first? I'd rather see one extra flag being added, > V4L2_FIELD_INTERLACED_TOP_FIELD_FIRST and _BOTTOM_FIELD_FIRST, or > whatever... For quite some devices (such as the ones supported by the > zoran driver), the field order is programmable, and applications can use > this. Guess what, we discussed that. Here's an addendum. There are two aspects, spatial and temporal order. Spatial order matters to correctly combine the fields into a frame. One field goes "on top" of the other, hence the talk of top and bottom fields. Temporal order determines which field has been transmitted/captured first, this is important for motion analysis in deinterlacers and codecs. We can have flags to indicate field order, or as Gerd suggested, the spec requires a particular order. So interlaced and sequential mode must store top first and newest first when NTSC-M; top first and oldest first otherwise. Applications in need of this information will have to check the current video standard. Alternating mode must store fields in transmitted/captured order. To determine the spatial order v4l2_buffer already flags the field parity. Cropping will be restricted to frame lines mod 2 because moving down one line swaps fields, i.e. temporal order. > what about the video output stuff btw. Does anybody use that? Of course. There are plenty of cards with video out or vga to tv. I know at least one driver using this part of the api. I think restoring output later, perhaps even in 2.7, will be much regretted. > * drop zoom stuff > use crop_* instead (whilch needs some review + exact specification) > * The crop/scaling thing needs some work, Michael is busy with that. Here's a brief explanation how zoom works and why I think it should be replaced. For cropping and scaling a source and target rectangle is assumed. v4l2_zoomcap defines the scaling limits, minimum and maximum width and height of the source rectangle. The spec states that "maxwidth and maxheight represent the total size of the raw device image.", i.e. what you normally get without zooming. The location of this rectangle over active video is undefined, and since the zoom ioctls are entirely optional this doesn't answer the question which part of the picture capturing will yield, or what the pixel aspect is. v4l2_zoom defines the width and height of a "capture subrectangle". This is the size of the source rectangle cropped out of the picture, then scaled up to maxwidth and maxheight, then down to v4l2_pix_format width and height. Figuratively speaking. The x and y fields define the "centre of capture subrectangle in device co-ordinates" and "maxwidth / 2 is the centre of the image." "Centre of the image" is not defined, but one can assume the origin of x and y is the centre of active video. It follows this is also the centre of the v4l2_pix_format source, the v4l2_zoomcap minimum-maximum rectangle. The units of x and y and width and height are intentionally not required to be frame lines or pixels or samples but only 1/maxwidth and 1/maxheight since video hardware differs in resolution. Centre co-ordinates are ambiguous. When width and height are even, x and y fall between four pixels and the image can be exactly positioned relative to the active video centre. When the dimensions are odd, which isn't forbidden by the API, off-by-one errors can result. Centring the raw device image also prevents orthogonal cropping. For example there are more scan lines above the active video centre than below, the vertical blanking. Still we don't know anything about the sampling frequency or vertical resolution and thus the pixel aspect. This is vital information to properly scale the image for display. Due to these flaws, nobody apparently using v4l2 zoom and two different ways of cropping and scaling unnecessarily complicating the API, removing zoom is justified. It cannot even remain optional because every choice left by the API leads to applications and/or drivers implementing both ways to remain compatible with their respective counterpart. Let's keep it simple. The cropping / scaling api assumes a source and target rectangle. For video capture drivers the source is the sampled picture, target is the captured or overlaid image. Output drivers reverse source and target, the api is used accordingly. Basically we assume the driver can capture within an arbitrary window. Its bounds are defined in v4l2_cropcap, giving the co-ordinates of the top left corner and its width and height. This is less ambiguous than co-ordinates of two opposite corners. The origin and units of the co-ordinate system are arbitrary, possibly 13.5 MHz samples and frame lines. We assume scaling and cropping always happens, from an arbitrary source rectangle within the capture window up or down to the target rectangle. The source rectangle is defined by v4l2_crop, giving the co-ordinates of the top left corner and its width and height using the same co-ordinate system as v4l2_cropcap. The target rectangle is given either by v4l2_pix_format width and height or by v4l2_window x, y, width and height. Scaling always happens in the sense that drivers not supporting this still scale by 1:1 in both directions. When scaling is supported, but not cropping the source rectangle will be fixed at the capture window size. When cropping is supported, but not scaling the source rectangle width and height must be the same as the target size. struct v4l2_cropcap shall contain the default source rectangle size, given as v4l2_crop does. This default source is supposed to be centred over the active picture area. The spec will suggest particular values unless the devices, for example a video camera, requires deviation. Purpose is to align images captured with different devices. A new addition defines the pixel aspect. The contents of v4l2_cropcap may change with video standard and perhaps other properties yet to be defined. struct v4l2_crop and the VIDIOC_G_CROP and VIDIOC_S_CROP ioctls will be mandatory only if the driver supports cropping. When cropping is not supported both ioctls shall return -EINVAL. The application can query the current source rectangle or request different dimensions. v4l2_cropcap and VIDIOC_CROPCAP must be supported by scaling drivers to calculate the scale factor. It should be supported by all drivers to query the pixel aspect. When cropping, hardware may not permit arbitrary locations, sizes and aspect ratios. VIDIOC_S_CROP must return the closest values possible. When scaling the hardware may not permit arbitrary scaling factors, perhaps depending on cropping parameters. To accommodate for this, the spec will require that the *opposite* rectangle is modified, i.e. the driver proposes values closest to the previously requested dimensions, the hardcoded default if nothing else. Suppose the application wants to capture a particular area of the picture. It may not get square pixels because the hardware does not support the sampling frequency or scaling. But the application may not care if the image is squeezed as long as it gets the requested area. So the target size is adjusted. On the other hand an application may want a particular image size, say for MPEG encoding. When the driver cannot scale the image to exactly the requested target size, the application does not care if the image must be cropped or padded. So it asks for a target size and the source rectangle is adjusted. To determine both source and target size the application can request a source size, then a target size, then check if the source is still ok. This can be repeated until acceptable cropping and scaling parameters have been negotiated. Using a single structure containing both source and target dimensions is not practical because the driver wouldn't know which value to adjust to satisfy the others. Moreover the scheme fits nicely when cropping and/or scaling are not supported. Without cropping the source rectangle is fixed and only the target size will be modified. This is already requested by the spec for v4l2_pix_format, to return the closest values possible. One could say the driver modified the source, found this won't work, and thus changed it back to default, accordingly modifying the target. When scaling is not supported but cropping, the reverse applies. About the pixel aspect. The _picture_ aspect (like 4:3) is to be derived from the video standard. Assumed the driver samples square pixels and the default source rectangle size is 640 x 480 we can calculate the pixel aspect (y/x) as: 640 x 3:4 / 480 = 1/1. Other drivers sampling at, and only at, 13.5 MHz (ITU-R Rec. 601) may by default capture a non-square pixel image of 720 x 480. It is important to note that 720 non-square pixels cover more information than 640 square pixels, the aspect is not 720/640. I pondered defining a "clean aperture" size, the size the image would have when covering the same area as a square pixel image. In this case 704 x 480, giving: 704 x 3:4 / 480 = 11/10. Obviously we cannot take the capture window or default source window dimensions as clean aperture. Problem with this approach is that PAL/SECAM pixel aspects cannot be expressed as clean aperture sizes using integers. So before we resort to numerator/denominator pairs we can as well store the pixel aspect directly. struct v4l2_cropcap { __u32 bounds_left; __u32 bounds_top; __u32 bounds_width; __u32 bounds_height; __u32 default_left; __u32 default_top; __u32 default_width; __u32 default_height; struct { __u32 numerator; __u32 denominator; } pixel_aspect; } Again, bounds_* (until someone suggest a better name) is the possible crop area. default_* is the default crop rectangle (source for capture devices, target for output devices), centred over the active picture. pixel_aspect is the pixel aspect of the captured image when cropping the default and scaling 1:1. There was a capabilities field (can crop, scale up, scale down etc) but this information is rather useless. Applications should just ask the driver how close it can get. struct v4l2_crop { __u32 left; __u32 top; __u32 width; __u32 height; } BTW we still lack a brilliant idea to reliable associate v4l2 and audio devices, mixer and pcm. I'm talking about video devices with audio sampling like the bt878, not audio cables connecting to the soundcard. Michael