> You need to be cache optimised - you want to code each tile through > multiple passes at a time to get best performance. On decode thats even > more critical Which exact part should be cache optimized? What about optimizations for vector processing instructions? I spoke of converting a YUV stream of images to mpeg2. That is where you will spend 99% of your processor time, so optimize that. While you are cache optimizing the code the entire way through, it would make a lot more sense to simply optimize the parts that are the most difficult. If you really want to rip, use the SSE2 and MMX or 3dNow! instruction sets and hand-code the expensive sections. But that sort of coding takes a while, and all I want is an interchangeable codec system in place now so I can play around with it. If you really want performance then you need to be writing directly to or reading from video memory, and using the video card to do hardware blitting and scaling for stuff. Performance isn't the top priority, ease of use and correctness come first. Alan, it seems you are jumping to optimization before a system is even in place. Thanks for the heads up about xine, though. I will look through it tonight. About half of my senior project would have been done already if we had direct show in linux (Actually, that was about 99% of it). Something else came to mind. Basically, v4l needs a user-space scaling and image format conversion system, and here is where that could be. Most of what I spoke of before was talking about (in the abstract sense) a filter system. Data goes in one end of the filter and comes out another end of the filter compressed or decompressed. The scaling and conversions could just be added stuff that looked (to an application) just like another codec (which scaling could be though of as really simple compression/decompression). Chris