Re: Linux Codec API — Video for Linux

> Which exact part should be cache optimized?  What about optimizations for 

You need to be aware of the cache at all time. The L1 cache is about 70
times faster than main memory, L2 about 9 times faster. 

> the most difficult.  If you really want to rip, use the SSE2 and MMX or 
> 3dNow! instruction sets and hand-code the expensive sections.  But that sort 
> of coding takes a while, and all I want is an interchangeable codec system in 
> place now so I can play around with it.   

Thing the bigger picture,

Lets take

	MPEG2->YUV->Filter->CU30

you don't want to 

	copy entire image into YUV format start->end with the end
		kicking the start out of the cache.
	filter on cold cache
	do cu30 from cold cache

You want to do

	foreach(tile)
		mpeg2 decode tile, filter time, cu30 tile all in L1/L2
	

> If you really want performance then you need to be writing directly to or 
> reading from video memory, and using the video card to do hardware blitting 

Out of date. On a modern AGP card you want the card to do the fetches,
clips and overlaying. You certainly don't want to write to video ram

> Alan, it seems you are jumping to optimization before a system is even in 
> place.  Thanks for the heads up about xine, though.  I will look through it 
> tonight.

The danger is in designing a system that can't be made to go fast. Tiling is
a very well established graphics technique

> About half of my senior project would have been done already if we had direct 
> show in linux (Actually, that was about 99% of it).

Nice.

> system.  Data goes in one end of the filter and comes out another end of the 
> filter compressed or decompressed.  The scaling and conversions could just be 
> added stuff that looked (to an application) just like another codec (which 
> scaling could be though of as really simple compression/decompression).  Chris

Agreed.