Justin Schoeman wrote: > > Benedict Bridgwater wrote: > > > > Justin Schoeman wrote: (regarding RGB/YUV conversion) > > > > > > I actually benchmarked these two versions a while ago, and it turns out > > > that on a reasonably new PC (K6/PII or up), fixed point arithmetic is > > > quite a bit faster than tables - It turns out that a fixed point > > > multiply is a lot quicker than a memory access (especially when taking > > > the heavy cache usage of image processing into account). On older PCs > > > (Pentium/K5/etc) the table version is quicker. > > > > Justin, do you remember if this difference would show in a simple > > conversion loop benchmark, or did it have to be in context of an actual > > image conversion with corresponding cache usage? > > > > Ben > > I didn't test it in simple loops, but the difference will definitely be > more marked with real images (due to cache contention for the look-up > tables). As a rough estimate, I would expect the two routines to come > out approximately equal on simple conversion loops (a L-1 cache hit is > about as expensive as an integer multiply on newer CPUs). One area > where tables can help is the final downscale and clamp operation. > Conditional jumps based on (relatively) random data can be very hard on > deeply pipelined CPUs, while a table lookup will usually hit L-1 > cache... Thanks for the info - I'll redo my conversion code when I get a chance (I don't want to use MMX since I want to keep it portable). I was too lazy to use lookup for the clamping since I couldn't be bothered to figure out how much of a "guard band" was necesary outside of the legal (clamped) range. Is there a simple way to figure this other than experimentally? Ben