Modern Application Specific Instruction Set Processors (ASIPs) have customizable caches, where the size, associativity and line size can all be customized to suit a particular application. To find the best cache size suited for a particular embedded system, the applications) is/are executed, traces obtained, and caches simulated. Typically, program trace files can range from a few megabytes to several gigabytes. Simulation of cache performance using large program trace files is a time consuming process. In this paper, a novel instruction cache simulation methodology that can operate directly on a compressed program trace file without the need for decompression is presented. This feature allowed our simulation methodology to have an average speed up of 9.67 times compared to the existing state of the art tool (Dinero IV cache simulator), for a range of applications from the Mediabench suite. © 2007 EDAA.