Abstract
A method to both reduce energy and improve performance in a processor-based embedded system is described in this paper. Comprising of a scratchpad memory instead of an instruction cache, the target system dynamically (at runtime) copies into the scratchpad code segments that are determined to be beneficial (in terms of energy efficiency and/or speed) to execute from the scratchpad. We develop a heuristic algorithm to select such code segments based on a metric, called concomitance. Concomitance is derived from the temporal relationships of instructions. A hardware controller is designed and implemented for managing the scratchpad memory. Strategically placed custom instructions in the program inform the hardware controller when to copy instructions from the main memory to the scratchpad. A novel heuristic algorithm is implemented for determining locations within the program where to insert these custom instructions. For a set of realistic benchmarks, experimental results indicate the method uses 41.9% lower energy (on average) and improves performance by 40.0% (on average) when compared to a traditional cache system which is identical in size.