Abstract
This thesis presents methodologies for improving system performance and energy consumption
by optimizing the memory hierarchy performance. The processor-memory performance
gap is a well-known problem that is predicted to get worse, as the performance
gap between processor and memory is widening. The author describes a method to estimate
the best L1 cache configuration for a given application. In addition, three methods are presented
to improve the performance and reduce energy in embedded systems by optimizing
the instruction memory.
Performance estimation is an important procedure to assess the performance of the system
and to assess the effectiveness of any applied optimizations. A cache memory performance
estimation methodology is presented in this thesis. The methodology is designed to
quickly and accurately estimate the performance of multiple cache memory configurations.
Experimental results showed that the methodology is on average 45 times faster compared
to a widely used tool (Dinero IV).
The first optimization method is a software-only method, called code placement, was
implemented to improve the performance of instruction cache memory. The method involves
careful placement of code within memory to ensure high cache hit rate when code
is brought into the cache memory. Code placement methodology aims to improve cache hit
rates to improve cache memory performance. Experimental results show that by applying
the code placement method, a reduction in cache miss rate by up to 71%, and energy consumption
reduction of up to 63% are observed when compared to application without code
placement.
The second method involves a novel architecture for utilizing scratchpad memory. The scratchpad memory is designed as a replacement of the instruction cache memory. Hardware
modification was designed to allow data to be written into the scratchpad memory
during program execution, allowing dynamic control of the scratchpad memory content.
Scratchpad memory has a faster memory access time and a lower energy consumption per
access compared to cache memory; the usage of scratchpad memory aims to improve performance
and lower energy consumption of systems compared to system with cache memory.
Experimental results show an average energy reduction of 26.59% and an average
performance improvement of 25.63% when compared to a system with cache memory.
The third is an application profiling method using statistical information to identify
application s hot-spots. Application profiling is important for identifying section in the
application where performance degradation might occur and/or where maximum performance
gain can be obtained through optimization. The method was applied and tested on
the scratchpad based system described in this thesis. Experimental results show the effectiveness
of the analysis method in reducing energy and improving performance when
compared to previous method for utilizing the scratchpad memory based system (average
performance improvement of 23.6% and average energy reduction of 27.1% are observed).