Engineering
Engineering
56 results
Filters
Availability

Publication Year

Publication Search Results
Sort
Results per page
Now showing
1 - 10 of 56
-
(2006) Koh, Shannon; Diessel, OliverConference PaperOn-going improvements in the scaling of FPGA device sizes and time-to-market pressures encourage the use of module-oriented design flows [3], while economic factors favour the reuse of smaller devices for high performance computational tasks. One of the core problems in proposing dynamic modular reconfiguration approaches is supporting the differing communications needs of the sequence of modules configured over time [2]. Proposals to date have not focussed on communications issues. Moreover, they have advocated the use of specific protocols [4], or they cannot be readily implemented [1], or they suffer from high overheads [5], or rely upon deprecated features such as tri-state lines [7]. In contrast, we propose a methodology for the rapid deployment of a communications infrastructure that provides the wires required by dynamic modules and allows users to implement the protocols they want. Our aim is to support new tiled dynamically reconfigurable architectures such as Virtex-4, as well as mature device families.
-
(2006) Malik, Usama; Diessel, OliverConference PaperIn line with Shannon's ideas, we define the entropy of FPGA reconfiguration to be the amount of information needed to configure a given circuit onto a given device. We propose using entropy as a gauge of the maximum configuration compression that can be achieved and determine the entropy of a set of 24 benchmark circuits for the Virtex device family. We demonstrate that simple off-the-shelf compression techniques such as Golomb encoding and hierarchical vector compression achieve compression results that are within 1-10% of the theoretical bound. We present an enhanced configuration memory system based on the hierarchical vector compression technique that accelerates reconfiguration in proportion to the amount of compression achieved. The proposed system demands little additional chip area and can be clocked at the same rate as the Virtex configuration clock.
-
(2006) Koh, Lih; Diessel, OliverConference PaperBypass delays are expected to grow beyond 1ns as technology scales. These delays necessitate pipelining of bypass paths at processor frequencies above 1GHz and thus affect the performance of sequential code sequences. We propose dealing with these delays through a dynamic functional unit chaining approach. We study the performance benefits of a superscalar, out-of-order processor augmented with a two-by-two array of ALUs interconnected by a fast, partial bypass network. An online profiler guides the automatic configuration of the network to accelerate specific patterns of dependent instructions. A detailed study of benchmark simulations demonstrates these first steps towards mapping binaries to a small coarse-grained array at runtime can improve instruction throughput by over 18% and 25% when the microarchitecure includes bypass delays of one cycle and two cycles, respectively.
-
(2006) Koh, Shannon; Diessel, OliverConference Paper
-
(2006) Altermatt, Pietro; Schenk, Andreas; Heiser, GernotJournal ArticleA parametrization of the density of states (DOS) near the band edge of phosphorus-doped crystalline silicon is derived from photoluminescence and conductance measurements, using a recently developed theory of band gap narrowing. It is shown that the dopant band only `touches` the conduction band at the Mott (metal-insulator) transition and that it merges with the conduction band at considerably higher dopant densities. This resolves well-known contradictions between conclusions drawn from various measurement techniques. With the proposed DOS, incomplete ionization of phosphorus dopants is calculated and compared with measurements in the temperature range from 300 to 30 K. We conclude that (a) up to 25% of dopants are nonionized at room temperature near the Mott transition and (b) there exists no significant amount of incomplete ionization at dopant densities far above the Mott transition. In a forthcoming part II of this paper, equations of incomplete ionization will be derived that are suitable for implementation in device simulators. (c) 2006 American Institute of Physics.
-
(2006) Altermatt, Pietro; Schenk, Andreas; Schmithuesen, B; Heiser, GernotJournal ArticleBuilding on Part I of this paper [Altermatt , J. Appl. Phys. 100, 113714 (2006)], the parametrization of the density of states and of incomplete ionization (ii) is extended to arsenic- and boron-doped crystalline silicon. The amount of ii is significantly larger in Si:As than in Si:P. Boron and phosphorus cause a similar amount of ii although the boron energy level has a distinctly different behavior as a function of dopant density than the phosphorus level. This is so because the boron ground state is fourfold degenerate, while the phosphorus ground state is twofold degenerate. Finally, equations of ii are derived that are suitable for implementation in device simulators. Simulations demonstrate that ii increases the current gain of bipolar transistors by up to 25% and that it decreases the open-circuit voltage of thin-film solar cells by up to 10 mV. The simulation model therefore improves the predictive capabilities of device modeling of p-n-junction devices.
-
(2006) Zhu, Liming; Gorton, Ian; Liu, Yan; Bui, BaoConference PaperWeb services solutions are being increasingly adopted in enterprise systems. However, ensuring the quality of service of Web services applications remains a costly and complicated performance engineering task. Some of the new challenges include limited controls over consumers of a service, unforeseeable operational scenarios and vastly different XML payloads. These challenges make existing manual performance analysis and benchmarking methods difficult to use effectively. This paper describes an approach for generating customized benchmark suites for Web services applications from a software architecture description following a Model Driven Architecture (MDA) approach. We have provided a performance-tailored version of the UML 2.0 Testing Profile so architects can model a flexible and reusable load testing architecture, including test data, in a standards compatible way. We extended our MDABench [27] tool to provide a Web service performance testing “cartridge” associated with the tailored testing profile. A load testing suite and automatic performance measurement infrastructure are generated using the new cartridge. Best practices in Web service testing are embodied in the cartridge and inherited by the generated code. This greatly reduces the effort needed for Web service performance benchmarking while being fully MDA compatible. We illustrate the approach using a case study on the Apache Axis platform.
-
(2006) Lee, Cathryn; Gaeta, Bruno; Malming, H; Bain, Michael; Sewell, William; Collins, AndrewJournal ArticleWe have used a bioinformatics approach to evaluate the completeness and functionality of the reported human immunoglobulin heavy-chain IGHD gene repertoire. Using the hidden Markov-model-based iHMMune-align program, 1,080 relatively unmutated heavy-chain sequences were aligned against the reported repertoire. These alignments were compared with alignments to 1,639 more highly mutated sequences. Comparisons of the frequencies of gene utilization in the two databases, and analysis of features of aligned IGHD gene segments, including their length, the frequency with which they appear to mutate, and the frequency with which specific mutations were seen, were used to determine the reliability of alignments to the less commonly seen IGHD genes. Analysis demonstrates that IGHD4-23 and IGHD5-24, which have been reported to be open reading frames of uncertain functionality, are represented in the expressed gene repertoire; however, the functionality of IGHD6-25 must be questioned. Sequence similarities make the unequivocal identification of members of the IGHD1 gene family problematic, although all genes except IGHD1-14*01 appear to be functional. On the other hand, reported allelic variants of IGHD2-2 and of the IGHD3 gene family appear to be nonfunctional, very rare, or nonexistent. Analysis also suggests that the reported repertoire is relatively complete, although one new putative polymorphism (IGHD3-10*p03) was identified. This study therefore confirms a surprising lack of diversity in the available IGHD gene repertoire, and restriction of the germline sequence databases to the functional set described here will substantially improve the accuracy of IGHD gene alignments and therefore the accuracy of analysis of the V-D-J junction.
-
(2006) Bain, Michael; Ahsan, Nasir; Potter, John; Gaeta, Bruno; Temple, Mark; Dawes, IanConference Paper
-
(2006) Janapsatya, Andhi; Ignjatovic, Aleksandar; Parameswaran, SriJournal ArticleA method to both reduce energy and improve performance in a processor-based embedded system is described in this paper. Comprising of a scratchpad memory instead of an instruction cache, the target system dynamically (at runtime) copies into the scratchpad code segments that are determined to be beneficial (in terms of energy efficiency and/or speed) to execute from the scratchpad. We develop a heuristic algorithm to select such code segments based on a metric, called concomitance. Concomitance is derived from the temporal relationships of instructions. A hardware controller is designed and implemented for managing the scratchpad memory. Strategically placed custom instructions in the program inform the hardware controller when to copy instructions from the main memory to the scratchpad. A novel heuristic algorithm is implemented for determining locations within the program where to insert these custom instructions. For a set of realistic benchmarks, experimental results indicate the method uses 41.9% lower energy (on average) and improves performance by 40.0% (on average) when compared to a traditional cache system which is identical in size.