Most computer architecture research involves investigating trade-offs between various alternatives. This can not be done adequately without a firm grasp of the costs of each alternative. For example, it is impossible to compare two different cache organizations without considering the difference in access or cycle times. Similarly, the chip area and power requirements of each alternative must be taken into account. Only when all the costs are considered can an informed decision be made.
Unfortunately, it is often difficult to determine costs. One solution is to employ analytical models that predict costs based on various architectural parameters. In the cache domain, both chip area models [1] and access time models [2] have been published.
In [2], Wada et al. present an equation for the access time of an on-chip cache as a function of various cache parameters (cache size, associativity, block size) as well as organizational and process parameters. Unfortunately, Wada's access time model has a number of significant shortcomings. For example, the cache tag and comparator in set-associative memories are not modeled, and in practice, these often constitute the critical path. Each stage in their model (e.g., bitline, wordline) assumes that the inputs to the stage are step waveforms; actual waveforms in memories are far from steps and this can greatly impact the delay of a stage. In the Wada model, all memory subarrays are stacked linearly in a single file; this can result in aspect ratios of greater than 10:1 and overly pessimistic access times. Wada's decoder model is a gate-level model which contains no wiring parasitics. In addition, transistor sizes in Wada's model are fixed independent of the load. For example, the wordline driver is always the same size independent of the number of cells that it drives. Finally, Wada's model predicts only the cache access time, whereas both the access and cycle time are important for design comparisons.
This paper describes a significant improvement and extension of Wada's access time model. The enhanced model is called CACTI. Some of the new features are:
The enhancements to Wada's model can be classified into two categories. First, the assumed cache structure has been modified to more closely represent real caches. Some examples of these enhancements are the column multiplexed bitlines and the inclusion of the tag array. The second class of enhancements involve the modeling techniques used to estimate the delay of the assumed cache structure (e.g. taking into account non-step input rise times). This paper describes both classes of enhancements. After discussing the overall cache structure and model input parameters, the structural enhancements are described in Section 4. The modeling techniques used are then described in Section 5. A complete derivation of all the equations in the model is far beyond the scope of a journal paper but is available in a technical report [3].
Any model needs to be validated before the results generated using the model can be trusted. In [2], a Hspice model of the cache was used to validate the authors' analytical model. The same approach was used here; Section 6 compares the model predictions to Hspice measurements. Of course, this only shows that the analytical model matches the Hspice model; it does not address the issue of how well the assumed cache structure (and hence the Hspice model) reflects a real cache design. When designing a real cache, many different circuit implementations are possible. In architecture studies, however, the relative differences in access/cycle times between different cache sizes or configurations are usually more important than absolute access times. Thus, even though our model predicts absolute access times for a specific cache implementation, it can be used in a wide variety of situations when an estimate of the cost of varying an architectural parameter is required. Section 7 gives some examples of how the model can be used in architectural studies.
The model described in this paper has been implemented, and the software is available via ftp. Appendix A explains how to obtain and use the CACTI software.