CSE141 LAB#2
-
When the cache size is 4 bytes (one word), most of the reads should be
misses. For write accesses, the program never writes to the memory location
where it did the last read access. Therefore, all write accesses should
create a write miss. The results are as expected.
| Accesses
|
Reads
|
Writes
|
Read Miss
|
Write Miss
|
Write Back
|
Drity Left
|
| 121118
|
84744
|
36374
|
75871
|
36374
|
36374
|
0
|
-
A cache of 16384 bytes can hold all of the data referenced by the program.
Therefore, there must be only one read miss, for the first read access.
Since all data can be stored in one cache line, we expect to see no write
miss and write backs. The results are as expected.
| Accesses
|
Reads
|
Writes
|
Read Miss
|
Write Miss
|
Write Back
|
Drity Left
|
| 121118
|
84744
|
36374
|
1
|
0
|
0
|
1
|
-
For this case, cache can still hold all the data, however there are 16000/4
= 2000 cache lines to be filled by the program. Which means we should see
2000 read misses and no write misses since every data is read before written
back.
Whoops!, I realized a mistake in the trace of the heap sort. Some of the
references shouldn't be there. Therefore, result come out to be a little
larger.
| Accesses
|
Reads
|
Writes
|
Read Miss
|
Write Miss
|
Write Back
|
Drity Left
|
| 121118
|
84744
|
36374
|
3353
|
0
|
0
|
1673
|
-
This part of the question asks for the results of your cache simulator
for hepsort with different cache parameters in order to evaluate correcness
of your cache simulator. If your results are not consistent with the numbers
below, your cache simulator must have some bugs. Cache size is 2048 bytes.
| Block Size
|
Associativity
|
Read Miss
|
Write Miss
|
Write Back
|
Dirty Left
|
| 16
|
1
|
7173
|
24
|
4710
|
118
|
| 64
|
2
|
5648
|
0
|
3756
|
19
|
| 64
|
32
|
5457
|
0
|
3601
|
20
|
-
Cace Size = 2048, Associativity = 2
Merge sort: Number of accesses = 107192, Reads = 63288, Writes =
43904
| Block Size
|
read miss
|
write miss
|
dirty write
|
Total miss
|
Dirty Left
|
Miss Penalty
|
Stalls
|
| 8
|
6946
|
6826
|
7837
|
21609
|
128
|
13
|
280917
|
| 16
|
3490
|
3417
|
3926
|
10833
|
64
|
14
|
151662
|
| 32
|
1801
|
1741
|
1996
|
5538
|
32
|
16
|
88608
|
| 64
|
1133
|
1003
|
1131
|
3267
|
16
|
20
|
65340
|
| 128
|
1394
|
924
|
991
|
3309
|
8
|
28
|
92652
|
| 256
|
2645
|
1441
|
1477
|
5563
|
4
|
44
|
243188
|
| 512
|
5386
|
2754
|
2776
|
10916
|
2
|
76
|
829616
|
| 1024
|
14309
|
7361
|
7407
|
29077
|
1
|
140
|
4070780
|
Best cache block size is 64 bytes.
-
Cace Size = 2048, Associativity = 2
Heap sort: Number of accesses = 121118, Reads = 84744, Writes = 36374
| Block Size
|
read miss
|
write miss
|
dirty write
|
Total miss
|
Dirty Left
|
Mss Penalty
|
Stalls
|
| 8
|
7647
|
0
|
4916
|
12563
|
185
|
13
|
163319
|
| 16
|
6879
|
0
|
4492
|
11371
|
89
|
14
|
159194
|
| 32
|
6148
|
0
|
4028
|
10176
|
42
|
16
|
162816
|
| 64
|
5648
|
0
|
3756
|
9404
|
19
|
20
|
188080
|
| 128
|
5372
|
0
|
3648
|
9020
|
10
|
28
|
252560
|
| 256
|
5502
|
0
|
3816
|
9318
|
5
|
40
|
409992
|
| 512
|
6740
|
0
|
4895
|
11635
|
3
|
76
|
884260
|
| 1024
|
8258
|
0
|
5901
|
14159
|
2
|
140
|
1982260
|
Best cache block size is 16 bytes.
- Total number of memory references = 107192 / (40%) = 267980
-
Cache Size: 2048, Block Size: 64
| Assoc
|
read miss
|
write miss
|
writ
|
Total miss
|
Dirty Left
|
Miss Penalty
|
Time
(ms)
|
| 1
|
4807
|
2185
|
2507
|
9499
|
3
|
10
|
4.58
|
| 2
|
1133
|
1003
|
1131
|
3267
|
16
|
11
|
3.67
|
| 4
|
919
|
906
|
1029
|
2854
|
16
|
12
|
3.90
|
| 8
|
926
|
931
|
1057
|
2914
|
16
|
13
|
4.24
|
| 16
|
952
|
950
|
1092
|
2994
|
16
|
14
|
4.59
|
| 32
|
955
|
961
|
1101
|
3017
|
16
|
15
|
4.92
|
-
Total Accesses: 20502, Reads: 10240, Writes: 10262
| cache size
|
read miss
|
write miss
|
dirty write
|
Total miss
|
Dirty Left
|
Miss penalty
|
Stalls
|
| 256
|
7104
|
4171
|
7175
|
18450
|
4
|
20
|
369000
|
| 512
|
7104
|
3147
|
7171
|
17422
|
8
|
20
|
348440
|
| 1024
|
7101
|
121
|
7158
|
16380
|
16
|
20
|
327600
|
| 2048
|
7099
|
1095
|
7138
|
15332
|
32
|
20
|
306640
|
| 4096
|
0
|
64
|
0
|
64
|
64
|
20
|
1280
|
| 8192
|
0
|
64
|
0
|
64
|
64
|
20
|
1280
|
-
-
Merge sort has fewer memory references.
- Merge sort has better cache performance for reasonable
cache line sizes.
- Yes. The difference in cache performance might affect which
algorithm is faster.
- The experiments do not show a "best" block size. However, 32 or
64 are probable candidates for being a "best" block size.
- In part 8, we see that number of misses increases slightly with
increasing associativity. This contradicts the graphs in the
book.
- The experiment in part 9 shows that the ab benchmark program
requires it's working set of data to be in the cache. There is a
certain point up to which the performance does not significantly
change, however, after that point the performance drops
drastically. Therefore, MFLOPS rate should decreases with increasing
size of the problem.