I chose to use Array's as the structure for quick lookup. Either arrays or Dictionary's would have done well here. Both have O(1) lookup. I chose Arrays because the number of elements is fixed and the key value is an Integer. So an Array would be faster in this case because it doens' thave the overhead of a hashing algorithmn. Jared Parsons
I understand the 2 answers...pretty similar to what i had in mind. But I had a question for (a). Since read is a SensorStream how is it that it understands 'displayAt:' since that is what a string understands. I mean for the second example read inform:'ALARM' is logical because everything is an object however not everything in squeak is a String. Just wanted to know excatly why 'read displayAt:point' would work for question 1. Thanks
in Jared Parsons answer to part a, should this line (read = 0)
actually be (read = '0')? because read is a string not a number? Derek Perry
Derek, I think Jared is right here to use (read = 0) because the question states "SensorStream next will always return an integer between 0 and 65536". Also, wouldn't using a bag data structure for part (b) be more appropriate because bags just keep track of the number of the same element? Although I guess using any hashing data structure should be quick enough. Modifying Jared's code for part b to use bags:
In response to Jesse Sheih's comment I don't think the bag is ultamitely the best choice for this particular project. There is a fixed number of elements so using a Hashtable(which a bag does) ultamitely becomes a simple waste of space. And since an Arrays at: put: is a primitive it's also faster. Jared Parsons
But creating an array space for 255 spaces all with 0 values isn't wasting space?... I don't think arrays are _that_ much faster than bags. If you look at the optimization slides, for finding specific elements, dictionaries are super fast. In fact when using the add: method on a bag/dictionary, I believe it just calls at:put: which I believe is a primitive also in the case of a bag/dictionary. And since part b is just keeping track of the number of times a sensor happens, a bag is logically the perfect choice :) Alfred Park
I don't believe Bags are the best choice for performanace and space reasons. I'm giong to make my discussion relevant to Dictinaries since Bags main storage Object is a Dictionary.
First Dictionaries start off with an array of 4 which is much less than 255. So I will agree that Arrays waste more space in the begining. However, once you add just a few small values the Dictionary has to rehash. And Dictionary's Arrays double in size after every rehash. So adding an equal number of elements would alomst certainly cause the Dictionary to have more wasted space because a Dictionary Array size >= (num elements) x 2. Sorry for the "x", the coweb won't let me put the multiply operator in there. So in the end Dictionary's waste more space than Arrays(assuming that all values are added of course which is not always the case).
Also finding elements may be fast. But it's slower than with an Array. Dictinary at: has to go through several checks before it can return an element. Pull out the Dictionary code and trace getting a value back from at:. It's a very deep call.
Also, add: in Bag calls at: put: on the Dictionary. at:put: is not a primitive in Dictionary. The Dictionary will eventually call at:put: on it's internal array but it has to do a lot of checking, possibly grow(rehash) before it can add to the Array. It's much more complicated than you would imagine. The Arrays at: put: however is just a primitive.
And just to please the Squeakers out there.
a := Array new: 255.
d := Dictionary new.
b := Bag new.
MessageTally time: [ 1 to: 100000 do: [ :index | a at: ((index \\ 255)+1) put: (Object new) ] ]. "575"
MessageTally time: [ 1 to: 100000 do: [ :index | d at: ((index \\ 255) +1) put: (Object new) ] ]. "1567"
MessageTally time: [ 1 to: 100000 do: [ :index | b add: ((index \\255) +1) ]]. "2359"
The really big difference comes when you look for values. Use the above workspace code with the following.
MessageTally time: [ 1 to: 100000 do: [ :index | | z | z := (a at: ((index \\ 255)+1))]]. "64"
MessageTally time: [ 1 to: 100000 do: [ :index | | z | z := (d at: ((index \\ 255)+1) ifAbsent: nil) ] ]. "990"
MessageTally time: [ 1 to: 100000 do: [ :index | | z | z := (b occurencesOf: ((index \\ 255) +1)) ]]. "2303"
This shows that Arrays are clearly the performance choice for this test. The size difference between the Arrays is probably not extremely relatvent here(yes I know that was one of my earlier points but it didn't turn out ot be as important ). I realize this isn't the best time: test you could perform but I think it illustrates my point that Bags are not necessarily hte perfect choice especially for such a small dataset. Jared Parsons
Nice thorough analysis Jared (wow, I got shot down :) ... But other things to consider: what if the number of sensors changes suddenly? With bags you don't have to reconfigure the bag size, it's automatically taken care of (although reconfiguring the array isn't hard either). And in respect to this particular problem, I don't think the sensors would be constantly being triggered, so performance might not be the main driving issue. Alfred Park
Jared did do a very nice analysis and is probably right, however he is wrong about one thing a bag would take at most the same amount as the array. If the dictionary/bag starts at size 4 a double everytime its needed to be bigger, assuming we use all indexes, the bag will not grow bigger than 256, as it is a nice power of 2. Thus there could be a reason to use a bag, if the memory in the system was the restraining factor. Robert Schierholz