Introduction to Maps

Beware! This document needs cleaning up. Formatting may be messy or broken. Text may be inaccurate or in need of editing.

You can look at this preview, but be aware that the content will change before the instructor assigns this to the class.

This document is the result of an automated conversion. You can view the original version but again beware: that version too is not necessarily what will eventually be assigned.

Maps pair keys with values.

Keys are unique: if a key is present, it has exactly one value.

Values are not necessarily unique: multiple keys could have the same value.

Learning objectives:

  • Learn the Map API: declaration, insertion, accessing elements, and removal
  • Understand the structure of keys, values, and entries in a map, and learn to iterate through each in Java
  • Practice the general skill of learning and working with a library’s API

In this activity, you will learn how to perform common operations with the Map API. Map is an interface (like List), and the most common implementation of Map is HashMap. Thus, if you are creating a map from Student ids (Integers) to Student objects, you would declare and initialize the map using the following:

Map<Integer, Student> studentsByID = new HashMap<>();

(Note that in another recent activity, you learned about the map operation on streams. Even though it (unfortunately) shares the same name, the Map interface we are studying today is only distantly related to that steam operation. They both use the word “map” based on the mathematical idea of mapping: “this corresponds to that.” Despite the shared name, they are completely different things in Java.)

Note that there are two type parameters for Maps: the key type (in this case Integer), and value type (in this case Student). This map thus might hold data that looks like this:

    10111235  Student(name=Sally, year=sophomore)

    10125263  Student(name=Fred, year=first-year)

The golden rule of maps is the following invariant rule:

For each key in the map, there is exactly one value.

Many keys may have the same value, but there is only one value for each key:

Map<String, Integer>

✅Possible:

“one” → 1

“two” → 2

“double” → 2 Two keys have the value 2

“three” → 3

“triple” → 3 Two keys have the value 3

❌Not possible:

“several” → 2

“several” → 3 Nope! The key “several” can only have one value

(In this example, adding the second entry would replace the first)

If you took Comp 123, all of this may be familiar: Java’s “map” is the same as Python’s “dictionary.” This same abstract data type exists in many other languages, which variously call it “dictionary,” “map,” “associative array,” and “hash.” The syntax varies quite a bit from language to language, but the abstract structure is the same. Keep in mind, therefore, that you are learning both the Java-specific API and the general abstraction and its patterns of use.

If you need a quick reference on basic Map methods in Java, check out: https://www.codebyamir.com/blog/how-to-use-a-map-in-java

Look at the Restaurant class, and understand what it represents and how to create one.

Your first task is to complete the six steps listed in Restaurant explorer. There are hints for many of these in the Map JavaDoc.

  1. Create a new map from restaurant names to Restaurant objects.
  2. Add five new restaurants to the map. They can be anything you want — perhaps your five favorite restaurants!
  3. Print the names of each restaurant in the map (by looping over the map, not by copying strings in the code).
  4. Remove one restaurant from the map. Check to make sure it is gone.
  5. Print out each restaurant object. (Restaurant has a reasonable toString() method, so you can simply pass a restaurant to println() and it will come out nicely formatted.)
  6. Print out the restaurant name and object pairs.
  7. Replace the value for one of the names that is already present in the map with a different Restaurant object. (Do you need to remove the existing value first? )

Look at WebWordCounter. Your goal is to complete a main method that counts the frequency of each unique word in some text file on the web.

  1. Identify the URL for a text file at http://textfiles.com/directory.html that looks interesting to you. To be clear, once you use your browser drill down to a single .txt file in this website the URL will be the address in your browser. Create a String constant for this URL in WebWordCounter.
  2. Complete the 4 steps outlined in the main method. Hint: handling updates to the word frequency is subtly tricky. Remember that the first time you see a particular word it will not be in the map, so a call to get() will fail.

Look at the code in Flags, but do not run it yet!

  1. Study the code. Ignore the commented-out code for now; just look at the code that will run.

Look at the declaration of flagsByColor. What is the type of the keys in this map? And what type are the values?

Look at the main method. What will this code print? Make your guess first. Then run it and check your guess. 2. Using what you learned from what happened in section 1, uncomment section 2 and repeat the same exercise: study the code, guess the output, then run it and check your guess. Was anything surprising? Form a theory about what’s happening. 3. Repeat the same process with section 3… 4. …and section 4.

What is going on?! Scroll down when you’re ready to find out.

·

·

·

[spoiler warning]

·

·

·

Why look, it’s our old friend…

·

·

·

[Echoing evil laughter]

Consider just this much of the code from Flags:

private static final Map<List<String>,String> flagsByColor =
    new HashMap<>();

...

List<String> key = new ArrayList<>();
key.add("green");
key.add("white");
flagsByColor.put(key, "Nigeria");

What do the objects look like after this code runs? You might reasonably have a picture like this in your head:

However, that picture misses something very important: the ArrayList and the Strings are all objects — and one of those objects is shared. The reality is closer to this:

Do you spot the shared object? Note the two arrows pointing to the ArrayList: the object that our main method’s key variable points to is the same object that the map is using as a key. That means that when we run this code:

key.add("orange");

…then this happens:

Because the HashMap is sharing the same mutable ArrayList object to which we are adding new elements, now the value Nigeria is stored in the map with an incorrect key! That is why this document places the words shared mutable state atop flames and chaos above.

It gets worse

Based on the picture above, we might well expect that the following line of code would replace the value Nigeria in the map:

flagsByColor.put(key, "Ireland");

…because we are putting a new value in the map with the same key as an existing one. It’s not even just that there are two keys whose values are equal; it is the very same object that we passed to the put method the first time!

And indeed, at first it appears that the new value replaced the old one: when we ask the map to get the value for the key List.of("green", "white"), we get null. That sure sounds like Ireland replaced Nigeria. But then, when we ask for the size of the map, it tells us there are multiple things in it. What?! If we were replacing the old values with the new ones at each step, we would expect the size to be 1!

Why?? The picture above is still not completely accurate. In software as in life, there is always yet more hidden complexity. An even more accurate picture of the map just after we add Nigeria would be this:

HashMap is built for speed. When you ask it to find the value for a given key, it doesn’t want to have to look through every single key in the whole map. To make key lookup faster, HashMap places its entries in buckets that are organized by key. In much the same way that you might organize papers in file folders with different labels, or much like how a library assigns numbers to its books and then labels each shelf with a specific range of book numbers, HashMap assigns a number (called a “hash code”) to every possible key and then organizes all its entries in bins according to that number.

(How does this mysterious process work? You can learn all about it in Comp 128. In fact, in that course, you will build it from scratch yourself!)

This “organized in bins” system is where things go really wrong with the code in Flags. When the main method adds orange to the key list, it changes the existing map entry’s key too — but HashMap has no idea that the key changed, so the entry stays in the same bin, which is now the wrong bin:

Imagine that you go into a library and write a new number on the spine of one of the books on the shelf. Then you ask the librarian to go find that book using the new number that you just wrote on the book. They probably won’t be able to find it: they will look where the book with that number should be, not where it is! That’s what we’re doing to poor HashMap.

When we ask to put Ireland in the map, HashMap dutifully checks whether there is already an entry with the same key. And there is such an entry! But it’s in the wrong place, so HashMap doesn’t find it. The old entry stays, the new entry gets added, and we end up with something like this:

We now have a map with two entries for the same key, one of which is unfindable by the map. This is why the flames-and-chaos image up above is really big: bugs involving shared mutable state can cause cascading effects that are quite surprising and insidious.

Run the full Flags code again. You can now make sense of every line of its cursed output.

So what do we do about this?

Wait a minute: wasn’t it supposed to be an invariant of the map interface that there is only one entry for a given key? There are two entries with the same key in the picture above! Doesn’t that break the invariant?!

The answer is yes. Yes it does.

Does that mean HashMap is broken?

No.

In addition to the invariant about unique keys, the Map interface’s put method also has a precondition: don’t mutate the keys. That’s why it’s called an API contract: it requires both the abstraction’s implementation and the abstraction’s clients to do their part. The map interface makes a deal with us as map clients: “If you don’t mutate the keys you provide, I will make sure the keys are unique. Mutate the keys and you’ve broken the contract; all bets are off.”

The best way to avoid the problem we’ve created in Flags is to always use immutable keys in maps.

Have you wondered why Java’s String class is immutable? Well, here’s one reason: it is always safe to use strings as keys in a map in Java. Nobody can mutate a string key and mess up a map. They can make new strings by calling substring and toLowerCase and the like, but they can’t modify the existing strings, so they can’t wreck any maps that are using those strings as keys. (That’s in Java. In, say, C, you get no such guarantees.)

Early in the class, students often ask, “Why should a class ever be immutable? Isn’t it better to allow every object to change, just in case we need to change it?” Here is your answer to that question: mutable objects mean shared mutable state. And shared mutable state means flames. And chaos. And Elmo.