Scrigroup - Documente si articole

     

HomeDocumenteUploadResurseAlte limbi doc
AccessAdobe photoshopAlgoritmiAutocadBaze de dateCC sharp
CalculatoareCorel drawDot netExcelFox proFrontpageHardware
HtmlInternetJavaLinuxMatlabMs dosPascal
PhpPower pointRetele calculatoareSqlTutorialsWebdesignWindows
WordXml

AspAutocadCDot netExcelFox proHtmlJava
LinuxMathcadPhotoshopPhpSqlVisual studioWindowsXml

Choosing an implementation: Choosing between Lists, Sets

java



+ Font mai mare | - Font mai mic



Choosing an implementation

From the diagram on page you can see that there are really only three collection components: Map, List, and Set, and only two or three implementations of each interface. If you need to use the functionality offered by a particular interface, how do you decide which particular implementation to use?



To understand the answer, you must be aware that each different implementation has its own features, strengths, and weaknesses. For example, you can see in the diagram that the "feature" of Hashtable, Vector, and Stack is that they are legacy classes, so that existing code doesn't break. On the other hand, it's best if you don't use those for new (Java 1.2) code.

The distinction between the other collections often comes down to what they are "backed by;" that is, the data structures that physically implement your desired interface. This means that, for example, ArrayList, LinkedList, and Vector (which is roughly equivalent to ArrayList) all implement the List interface so your program will produce the same results regardless of the one you use. However, ArrayList (and Vector) is backed by an array, while the LinkedList is implemented in the usual way for a doubly-linked list, as individual objects each containing data along with handles to the previous and next elements in the list. Because of this, if you want to do many insertions and removals in the middle of a list a LinkedList is the appropriate choice. (LinkedList also has additional functionality that is established in AbstractSequentialList.) If not, an ArrayList is probably faster.

As another example, a Set can be implemented as either an ArraySet or a HashSet. An ArraySet is backed by an ArrayList and is designed to support only small numbers of elements, especially in situations in which you're creating and destroying a lot of Set objects. However, if you're going to have larger quantities in your Set, the performance of ArraySet will get very bad, very quickly. When you're writing a program that needs a Set, you should choose HashSet by default, and change to ArraySet only in special cases where performance improvements are indicated and necessary.

Choosing between Lists

The most convincing way to see the differences between the implementations of List is with a performance test. The following code establishes an inner base class to use as a test framework, then creates an anonymous inner class for each different test. Each of these inner classes is called by the test( ) method. This approach allows you to easily add and remove new kinds of tests.

//: ListPerformance.java

// Demonstrates performance differences in Lists

package c08.newcollections;

import java.util.*;

public class ListPerformance

abstract void test(List a);

}

private static Tester[] tests =

}

},

new Tester('iteration', 300)

}

},

new Tester('insert', 1000)

},

new Tester('remove', 5000)

}

},

};

public static void test(List a)

}

public static void main(String[] args)

} ///:~

The inner class Tester is abstract, to provide a base class for the specific tests. It contains a String to be printed when the test starts, a size parameter to be used by the test for quantity of elements or repetitions of tests, a constructor to initialize the fields, and an abstract method test( ) that does the work. All the different types of tests are collected in one place, the array tests, which is initialized with different anonymous inner classes that inherit from Tester. To add or remove tests, simply add or remove an inner class definition from the array, and everything else happens automatically.

The List that's handed to test( ) is first filled with elements, then each test in the tests array is timed. The results will vary from machine to machine; they are intended to give only an order of magnitude comparison between the performance of the different collections. Here is a summary of one run:

Type

Get

Iteration

Insert

Remove

ArrayList

110

270

1920

4780

LinkedList

1870

7580

170

110

You can see that random accesses (get( )) and iterations are cheap for ArrayLists and expensive for LinkedLists. On the other hand, insertions and removals from the middle of a list are significantly cheaper for a LinkedList than for an ArrayList. The best approach is probably to choose an ArrayList as your default and to change to a LinkedList if you discover performance problems because of many insertions and removals from the middle of the list.

Choosing between Sets

You can choose between an ArraySet and a HashSet, depending on the size of the Set (if you need to produce an ordered sequence from a Set, use TreeSet ). The following test program gives an indication of this tradeoff:

//: SetPerformance.java

// Demonstrates performance differences in Sets

package c08.newcollections;

import java.util.*;

public class SetPerformance

abstract void test(Set s, int size);

}

private static Tester[] tests =

}

},

new Tester('contains')

},

new Tester('iteration')

}

},

};

public static void test(Set s, int size)

}

public static void main(String[] args)

} ///:~

The last test of ArraySet is only 500 elements instead of 1000 because it is so slow.

Type

Test size

Add

Contains

Iteration

10

5.0

6.0

11.0

ArraySet

100

24.2

23.1

4.9

500

100.18

97.12

4.5

10

5.0

6.0

16.0

HashSet

100

5.5

5.0

6.0

1000

6.1

6.09

5.77

HashSet is clearly superior to ArraySet for add( ) and contains( ), and the performance is effectively independent of size. You'll virtually never want to use an ArraySet for regular programming.

Choosing between Maps

When choosing between implementations of Map, the size of the Map is what most strongly affects performance, and the following test program gives an indication of this tradeoff:

//: MapPerformance.java

// Demonstrates performance differences in Maps

package c08.newcollections;

import java.util.*;

public class MapPerformance

return m;

}

private abstract static class Tester

abstract void test(Map m, int size);

}

private static Tester[] tests =

}

},

new Tester('get')

},

new Tester('iteration')

}

},

};

public static void test(Map m, int size)

}

public static void main(String[] args)

} ///:~

Because the size of the map is the issue, you'll see that the timing tests divide the time by the size to normalize each measurement. Here is one set of results. (Yours will probably be different.)

Type

Test size

Put

Get

Iteration

10

22.0

44.0

17.0

ArrayMap

100

68.7

118.6

8.8

500

155.22

259.36

4.84

10

17.0

16.0

11.0

TreeMap

100

18.1

70.3

8.3

500

11.22

148.4

4.62

10

11.0

11.0

33.0

HashMap

100

9.9

10.4

12.1

1000

13.18

10.65

5.77

Even for size 10, the ArrayMap performance is worse than HashMap - except for iteration, which is not usually what you're concerned about when using a Map. (get( ) is generally the place where you'll spend most of your time.) The TreeMap has respectable put( ) and iteration times, but the get( ) is not so good. Why would you use a TreeMap if it has good put( ) and iteration times? So you could use it not as a Map, but as a way to create an ordered list. The behavior of a tree is such that it's always in order and doesn't have to be specially sorted. (The way it is ordered will be discussed later.) Once you fill a TreeMap, you can call keySet( ) to get a Set view of the keys, then toArray( ) to produce an array of those keys. You can then use the static method Array.binarySearch( ) (discussed later) to rapidly find objects in your sorted array. Of course, you would probably only do this if, for some reason, the behavior of a HashMap was unacceptable, since HashMap is designed to rapidly find things. In the end, when you're using a Map your first choice should be HashMap, and only rarely will you need to investigate the alternatives.

There is another performance issue that the above table does not address, and that is speed of creation. The following program tests creation speed for different types of Map:

//: MapCreation.java

// Demonstrates time differences in Map creation

package c08.newcollections;

import java.util.*;

public class MapCreation

} ///:~

At the time this program was written, the creation speed of TreeMap was dramatically faster than the other two types. (Although you should try it, since there was talk of performance improvements to ArrayMap.) This, along with the acceptable and consistent put( ) performance of TreeMap, suggests a possible strategy if you're creating many Maps, and only later in your program doing many lookups: Create and fill TreeMaps, and when you start looking things up, convert the important TreeMaps into HashMaps using the HashMap(Map) constructor. Again, you should only worry about this sort of thing after it's been proven that you have a performance bottleneck. ("First make it work, then make it fast - if you must.")



TreeSet was not available at the time of this writing, but you can easily add a test for it into this example.



Politica de confidentialitate | Termeni si conditii de utilizare



DISTRIBUIE DOCUMENTUL

Comentarii


Vizualizari: 682
Importanta: rank

Comenteaza documentul:

Te rugam sa te autentifici sau sa iti faci cont pentru a putea comenta

Creaza cont nou

Termeni si conditii de utilizare | Contact
© SCRIGROUP 2024 . All rights reserved