Issue
I am trying to do a sorting of some number. I am getting a "java.lang.IllegalArgumentException: Comparison method violates its general contract!" Exception when I execute the following code.
import org.apache.commons.lang3.StringUtils;
public class ComparatorTest {
public static void main(String[] args) {
List<String> ll = List.of("1.A", "1.A.1", "10.A", "10.A.1", "10.A.2", "10.A.3", "12.A", "12.A.1", "12.A.2",
"12.A.4", "12.A.6", "1A.2", "2.A.1", "2.A.1.b", "2.A.1.b.1", "2.A.1.b.2", "2.A.1.b.3", "20.A.1",
"20.A.1.a", "20.A.1.b", "20.A.1.b.1", "20.A.1.b.2", "3.A.1", "3.A.1.a", "3.A.1.a.1", "3.A.1.a.2",
"3.A.1.a.3", "3.A.1.a.4", "3.A.1.b", "3.A.10", "6.A.1", "9.A.1");
ArrayList<String> l2 = new ArrayList<>(ll);
Collections.sort(l2, (obj1, obj2) -> {
try {
String[] prodClass1 = obj1.split("\\.");
String[] prodClass2 = obj2.split("\\.");
for (int i = 0; (i < prodClass1.length) && (i < prodClass2.length); i++) {
if (!prodClass1[i].equals(prodClass2[i])) {
if (StringUtils.isNumeric(prodClass1[i]) && StringUtils.isNumeric(prodClass2[i])) {
return Integer.valueOf(prodClass1[i]).compareTo(Integer.valueOf(prodClass2[i]));
} else {
return prodClass1[i].compareToIgnoreCase(prodClass2[i]);
}
}
}
return obj1.compareToIgnoreCase(obj2);
} catch (Exception e) {
e.printStackTrace();
return obj1.compareToIgnoreCase(obj2);
}
});
System.out.println(l2);
}
}
Interstingly if I remove any 1 element form the list ll, the code works just fine.
I think this has to do something related to equals() and compare() method. But can someone pinpoint the whats wrong here.
If I add one extra condition in the custom comparator it works. Why it is like this? Can anyone pinpoint the issue.
Modified code
Collections.sort(l2, (obj1, obj2) -> {
try {
String[] prodClass1 = obj1.split("\\.");
String[] prodClass2 = obj2.split("\\.");
for (int i = 0; (i < prodClass1.length) && (i < prodClass2.length); i++) {
if (!prodClass1[i].equals(prodClass2[i])) {
if (StringUtils.isNumeric(prodClass1[i]) && StringUtils.isNumeric(prodClass2[i])) {
return Integer.valueOf(prodClass1[i]).compareTo(Integer.valueOf(prodClass2[i]));
} else if (StringUtils.isNumeric(prodClass1[i]) || StringUtils.isNumeric(prodClass2[i])) {
return StringUtils.isNumeric(prodClass1[i]) ? -1 : 1;
} else {
return prodClass1[i].compareToIgnoreCase(prodClass2[i]);
}
}
}
return obj1.compareToIgnoreCase(obj2);
} catch (Exception e) {
e.printStackTrace();
return obj1.compareToIgnoreCase(obj2);
}
});
The below is the output when I execute the code after adding some logs:
prodClass1: [3, A, 1, a, 1] Comparing with prodClass2: [2, A, 1, b, 3]
prodClass1: [2, A, 1, b, 3] Comparing with prodClass2: [3, A, 1, a]
prodClass1: [2, A, 1, b, 3] Comparing with prodClass2: [3, A, 1]
Exception in thread "main" java.lang.IllegalArgumentException: Comparison method violates its general contract!
at java.base/java.util.TimSort.mergeHi(TimSort.java:903)
at java.base/java.util.TimSort.mergeAt(TimSort.java:520)
at java.base/java.util.TimSort.mergeForceCollapse(TimSort.java:461)
at java.base/java.util.TimSort.sort(TimSort.java:254)
at java.base/java.util.Arrays.sort(Arrays.java:1515)
at java.base/java.util.ArrayList.sort(ArrayList.java:1750)
at java.base/java.util.Collections.sort(Collections.java:179)
at core.ComparatorTest.main(ComparatorTest.java:30)
Solution
If you violate the contract, then sort
can do one of two things, and the spec does not guarantee either behaviour:
- The output of sort is utterly gobbledygook.
- You get that exception.
Hence, if you do not get that exception, that does not imply your code is fine. Absence of exception is not proof of absence of bugs. Hence, "If I remove an element this exception goes away" is meaningless. The key problem here is the 1A entry.
You can zoom in on just "10.A", "1A.2", "2.A.1"
.
- As per your code, 10.A is less than 1A.2, because string ordering is used on the first component, and
"10".compareTo("1A")
is negative -"10"
is considered as coming before"1A"
. - As per your code, 1A.2 is less than 2.A.1, because string ordering is used on the first component, and
"1A".compareTo("2")
is negative -"1A"
is considered as coming before"2"
.
So far, so obvious, right? However..
- As per your code, 10.A is not less than 2.A.1, because the integer ordering is used on the first component (as they are both numeric), and 10 is after 2.
Hence, we have the following scenario:
- A comes before B, and B comes before C...
- but A comes after C.
Which, obviously, is impossible.
The general contract of comparison methods involves:
a.equals(b)
? thena.compareTo(b) == 0
.a.compareTo(a) == 0
.a.compareTo(b)
needs to be the opposite ofb.compareTo(a)
.a.compareTo(b)
andb.compareTo(c)
are the same direction? Thena.compareTo(c)
must also be that direction (a is before b and b is before c? Then a must also be before c - same for 'after').
You've violated that last one. The second snippet fixes it by treating 1A
as being after 10
.
There are more ways in which your comparison code violates the rules, and, as you found out, violation of the rules doesn't guarantee exceptions. Your general approach of 'eh, fall back to string sorting' is a bad one. For example, you should not just fall back to that if one is 'longer' than the other (instead, the longer way needs to always win/lose from the shorter one).
Answered By - rzwitserloot
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.