写在前面

这里分享一下,Java 中 用于 list 中对象的多字段唯一标识,除重,代码示例

一、封装工具类

public class ListUtils {

public static <T> List<T> distinctList(List<T> list, Function<? super T, ?>... keyExtractors) {
return list.stream()
.filter(distinctByKeys(keyExtractors))
.collect(Collectors.toList());
}

private static <T> Predicate<T> distinctByKeys(Function<? super T, ?>... keyExtractors) {
final Map<List<?>, Boolean> seen = new ConcurrentHashMap<>();
return new Predicate<T>() {
@Override
public boolean test(T t) {
final List<?> keys = Arrays.stream(keyExtractors)
.map(ke -> ke.apply(t))
.collect(Collectors.toList());
return seen.putIfAbsent(keys, Boolean.TRUE) == null;
}
};
}
}

1.1、测试示例

private List<DataLineage> list;

@Before
public void inti() {
List<DataLineage> ll = Lists.newArrayList();
for (int i = 0; i < 20; i++) {
DataLineage p1 = new DataLineage();
p1.setId(Long.valueOf(i));
p1.setSourceDataBaseName("BaseName" + 1);
p1.setSourceTableName("TableName" + RandomUtil.randomInt(5));
p1.setSqlQuery("SqlQuery" +RandomUtil.randomInt(5));
ll.add(p1);
}
list = ll;
}


@Test
public void t1() {
List<DataLineage> distinctDataLineageList = ListUtils.distinctList(list,
DataLineage::getSourceDataBaseName,
DataLineage::getSourceTableName);
distinctDataLineageList.forEach(System.out::println);
}

结果打印

DataLineage(id=0, sourceDataBaseName=BaseName1, sourceTableName=TableName1, sourceFieldName=null, sourceFieldType=null, sourceFieldComment=null, targetDataBaseName=null, targetTableName=null, targetFieldName=null, sqlQuery=SqlQuery2)
DataLineage(id=4, sourceDataBaseName=BaseName1, sourceTableName=TableName4, sourceFieldName=null, sourceFieldType=null, sourceFieldComment=null, targetDataBaseName=null, targetTableName=null, targetFieldName=null, sqlQuery=SqlQuery2)
DataLineage(id=6, sourceDataBaseName=BaseName1, sourceTableName=TableName3, sourceFieldName=null, sourceFieldType=null, sourceFieldComment=null, targetDataBaseName=null, targetTableName=null, targetFieldName=null, sqlQuery=SqlQuery4)
DataLineage(id=12, sourceDataBaseName=BaseName1, sourceTableName=TableName2, sourceFieldName=null, sourceFieldType=null, sourceFieldComment=null, targetDataBaseName=null, targetTableName=null, targetFieldName=null, sqlQuery=SqlQuery4)
DataLineage(id=19, sourceDataBaseName=BaseName1, sourceTableName=TableName0, sourceFieldName=null, sourceFieldType=null, sourceFieldComment=null, targetDataBaseName=null, targetTableName=null, targetFieldName=null, sqlQuery=SqlQuery0)

可验证,这里其实将 SourceDataBaseName 和 SourceTableName 作为唯一标识,打印了 整个 List 唯一标识过滤后的数据