java中删除数组中重复元素方法探讨

摘要：问题：比如我有一个数组（元素个数为0哈），希望添加进去元素不能重复。拿到这样一个问题，我可能会快速的写下代码，这里数组用ArrayList....

问题：比如我有一个数组（元素个数为0哈），希望添加进去元素不能重复。

拿到这样一个问题，我可能会快速的写下代码，这里数组用ArrayList.

复制代码代码如下:

private static void testListSet(){

List<String> arrays = new ArrayList<String>(){

@Override

public boolean add(String e) {

for(String str:this){

if(str.equals(e)){

System.out.println("add failed !!! duplicate element");

return false;

}else{

System.out.println("add successed !!!");

}

return super.add(e);

}

};

arrays.add("a");arrays.add("b");arrays.add("c");arrays.add("b");

for(String e:arrays)

System.out.print(e);

}

这里我什么都不关，只关心在数组添加元素的时候做下判断（当然添加数组元素只用add方法），是否已存在相同元素，如果数组中不存在这个元素，就添加到这个数组中，反之亦然。这样写可能简单，但是面临庞大数组时就显得笨拙：有100000元素的数组天家一个元素，难道要调用100000次equal吗？这里是个基础。

问题：加入已经有一些元素的数组了，怎么删除这个数组里重复的元素呢？

大家知道java中集合总的可以分为两大类：List与Set。List类的集合里元素要求有序但可以重复，而Set类的集合里元素要求无序但不能重复。那么这里就可以考虑利用Set这个特性把重复元素删除不就达到目的了，毕竟用系统里已有的算法要优于自己现写的算法吧。

复制代码代码如下:

public static void removeDuplicate(List<People> list){

HashSet<People> set = new HashSet<People>(list);

list.clear();

list.addAll(set);

}private static People[] ObjData = new People[]{

new People(0, "a"),new People(1, "b"),new People(0, "a"),new People(2, "a"),new People(3, "c"),

};

复制代码代码如下:

public class People{

private int id;

private String name;

public People(int id,String name){

this.id = id;

this.name = name;

}

@Override

public String toString() {

return ("id = "+id+" , name "+name);

}

上面的代码，用了一个自定义的People类，当我添加相同的对象时候（指的是含有相同的数据内容），调用removeDuplicate方法发现这样并不能解决实际问题，仍然存在相同的对象。那么HashSet里是怎么判断像个对象是否相同的呢？打开HashSet源码可以发现：每次往里面添加数据的时候，就必须要调用add方法：

复制代码代码如下:

@Override

public boolean add(E object) {

return backingMap.put(object, this) == null;

}

这里的backingMap也就是HashSet维护的数据，它用了一个很巧妙的方法，把每次添加的Object当作HashMap里面的KEY，本身HashSet对象当作VALUE。这样就利用了Hashmap里的KEY唯一性，自然而然的HashSet的数据不会重复。但是真正的是否有重复数据，就得看HashMap里的怎么判断两个KEY是否相同。

复制代码代码如下:

@Override public V put(K key, V value) {

if (key == null) {

return putValueForNullKey(value);

}

int hash = secondaryHash(key.hashCode());

HashMapEntry<K, V>[] tab = table;

int index = hash & (tab.length - 1);

for (HashMapEntry<K, V> e = tab[index]; e != null; e = e.next) {

if (e.hash == hash && key.equals(e.key)) {

preModify(e);

V oldValue = e.value;

e.value = value;

return oldValue;

}

// No entry for (non-null) key is present; create one

modCount++;

if (size++ > threshold) {

tab = doubleCapacity();

index = hash & (tab.length - 1);

}

addNewEntry(key, value, hash, index);

return null;

}

总的来说，这里实现的思路是：遍历hashmap里的元素，如果元素的hashcode相等（事实上还要对hashcode做一次处理），然后去判断KEY的eqaul方法。如果这两个条件满足，那么就是不同元素。那这里如果数组里的元素类型是自定义的话，要利用Set的机制，那就得自己实现equal与hashmap（这里hashmap算法就不详细介绍了，我也就理解一点）方法了：

复制代码代码如下:

public class People{

private int id; //

private String name;

public People(int id,String name){

this.id = id;

this.name = name;

}

@Override

public String toString() {

return ("id = "+id+" , name "+name);

}

public int getId() {

return id;

}

public void setId(int id) {

this.id = id;

}

public String getName() {

return name;

}

public void setName(String name) {

this.name = name;

}

@Override

public boolean equals(Object obj) {

if(!(obj instanceof People))

return false;

People o = (People)obj;

if(id == o.getId()&&name.equals(o.getName()))

return true;

else

return false;

}

@Override

public int hashCode() {

// TODO Auto-generated method stub

return id;

//return super.hashCode();

}

这里在调用removeDuplicate(list)方法就不会出现两个相同的people了。

好吧，这里就测试它们的性能吧：

复制代码代码如下:

public class RemoveDeplicate {

public static void main(String[] args) {

// TODO Auto-generated method stub

//testListSet();

//removeDuplicateWithOrder(Arrays.asList(data));

//ArrayList<People> list = new ArrayList<People>(Arrays.asList(ObjData));

//removeDuplicate(list);

People[] data = createObjectArray(10000);

ArrayList<People> list = new ArrayList<People>(Arrays.asList(data));

long startTime1 = System.currentTimeMillis();

System.out.println("set start time --> "+startTime1);

removeDuplicate(list);

long endTime1 = System.currentTimeMillis();

System.out.println("set end time --> "+endTime1);

System.out.println("set total time --> "+(endTime1-startTime1));

System.out.println("count : " + People.count);

People.count = 0;

long startTime = System.currentTimeMillis();

System.out.println("Efficient start time --> "+startTime);

EfficientRemoveDup(data);

long endTime = System.currentTimeMillis();

System.out.println("Efficient end time --> "+endTime);

System.out.println("Efficient total time --> "+(endTime-startTime));

System.out.println("count : " + People.count);

}

public static void removeDuplicate(List<People> list)

{

HashSet<People> set = new HashSet<People>(list);

list.clear();

list.addAll(set);

}

public static void removeDuplicateWithOrder(List<String> arlList)

{

Set<String> set = new HashSet<String>();

List<String> newList = new ArrayList<String>();

for (Iterator<String> iter = arlList.iterator(); iter.hasNext();) {

String element = iter.next();

if (set.add( element))

newList.add( element);

}

arlList.clear();

arlList.addAll(newList);

}

@SuppressWarnings("serial")

private static void testListSet(){

List<String> arrays = new ArrayList<String>(){

@Override

public boolean add(String e) {

for(String str:this){

if(str.equals(e)){

System.out.println("add failed !!! duplicate element");

return false;

}else{

System.out.println("add successed !!!");

}

return super.add(e);

}

};

arrays.add("a");arrays.add("b");arrays.add("c");arrays.add("b");

for(String e:arrays)

System.out.print(e);

}

private static void EfficientRemoveDup(People[] peoples){

//Object[] originalArray; // again, pretend this contains our original data

int count =0;

// new temporary array to hold non-duplicate data

People[] newArray = new People[peoples.length];

// current index in the new array (also the number of non-dup elements)

int currentIndex = 0;

// loop through the original array...

for (int i = 0; i < peoples.length; ++i) {

// contains => true iff newArray contains originalArray[i]

boolean contains = false;

// search through newArray to see if it contains an element equal

// to the element in originalArray[i]

for(int j = 0; j <= currentIndex; ++j) {

// if the same element is found, don't add it to the new array

count++;

if(peoples[i].equals(newArray[j])) {

contains = true;

break;

}

// if we didn't find a duplicate, add the new element to the new array

if(!contains) {

// note: you may want to use a copy constructor, or a .clone()

// here if the situation warrants more than a shallow copy

newArray[currentIndex] = peoples[i];

++currentIndex;

}

System.out.println("efficient medthod inner count : "+ count);

}

private static People[] createObjectArray(int length){

int num = length;

People[] data = new People[num];

Random random = new Random();

for(int i = 0;i<num;i++){

int id = random.nextInt(10000);

System.out.print(id + " ");

data[i]=new People(id, "i am a man");

}

return data;

}

｝

测试结果：

复制代码代码如下:

set end time --> 1326443326724

set total time --> 26

count : 3653

Efficient start time --> 1326443326729

efficient medthod inner count : 28463252

Efficient end time --> 1326443327107

Efficient total time --> 378

count : 28463252

【java中删除数组中重复元素方法探讨】相关文章：

★ java页面中文乱码的解决办法

★ java中Filter过滤器处理中文乱码的方法

★ Java判断本机IP地址类型的方法

★ java中File类的使用方法

★ 对Java中JSON解析器的一些见解

★ java中读取配置文件中数据的具体方法

★ Java 中实现随机无重复数字的方法

★ 从java中调用matlab详细介绍

★ 基于java中正则操作的方法总结

★ java中计算字符串长度的方法及u4E00与u9FBB的认识

上一篇： java 删除数组元素与删除重复数组元素的代码

下一篇：浅谈java中的访问修饰符

学习工具