好得很程序员自学网

<tfoot draggable='sEl'></tfoot>

Java ArrayList.add 的实现方法

arraylist 是平时相当常用的list实现, 其中 boolean add(e e) 的实现比较直接:

?

1

2

3

4

5

6

7

8

9

10

11

/**

  * appends the specified element to the end of this list.

  *

  * @param e element to be appended to this list

  * @return <tt>true</tt> (as specified by {@link collection#add})

  */

public boolean add(e e) {

   ensurecapacityinternal(size + 1 ); // increments modcount!!

   elementdata[size++] = e;

   return true ;

}

有时候也使用 void add(int index, e element) 把元素插入到指定的 index 上. 在jdk中的实现是:

?

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

/**

  * inserts the specified element at the specified position in this

  * list. shifts the element currently at that position (if any) and

  * any subsequent elements to the right (adds one to their indices).

  *

  * @param index index at which the specified element is to be inserted

  * @param element element to be inserted

  * @throws indexoutofboundsexception {@inheritdoc}

  */

public void add( int index, e element) {

   rangecheckforadd(index);

 

   ensurecapacityinternal(size + 1 ); // increments modcount!!

   system.arraycopy(elementdata, index, elementdata, index + 1 ,

            size - index);

   elementdata[index] = element;

   size++;

}

略有差别, 需要保证当前 elementdata 数组容量够用, 然后把从 index 处一直到尾部的数组元素都向后挪一位. 最后把要插入的元素赋给数组的 index 处.

一直以来, 我都认为 system.arraycopy 这个native方法, 它的c++实现是调用底层的 memcpy , 直接方便, 效率也没问题.

但今天看了openjdk的源码发现并非如此.

以openjdk8u60 为例, 在 objarrayklass.cpp 中:

?

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

void objarrayklass::copy_array(arrayoop s, int src_pos, arrayoop d,

                 int dst_pos, int length, traps) {

  assert (s->is_objarray(), "must be obj array" );

 

  if (!d->is_objarray()) {

   throw (vmsymbols::java_lang_arraystoreexception());

  }

 

  // check is all offsets and lengths are non negative

  if (src_pos < 0 || dst_pos < 0 || length < 0 ) {

   throw (vmsymbols::java_lang_arrayindexoutofboundsexception());

  }

  // check if the ranges are valid

  if ( (((unsigned int ) length + (unsigned int ) src_pos) > (unsigned int ) s->length())

    || (((unsigned int ) length + (unsigned int ) dst_pos) > (unsigned int ) d->length()) ) {

   throw (vmsymbols::java_lang_arrayindexoutofboundsexception());

  }

 

  // special case. boundary cases must be checked first

  // this allows the following call: copy_array(s, s.length(), d.length(), 0).

  // this is correct, since the position is supposed to be an 'in between point', i.e., s.length(),

  // points to the right of the last element.

  if (length== 0 ) {

   return ;

  }

  if (usecompressedoops) {

   narrowoop* const src = objarrayoop(s)->obj_at_addr<narrowoop>(src_pos);

   narrowoop* const dst = objarrayoop(d)->obj_at_addr<narrowoop>(dst_pos);

   do_copy<narrowoop>(s, src, d, dst, length, check);

  } else {

   oop* const src = objarrayoop(s)->obj_at_addr<oop>(src_pos);

   oop* const dst = objarrayoop(d)->obj_at_addr<oop>(dst_pos);

   do_copy<oop> (s, src, d, dst, length, check);

  }

}

可以看到copy_array在做了各种检查之后, 最终copy的部分在do_copy方法中, 而这个方法实现如下:

?

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

// either oop or narrowoop depending on usecompressedoops.

template < class t> void objarrayklass::do_copy(arrayoop s, t* src,

                 arrayoop d, t* dst, int length, traps) {

 

  barrierset* bs = universe::heap()->barrier_set();

  // for performance reasons, we assume we are that the write barrier we

  // are using has optimized modes for arrays of references. at least one

  // of the asserts below will fail if this is not the case.

  assert (bs->has_write_ref_array_opt(), "barrier set must have ref array opt" );

  assert (bs->has_write_ref_array_pre_opt(), "for pre-barrier as well." );

 

  if (s == d) {

   // since source and destination are equal we do not need conversion checks.

   assert (length > 0 , "sanity check" );

   bs->write_ref_array_pre(dst, length);

   copy::conjoint_oops_atomic(src, dst, length);

  } else {

   // we have to make sure all elements conform to the destination array

   klass* bound = objarrayklass::cast(d->klass())->element_klass();

   klass* stype = objarrayklass::cast(s->klass())->element_klass();

   if (stype == bound || stype->is_subtype_of(bound)) {

    // elements are guaranteed to be subtypes, so no check necessary

    bs->write_ref_array_pre(dst, length);

    copy::conjoint_oops_atomic(src, dst, length);

   } else {

    // slow case: need individual subtype checks

    // note: don't use obj_at_put below because it includes a redundant store check

    t* from = src;

    t* end = from + length;

    for (t* p = dst; from < end; from++, p++) {

     // xxx this is going to be slow.

     t element = *from;

     // even slower now

     bool element_is_null = oopdesc::is_null(element);

     oop new_val = element_is_null ? oop( null )

                    : oopdesc::decode_heap_oop_not_null(element);

     if (element_is_null ||

       (new_val->klass())->is_subtype_of(bound)) {

      bs->write_ref_field_pre(p, new_val);

      *p = element;

     } else {

      // we must do a barrier to cover the partial copy.

      const size_t pd = pointer_delta(p, dst, (size_t)heapoopsize);

      // pointer delta is scaled to number of elements (length field in

      // objarrayoop) which we assume is 32 bit.

      assert (pd == (size_t)( int )pd, "length field overflow" );

      bs->write_ref_array((heapword*)dst, pd);

      throw (vmsymbols::java_lang_arraystoreexception());

      return ;

     }

    }

   }

  }

  bs->write_ref_array((heapword*)dst, length);

}

可以看到, 在设定了heap barrier之后, 元素是在for循环中被一个个挪动的. 做的工作比我想象的要多.

如果有m个元素, 按照给定位置, 使用arraylist.add(int,e)逐个插入到一个长度为n的arraylist中, 复杂度应当是o(m*n), 或者o(m*(m+n)), 所以, 如果m和n都不小的话, 效率确实是不高的.

效率高一些的方法是, 建立m+n长度的数组或arraylist, 在给定位置赋值该m个要插入的元素, 其他位置依次赋值原n长度list的元素. 这样时间复杂度应当是o(m+n).

还有, 在前面的实现中, 我们可以看到有对ensurecapacityinternal(int) 的调用. 这个保证数组容量的实现主要在:

?

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

/**

  * increases the capacity to ensure that it can hold at least the

  * number of elements specified by the minimum capacity argument.

  *

  * @param mincapacity the desired minimum capacity

  */

private void grow( int mincapacity) {

   // overflow-conscious code

   int oldcapacity = elementdata.length;

   int newcapacity = oldcapacity + (oldcapacity >> 1 );

   if (newcapacity - mincapacity < 0 )

     newcapacity = mincapacity;

   if (newcapacity - max_array_size > 0 )

     newcapacity = hugecapacity(mincapacity);

   // mincapacity is usually close to size, so this is a win:

   elementdata = arrays.copyof(elementdata, newcapacity);

}

大家知道由于效率原因, arraylist容量增长不是正好按照要求的容量mincapacity来设计的, 新容量计算的主要逻辑是: 如果要求容量比当前容量的1.5倍大, 就按照要求容量重新分配空间; 否则按当前容量1.5倍增加. 当然不能超出integer.max_value了. oldcapacity + (oldcapacity >> 1) 实际就是当前容量1.5倍, 等同于(int) (oldcapacity * 1.5), 但因这段不涉及浮点运算只是移位, 显然效率高不少.

所以如果arraylist一个一个add元素的话, 容量是在不够的时候1.5倍增长的. 关于1.5这个数字, 或许是觉得2倍增长太快了吧. 也或许有实验数据的验证支撑.

关于这段代码中出现的arrays.copyof这个方法, 实现的是重新分配一段数组, 把elementdata赋值给新分配的空间, 如果新分配的空间大, 则后面赋值null, 如果分配空间比当前数组小则截断. 底层还是调用的system.arraycopy.

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持。

原文链接:https://segmentfault测试数据/a/1190000016910760

查看更多关于Java ArrayList.add 的实现方法的详细内容...

  阅读:12次