LightWeightGSet

原创
2019/08/24 11:56
阅读数 329

为了降低保存block的内存开销,同时保证访问效率,namenode使用 LightWeightGSet这一数据结构。

LightWeightGSet同HashMap一样使用数组 + 链表的结构,但是有几点不同:

  1. 数组长度初始化时就确定了,以后不会再变化,所有没有rehash。
  2. 本质上不是key/value存储结构。value是key的子类,所以value自身就包含了key的信息,通过key找到value的位置,但是只存储value,不会再开辟单独的存储空间存储key,因此就比HashMap更节省空间,但是访问效率跟HashMap一下。 详细设计、对比可以参看:https://issues.apache.org/jira/secure/attachment/12445502/GSet20100525.pdf

LightWeightGSet成员变量及构造方法:

//链表。数组元素的类型
  public static interface LinkedElement {
    /** Set the next element. */
    public void setNext(LinkedElement next);

    /** Get the next element. */
    public LinkedElement getNext();
  }

  static final int MAX_ARRAY_LENGTH = 1 << 30; //prevent int overflow problem
  static final int MIN_ARRAY_LENGTH = 1;

  //数组
  private final LinkedElement[] entries;
  /** A mask for computing the array index from the hash value of an element. 取值为数组长度-1 */
  private final int hash_mask;
  /** The size of the set (not the entry array). */
  private int size = 0;
  /** Modification version for fail-fast.
   * @see ConcurrentModificationException
   */
  private int modification = 0;


//创建LightWeightGSet时,根据recommended_length确定数组的长度,一旦确定了就不会再改变
  public LightWeightGSet(final int recommended_length) {
    final int actual = actualArrayLength(recommended_length);
    if (LOG.isDebugEnabled()) {
      LOG.debug("recommended=" + recommended_length + ", actual=" + actual);
    }
    entries = new LinkedElement[actual];
    hash_mask = entries.length - 1;
  }

  //compute actual length
  private static int actualArrayLength(int recommended) {
    if (recommended > MAX_ARRAY_LENGTH) {
      return MAX_ARRAY_LENGTH;
    } else if (recommended < MIN_ARRAY_LENGTH) {
      return MIN_ARRAY_LENGTH;
    } else {
      final int a = Integer.highestOneBit(recommended);
      return a == recommended? a: a << 1;
    }
  }
  //省略
  }

 

基本操作:

public E put(final E element) {
    //validate element
    if (element == null) {
      throw new NullPointerException("Null element is not supported.");
    }
    if (!(element instanceof LinkedElement)) {
      throw new HadoopIllegalArgumentException(
          "!(element instanceof LinkedElement), element.getClass()="
          + element.getClass());
    }
    final LinkedElement e = (LinkedElement)element;

    //通过hash取于 得到element在数组上的位置
    final int index = getIndex(element);
  
    //如果已近存在就删除
    final E existing = remove(index, element);

    modification++;
    size++;
    //将element加到index位置链表的头部
    e.setNext(entries[index]);
    entries[index] = e;

    return existing;
  }
  
//hash取与的方式获取key在数组中的小标位置
private int getIndex(final K key) {
    return key.hashCode() & hash_mask;
}

//删除数组中index位置的链表上的K为key的元素
private E remove(final int index, final K key) {
    if (entries[index] == null) {
      return null;
    } else if (entries[index].equals(key)) {
      //remove the head of the linked list
      modification++;
      size--;
      final LinkedElement e = entries[index];
      entries[index] = e.getNext();
      e.setNext(null);
      return convert(e);
    } else {
      //head != null and key is not equal to head
      //search the element
      LinkedElement prev = entries[index];
      for(LinkedElement curr = prev.getNext(); curr != null; ) {
        if (curr.equals(key)) {
          //found the element, remove it
          modification++;
          size--;
          prev.setNext(curr.getNext());
          curr.setNext(null);
          return convert(curr);
        } else {
          prev = curr;
          curr = curr.getNext();
        }
      }
      //element not found
      return null;
    }
  }

//删除元素key  
public E remove(final K key) {
    //validate key
    if (key == null) {
      throw new NullPointerException("key == null");
    }
    return remove(getIndex(key), key);
  }

 

block存储:BlocksMap使用LightWeightGSet来存储blcok,BlockManager在创建BlocksMap时指定使用namenode总内存的2%来作为LightWeightGSet的容量

//capacity为namenode总内存空间的2%,计算过程间BlockManager的构造方法
BlocksMap(int capacity) {
    this.capacity = capacity;
    this.blocks = new LightWeightGSet<Block, BlockInfoContiguous>(capacity) {
      @Override
      public Iterator<BlockInfoContiguous> iterator() {
        SetIterator iterator = new SetIterator();
        /*
         * Not tracking any modifications to set. As this set will be used
         * always under FSNameSystem lock, modifications will not cause any
         * ConcurrentModificationExceptions. But there is a chance of missing
         * newly added elements during iteration.
         */
        iterator.setTrackModification(false);
        return iterator;
      }
    };
  }

 

展开阅读全文
打赏
0
0 收藏
分享
加载中
该评论暂时无法显示,详情咨询 QQ 群:912889742
更多评论
打赏
1 评论
0 收藏
0
分享
返回顶部
顶部