MXQuery 0.6.0 API Documentation - Copyright 2006-2009 ETH Zurich

ch.ethz.mxquery.sms.btree
Class BTree

java.lang.Object
  extended by ch.ethz.mxquery.sms.btree.BTree

public class BTree
extends java.lang.Object

A simple btree implementation. (key equal or less than pivot -> go right) Has to be taylored to different key and value types manually as Generics would use Complex type (inefficent) instead of native types. We have pointers among leaves and also only store key/value mappings on the leaves. Therefore, this is a B+-tree implementation. The implementation allows us to store duplicate keys in the leaf level of the tree. However, a strict tree interval invariant is kept. To handle duplicate keys, this means that overflows due to duplication are handled by adding special overflow leaves that are not pointed to by any parent node. These leaves are created via splitting, however no pivot is promoted on such a split. The delete strategy implemented in this tree is to free at empty. Thus, there is no logic to merge at nodes at half occupation and nodes may become underutilized. This may be monitored by calculating the utilization at the leaf level.


Field Summary
static int INVALID_KEY_INDEX
          invalid value for atPos index returned from getFirstKeyAfter
 
Constructor Summary
BTree(int k, int k_star)
          Instantiates a new BTree.
 
Method Summary
 void add(DeweyIdentifier key, LinguisticToken value, LeafCarrier leafCarrier)
          Adds a mapping from key to value in the b-tree.
 BTreeNode bulkLoad(PhraseIterator input, int F)
          Bulkloads an empty BTree fill leaves from left to right until certain percentage.
 void close()
          Close this b-tree.
 LinguisticToken get(DeweyIdentifier key, BTreeNode currentRoot)
          Gets the value currently mapped to the given key.
 DeweyIdentifier getFirstKey()
          This method returns the first(smallest) currently in the tree.
 void getFirstKeyAfter(DeweyIdentifier key, Leaf[] inLeaf, int[] atPos)
          This method takes two arrays as out parameters and inserts in them pointers to the leaf and position of the smallest key currently in tree which is either equal to key (if key is present in btree) or smallest key larger than key (if key is not present in btree):
 Leaf getFirstLeaf()
          gets the first leaf of the btree
 DeweyIdentifier getKey(Leaf l, int pos)
          Gets the key value in the given leaf at the given position
 DeweyIdentifier getLastKey()
          This method returns the last(largest) key present in the tree
 void getLastKeyBefore(DeweyIdentifier key, Leaf[] inLeaf, int[] atPos)
          This method takes two arrays as out parameters and inserts in them pointers to the leaf and position of the largest key currently in tree which is either equal to key (if key is present in btree) or largest key smaller than key (if key is not present in btree):
 void getLastKeyBeforeLeafTraversal(DeweyIdentifier toKey, Leaf startLeaf, int startPos, Leaf[] inLeaf, int[] atPos)
          This method takes two arrays as out parameters and inserts in them pointers to the leaf and position of the largest key currently in tree which is either equal to key (if key is present in btree) or largest key smaller than key (if key is not present in btree).
 Leaf getLastLeaf()
           
 LinguisticToken getNext(DeweyIdentifier key, BTreeNode currentRoot)
           
 BTreeNode getRoot()
          Gets the root of the btree
 java.util.Vector getSiblings(DeweyIdentifier from, DeweyIdentifier parent, BTreeNode currentRoot, DeweyIdentifier[] ignoreId)
           
 void printStats()
          prints statistical infos about the tree
 void queryRange(DeweyIdentifier lowKey, DeweyIdentifier highKey, BtreePushOperator results)
          Returns all the values mapped in the given key range through the provided push operator.
 void remove(DeweyIdentifier key)
          Removes all mappings corresponding to the given key from the b-tree.
 void remove(DeweyIdentifier key, LinguisticToken value)
          Removes one instance of the given key-value mapping from the b-tree.
 int size()
           
 java.lang.String toString()
          Prints the root of the tree as a string.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

INVALID_KEY_INDEX

public static final int INVALID_KEY_INDEX
invalid value for atPos index returned from getFirstKeyAfter

See Also:
Constant Field Values
Constructor Detail

BTree

public BTree(int k,
             int k_star)
Instantiates a new BTree. Does not create persister. Has to be set manually afterwards, since every BTree must have a persister.

TODO remove this constructor.

Parameters:
k -
k_star -
Method Detail

bulkLoad

public BTreeNode bulkLoad(PhraseIterator input,
                          int F)
                   throws MXQueryException
Bulkloads an empty BTree fill leaves from left to right until certain percentage. If percentage reached begin next leaf and create parent InternalNode if needed.

Parameters:
input - as Pulloperator of Tuples, sorted by keys
F - denoting the leaf usage
Throws:
MXQueryException

add

public void add(DeweyIdentifier key,
                LinguisticToken value,
                LeafCarrier leafCarrier)
Adds a mapping from key to value in the b-tree. Duplicate mappings are allowed.

Parameters:
key -
value -
leafCarrier -

get

public LinguisticToken get(DeweyIdentifier key,
                           BTreeNode currentRoot)
Gets the value currently mapped to the given key.

Parameters:
key -
currentRoot -

getNext

public LinguisticToken getNext(DeweyIdentifier key,
                               BTreeNode currentRoot)

getSiblings

public java.util.Vector getSiblings(DeweyIdentifier from,
                                    DeweyIdentifier parent,
                                    BTreeNode currentRoot,
                                    DeweyIdentifier[] ignoreId)

getKey

public DeweyIdentifier getKey(Leaf l,
                              int pos)
Gets the key value in the given leaf at the given position

Parameters:
l - leaf in which key is looked up
pos - position inside leaf where key is at
Returns:
the identifier at l, pos

getFirstKeyAfter

public void getFirstKeyAfter(DeweyIdentifier key,
                             Leaf[] inLeaf,
                             int[] atPos)
This method takes two arrays as out parameters and inserts in them pointers to the leaf and position of the smallest key currently in tree which is either equal to key (if key is present in btree) or smallest key larger than key (if key is not present in btree):

Parameters:
key -
inLeaf - points to the leaf where to get the key
atPos - points to the position of the key inside the leaf

getLastKeyBefore

public void getLastKeyBefore(DeweyIdentifier key,
                             Leaf[] inLeaf,
                             int[] atPos)
This method takes two arrays as out parameters and inserts in them pointers to the leaf and position of the largest key currently in tree which is either equal to key (if key is present in btree) or largest key smaller than key (if key is not present in btree):

Parameters:
key - the search key
inLeaf - points to the leaf where to get the key
atPos - points to the position of the key inside the leaf

getLastKeyBeforeLeafTraversal

public void getLastKeyBeforeLeafTraversal(DeweyIdentifier toKey,
                                          Leaf startLeaf,
                                          int startPos,
                                          Leaf[] inLeaf,
                                          int[] atPos)
This method takes two arrays as out parameters and inserts in them pointers to the leaf and position of the largest key currently in tree which is either equal to key (if key is present in btree) or largest key smaller than key (if key is not present in btree). This method implements the same functionality as getLastKeyBefore but uses leaf traversal to find the position of the key.

Parameters:
toKey - the search key
startLeaf -
startPos -
inLeaf - points to the leaf where to get the key
atPos - points to the position of the key inside the leaf

remove

public void remove(DeweyIdentifier key)
Removes all mappings corresponding to the given key from the b-tree.

Parameters:
key -

remove

public void remove(DeweyIdentifier key,
                   LinguisticToken value)
Removes one instance of the given key-value mapping from the b-tree. Note that even if multiple instances of that mapping exist, only a single instance will be removed.

Parameters:
key -
value -

queryRange

public void queryRange(DeweyIdentifier lowKey,
                       DeweyIdentifier highKey,
                       BtreePushOperator results)
Returns all the values mapped in the given key range through the provided push operator. We include values that correspond to the lowKey and also include values that correspond to the highKey.

Parameters:
lowKey -
highKey -
results - PushOperator to be filled.

toString

public java.lang.String toString()
Prints the root of the tree as a string.

Overrides:
toString in class java.lang.Object

close

public void close()
Close this b-tree. That is make it persistent first.


printStats

public void printStats()
prints statistical infos about the tree


size

public int size()
Returns:
number of keys in an initialized this b-tree.

getRoot

public BTreeNode getRoot()
Gets the root of the btree

Returns:
the root of the btree

getFirstLeaf

public Leaf getFirstLeaf()
gets the first leaf of the btree

Returns:
the first leaf of the btree or null if no leaf exists

getLastLeaf

public Leaf getLastLeaf()

getLastKey

public DeweyIdentifier getLastKey()
This method returns the last(largest) key present in the tree

Returns:
the largest key in the tree

getFirstKey

public DeweyIdentifier getFirstKey()
This method returns the first(smallest) currently in the tree. If the tree is empty the method returns null.

Returns:
the smaller key

MXQuery 0.6.0 API Documentation - Copyright 2006-2009 ETH Zurich

MXQuery 0.6.0 API Documentation - Copyright 2006-2009 ETH Zurich