Skip to Content
Problem Solving

3-数据结构

2020-05-18Original-language archivelegacy assets may be incomplete

Dynamic sets

  • operations
    • Search(S, k)
    • Insert(S, x)
    • Delete(S, x)
    • Min/Max(S)
    • Successor(S, x)
    • Precessor(S, x)
  • data structure augmentation
    • choose an underlying data structure
    • determine addtional information to maintain
    • verify it can be maintained
    • develop new operations

优先队列

  • operation
    • Insert(S, x)
    • Max(S)
    • extract_max(S)
    • increase_key(S, x, k)
    • not good for Search

线性数据结构

  • Stack
  • Queue
  • Linked list

  • property
    • max-heap property: A[Parent(i)]\geqA[i]
  • procedure
    • max-heapify O(lgn)O(\lg n)
    • build-max-heap O(n)O(n)
    • max-heap-insert/extract/increase O(lgn)O(\lg n)
def max_heapify(A, i):
    # l, r are already max-heaps
    l = 2 * i
    r = 2 * i + 1
    if l <= A.size and A[l] > A[i]:
        largest = l
    else:
        largest = i
    if r <= A.size and A[r] > A[largest]:
        largest = r
    if largest != i:
        swap(A[i], A[largest])
        mex_heapify(A, largest)

哈希表

  • m slots store n elements
  • hash function: simple uniform hashing
    • h(k)=kmh(k)=\lfloor km\rfloor
    • h(k)=kmodmh(k)=k\bmod m
    • h(k)=m(kAmod1)h(k)=\lfloor m(kA\bmod 1)\rceil
    • universal hashing: P(h(k)=h(l))HmP(h(k)=h(l))\leq\frac{|\mathcal{H}|}{m}
      • hab(k)=((ak+b)modp)modm,H={hab:aZp,bZp}h_{ab}(k)=((ak+b)\bmod p)\bmod m,\mathcal{H}=\{h_{ab}:a\in\mathbb{Z}^*_p,b\in\mathbb{Z}_p\}
  • collision resolution
    • chaining O(1+nm)O(1+\frac{n}{m}) (worst case Θ(n)\Theta(n))
    • open addressing
      • linear probing
      • quadratic probing: h(k,i)=(h(k)+c1i+c2i2)modmh(k,i)=(h'(k)+c_1i+c_2i^2)\bmod m
      • double hashing: h(k,i)=(h1(k)+ih2(k))modmh(k,i)=(h_1(k)+ih_2(k))\bmod m
  • perfect hashing: two level hashing, O(1)O(1)

二叉搜索树

  • property: left < root < right
  • search, min/max, succ/pre: O(h)O(h)
void bst_insert(T, z){
  y = NULL
  x = T.root
  while (x){
    y = x;
    if z.key < x.key
      x = x.left
    else x = x.right
  }
  z.p = y;
  if (!y) T.root = z;
  else if (z.key < y.key) T.left = z;
  else y.right = z;
}
void bst_delete(T, z){
  if (!z.left) transplant(T, z, z.right);
  else if (!z.right) transplant(T, z, z.left);
  else {
    y = bst_min(z.right);
    if (y.parent != z){
      transplant(T, y, y.right);
      y.right = z.right;
      y.right.parent = y;
    }
    transplant(T, z, y);
    y.left = z.left;
    y.left.parent = y;
  }
}
  • randomly built BST has expected h=O(lgn)h=O(\lg n)

红黑树

  • h2log2(n+1)h\leq 2\log_2 (n+1)
  • red-black properties
    • node is red(R) or black(B)
    • root is black
    • leaf(NIL) is black
    • red node has black children
    • all simple paths from the node to descendant leaves contain the same number of black nodes
  • rotation
void left_rotate(T, x){
  y = x.right;
  x.right = y.left;
  if y.left != T.nil
    y.left.p = x
  y.p = x.p
  if x.p == T.nil
    T.root = y
  else if x == x.p.left
    x.p.left = y
  else x.p.right = y
  y.left = x
  x.p = y
}
  • Insertion: insert red + fixup(以下父亲为祖父左结点)
    • case1: uncle is red
    • case2: uncle is black and self is left
    • case3: uncle is black and self is right
  • Delete: delete + (black) fixup
    • case1: sibling is red
    • case2: sibling black and has black children
    • case3: sibling black and has left red, right black
    • case4: sibling black and has right red
  • order-static tree: maintain size
    • retrieve with a given rank O(lgn)O(\lg n)
    • determine rank O(lgn)O(\lg n)

B 树

  • hlogtn+12h\leq \log_t\frac{n+1}{2} with minimum degree t2t\geq 2 and nn keys
    • same examining times for keys
    • lgt\lg t less examining times for nodes
  • properties
    • k1x.k1k2x.kx.nkx.nk_1\leq x.k_1\leq k_2\leq \cdots\leq x.k_{x.n}\leq k_{x.n}
    • All leaves have the same depth
    • Every node other than root has at least t1t-1 keys(tt children). Every node may have at most 2t2t children. Root has at least 1 key.
  • split: O(t)O(t)
  • search/insert/delete: O(th)O(th)
  • B+ tree: 索引仅出现在 leaf,容纳更多索引项

Fibonacci 堆

  • Mergeable heaps (amortized)
    • Insert Θ(1)\Theta(1)
    • Decrease Θ(1)\Theta(1)
    • Union Θ(1)\Theta(1)
    • Other same to heap
  • Fibonacci heap: min-heap ordered (k-ary)
  • Φ(H)=t(H)+2m(H)\Phi(H)=t(H)+2m(H)
    • t(H)t(H): number of trees
    • m(H)m(H): number of marked nodes (lost a child since the last time it was made the child of another node)
  • consolidate: O(D(n))O(D(n))
    • find two roots of same degree, link the more one to another, until every root has a distinct degree value
    • auxiliary array A[0..D(H.n)]A[0..D(H.n)]
  • D(H)=logϕn=O(lgn)D(H)=\lfloor\log_\phi n\rfloor=O(\lg n)
    • node with root degree kk has size Fk+2\geq F_{k+2}

van Emde Boas Tree (vEB tree)

  • dynamic set with values from universe Zu\mathbb{Z}_u
  • direct addressing
    • insert, delete, member: O(1)O(1)
    • min/max, pre/succ: Θ(u)\Theta(u)
  • superimposing a tree of constrant height: O(u)O(\sqrt{u})
  • shrink with u\sqrt{u} gets O(lglgn)O(\lg\lg n) (shrink with 22 gets O(lgn)O(\lg n))
    • T(n)=T(n)+O(1)T(n)=T(\sqrt{n})+O(1) yields T(n)=O(lglgn)T(n)=O(\lg\lg n)
    • T(n)=2T(n)+O(1)T(n)=2T(\sqrt{n})+O(1) yields T(n)=O(lgnlglgn)T(n)=O(\lg n\lg\lg n)
  • vEB node
    • u: number of elements
    • min, max
    • summary: point to a vEB(u\sqrt{u}) node keeping bit vector
    • cluster: point to u\sqrt{u} vEB(u\sqrt{u})
  • vEB tree
    • create empty tree: O(u)O(u)
    • operations: O(lglgn)O(\lg \lg n)
    • space: O(u)O(u)

并查集

  • linked-list: Θ(n2)\Theta(n^2) for union
  • Disjoint-set forests: O(mα(n)),α(7)=2,α(2047)=3,α(16512)=4O(m\alpha(n)),\alpha(7)=2,\alpha(2047)=3,\alpha(16^{512})=4
    • union by rank
    • path compression

TBD

  • 可持久
  • AVL
  • Treaps
  • 线段树
  • Dynamic trees
  • Splay trees