1. 程式人生 > >Basics of Hash Table--Data Structure

Basics of Hash Table--Data Structure

Intro:

API: Python: Dict; JAVA: HashMap

Applications: File Systems; Password Verification; Store Optimization

IP Address: 

Main Loop

log - array of log lines(time,IP)
C - mapping from IPs to counters
i - first unprocessed log line
j - first line in current 1h window

i0

j0
C←∅
Each second

UpdateAccessList

(log, i, j, C)

UpdateAccessList(log, i, j, C)

while log[i].timeNow():

    C[log[i].IP]C[log[i].IP] + 1

    ii+1

while log[j].timeNow()3600:

    C[log[j].IP]C[log[j].IP]1

    jj+1

AccessedLastHour(IP, C)

return C[IP]> 0

Direct Addressing

Need a data structure for

C
There are 232different IP(v4) addresses

Convert IP to 32-bit integer
Create an integer array
A of size 232

Use A[int(IP)]as C[IP

int(IP)

return IP[1]·224+IP[2]·216+IP[3]·28+IP[4]

UpdateAccessList(log, i, j, A)

while log[i].timeNow():

    A[int(log[i].IP)]A[int(log

[i].IP)] + 1

    ii+1

while log[j].timeNow()3600:

    A[int(log[j].IP)]A[int(log[j].IP)]1

    jj+1

AccessedLastHour(IP)

return A[int(IP)]> 0

Asymptotics

UpdateAccessListis O(1)per log line

AccessedLastHouris O(1)
But need 232memory even for few IPs

IPv6: 2128won’t fit in memory

In general: O(N)memory, N= |S

List-based Mapping:

Direct addressing requires too much memory

Let’s store only active IPs
Store them in a list
Store only last occurrence of each IP 

Keep the order of occurrence 

UpdateAccessList(log, i, L)

while log[i].timeNow():

    log_lineL.FindByIP(log[i].IP)

    if log_line=NULL:

        L.Erase(log_line)

    L.Append(log[i])

    ii+1

while L.Top().timeNow()3600:

    L.Pop()

AccessedLastHour(IP, L)

return L.FindByIP(IP)=NULL

Asymptotics

n is number of active IPs
Memory usage is
Θ(n)
L.Append,L.Top,L.Popare Θ(1)

L.Findand L.Eraseare Θ(n)

UpdateAccessListis Θ(n)per log line

AccessedLastHouris Θ(n

Encoding IPs

Encode IPs with small numbers
I.e. numbers from 0 to 999
Different codes for currently active IPs 

Hash Function

De nition

For any set of objectsS and any integer
m >0, a function h: S→ {0,1,...,m1}is called a hash function.

De nition

m is called thecardinality of hash function h.

Desirable Properties

h should be fast to compute.

Different values for different objects.

Direct addressing withO(m)memory.

Want small cardinalitym.

Impossible to have all different values ifnumber of objects|S|is more than m.

Collisions

De nition

When h(o1) = h(o2)and o1̸=o2, this is acollision.

Map

Store mapping from objects to other objects:

   Filename location of the file on disk

   Student ID student name
   Contact name
contact phone number

Definition

Map from S to V is a data structure with methodsHasKey(O),Get(O),Set(O,v),whereO S,vV.

h :S → {0,1, . . . ,m 1}
O,OS
v
,vV
A
array ofm lists (chains) of pairs(O,v)

HasKey(O)

L A[h(O)]
for (O,v)in L:

  if O== O:

     return true

return false

Get(O)

L A[h(O)]
for (O,v)in L:

   if O== O:

      return v

return n/a

Set(O,v)

L A[h(O)]

for pin L:

   if p.O== O:

      p.vv

      return 

L.Append(O, v)

Set

De nition

Set is a data structure with methodsAdd(O),Remove(O),Find(O).

Examples

IPs accessed during last hourStudents on campus
Keywords in a programming language

h :S → {0,1, . . . ,m 1}
O,OS
A
array ofm lists (chains) of objectsO

Find(O)

L A[h(O)]

for Oin L:

  if O== O:

    return true

return false

Add(O)

L A[h(O)]

for Oin L:

  if O== O:

    return

L.Append(O)

Remove(O)

if not Find(O):

  return

L A[h(O)]

L.Erase(O)

Hash Table:

Definition

An implementation of a set or a map usinghashing is called a hash table.

Programming Language:

Set:

相關推薦

Basics of Hash Table--Data Structure

Intro: API: Python: Dict; JAVA: HashMap Applications: File Systems; Password Verification; Store Opt

Data structure basics - Java Implementation

tno div alt dequeue algo ges cap pan created Stack & Queue Implementations FixedCapacityQueue package cn.edu.tsinghua.stat.mid

Data structure you've never heard of(列舉+dp)

沒有來源。 題目大意: 給 n n n個長度為 k k k的01串,求不下降序列的個數。 n≤100000,k≤16

Bitcask:A Log-Structured Hash Table for Fast Key/Value Data 閱讀筆記

一個Bitcask例項就是一個目錄,我們保證在一個時刻只有一個系統程序可以開啟Bitcask進行寫操作。 在一個時刻,只有一個檔案是“active”的。當檔案達到一定的大小限制就會關閉,並建立一個新的“active”檔案。 一旦一個檔案關閉了,就視為是不可變的,即不會再開啟來進行寫操作。

[Data Structure & Algorithm] Hash那點事兒

#include <iostream> #include <vector> #include <list> #include <string> #include <cstdlib> #include <cmath> #in

Hash Tables: String Search--Data Structure

Naive Algorithm For each position i from 0 to |T| − |P|,check character-by-character whet

data structure practice

margin bsp .cn target data 外排序 初始化 practice structure 《數據結構與教程 第二版》(北航出版社) 數據結構 線性表 數組、串、廣義表 特殊線性表:棧、隊列 棧、隊列 存儲:線性存儲、鏈式存儲 基本操作(6):

3.1.7. Cross validation of time series data

distrib per ted sklearn provided imp depend util ech 3.1.7. Cross validation of time series data Time series data is characterised by the

UVa 11995 - I Can Guess the Data Structure!

spa 實現 size end amp ins post bool ret 題目:給你一些數據結構上的操作,推斷該數據結構是棧、隊列、還是優先隊列。 分析:0基礎DS,模擬。構建三種結構,直接模擬,然後依據結果推斷。 說明:優先隊列用最大堆實現。 #include &l

211. Add and Search Word - Data structure design

new have ret oop you print recursive expr ng- https://leetcode.com/problems/add-and-search-word-data-structure-design/#/description Des

Hash Table Collision Handling

arc sar data available pop array actor method truct Two basic methods; separate chaining and open address. Separate Chain Hangs an additi

The Basics of the Doherty Amplifier-Bill Slade [轉載]

idea topology [1] could discus ebs lec strong gather Introduction The year is 1936. The boom times of the 1920s are a distant memory

leetcode 211. Add and Search Word - Data structure design

cti rsquo .com dict esc sum color pos nsis Design a data structure that supports the following two operations: void addWord(word) bool

What is “passive data structure” in Android/Java?

nag holding bstr say roi containe ive ces get From the Android developer web link: http://developer.android.com/reference/android/content

對象內存 (擴展 Data Structure Alignment)

文章 bsp 多少 存在 這也 tail article hive .com 對於一個class object來說,我們需要多少內存才能表現出來,大致分為3類,這裏在前面文章有內存圖 (1)非靜態數據成員的綜合大小,這也符合了c++對象模型的結構 (2)填充字節,就是我們所

Uva 11995 I Can Guess the Data Structure!

size str data href pan else stl mes ems Uva 11995 I Can Guess the Data Structure! 思路:隊列,棧和優先隊列的模擬。用STL更方便。 代碼: #include<bits/stdc

建立簡單的Hash table(哈希表)by C language

class truct num span abs arr str log tab 1 #define SIZE 1000 //定義Hash table的初始大小 2 struct HashArray 3 { 4 int key; 5 i

【推導】【線段樹】hdu5929 Basic Data Structure

mat string 發現 span print 定義 %d ram 個數 題意: 維護一個棧,支持以下操作: 從當前棧頂加入一個0或者1; 從當前棧頂彈掉一個數; 將棧頂指針和棧底指針交換; 詢問a[top] nand a[top-1] nand ... nan

Data Structure

namespace light char ati ret pre div 數據 != 寫在前面 數據結構這種東西,還是需要學習一下的。不能投機取巧用STL了,得學會自己手寫,畢竟效率差距非常大。 關鍵是明白原理,然後需要自己手寫實踐一下,踩坑才行。 代碼能力太差,需要

Data Structure and Algorithm

spa ash stack brush truct ati clas static tro Array & ArrayList String LinkedList Stack & Queue Recursion 1. Knapsack non-repeati