User: Password:
|
|
Subscribe / Log in / New account

Algorithms for loading history file

Algorithms for loading history file

Posted Jun 1, 2005 16:15 UTC (Wed) by Blaisorblade (guest, #25465)
In reply to: Algorithms for loading history file by liljencrantz
Parent article: Fish - The friendly interactive shell

Ok, probably the search algorithm is ok...
For the hash function, SHA1 has almost 0 probability of collision and gives a 160 bits hash; to avoid collisions between around 5000 entries, it's really overkill. Actually a good hash function for non-cryptographic purposes (assuming you go compare entries with the same hash) is of the followind kind:

int hash(char a[], int size, int prime) {
  char * p = a;
  int sum = 0;
  while (*p) {
    sum = sum * prime;
    sum = (sum + *p++) % size;
    //printf("%d ", sum);
  }
  return sum;
}

it returns an integer in the range [0..size); (and prime must be a prime number, which usually you choose at compile time). prime=131 is a good choice. It is easy to add this code to build a testing program with the below (send size and prime on the first line, then a test file, and then sort|uniq -c|count the average number of collisions on used hash codes with awk).

char str[1000];

int main(void) {
  int size, prime;
  scanf("%d %d", &size, & prime);
  while (scanf("%s", str) != EOF) {
    printf("%d\n", hash(str, size, prime));
  }
}


(Log in to post comments)


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds