TODO


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88

- keep history of installed packages/journal of package transaction,
  so we can roll back to yesterday, or see what got installed in the
  latest yum update.

- we build a cache of the currently installed set to service
  dependency inquiries fast:

	map from property to pkg (as hash) providing it
	map from property to pkgs requiring it
	map from pkg name to manifest
	map from string to string pool index

	no implicit provides? not even pkgname?

- properties are strings, stored in a string table

- on disk maps are binary files of (string table index, hash) pairs

- at run time, we mmap the map, and keep changes in memory in a splay
  tree or similar.  if searching the splay tree fails we punt to the
  mmap.  once the transaction is done, we merge the map and the splay
  tree and write it back out.

- the on-disk string pool is sorted and we keep a list of indices into
  the string pool in sorted order so we can bsearch the list with a
  string to get its string pool index.  maybe a hash table is better,
  less I/O as we will expect to find the string within the block we
  look up with the hash function.

- represent all files as a breadth first traversal of the tree of all
  files.  each entry has its name (string pool index), the number of
  immediate children, total number of children, and owning package.
  for files both these numbers are zero.  a file is identified by its
  index in this flattened tree.

  to get the file name from an index, we search through the list.  by
  summing up the number of children, we know when to skip a directory
  and when to descend into one.  as we go we accumulate the path
  elements.

  hmm, dropping number of immediate children and using a sentinel drops
  a word from every entry.

- signed pkgs

- gzip repository of look-aside pkg xml files somehow?

- transactions, proper recovery, make sure we don't poop our package
  database (no more rm /var/lib/rpm/__cache*).

- diff from one package set to another answers: "what changed in
  rawhide between since yesterday?"

- rewrite qsort and bsearch that doesn't require global context var
  and can output a map describing the permutaion.

- use hash table for package and property lists so we only store
  unique lists (like for string pool).

- use existing, running system as repo; eg

	razor update razor://other-box.local evince

  to pull eg the latest evince and dependencies from another box.  We
  should be able to regenerate a rzr pkg from the system so we can
  reuse the signature from the originating repo.

- Ok, maybe the fastest package set merge method in the end is to use
  the razor_importer, but use a hash table for the properties.  This
  way we can assign them unique IDs immediately (like tokenizing
  strings).

- bash completion for 'razor install gtk2-<TAB>' og
  'razor install /usr/bin/gtk-perf<TAB>'

- test suite should be easy, just keep .repo files around and test
  different type of upgrades that way (obsoletes, conflicts, file
  conflicts, file/dir problems etc).  Or maybe just keep a simple file
  format ad use a custom importer to create the .repo files.

- pipelined download and install; download is network bound, install
  is disk bound.  Start installing once we have self-contained set of
  packages.  Install in reverse topo-sort order.  Interruptible
  installation; stops at nearest checkpoint.

- make packages pointers be either an index into the package pool or a
  direct link to a package when there is only one package.  set a high
  bit to indicate which it is.  similar for properties.