TODO


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201

Towards replacing rpm + yum (0.1):

- drop the filelists from the main package set file, split out to a
  secondary file.  for package sets that depend on other package sets,
  we need to be able to generate properties with owning packages that
  are in another set.  this way, a package that requires a file, will
  look up the provides in the set and find the package that owns it
  and then try mark that for update.

- installer part:

   - pre install check; check that dirs can be created (no files where
     want to create dirs), move config files according to file
     flags. (.rpmnew etc)

   - store rpm headers for installed packages.

   - implement rpm uninstall and update.

   - triggers? just say no?

- rpm seems to consider glibc > 2.6.90 to mean greater than
  2.6.90-anything.  That is, a comparison that doesn't mention the
  release field, shouldn't regard the release field of pkgs it
  compares against.  glibc-common-2.6.90 has

	conflicts: glibc < 2.6.90, glibc > 2.6.90

  since rpm doesn't let you do glibc != 2.6.90, and

	requires: glibc = 2.6.90

  will always pull in glibc.  But even with a != relation, would
  glibc-2.6.90-16 be equal to 2.6.90?  glibc 2.7.90-8 dropped it in
  favor of requires = 2.7.90-8 (#225806).

- signed packages

- space calculation before transaction, but ideally, do a number of
  smaller transactions.

- pre-link changing binaries and libs on disk screwing up checksum?

- pipelined download and install; topo-sort packages in update set,
  pick one with all deps in the current set, add it to the current set
  and satisfy deps against update set => result: minimal update
  transaction.  Queue download and install/update transaction for the
  packages in the minimal set, start over.  This also makes the
  installation phase much more interruptible, basically just stop
  after a sub-transaction finishes.  As we keep the update set around
  as a target, we can restart if needed.  Probably don't need to, can
  just do a new update.  During a sub-transaction we should keep the
  target set (i.e. the current set to be) around as a lock file
  (system.repo.lock or so, see git) so that razor updates are
  prevented if the systems crashes during an update.

- implement depsolving between multiple package sets by creating an
  iterator that has a sorted list of all installed pkgs from all sets,
  all installed requires from all sets, all installed provides from
  all sets etc.  could be a list of tuples (pkgs index, set index).
  should simplify even the two-set depsolving a bit since we can
  pretend there's just one set.  this should also be useful for the
  'overlay set' idea where the system set is actually made up of a
  number of sets, but typically a read-only set from a read-only fs
  and a read-write set from a r/w fs.

- locking: we use advisory file locking on the system set
  (/var/lib/razor/system.repo) to indicate a transaction is in
  progress.  The locking algorithm is as follows:

    1. obtain advisory lock on system set.  if this is already taken,
       we know that a process is actively modifying the system set and
       we have to wait.  there's a fcntl that lets you block for the
       lock to go away.

    2. if a system-next.repo file already exists an earlier razor
       process was interrupted or crashed and we may want to clean
       that up.  the system-next.repo file will record what the
       previous instance was trying to do and we can just replay that
       to clean up.

    3. create the new package set whichever way and write it to
       system-next.repo, then start installing/removing rpms.

    4. When the update is complete, rename system-next.repo to
       system.repo and remove the advisory lock.

  we should probably introduce a new object that encapsulates this
  sequence, the filename conventions, rpm cache, e.g. struct
  razor_image, with operations such as

	#define RAZOR_IMAGE_READ	0x01
	#define RAZOR_IMAGE_WRITE	0x02

	struct razor_image *
	razor_image_open(const char *root, unsigned int flags);

	int
	razor_image_begin_transaction(struct razor_image *image,
				      struct razor_set *target);

	int
	razor_image_finish_transaction(struct razor_image *image);

  the transaction pipelineing described above sits on top of this,
  since each step there needs to complete a full transaction that
  writes out a new package set.

  for overlay package sets we could do something like

	struct razor_image *
	razor_image_open_with_base(const char *root, unsigned int flags,
				   struct razor_image *base);

  where base specifies the r/o package set it's layered on.  this
  allows for stacking several layers of images.


Package set file format items:

- drop the 4k section alignment

- just use strings for header identifiers, make the string pool
  section have a fixed string (maybe make "strings" always the first
  string so its index is 0), or maybe just require that it's the first
  section in the file.

- nail down byte-order of repo file.

- version the sections in the file, put the element size in the header
  so we can add stuff to elements in a backwards compatible way.
  maybe not necessary, we can just add sections that augment the
  sections we want to add to (similar to how rpm has add versioned
  deps).


Misc ideas:

- keep history of installed packages/journal of package transaction,
  so we can roll back to yesterday, or see what got installed in the
  latest yum update.

- transactions, proper recovery, make sure we don't poop our package
  database (no more rm /var/lib/rpm/__cache*).

- use hash table for package and property lists so we only store
  unique lists (like for string pool).

- use existing, running system as repo; eg

	razor update razor://other-box.local evince

  to pull eg the latest evince and dependencies from another box.  We
  should be able to regenerate a rzr pkg from the system so we can
  reuse the signature from the originating repo.

- Ok, maybe the fastest package set merge method in the end is to use
  the razor_importer, but use a hash table for the properties.  This
  way we can assign them unique IDs immediately (like tokenizing
  strings).

- test suite should be easy, just keep .repo files around and test
  different type of upgrades that way (obsoletes, conflicts, file
  conflicts, file/dir problems etc).  Or maybe just keep a simple file
  format ad use a custom importer to create the .repo files.

- try to clean up the

	do { ... } while (((e++)->name & RAZOR_ENTRY_LAST) == 0);

  idiom for iteration of directories.

- overlay package sets?  mount a read-only /usr over nfs or from the
  virt-host and have a local package set overlaid over the read-only
  one.  shouldn't need new features in the core package set data
  structure, but should be just conventions on top.  we have the base
  package set from the r/o system, the overlay set from the local
  system and we can have an effective package set which is the merge
  of everything from the overlay into the base set.  the effective set
  is easy to compute and we could do it on the fly or cache it.  or
  maybe the effective set is the on-disk representation and the
  overlay can be computed when needed, we just keep a link back to the
  base.

- incremental rawhide repo updates? instead of downloading 10MB zipped
  repo every time, download a diff repo?  Should be pretty small,
  especially if we don't have file checksums in metadata.  Filenames
  and properties are for the most part already present, typically just
  a version bump plus maybe tweaking a couple requires.  The upstream
  repo can store multiple incremental updates in one big file and
  provide an index file that maps updates for a given date (we should
  use repo-file checksums though) to a range in the file: Download the
  index file, search for a match for your latest rawhide.repo file,
  download range of updates that brings it up to date.

- use hash tables for dirs when importing files to avoid qsorting all
  files in rawhide.

Bugs:

- eliminate duplicate entries in package property lists.