isopackIntroductionisopack is a command line tool for filtering a list of files to pack in an ISO image to try and produce the largest image that will still fit on a given media. This is a very hard problem (see Knapsack problem) and isopack has a variety of algorithms you can use to attempt to find a very good packing without taking too much time (at least compared to the time writing a BD-R disk takes, for instance :-). No doubt other algorithms exists which might work better, the current collection was just what I came up with over a few days of fooling around with packing algorithms. Note: I'm putting this out there with no license restrictions. Do what you will with it. (Especially don't ask me if you can use it: The answer is "Yes." :-). DownloadYou can download the C++ source for the tool here: Usageisopack [options...] < infile > outfile isopack reads a modified form of the mkisofs
-graft-points -path-list filename list on
In addition to the filename lines, you can group files you always want to stay together and files that must be included in the packing. Lines that consist of only an open bracket (optionally followed by an exclamation point) or only a close bracket mark the beginning and end of groups. The open bracket with the exclamation marks a group of files that must be included in the output. For example: [! README=/file/collection/README ] defines a group with a single README file which will definitely be included in the resulting packing. On the other hand: [ Movie.avi=/file/collection/Movie.avi Movie.srt=/file/collection/Movie.srt MoviePoster.jpg=/file/collection/MoviePoster.jpg ] defines a group of files which do not have to be included in the output list, but if any one of them is included, all three will be included. Lines in the input do not have to be inside the The group syntax is not included in the output, so the output can be fed directly to mkisofs. Note that isopack requires individual files to be listed in this input, it will not (for instance) accept a line naming a directory and expand that into all the files in the directory. If you want directory contents included in the packing, you need to list all the individual files in those directories. Warning: Burning software often decides to format media by default, so the size you think might fit will not actually fit. Certainly when I tried k3b using growisofs to write a BD-R.SL disk, it ran out of space at the 98% point when I used a packing that should have fit in an unformatted disc. The dvd+rw-mediainfo tool is available on most linux distros, and it will show you all the different recommended sizes for different standard formattings of media. You may need to pick one of the smaller sector sizes instead of the unformatted size in order to use some burning software. Options-help Print usage information for the program. You may also pass the word help following the equal (=) in any of the options below to get detailed information about that option. -algorithm=name Name the algorithm to use to try and find a packing. This option may be specified multiple times, and multiple algorithms will be tried.
-time=seconds Specify the maximum time in seconds any one algorithm is allowed to spend trying to find a packing. You can suffix the time value with m to mean minutes, or h to mean hours. Note that if you specify multiple algorithms, each one will be given this much time. -close=sectors Specify how many sectors are close enough to be willing to use the new packing just found rather than take longer searching for a better one. This can be used to make the search finish faster than the value given on the -time option. -waste=sectors Specify how many sectors you are willing to waste on the media. After isopack has completed all normal processing, it will check this at exit and exit with an error status if the packing wastes more sectors than this (might be useful for scripting). -watchdog=seconds Accepts the same format time values as the -time= option, but instead of timing the entire run of the packing algirithm, acts as a watchdog timer and resets the timeout each time a better packing is found, so something like -watchdog=30 would stop looking for new packings after it spent 30 futile seconds trying to find a better one. Note: Aside from the -time, -close, and -watchdog arguments, you can also stop the running algorithm by sending a SIGINT signal to the isopack process. -verbose=level Specify the verbosity level as an
integer value. Zero means be quiet. One updates you about new
packings as they are found. Two or more prints random details about
what is going on (all this output is to -sectors=size Specify the size of ISO image to generate. The size parameter may either be a number of 2048 byte sectors, or the name of one of the predefined image sizes:
-prog=file Give the full path to the mkisofs executable to use for determining exact size of ISO images. -args=string Give the argument string to be passed to mkisofs. Arguments that are required by isopack will be added to the beginning and end of the string like so:
The default string (only without the line wrapping) contains:
-addarg=string Rather than setting the entire argument string, you can use this to simply add additional arguments to the end of the list (you can use this option multiple times to add multiple strings). Note: Not all versions of mkisofs are the same, you may
need to (for instance) add the -infile=filename Instead of reading the path list
from -outfile=filename Instead of writing the resulting
path list to -unpacked=filename Write the groups from the input
path list which were not included in the packing to this file
(including any of the ResultsI have a collection of movies and other big files I'm been using
as test data (and the Running just ./isopack with no arguments, it only takes a fraction of a second to run and comes up with this packing: |
Best packing found contains 25022370117 bytes of data in 57 files. Unused sectors on media: 926 of 12219392 (99.992% full) Packing found in 0 seconds by simple algorithm.
Trying ./isopack -algorithm=luck0 -time=4m gives: |
Best packing found contains 25024414966 bytes of data in 7 files. Unused sectors on media: 2 of 12219392 (100.000% full) Packing found in 9 seconds by luck algorithm.
Luck, being random, found a really good packing in the first 9 seconds, then never found a better one in the remaining 3 minutes and 51 seconds. It isn't always so quick (but this is the sort of reason I added the -close option). Finally, trying ./isopack -algorithm=brute -time=4m gives: |
Best packing found contains 25024415614 bytes of data in 8 files. Unused sectors on media: 1 of 12219392 (100.000% full) Packing found in 59 seconds by brute algorithm.
That's a better packing than luck found, but it took longer (not that the results would always come out that way). I also have some actual results: I was able to pack a BD-R.SL with only 2 unused sectors and write it with k3b using the cdrtools back end rather than growisofs. k3b still had some kind of silly problem trying to verify the data, but sha256sum produced matching checksums for files on the BD-R and the hard disk, so the write did seem to work perfectly and not truncate any files. I saved the log from that k3b run and I should be able to look at the commands it used and produce a mostly automated command line script to write my backups when I collect enough files to want to dump some to BD-R. Work FlowIn the tarball with the source, I also have some sample scripts showing how I plan to use isopack (I've done this a bit at a time so far, but now I've put together scripts to automate things, and hopefully they will work when I get enough files to need a new backup). pack-dir is a perl script that turns a directory tree into the sort of path list that can be passed to isopack as input. You can hand edit the file after it is created to define any groups you might want to add, etc. A sample run might look like: prompt> pack-dir /disk2/Movies > packlist.txt pack-and-write is a shell script that runs isopack,
mkisofs, and cdrecord to automate the packing and media
writing process. It hasn't really been tested at all, and would need
extensive personalization to be suitable for use on another machine
with a different burner, etc. Consider it an example that may or may
not actually work as it is. After making any hand edits to the above
|
prompt> pack-and-write packlist.txt "MOVIES-2011-12-13" 2>&1 | pack-progress
pack-check is a script intended to be run to compute
checksums for the files on the mounted media after it is written and
compare those checksums to the same files on the original hard disk
directory. After writing the above media, putting it back in the
computer, and mounting it at prompt> pack-check /media/MOVIES-2011-12-13 /disk2/Movies That will tell you if all the checksums match. If the backup was
successful, you can now remove the files in
References
RevisionsSaturday, Jun 16th 2012 Removed the dev= arg from cdrecord command in pack-and-write script since my hardware changed a bit and cdrecord seems to find the device without me naming it. Saturday, Apr 28th 2012 Fixed problems with the pack-check script and files with spaces in the name. Added the pack-progress program to use to overwrite all the progress messages from cdrecord onto a single line. Friday, Jan 13th, 2012 Finally accumulated enough data to try the pack-and-write script, found some typos in the script and a SEGFAULT in isopack. Realized the -waste option would be useful, so I added it. (The script did work after the fixes though). |