MergeSvnDumps.pl - a script for merging and filtering svn repository dump files


# merge and filter dumpfiles
MergeSvnDumps.pl [ -f filterFile ] dumpFile1 [ dumpFile2 ... ]


MergeSvnDumps.pl is a perl script for filtering and merging subversion repository dumps generated using svnadmin dump. It provides extended filtering capabilities using node-path transformations via perl's regex engine (svndumpfilter, which is distributed with Subversion, doesn't even support wildcards, though there is a patch in the tracking system for this). It also enables prioritized merging of version histories from multiple repository dump files while maintaining revision commits in proper chronological order (The svnadmin load switch imports data to an existing repository, but does not interleave revisions. This breaks the

MergeSvnDumps.pl uses "SVN:Dump 0.03" to parse svn dump files.

Prioritized Merging

MergeSvnDumps.pl can merge any number of dumpfiles at a time. Assuming that the dumpfile revisions are in chronological order, the revisions in the output will be in chronological order.

Node conflicts are resolved using a system of dump priority. Dump files specified later on the command line have a higher priority. Once a node is touched in one dump file, nodes in lower-priority dump files with the same path will be ignored. A warning is issued in such a case.

Filter Transformations

Nodes can be filtered or transformed using perl regular expressions, read from a separate file which is specified using the "-f" switch. The format of a filter specification is as follows:

*dumpFile*: "*searchRegex*","*replaceString*"

This is best demonstrated by example.

mainRepo:      ".*\.zip$",""
personal:      "^(career/cv\..*)?.*","$1"
documentation: ".*(?i)subversion","collab/$&"
documentation: "^(?!collab/)","doc/"

In this example, transformations are specified for three dump files. All *.zip files in the mainRepo dump file are removed in the output, since nodes with null paths are dropped. Only nodes matching the path glob career/cv.* will be included from the personal dump file. Two rules are given for nodes coming from the documentation dump file; these will be applied in the same order. Any node containing the word subversion (case-insensitive match) will be moved into a subdirectory collab/. All other nodes will then be moved to the subdirectory doc/.

Note: Any amount of space after the colon is acceptable, but there is only a single comma and no space between the search and replace strings.

Directory Autovivification

Notice that in the example above, the nodes in the documentation dump file are moved to new locations. It's possible that the new paths doc/ and collab/ don't exist in any of the repository dump files. In such a case, MergeSvnDumps steps in and provides additional records to add the missing directories.

See Also


Copyright and License

Copyright 2007 Aryeh Leib Taurog

This is free software. It is available under the same license as Perl.

This software is provided AS IS with no warranty as to its usability or fitness for any purpose. I disclaim all responsibility for any damage whatsoever resulting from the download or use of this software.