Migration job is executed and the output data group is significantly larger than the original data group


Version: R69.1

Article ID: PE000139

Migration job is executed and the output data group is significantly larger than the original data groupmain image

Description

When executing a migration job for a data group, the migrated data group is significantly larger than the original data group.

Summary

When a migration job is executed, a new datagrp.mdb is created and uses new document ID’s and image file names when it creates the migrated data group.

Both PaperVision Capture and PaperFlow have the ability to create ‘duplicate documents’ (aka detail sets), which allow one set of images/pages to be indexed with several different values. This uses less hard drive space for storing the images while adding more document ID’s to reference those same images.

PaperVision Enterprise/ImageSilo does not track the duplicate documents any differently than a standard document, so when a migration job is executed, the duplicated documents (aka detail sets) are migrated as new images for each duplicate document included in the data group.

For example, if there is one duplicate document in the original data group, that uses 5mb of space for that document’s images, the migrated data group could be 5mb larger than the original, because the duplicated document becomes its own set of images taking up 5mb of additional space.

In cases where the source data group contains a large number of duplicated documents, the migrated data group may be significantly larger in size. For example, a single 90 page document in a data group gets duplicated 999 times, for a total of 1,000 documents and only 90 image files and a size of 4mb. The migrated data group ends up having 1,000 documents, 90,000 image files, with a size around 4gb.