PVCS Conversion (1)

I'm trying to get a source control repository out of PVCS, with full revision history and metdata (author, date, comments, version labels). Because we're still doing Visual Basic 6.0 maintenance, we can't use Subversion, EVS, or newer products from Serena (the publisher of PVCS). VB6 uses a binary resource file per UI element (form, user control, report), so we need per-file locking and versioning. Branching would allow simultaneous additions to the binary file, which can't be merged. And if we had repository-wide versions, we couldn't leave "unfinished" code out of the QA/production builds without branching.

First a few definitions:

Project Database
Most systems would call this a "repository". It's just a container for a group of related projects. This is what you connect to.
Project
The root or a subfolder of a project database. Subfolders within a project may be a subproject (e.g. a DLL associated with an EXE), or just a subdirectory (e.g. a web site's /images directory).
Archive
A file under source control, e.g. frmMain.vb.
Revision
One checked-in modification to an archive

PVCS has a physical directory structure that mostly matches the project structure. However, a project can include an archive that's defined in a different project. Shared forms, classes, etc. allow code reuse, with changes reflected once the project is compiled. We don't have a reliable list of shared archives, so it's possible to make a change in one project that runs fine, only to find you've broken the compile of another project that happens to share the file you edited. We'd like to make it clearer which files are shared as part of the conversion process.

I didn't know about the difference between archive paths and project paths at first, so I solved the archive extraction first. With network access to the PVCS server, the PVCS DTK allows you to walk an archive directory structure recursively, find all archives, and populate structures with all the archive/revision information. Note that this DTK is provided in C, so I got to have fun with Double-Null Terminated Strings, above and beyond the disconnect for a mostly memory-managed developer to have to allocate buffers and pass both the pointer and the length into a function just to get a string back out.

Anyway, that problem mostly solved, I found that the archive list didn't contain files I knew were in the projects. I'd gotten used to all programming against PVCS being difficult, so I just looked at the raw pvcsproj.ser files, which turned out to be serialized Java objects. I could figure out the data I needed by eyeballing it, but my first few attempts at parsing it programmatically weren't very successful. So I asked for a little help with a RegEx. That's when Ben said "there has to be an easier way. Doesn't PVCS do this for you?"

D'oh.

The PVCS Command Line Interface (PCLI) does have a way to list all projects, and to list all archives within a project. It might have something that would do that all in one step, but I really prefer data structures to parsing randomly-formatted multi-value strings in stdout. So I piped the output of the "get all projects" command to a text file, one per line. Then I wrote a batch program using "FOR /F" and Delayed Expansion. It created a directory structure to mirror the project paths, and dumped a list of the archive paths & file names in each project to an "archive.txt" in each directory.

I'm much quicker in memory-managed languages, and I wanted to get out of my VB6 rut, so I wrote a quick VB2008 app to recurse through those directories, and create a single tab-delimited file with one line per archive, containing the project path relative to the Project DB (repository) and the archive path & file name. This can be imported into a DB, or parsed directly in code.

I'm currently trying to port the revision information application from Visual C++ 6.0 to Visual C++ 2008, so I could put it into a .NET-compatible DLL, and use VB.NET or C# to store everything in a database.

My preference for this project is to get everything into a SQL 08 DB, using a filestream field for the actual revision data. Yes, we're repeating unchanged data, but we have plenty of space for this system, and there are quite a few benefits:

  1. Projects, files, archives, revisions, comments, and versions are relational tables, not some filesystem convention. There's very little room for misinterpretation of directory structures, file names, etc. between developers.
  2. We can easily try several different target environments. We'll probably issue repeated command-line "add" and "commit" calls to CVSNT, or whatever replacement we come up with. This way it's fairly simple to populate parameters from database fields.
  3. If we give up on importing the full history to a new SCCM system, we can still use this database to query anything we need from our archive, without having to maintain an otherwise-unused PVCS / Serena File System server, and appropriate licensing thereof.

Comments

Hi Chris - I am going through the same thing right now, trying to migrate about 900+ apps from PVCS to Microsoft TFS. I am writing in C# against the unmanaged libraries that the DTK provides. It is painful. Do you have any code you could share? I may have to resort to the PCLI. Your post is about the only one I found that is relative to what I am doing. You pretty much have already been through the pain.

Anyway, let me know if you have any code you could share.

Dan

Has anyone been able to develop a tool that can be used to migration archives from PVCS VM to TFS?
We are planning to move to TFS and the only option I have found is to migrate from VM to SVN and then from SVN to TFS. Not a real good solution.

Thanks, Christopher Montgomery

Same Question: Has anyone been able to to migrate from PVCS VM to TFS? oR any blog, post or document which states the procss and pros and cons of this migration.

Thanks
Muddessar Iqbal