Rtf Merge

Merge RTF files into single text file

What it do?

An organization that performs printing press work uses an OCR tool to convert individual image files to RTF files. They then copy the content from each individual file into another tool, which can be a cumbersome and time-consuming process, especially when dealing with thousands of pages. To address this issue, I provided a solution that extracts the text from all RTF files and creates a single file, eliminating the need to open each individual file and copy-paste its contents into a different software.

How to use?

  1. Launch PowerShell ISE.
  2. If you do not have the necessary permissions to run the script, execute the following command to bypass any restrictions: Set-ExecutionPolicy -ExecutionPolicy Unrestricted -Force.
  3. Place all files in a single folder.
  4. Update the script with the appropriate file paths and execute it.
  5. Once the execution is complete, the merged text file will be visible.

Powershell Script

#source folder path
$rtfFile = [System.Io.FileInfo]"C:\rtf"
 
#destination output filepath
$txtFile = "C:\rtf\output.txt"
 
if (Test-Path $txtFile) {
   Remove-Item $txtFile -verbose
}
 
Get-ChildItem -Path $rtfFile -Filter *.rtf  | foreach { 
 
$rtBox = New-Object System.Windows.Forms.RichTextBox
$rtfText = [System.IO.File]::ReadAllText($_.FullName);
$rtBox.Rtf = $rtfText
 
# Get plain text
$plainText = $rtBox.Text;
 
$([Environment]::NewLine) | Add-Content $txtFile
("----------" + $_.FullName + "----------") | Add-Content $txtFile
 
# Write the plain text to the destination file
[System.IO.File]::AppendAllText($txtFile, $plainText)
 
}