Using Screenshots to Examine Many Files Quickly
I had a couple of projects that called for a way to examine a large number of files. It seemed that screenshots could help in those projects. This post describes the techniques I used.
EML Analysis
In one project, I was working with various email files that I had exported from Thunderbird. These files had an EML extension. Typically, if I viewed an EML file in Notepad, I would see various codes and other information that wouldn't be visible if I viewed it in an email program like Thunderbird.
I was interested in seeing the header codes in these EML files. Those codes appeared at the tops of the files. I felt that I could probably see what I needed to see in the first screenful of a Notepad session, opened maximized.
In other words, the concept was that I would open the EML file in Notepad; I would take a screenshot; I would save the screenshot; and then I would close the file and repeat the process with the next EML file on my list. Then I would combine all those screenshots into one file, and flip through it or perhaps use other tools to analyze it further. I wouldn't have to sit there, maintaining constant attention while the process continued in real time; I could just review the outcome afterwards. (For some purposes, an alternative would have been to combine or select from the text, without a graphical view.)
The first step was to build the list of EML files that I wanted to examine. I moved them all into a single folder and
used DIR and Excel
to give me the list and to convert it into a series of batch commands. There was one such command for each such EML file. Before running those commands, I had to open Notepad once, turn on its Format > Word Wrap option, and then close it. The format of the command was as follows:
start /max notepad "D:\Folder Name\Email Name.eml"That command was sufficient to open the EML file. Next, I needed to pause the system for a moment, so that the file would have time to come onscreen. Among numerous suggestions , I favored a command involving PING ("ping 1.1.1.1 -n 1 -w 1500 > nul") because of its fine-tunable setting (in the example just given, 1500 milliseconds). Unfortunately, that command's output component (" > nul") would have prevented me from adding more commands on the same line. So I had to go with "TIMEOUT /T 1" for a one-second delay.
Next, I needed a command to take a snapshot. It looked like there were multiple options here. I had already installed NirCmd and had found it useful for other things, so I used this command:
start NirCmd savescreenshot "D:\Folder Name\Screenshots\Email Name.png"NirCmd came with an option to copy its executable (nircmd.exe) to C:\Windows, so that this command could run without any need to specify the location of NirCmd, to put a copy of it in the current working folder, or to modify the computer's Path. NirCmd wasn't saving to subfolders properly, so in the end I had to modify that part of the command.
Finally, I needed a command to close Notepad. The advice that worked for me was:
taskkill /f /im notepad.exeNote that this would close all currently open Notepad sessions. These three steps (i.e., open the EML in Notepad, take a picture with NirCmd, close Notepad) would give me a screenshot of the first screenful's worth of the file's contents. Collectively, those screenshots would give me a visual impression of the various kinds of codes appearing at the start of my EML files.
I used && to combine multiple commands on the same line, as a single (long) batch command. If that had failed, I could have added index columns next to the spreadsheet columns in which I built those two commands, with alternating even and odd numbers in those columns: 1 for the first Notepad command, 2 for the first NirCmd command, 3 for the second Notpad command, and so forth. These index numbers would allow the various commands to be sorted into proper sequence in a single column, for copying and pasting into a batch file.
In short, for each EML file, I combined four commands with &&, into a single long command like this:
start /max notepad "D:\Folder Name\Email Name.eml" && timeout /t 1 && start NirCmd savescreenshot "Email Name.png" && taskkill /f /im notepad.exeThis gave me some PNGs. Now there was the question of what to do with them. One option was to simply stitch them together in a slideshow (using e.g., IrfanView) or a single PDF (using e.g., Acrobat). I did a brief investigation of OCR software for that purpose. Ultimately, I just used IrfanView, without even creating a slideshow, to arrow down through those PNGs, one at a time, at whatever pace I chose. So I could look at whether each page came through OK.
PDF Analysis
In another project, I had a bunch of PDFs that I had created in a conversion process. I wanted to check if the PDFs came through OK. It would have been very slow to open them, one at a time, and page through them. Combining them all into a single large PDF, which I could also page through, would have produced a huge file. Also, if I was working with large PDFs or many PDFs (or both), I might have to look at huge numbers of pages. Boredom or haste could lead me to flip past an important one-page document, while checking hundreds or thousands of less important pages.
Based on various factors (including the number of PDFs, their importance, and the time available), I decided to examine just the first page of each PDF. I might not be able to tell if the whole document printed properly, but at least I could eliminate those instances where printing failed completely.
For this purpose, the process described in the previous section offered one possibility. I could probably work up a set of commands to open a PDF, take a screenshot, and then close it, and then flip through the resulting screenshots.
I did not actually pursue that approach in this case, however. Instead, I wanted to see if I could convert the PDF documents to JPG and then flip through just the first page from each such document. If I had a hundred documents to check, I would have a hundred pages to look at -- not a thousand. A separate post discusses that investigation. The tool I chose was Boxoft PDF to JPG Converter . Another way to proceed might have been to split the PDFs first, using something like PDFsam , and then combine the PDFs of each resulting first page into a larger PDF that I could flip through.
0 comments:
Post a Comment