Home > Surf Report 43 > Mining Data from Mainframe Reports

Mining Data from Mainframe Reports



It's easy to view the Internet as the source of all data, but mainframe computers still hold enormous amounts of valuable data. For example, we've seen journalists take mainframe data on delinquent tax payers and analyze it with askSam. Another researcher examined Veterans Administration mainframe report data looking for indications of Gulf War Syndrome in veterans of the first Gulf War. Researchers, analysts, and journalists often find themselves dealing with data that's been dumped from a mainframe. In this article, we'll give you some tips on how you can take mainframe data and analyze it on your PC.


There are several challenges you face when moving mainframe report files to your PC. First, the files are usually stored in EBCDIC format, a format that differs from the ASCII format common on Windows. Second, mainframe reports are not normally structured in a way that allows you to easily bring the data into a spreadsheet or database.



Converting and Structuring Your Mainframe Report

TextPipe, a text manipulation and conversion utility, lets you convert and structure mainframe report files. TextPipe includes a filter that converts from EBCDIC to ASCII. Simply choosing this filter and running it on your mainframe report will convert the report to ASCII format (a format that can be used on PCs and imported into askSam).


Depending on the format of your mainframe data, other conversions may be necessary. Often mainframe reports consist of 132 character lines that actually have an extra character at the start of each line. This character determines double printing, bold and other special functions, typically with a '*' or '0' or '1' character. If this character is in your mainframe report, you can remove it using TextPipe. Similarly, some reports have packed decimal, packed numeric, or zoned decimal fields that need to be expanded before converting from EBCDIC to ASCII. Again, TextPipe provides filters to do this.


TextPipe also includes a variety of tools to help structure the information in your mainframe report. For example, you may want to remove headers and footers. Or there may be parts of the report where you want to insert a word or identifier that askSam can use as a field.


The amount of structure you'll need to insert into your data depends on two things: the structure already contained in the mainframe report file and the analysis you plan to do in askSam. askSam requires no structure to search information, but if you intend to sort, group, total, and output fields in reports, there will need to be a way to identify the information you wish to manipulate.


askSam can automatically recognize fields contained in your information. For example, if a report contains words such as "TaxpayerID:" or "Date:", askSam will be able to use these words as fields and sort, group, total, and output the information in these fields. If the information in your report does not contain any such structure for askSam to use (and if you require this structure for your analysis), TextPipe offers an array of tools to manipulate your information.  


You can find more information about converting mainframe data with TextPipe at http://www.crystalsoftware.com.au/docs

We can also provide assistance with the conversion process.



Importing Your Mainframe Data Into askSam

Depending on the format of your data, there are different ways that you can import it into askSam. When reports are designed as printouts they tend to contain either rows and columns of information (that can normally be imported as Fixed Position data), or the report will be a text file where information may be identified by words in the report.


To import delimited data, choose the askSam import command and select the "Text Delimited" import type. The Import Wizard will guide you through the process. When you import delimited data, the data will be structured, and you'll be able to sort, group, total, and output fields in askSam.


When you import non-delimited data into askSam, the process is slightly different. Rather than using the "Text Delimited" import type, you will use the "Text" import type. Normally, you will want data from your report divided into different records in your askSam database. For example, if you were to import a mainframe report that looks like this:




You would want each section starting with "TaxpayerID:" to be a separate record in your askSam database. When importing into askSam, set the "Document Delimiter" option to "Blank Line" and askSam will divide the file into records each time a blank line is encountered (there are other options to divide a file into records, but for this example, blank line would work).


Once the information is imported into askSam, you can now search and analyze it.



Using Embedded Structure as Fields

askSam can automatically recognize embedded structure as fields. For example, if a report contains words such as "TaxpayerID:" or "Amount:", askSam will be able to use these words as fields and sort, group, total, and output the information in these fields.


Under the TOOLS menu, the AUTO FIELD RECOGNITION command allows you to define fields after you import a text file into askSam. Specify what character defines a field in your information (for example a colon), and askSam will display a list of the words in your database followed by this character (and also display how many records contain these words). Select the words you wish to you use as fields.



Searching, Reporting, and Analyzing the Information

Once you've brought your information into askSam, you'll be able to search, report, and analyze it. The ACTIONS menu contains different options search options. You can use menus to set up different search queries (searching in different fields). askSam's Report Writer lets you set up reports to sort, group, total, and create summary reports from the information that you've imported. The askSam report writer makes it easy to analyze information you bring into askSam.



Mixing Data from Different Mainframe Reports/Databases

The field structure in an askSam database does not need to be identical in all askSam records. This allows you to easily take information from multiple mainframe reports or databases and bring them into a single askSam database for analysis. askSam's report writer offers functions that allow you to create reports from these different data types. We've seen researchers use these functions to combine data from different government databases.



Both askSam and TextPipe offer flexible and powerful tools for manipulating and organizing mainframe information. This article does not provide exact details on how to use all the necessary features, but instead tries to explain what's possible. If you need more details, please the askSam technical support department.


Related Links:


White Paper on Using TextPipe with Mainframe Reports

http://www.crystalsoftware.com.au/docs/mainframe.pdf


Quick Downloads

 

How people use askSam

 

Surf Report Newsletter

Subscribe today to receive our FREE monthly newsletter. The Surf Report includes tips, articles, and information about new releases, upgrades, free utilities, and special promotions. Sign up today!


Read Back Issues »
 

"askSam is an essential part of my software tool chest. I can research and collect data from anywhere and any source. Once it is in askSam I can edit, rearrange, organize, and search the information easily. Then I can present it and make it totally useful for other people via the web or CD. Fantastic!"

-- Valda Hilley, Author, Literary Agent, Teaching Consultant, Pack rat, and President, Convergent Press, Ltd.

 

Seaside Software Inc. DBA askSam Systems, 121 S Jefferson Street, Perry FL 32347
Telephone: 800-800-1997 / 850-584-6590   •   Email: info@askSam.com   •   Support: http://www.askSam.com/central.asp
© Copyright 1985-2012   •   Privacy Statement