**Abstract**

The advent of SAS No. 82 and SAS No. 99 has strongly encouraged auditors to plan their audits with fraud in mind. Accounting literature has suggested that digital analysis could be used as an analytical review procedure to assist in the planning stage of an audit as per SAS No. 56. However, previous literature has used digital analysis at the disaggregated account level and has failed to identify sources for fraudulent financial data. This study advances the concept of using aggregated financial data from a fraudulent company to determine the effectiveness of using digital analysis.

**Introduction**

For years, auditors have searched for a "magic bullet" to be used to identify troubled companies at the beginning of the audit. The implications of SAS No. 82 and SAS No. 99 have intensified this search for an ideal analytical review procedure. Nigrini (1999a, 1999b) and Nigrini and Mittermeir (1997) suggest that digital analysis might be this "magic bullet" for auditors. Their studies determine that digital analysis is useful in analyzing disaggregated account data while searching for fraudulent activities.

Other researchers have also suggested digital analysis as an analytical review procedure. Busta and Weinberg (1998) use simulated data to test whether digital analysis can detect manipulated data. They indicate a need for replication using real financial data that has been manipulated and cite an inability to find real fraudulent data.

This study examines the appropriateness of digital analysis as an analytical review procedure to determine the possibility of financial statement fraud using aggregated data. This aggregated data is obtained from a real company whose CEO has been convicted of financial statement fraud. Our findings lead us to conclude that digital analysis would not have indicated whether manipulation of the data had occurred. We further conclude that digital analysis would not have been useful as an analytical review procedure in the planning stage of the audit for this company.

This study has contributed new knowledge to the use of digital analysis. The findings of this study indicate that digital analysis may not be appropriate for examining aggregated financial data in all cases. In situations where fictitious invoices are used to overstate revenues, the numbers created by these invoices will create numbers similar to real invoices. Therefore, no manipulation of the number set has occurred and the fictitious invoice numbers will be in the normal Benford distribution.

Another contribution this research has made to the literature is identifying a source for fraudulent data for any publicly traded company. Previous research has been unable to find any such manipulated data. This paper indicates that fraudulent financial records of SEC targeted firms can be located in Moody's Industrial Manuals.

The remainder of this paper contains five sections. Section II provides background information and develops the hypothesis. Section III explains the methodology used to examine the data. Section IV illustrates the findings of the study. Section V explains the conclusions of the findings. Finally, Section VI summarizes the paper and develops future research implications for the area.

**Background & Hypothesis Development**

The deceptive activities of several businesses in the last ten years have aroused the indignation of the American consumer, as well as that of the Securities and Exchange Commission (SEC). This indignation has led to prosecutions, as well as convictions, for several executives leading these companies. These executives have engaged in practicing both "earnings management" and fraudulent reporting.

Former SEC Chairman Arthur Levitt regarded these two activities as being similar in effect. In a speech given at the NYU Center for Law and Business in September of 1998, he expressed concern over the number of companies artificially meeting Wall Street earnings expectations by employing "hocus-pocus" accounting practices. These practices have led many companies to artificially inflate revenues and reduce expenses by using inappropriate accounting principles. This practice has come to be known as "earnings management." Healy and Wahlen (1999) define earnings management as occurring:

when managers use judgment in financial reporting and in structuring transactions to alter financial reports to either mislead some stakeholders about the underlying economic performance of the company or to influence contractual outcomes that depend on reported accounting numbers (p.368).

Earnings management can lead to fraudulent reporting depending upon the degree of severity and the amount of departure from generally accepted accounting principles.

The importance of preventing fraud cannot be overstated as fraud can

undermine the resources, morale, and public image of an organization. Early detection of fraud is important. Albrecht and Albrecht (2002) report that U.S companies incur $600 billion in annual losses due to fraud. Authoritative literature suggests that any appearance of fraud demands extensive testing to determine not only the amount of the fraud, but also the possible impact the amount has on the financial statements. SAS No. 56 states that auditors must use analytical review procedures in planning the nature, timing, and extent of other auditing procedures (AICPA, 1988). SAS No. 99 requires auditors to document, during the planning stage, their performance of risk assessment of material misstatement due to fraud (AICPA, 2002). SAS No. 99 also increases the auditor's responsibilities to explore the possibility of fraud.

An effective analytical review procedure that would indicate the presence of fraud is important in satisfying SAS No. 99 requirements. Nigrini (1999a, p.79) effusively proclaims that digital analysis could be effective in helping auditors "find errors and irregularities in large data sets ... to direct their attention to anomalies." Codere (1999, p.59) states that the "analysis of the frequency distribution of the first or second digits can detect abnormal patterns in the data and MAY (emphasis added) identify possible fraud."

Ettredge and Srivastava (1999) describe digital analysis as a statistical methodology that determines whether a set of financial numbers corresponds with another number set described as the Benford distribution. Manipulation of data would be suspected if the frequency of the digits under investigation were significantly different from the Benford distribution. This methodology is named for a GE employee who discovered that certain numbers follow a pattern such that the probability of occurrence of each number can be estimated. Nigrini (1999b, p.79) states, "Not all data sets follow Benford's law." Nigrini (1999a, p.21) further elaborates that

Benford's Law usually applies to tabulated data where numbers describe the sizes of similar phenomena. The list of numbers should not contain a built-in maximum or minimum value. Also, the data must not consist of assigned numbers that, in essence, are symbols, such as social security and bank account numbers and zip codes.

Prior research has indicated that digital analysis can be used as an analytical review procedure. Nigrini and others (1997, 1999a, 1999b) utilize digital analysis to explore data anomalies at the disaggregated account level. Caster, Mittag, and Scheraga (2003) use digital analysis to determine whether an entire database has been manipulated. Lanza (2000) illustrates how practitioners can use the DATAS statistical analysis tool to easily perform digital analysis.

Busta and Weinberg (1998) use digital analysis in a study to distinguish between "normal" and "manipulated" financial data. Simulated data is used in the study due to the authors' inability to find real fraudulent data. They cite a need for replication of the study with real data when it can be found. However, no one has attempted to analyze aggregated data, such as that found in real financial statements of fraudulent companies, to determine whether digital analysis will detect the likelihood of financial statement fraud.

**Methodology**

This study is designed to test whether digital analysis can serve as an analytical review procedure for auditors in determining the existence of fraud in financial statements. Previous research used simulated data to determine the existence of manipulation of data. The current study extends prior research by using real fraudulent data. This data is obtained by identifying a company whose chief executive officer (CEO) has been successfully prosecuted by the SEC for committing fraudulent activity.

A list of companies that had been targeted by the SEC for fraudulent activity is used to identify a prospective target (Loomis, 1999). The identity of this company will remain anonymous at the request of the editor, but will be called XYZ, Inc. in this study. The CEO of XYZ, Inc. concocted false invoices and revenues to meet earnings goals and is currently serving five years in a federal penitentiary.

Next, an examination of Moody's OTC Industrial Manual yields the financial statements of XYZ, Inc. for the years of 1994-5 (see TABLE 1). The SEC enforcement action stated that XYZ, Inc. had been forced to restate their financial statements for these particular years. The EDGAR database maintained by the SEC yields the restated data for these years (see TABLE 1).

As Nigrini (1999a) reports, some data sets are not appropriate for testing using digital analysis. Wallace (2002) states that three assumptions about the data sets must be met for digital analysis to be appropriate. If the data exhibits a positive skewness (mean divided by the median), an increase in the means over time, and no ceiling or floor, the assumptions are met for analyzing the first digit using digital analysis.

An analysis of the first digit is then performed on the data to determine whether digital analysis might have indicated fraudulent activity prior to SEC intervention. Nigrini and Mittermaier (1997, p. 57) state, "The first digit test is an initial test of reasonableness." If the first digit test indicates an anomaly in the test data, an auditor can then assume the possibility of fraud and design the audit program to search more closely for irregularities.

Wallace (2002) relates the procedure for performing the first digit test. She describes how to use a spreadsheet to order the data such that the first digits are left justified and then counted. The data are then compared with the Benford distribution, using either a Z test or a Chi-square test, to determine statistical significance. The Chi-square method is a more conservative test than the Z test. Although it is less discriminatory, it produces fewer false positives. Since previous research has used both methods, this paper illustrates both test results in the interest of comparability.

The Z test involves standardizing the distribution by comparing the calculated Z score with a Z statistic at a predetermined a level. Essentially, the determination is the comparison of the company data distribution with the Benford distribution. If the Z score is larger than the positive Z statistic or smaller than the negative Z statistic, then the company data does not follow the Benford distribution. Therefore, the data would show evidence of manipulation. A hypothesis test is performed to determine whether the XYZ, Inc. financial statement data is distributed consistent with the Benford distribution. The decision rules for the test are as follows:

The Chi-square test is a test examining non-normally distributed data to determine whether two data sets follow the same distribution. A Chi-square test is performed on the data by summing the squared differences of the expected value (Benford distribution for the digit) and the actual value (the calculated probability) divided by the expected value of the digit.

A hypothesis test is performed to determine whether the XYZ, Inc. financial statement data is distributed consistent with the Benford distribution. The decision rules for the test are as follows:

The original and restated data for XYZ, Inc. are examined following Wallace (2002). Both data sets exhibit positive skewness as measured by dividing the mean by the median. In addition, the mean for each year is examined and determined to be increasing from prior periods to current periods for both original and corrected data. An examination is made of the data to determine the modes and to establish that the data set is not truncated; i.e., no ceiling or floor. Both data sets appear to meet all the assumptions as set forth by prior research and thus digital analysis is appropriate to examine this data.

The data yield 65 observable data points for the original two years and 67 observable data points for the restated two years. These points are examined using a first digit test. In following the Wallace (2002) description of the methodology, the financial statement data from XYZ, Inc. for the years 1994-5 are placed in a spreadsheet. EXCEL commands "left" and "count" cause the data to be left justified and the digits in the first position counted. The calculated probability is determined by dividing the frequency of occurrence of each digit by the number of observation points in the original data set and in the restated data set.

**Findings of the Study**

The XYZ, Inc. data (both original and restated) are analyzed using the Z test. The Z scores (See TABLE 2) are compared to the Z statistics at a .05 a level (+/-1.96). None of the Z scores in either the original data or the restated data are outside the range established by the Z statistics. Hence, the null hypothesis cannot be refuted. Therefore, the finding is that the XYZ Inc. data is distributed in the same manner as the Benford distribution.

When the original XYZ, Inc. data for 1994-5 are manipulated as Wallace (2002) suggests, the Chi-square statistic is determined to be 3.5182 (See TABLE 3). Using the decision rules cited above, 3.5182 is less than 12.5916, so Ho cannot be rejected. The implication of the finding is that the original XYZ, Inc. data are distributed in a manner consistent with the Benford distribution.

While the variance of the data does exhibit an abnormality, this abnormality does not exist to the extent that would cause the distribution to be outside of the Benford distribution. Most of the variance exists due to the excess number of occurrences of the digit one in the data (see TABLE 4). The predicted number of occurrences is 19.5, while the actual occurrence rate is 25 (See TABLE 3). In fact, an examination of the data shows that 17 of the 49 accounts that were restated by the SEC begin with the digit one.

When the restated XYZ, Inc. data for 1994-5 are manipulated in the same manner as the original data, the Chi-square statistic is determined to be 3.8655 (See TABLE 3). Using the decision rules cited above, 3.8655 is less than 12.5916, so Ho cannot be rejected. The implication of this finding is that the restated XYZ, Inc. data is also distributed in a manner consistent with the Benford distribution.

**Conclusions**

As the Z test failed to refute the null hypothesis for both the original, as well as the restated data, it is concluded that both data sets are distributed in the same manner as is the Benford distribution. It follows that the data does not appear to have been disturbed or changed by outside sources. Therefore, digital analysis does not seem to indicate the existence of fraud in the financial statements of XYZ, Inc.

Further, the Chi-square test does not indicate an existence of errors or irregularities in the data for XYZ, Inc. for the years 1994-5. While the data does exhibit some anomalies, they do not contribute significantly enough to the variance to result in a sufficiently large Chi-square value to indicate misstated financial statements. Based upon the Chi-square results, it is concluded that digital analysis is not useful as an analytical review procedure in determining financial statement fraud.

One possible explanation as to why no manipulation was detected is that XYZ, Inc. used fictitious invoices to overstate revenues. Since these revenues are generated by multiplying normal sales price by the number of units "sold," these revenue dollars will be the same numbers as created by real invoices. Hence, both the fictitious invoice dollars and the real invoice dollars are in the same distribution of numbers. Therefore, the fictitious invoices do not cause numbers that would be outside the Benford distribution.

An interesting observation is the excess occurrences of the digit one in the original 1994-5 data for XYZ, Inc. In fact, the digit one appears 22% more than predicted by the Benford distribution. Additionally, 17 of the 49 accounts (34.7%) required to be restated by the SEC begin with the digit one. Further research into this phenomenon should be conducted to determine whether an auditor might have used this finding in examining financial statements for the existence of fraud.

**Summary**

Digital analysis has been extolled by a number of researchers as an effective tool for auditors to use in their search for the "magic bullet" to find troubled companies at the beginning of the audit. Previous research has indicated its successful use at the disaggregated account level. This research examines fraudulent data at the aggregated level to determine whether digital analysis would be useful as an analytical review procedure for auditors. Using both the Z test and the Chi-square test, the difference between the data for XYZ, Inc. and the Benford distribution is statistically insignificant in the search for financial misstatement. The conclusion is that digital analysis would not have helped in the audit planning for XYZ, Inc.

Further examination of the 1994-5 original data for XYZ, Inc. reveals an anomaly in the number of occurrences that the digit one appears in the first position in the data set. This anomaly is observed as a comparison between the numbers of occurrences in the data set versus the expected value suggested by the Benford distribution. A correlation between the number of accounts that were required to be restated and the number of accounts beginning with "one" appears to exist. Further research is needed to determine whether this anomaly can be useful in the planning stage of an audit.

One contribution this research has made to the literature is identifying a source for fraudulent data for any publicly traded company. Some researchers have indicated they would have preferred to use digital analysis on real fraudulent data, but were unable to locate any. This paper indicates that fraudulent financial records of SEC targeted firms can be located in Moody's Industrial Manuals.

Another contribution this study has made is to suggest that not all aggregated data can be successfully examined with digital analysis. In the case of XYZ, Inc., repeated fictitious invoices do not cause an indication of manipulated data that would be outside the Benford distribution. Further research in this area may find other areas where aggregated data may be examined by digital analysis.

Previous research examined disaggregated data with digital analysis. While digital analysis remains a viable tool in exploring data sets for manipulation of numbers, caution must be exercised when using it on aggregated data. If the data manipulation is the result of the duplication of normal transaction numbers, the data set will appear to follow the normal Benford distribution without any sign of disturbance.

**REFERENCES**