Roger Carlson's Papers
The Mega Test
by Ronald K. Hoeflin
Reviewed by Roger D. Carlson, Ph.D.
Springfield, Oregon
Test Definition. The Mega Test was originally designed to measure the intelligence of individuals who are among the top ten per cent of the population in terms of intelligence. Its name derives from a principle use of the test to identify individuals who have a measured IQ of 176 or greater (1 in 1,000,000). The latest norming of the test revealed that it has an effective range of measuring IQs from 100 to 193.
History and Background. Ronald K. Hoeflin is an amateur self-taught psychometrician with a 15 year background in library science. His academic background includes two bachelor's degrees, two master's degrees, and a Ph.D. in philosophy from the New School for Social Research in New York City. He has founded and been active in many high IQ societies, and now edits a journal entitled, Ingenious.
The original 100 item form of the Mega Test was a distillation of seven preliminary tests of 24 to 48 problems each, totaling 250 items which were created by the author from 1979 to 1983. The 100 item test was split into two tests making up two 48 item paper and pencil tests--The Mega Test (Hoeflin, 1983, Morris, 1985) and Mega Test Two (Hoeflin, 1984, Mach vos Savant, 1985). Prior to its 1983 publication, the Mega Test was privately distributed by the author. Before publication in 1985, the author had collected data on 97 people and developed norms by assembling IQs on other intelligence tests associated with scores on The Mega Test. After its 1985 publication, the same was done using a sample of 434 people who sent their answer sheets to the publisher between April, 1985 until about the end of 1986. As of January, 1986, the 3200 who sent in their answer sheets ranged in age from nine to 85 years. Ten per cent of the sample was female. According to Hoeflin (personal communication, October 8, 1989), around 3900 individuals have taken The Mega Test to date.
The sixth and last norming of The Mega Test was done on the basis of 222 test takers who reported Scholastic Aptitude Test (SAT) scores to the author. The author used that data along with 1984 - 1988 SAT data obtained from Educational Testing Service in order to make hypothetical extrapolations concerning the likelihood of individuals in the general population achieving high scores on The Mega Test.
Description. The Mega Test is available for test-takers in the periodicals in which it has been published. The 48 items consist of 24 analogy items, 12 spatial problems, and 12 numerical problems (including six number series items). (The same distribution of types of items comprise Mega Test Two.) There is no age range specified as appropriate for persons taking the test.
The test is self-administered and has a time limit of one month. In order that no one will gain an unfair advantage by cheating, test takers are permitted and encouraged to use reference aids. Pocket calculators are acceptable but computers are not. Assistance from others is prohibited. There is no penalty for guessing.
The answers required must be generated by the respondent--there are no multiple choice alternatives, nor is there a standardized answer sheet. The answers to The Mega Test must be submitted to the author for scoring. (The fee is $5.) The profile provided by the author will give a scoresheet listing of the number of questions answered correctly broken down into raw scores in each of the subtests, as well as corresponding IQ and percentile equivalent. (Answers to Mega Test Two as well as IQ and percentile equivalents are included in Mach vos Savant (1985)). For an additional $5 fee, a ten page statistical report, "The Meaning of Mega Test Scores" will be sent. The report includes a discussion of the relationship between Mega Test scores and scores on other recognized high-IQ tests.
Practical Uses. The Mega Test's usefulness is confined to situations where one wishes to make discriminations between people of very high intelligence. Along with Mega Test Two as a confirming test, it was designed to select individuals for membership in The Mega Society, having as its criterion for membership, an IQ of greater than 176.
The test has been used primarily by those who are members of various high IQ societies (e.g., Mensa, Intertel), and those who wish to gain admission to certain of those societies (e.g., International Society for Philosophical Enquiry, The Mega Society). The test has been used by those who have intrinsic interest in such tests and the problems which comprise them. The Mega Test is also used by those who have intrinsic interest in exploring the vicissitudes of very high intelligence and its measurement.
The test is only available in an English paper and pencil format.
Technical Aspects. Morris (1985, 1986) has provided popular summaries of some of the psychometric aspects of the test. There is no manual per se containing information about the technical psychometric qualities of the test, however for the purposes of this review, Hoeflin has provided a photocopied report of raw data and various statistical analyses that he has compiled (Hoeflin, undated).
No reliabilities of the usual kind (e.g., test-retest, split half, etc.) are reported by the author. The test has face validity in that the items are of the usual type and format of intelligence test items. Several studies of construct validity have been done. Since The Mega Test was designed to measure levels of intelligence higher than any other test (with the possible exception of the Langdon Adult Intelligence Scale (Langdon, 1978)), correlations with existing tests are discounted because of ceiling effects. Pearson product moment correlations are .673 with the Langdon Adult Intelligence Test (n = 76), .574 with the Graduate Record Examination (n = 106), .565 with the Army General Classification Test (n = 28), .562 with the Cattell ( n = 80), .495 with the Scholastic Aptitude Test, .393 with the Miller Analogies Test (n = 28), .374 with the Stanford-Binet (n = 46), .307 with the California Test of Mental Maturity (n = 75), and .137 with the Wechsler Adult Intelligence Scale (n = 34). Using corrected correlations for range restriction, correlations are reported as follows: .697 for the Langdon Adult Intelligence Test, .368 for the California Test of Mental maturity, .267 for the Stanford-Binet, .781 for the Graduate Record Examination, and .801 for the Scholastic Aptitude Test.)
Until the sixth norming, IQ ranges given by the author were "estimates" based on means and standard deviations of respondents scores as associated with IQs derived on the basis other intelligence tests previously administered to the subjects. The procedure used in the sixth (and most current) norming of The Mega Test is of particular interest. Hoeflin obtained 222 SAT scores for Mega Test takers and found the percentile equivalents of those scores from 1984 through 1988. Since he assumed that those taking the SAT would be in the upper 10% of the populations and found that there were almost exactly three times as many 18 year olds as there were SAT takers in the general populations, he made a factor of three shift in those percentile equivalents. That was done in order to allow for the restricted range of SAT takers, thus viewing the SAT and Mega Test scores as a restricted range of scores within the context of a normal distribution of scores that one would find in the general population. Hoeflin then did a cummulative frequency distribution of the numbers of SAT scores who fell below each .25 sigma from 1.25 to 4.25, and took the average Mega Test score for the individuals falling within each .25 sigma interval. Using those values, he then equated the percentiles associated with each interval with raw scores on The Mega Test by using ratios of observed Mega Test takers scores vs. numbers of individuals who would be statistically expected to score above a given level. Using curve fitting procedures he then interpolated and extrapolated into ranges where no actual scores have been obtained. In order to extrapolate beyond Hoeflin took equivalent sigma scores for 90, 99, 99.9, 99.99 percentiles from standard statistical tables for the normal distribution.
Critique. The Mega Test is of interest psychometrically for two reasons. First, it is a pioneering effort at measuring the very levels of intelligence. Second, the author has not bound himself to orthodoxy in both the administration and the norming of the test. The question that must be asked is: has the first goal been achieved by the methods that have been utilized?
With respect to the administration of the test, the reasons that Hoeflin has for making it unadministered and with open reference materials appear to be pragmatic. He has no access to large samples of individuals who can be administered the test for norming purposes under controlled conditions. It is with respect to this point that one must ask if such administrative conditions violate the goals that Hoeflin has sought to achieve. What the test might measure may be resourcefulness, persistence, etc. as easily as some unified notion of intelligence. If the test were given under controlled conditions without the use of references, etc., perhaps the test might be a better and more exacting discriminator at the highest levels than it is under the laze faire conditions. Also the unorthodox practice of demanding total production of the answer to each problem deserves examination. Psychometricians have favored the use of multiple choice answer format because it is the most sensitive means of determining if the respondent knows the correct answer. Such format demands less of the respondent, but is very sensitive in detecting of knowledge of the correct answer. Such format is more robust to inherent deficiencies in the test procedure or question format. In essence, it gives the benefit of the doubt to the test taker, and helps to remove ambiguity in meaning, and adds an element of directiveness to the test-taker's cognitive processes. The Mega Test does the opposite; it demands that the respondent do the entire problem himself. One does not know, in the case of failure (or success), what exactly went wrong (or right). For example, was the solution given to an analogy problem a result of resourcefulness or long-term memory? Was the solution to a math problem a result of knowledge of how to use a hand-held calculator or the habitual use of a particular cognitive strategy? One has more control over what might be going on in the respondent's mind, if one has control over as many elements as possible in the test situation. Likewise, it maximizes the likelihood that the item will measure a particular mental process.
The norming of the test relies upon the reports of individuals' scores on other tests. (Requests for such data of Mega Test takers has been made in articles in which the test has been published.) The problem with determining construct validity on the basis of these numbers is in the inherent error of self-selected, self-reported norming data and the self-administration of The Mega Test. No qualifications are made for factors such as when the standardized tests were given, who reads the periodicals in which the test has been published, who is sending in the data, and the circumstances surrounding the taking of The Mega Test. The sample is not large enough to assume that such error would be randomly distributed throughout the sample and thus be insignificant.
Aside from the problems of memory, time of administration, form of test, etc. is the problem of combining standard scores and percentiles derived from many different administrations in order to make generalizations as if the standard scores and percentiles were all derived from one sample at one administration. Percentiles and standard scores are relative to the sample from which they are derived. Hoeflin uses the figures as if person A's performance at the 97th percentile on administration X is equivalent to person B's performance at the 97th percentile on administration Y. Extrapolating as far as Hoeflin does, splitting percentiles into thousandths overinterprets the meaning of percentile. Making such fine discriminations in hundredths and thousandths of a percentile is not warranted on a test that only has a score range of 0 to 48. If psychometrics could be as precise as Hoeflin assumes, all tests would be interpreted in terms of fractions of percentiles. Instead, psychometricians have the concept of standard error of a test score which emphasizes the tentativeness of mental measurements rather than emphasizing their exactness. Percentiles are usually used primarily for ease of interpreting the meaning of a standard score for a given administration.
While the approach the Hoeflin takes is interesting, inventive, intellectually stimulating, and internally consistent, it violates many good psychometric principles by overinterpreting the weak data of a self-selected sample. Indeed if some educational psychologists are said to be number crunchers, what Hoeflin has done in the norming of his test results can be said to be nothing short of number pulverization! Numbers do not stand up by themselves but only in a nexus of assumptions about the milieu from which the numbers come.
References
Hoeflin, R. K. (1983). The Mega Test. Vidya, No. 42, pp. 7-13.
Hoeflin, R. K. (1984). Mega Test Two. Vidya, No. 55, pp. 4-10.
Hoeflin, R. K. (undated). The Mega Test: Raw Statistical Data. (Unpublished compilation.)
Langdon, K. (1978). Langdon Adult Intelligence Test. Berkeley: Polymath Systems.
Morris, S. (1985). Games. Omni, April, 1985, pp. 128-132.
Morris, S. (1986). Games. Omni, January, 1986, p. 112.
vos Savant, M. M. (1985). Omni I.Q. Quiz Contest, New York: McGraw-Hill.