Draw-A-Person test

The Draw-A-Person test, first conceived by Dr. Florence Goodenough in 1926, is a skill test to measure a child's mental age through a figure drawing task. It estimates the progress of learning visual, cognitive, and motor skills by having the candidate draw a human figure, scoring the drawing for presence and quality of figure features, and comparing the score to children's typical rate of acquisition of figure features. Its cultural bias is smaller than that of more verbal intelligence tests but still present to an extent, especially with respect to clothing, and many editions' scoring criteria fail to account for disabilities or other natural variations of the human form, as a test case shows. It has a few clear opportunities for improvement.

Advantages
Compared to some other, more verbal intelligence tests, Draw-A-Person (DAP) avoids biases associated with speech, hearing, or language difficulties. Figure drawing is "culture-reduced" in the sense that it does not suffer from cultural biases to nearly the same extent, so long as instructions are given in the local language. It also helps the candidate feel at ease before starting another, more formal test.

The subject of a person was chosen for several reasons. It's something all children know, it's fairly standard, it can be drawn at coarse or fine detail. And if candidates were allowed to choose their own subject, more clever ones would choose something easier, and it would be hard to separate faults related to the difficulty of the subject from faults caused by lack of ability. :15-16

In children under ten years old, figure drawing age is somewhat correlated with other measures of mental age, such as the Stanford–Binet Intelligence Scales and the Wechsler Intelligence Scale for Children.[citation needed: try "Validity" sections in Harris, Hasan, or DHEW] Though this correlation is not strong enough for DAP to be used alone to test intelligence, the correlation is stable enough that a DAP score at age 4 predicts intelligence test performance at age 14. Identical twins were also found to have more similar scores than fraternal twins.

Developing countries have limited resources to educate their children. Wise use of these resources includes placing students in the most appropriate school track for the student's intellectual capability. For example, when there are not enough spaces in university, some students would benefit more from a trade school than from a prep school. But where literacy is low and not everyone speaks the same language, a traditional verbal intelligence test is more difficult to administer. DAP was found useful in Pakistan, as it correlated well with scholastic success.

On adults, it would primarily be used to evaluate candidates for a course or job in graphic design.

Instructions
Instruct the candidate to make a picture of a person that shows the whole body, and to try hard, work carefully, and use as much time as needed to make the best picture. Earlier editions give a cute motivation for quality based on the tribalism of organized schooling: "I want to see whether the boys and girls in ____ School can do as well as those in other schools." :85 :240

A one-drawing test gives no instruction on age, sex, location, time period, or medical condition. A longer test uses a man–woman–self sequence or a same sex–opposite sex sequence, with a couple minutes of rest between drawings. :240 Even if a candidate asks for clarification, the instructor must refuse any and all suggestion: "Do it whatever way you think is best." :241

The original protocol gives no time limit. Some methods give a soft limit of five minutes and stop the drawing after seven because candidates continue to make changes that do not improve the score. A time limit was found to decrease scores by 3 to 6 points across the board, requiring renorming, but the timed test remained a valid test.

Criteria
Each revision of the examiner's manual for Draw-A-Person includes a checklist of features, each worth one point. This is followed by a formula to translate a score into an approximate mental age.

The following descriptions have been reworded. They do not substitute for the official manual in a formal test situation. "Thickness" means that a feature is two-dimensional as opposed to a single line or dot. "Opaque" means that the area of a feature occludes the lines of features behind it. "Correct number" of bilateral features usually means two for a front drawing or one for a profile.

University of Washington
After the candidate completes the illustration, count one point for each of the following features that is complete and correct. A checklist for Goodenough's test distributed by the University of Washington listed the following features:


 * Gross detail (6): 1 for each of head, legs, arms, and trunk; trunk has thickness but longer than width; top of trunk broadens to suggest shoulders
 * Attachment to trunk (4): Both arms and legs attached to trunk; arms and legs attached at correct points; neck present; neck outline continuous with head or trunk
 * Head detail (7): 1 for each of eye, nose, nostril, mouth, and hair; nose and mouth have thickness, including two lips; hairline present: hair has shape to it other than just a scribble around the circumference of the head
 * Clothing (5): At least one identifiable article; a second article, such as a hat or trousers, both articles opaque; all clothing opaque, including sleeves and trousers; four articles; fully dressed with an identifiable role, such as business suit or a soldier's uniform, including sleeves, trousers, and shoes
 * Fingers (5): Some indication of fingers; correct number of fingers; fingers thick and longer than width, and differing in angle by no more than 180 degrees; thumb appears distinct and opposable; identifiable "hand" section from MCP to wrist separate from fingers
 * Joints in limbs (2): Elbow or shoulder identifiable; knee or hip identifiable
 * Proportion (5): Trunk area is 2 to 10 heads; arms roughly as long as trunk and do not reach knee; legs between 1 and 2 trunks long; feet with thickness and between 1/10 and 1/3 of leg; arms and legs have thickness
 * Motor coordination (6): Lines are firm and do not leave marked gaps or overlaps where they join (except for "sketchy" short strokes in more mature drawings); :104 all lines are firm and joined correctly (extremely strict, bordering on professional quality, credited for fewer than three of Goodenough's 95 sample drawings); head is more shaped than a circle or ellipse and not obviously irregular; trunk is more shaped than a circle or ellipse and not obviously irregular; arms and legs have thickness, not obviously irregular, and not narrowing near trunk; features symmetrical to the extent applicable
 * Fine head detail (7): Correct number of ears for angle; ears positioned correctly; eye hair (brow or lashes); eye has pupil; eye longer than height; pupils pointing same direction in front or forward in profile; chin and forehead present
 * Profile bonus points (4): Chin projects; heel visible; head, trunk, and feet without error; straight-on side view with all features opaque and none doubled (sorry, Picasso)

Goodenough explained why she excluded some features, such as teeth, shading, movement, pupils facing forward, and three-quarter view. Some were too difficult to score, some were not monotonic (that is, they increased and decreased with age), and some depended more on the circumstances of the test (such as pencil hardness) than intellectual maturity. :20-21

Top score for front is 46; top score for side view is 51. The standard deviation at a given age is around 7 to 8 points. If your illustration scores 40 or more, your skills are Goodenough to illustrate a children's book. Otherwise, your figure drawing age is 3 years, plus 3 months for each point.

Harris scale
In 1963, Dale B. Harris presented a revised version of DAP resulting from his collaboration with Dr. Goodenough. It describes the research that led to a 73-point scale for male and 71-point scale for female drawings.

Harris's woman scale corrects for a long skirt, giving all leg points but no foot points. :87 It begins to phase out profile bonus points in Goodenough's scale, which Harris found to be too generous. :87-88

Use in Ghana
The 1988 edition of the test by Naglieri includes a more systematic scoring system, grading specific aspects (presence, proportion, and detail) of 14 features (arms, attachment, clothing, ears, eyes, feet, fingers, hair, head, legs, mouth, neck, nose, and trunk).
 * Presence: a feature is visible, even down to one line.
 * Proportion: length to width ratio of features other than hair fall within realistic bounds.
 * Detail: other aspects of the feature. For example, detail in clothing gives additional points for additional articles of clothing and correct opacity.
 * Bonus: award an additional point if presence, proportion, and detail for a feature are perfect

There are 50 points, plus 14 bonus points for perfect sections.
 * Arms: At least one arm; thickness and correct number of arms; thickness and longer than width; all arms pointing downward or in action.
 * Attachment: Head attached to neck or trunk; correct number of arms attached to trunk and not head; correct number of arms and legs (more than just feet) attached to trunk; arms attached to top half of trunk and legs attached to bottom half.
 * Clothing: 1 for each of up to three identifiable articles identifiable by shape, shading, or fastener; clothing is opaque. (Eyeglasses and earrings are not clothing; they are ear and eye details.)
 * Ears: At least one ear; correct number of ears; taller than width in all ears; earring or earlobe in at least one ear.
 * Eyes: At least one eye; more than a line or dot; details such as pupil, eye hair, or glasses; wider than height.
 * Feet: At least one foot distinct from leg; thickness; correct number of feet, all with detail such as toes, heel, or shoelace; at least one foot wider than height.
 * Fingers: Hand distinct from arm; five fingers; correct number of hands all with five fingers; thumb has distinct shape or position; all fingers have thickness; those fingers that have thickness are longer than width.
 * Hair: Presence; hair on sides of head or facial hair; distinct style such as part, braids, or decs, more than just a squiggle around the top half.
 * Head: Presence; bounding box including hair and ears is taller than width.
 * Legs: At least one leg distinct from foot; indication of knee or crotch; thickness and longer than width in both legs.
 * Mouth: Presence; thickness (lips, teeth, or open); thickness and wider than height.
 * Neck: Neck distinct from trunk; thickness; tangent to head or trunk or separated at bottom by a collar.
 * Nose: Present; indication of nostrils or bridge; taller than width.
 * Trunk: Piece other than head, arms, and legs; indication of waist, belt, chest, or shoulder; taller than width.

The Ghanaian manual shows an illustration of each feature in order to clarify what counts as a point.

Other scales
Some scales for scoring DAP attempt to correct for deficiencies in the original test.

An excerpt from the 2004 edition of the Draw-A-Person test by Reynolds and Hickman was briefly made available to us. Like the Naglieri edition, it rearranges the criteria into features for which 1 to 5 quality grades are given, and it includes a sample illustration for each quality grade. It introduces the additional instruction to draw a picture of yourself from the front, which may reduce the temptation for especially bright people to draw an "edge case" character. Some criteria have been made less discriminatory, such as not counting fingers, judging fasteners and other clothing detail instead of counting pieces, and showing distinct toes in lieu of shoes. Others have not changed, such as the bias against mittens, long hair, or a skirted garment that covers fingers, ears, or legs. One of the example figures for "0" on waist through ankles resembles a man wearing an ankle-length shirt or a long trenchcoat.

Other scales have varying amounts of clothing and ableist bias. The 5-Minute Pediatric Consult makes no mention of shoes, knees, crotch, or leg length, which means it doesn't penalize for bare feet or a skirt. But on its 28-point, 3-point-per-year scale, it deducts 3 points for missing legs, 1 for missing ears, and 3 points for mittens.

DAP has been adapted as a "personal neglect test" for stroke patients in their 60s. The scoring gave ten points, one each for presence of head, torso, left arm, left hand, left leg, left foot, right arm, right hand, right leg, and right foot. A character showing unilateral features (difference in scores between left and right side) showed hemineglect in the candidate, and bilateral features were noticeably correlated with performance in the candidate's activities of daily living (ADLs).

At times, the test has been modified to use subjects other than the human figure. John Buck created the House-Tree-Person test in 1948. Rebecca Lawson adapted some of the methodology of DAP to test awareness of the parts of a bicycle in 2006, along with a multiple-choice follow-up to illustrate the difference between recall and recognition.

Projective test
Some variants of Draw-A-Person are intended as a projective test to measure emotional disturbance rather than figure drawing age. The Draw-A-Person: Screening Procedure for Emotional Disturbance (DAP:SPED) test, for instance, requires the candidate to make drawings of man, woman, and self, and grades them based on inclusion and omission of features that correlate with emotional disturbance, even if this disturbance is over- or underreported by the candidate's parent.

Much of the Machover interpretation is based on the size of various features. It treats a large head as representing "a large ego", a paranoid or narcissistic personality, and drawing it last shows "disturbances with interpersonal relationships." A disconnected neck could mean schizophrenia, and eyelashes or high-heel shoes drawn by a man mean gay. Stereotypical Freudian theories abound. I smell what RationalWiki calls woo. So did Harris, who found no validity in personality testing through human figure drawing. He rejected the use of "an elaborate theory of symbolism" to interpret the stylization of features, instead preferring to let the child lead with a simple "Tell me about it" after the drawing. :148-152

Sample
I'll grade a drawing of a girl based on these criteria.

University of Washington
"Bidge" scores 28/47, for a figure drawing age of 10 years.


 * Gross detail: 5/6 (legs not present)
 * Attachment to trunk: 2/4 (legs not attached, legs not attached at correct points)
 * Head detail: 5/7 (no nostrils, no mouth thickness)
 * Clothing: 4/5 (no trousers)
 * Fingers: 2/5 (fingers not visible, finger count not visible, finger width not visible)
 * Joints in limbs: 1/2 (no knee)
 * Proportion: 2/5 (legs have no length, feet have no length, legs have no thickness)
 * Motor coordination: 3/6 (lines not perfect especially in hem of cape, head is too close to a circle, no leg thickness)
 * Fine head detail: 4/7 (ears hidden by hood, no eyebrows, eyes are round)

Breakdown of missed points:
 * Expected to miss: 1 (lines not perfect)
 * Actual mistake: 1 (lack of eyebrows)
 * Clothing: 4 (hood and mittens covering ears and fingers)
 * Stylization: 4 (head shape, eye shape, nose, mouth). Though stylization can be evidence of an inexperienced or lazy artist, the instructions don't mention realism at this point.
 * Anatomy: a full 9 points

Use in Ghana
"Bidge" scores 32/64 based on the Naglieri criteria from the Ghanaian manual.
 * Arms: 4/4
 * Attachment: 2/4 (legs not attached to trunk, legs not attached to bottom half)
 * Clothing: 4/4
 * Ears: 0/4 (hood covers ears)
 * Eyes: 3/4 (round, not wide)
 * Feet: 0/4
 * Fingers: 4/6 (mittens cover fingers)
 * Hair: 2/3 (hood covers side hair)
 * Head: 1/2 (round head)
 * Legs: 0/3
 * Mouth: 1/3 (no thickness)
 * Neck: 3/3
 * Nose: 1/3 (round button)
 * Trunk: 3/3
 * Perfect sections: 4/14

Breakdown of missed points:
 * Clothing: 7 (hood and mittens covering ears and fingers), plus 3 perfect sections
 * Stylization: 6 (round eyes, round head, no lips, button nose), plus 4 perfect sections
 * Anatomy: 9 (no legs, feet, or attachment thereof), plus 3 perfect sections

Limits
No intelligence test is perfect. There are good reasons, unrelated to the candidate's skill, for an illustration to lack some of the above features.

Figure drawing age is not intended to estimate mental age in older candidates for several reasons. One is that the correlation starts to become weaker after age ten, especially in teens without intellectual disability. Scores hit a noticeable ceiling after age 12, despite attempts to find new items that target adolescents. :99 A second is that older candidates are likely to have taken drawing lessons. A study at an elementary school in Pennsylvania showed that incorporating two hours of figure drawing into a kindergarten anatomy curriculum noticeably improved the detail of the students' drawings, even though it didn't significantly improve scores in the protocol used. Finally, an especially bright candidate may recognize the psychological test and attempt to confuse the investigator. One eleven-year-old boy drew a collection of weapons as well as a pet dropping bombs onto a second figure labeled "Father". Later he revealed that he was trying to break the test, :148-149 possibly to feign warning signs associated with antisocial personality disorder. Another may "show off" her skill and sense of inclusivity by drawing a character whose appearance differs from that of an "average" illustration for a good reason, and DAP is not intended to handle cases like this.

One cause of difference is sex. The United States Department of Health conducted a study in the 1970s, based on the Harris test with its separate woman scale. Boys and girls drew male figures equally well, but girls ten years old or older tended to be six points better at drawing the female figure than boys of the same age. This can complicate analysis using test protocols that ask for one drawing of each sex. Some protocols use only one drawing, and for these, boys prefer to draw boys and girls girls. This tendency is so strong that projective versions of the test treat drawing the opposite sex first as transgender tendency.

Another is art style. Early editions failed to instruct the candidate that "good" and "best" mean "detailed realism", as opposed to intentional stylization. Certain eye and nose styles associated with illustration and animation may cost fine head detail points. Children were found more likely to apply cartoon stylization to a self-portrait than to a drawing of a generic man or woman. :151 Since the first edition of the test, illustration convention in children's entertainment shifted from the relative realism of Gray and Sharp's Dick and Jane to the extreme stylization of 21st century cartoons such as South Park and The Amazing World of Gumball, not to mention manga and anime. This art style shift can be seen even within a single long-running serial work, especially the comic strip "Goofus & Gallant" in Highlights for Children. And it may be part of what led newer editions of the test to add explicit instructions against "a cartoon or stick figure."

Some claim that DAP misses the intelligence of a candidate knowing his own limits. A hand in a pocket, for instance, is often graded as failure to draw a hand. Placing hands in pockets to avoid drawing them may show that a candidate is "clever enough to hide their inability in a clever manner", or in other words that the candidate is averting the Dunning–Kruger effect. It still costs points, though not significantly many among 13 to 15-year-olds. :86-87

A few choices of subject matter may not only reduce the estimation of mental age but also show up on emotional disturbance indicator scales:
 * A character who has lost a finger to an accident or violence costs not only the use of the finger that Dr. Smeagol and Mr. Gollum bit off but also a point for correct number of fingers.
 * If the protocol doesn't specify an adult, the candidate may end up drawing a very young character, which may cost a point for arms not reaching the bottom of the trunk. So may a character with a short-limbed, stocky build due to hypochondroplasia.
 * A character from a chibiverse may cost a point for head length.
 * A character who happens to have no legs may cost two to three years' worth of points for leg presence, leg attachment, leg proportion, trousers, and fingers (because of mitten hands). This gives many scales an uncompensated ableist bias. Typical instructions fail to specify that the person shall be "healthy"; in fact, candidates may be told to draw "any kind of person you want to draw." Goodenough recommended use of "common sense" when scoring drawings of a character with one leg and a suggestion of a crutch, :91 but this doesn't appear in later scoring manuals. In particular, Harris changes it to "crotch" for some odd reason. :148-149

Culture dependency
A human figure drawing test of this structure is not all that useful for making cross-cultural comparisons of children's intellectual maturity. Even though DAP is far more culture-reduced than other intelligence tests, it must still be standardized separately for each country.

Socioeconomic status can affect performance. Different countries have different average scores, in part because socioeconomically advanced populations tend to foster higher intelligence. Developing countries may need to modify the test to accommodate a lack of available labor to score them. In Ghana, for instance, the man-woman-self test sequence was trimmed to man-woman. For this and other reasons, Harris also presented a 12-step "quality scale" for a more rapid but less precise and more subjective assessment. :302

The aspects of a drawing considered as signs of maturity may differ from one culture to another. In Pakistan, showing knees is taboo, and the feature had to be dropped from the scale for that country. Beards are seen as a symbol of manhood and may cover the neck. Highly intelligent girls in Pakistan drew detailed clothes and jewelry but lost points for missing things like a nose or the pupil of an eye. With cultural pressure on girls to marry early, some girls emphasized beauty over including body parts and thus lost points. Female modesty aspects considered desirable in Islam, such as a rear view or closed eyes, also cost points. For this reason, Harris rejected DAP's validity for comparisons across cultures, instead suggesting that "for the most valid results, the points of the scale should be restandardized for every group having a distinctly different pattern of dress, mode of living, and quality or level of academic education." :133

Predictably, the fact that the test was conceived in the Western world has led to wide dissemination of scales that are highly specialized toward twentieth century Euroamerican men's dress. The crotch, knee, and leg points assume that the character in the illustration will be wearing pants rather than a long skirt. On the Washington scale, which summarizes the Goodenough scale, a character wearing a skirted garment may cost 2 Clothing points that mention trousers and one Joints point for knee. (Goodenough wrote that she deliberately chose a man for what was then called Draw-A-Man (DAM) :iii because of "greater uniformity of [Western] men's clothing". :16) The Ghanaian manual's scoring examples for "No point" show the same trouser tyranny, including a man in an ankle-length shirt, a woman in a long dress, and other figures wearing skirted garments that fall past the knees. It becomes even worse for a character wearing a wide skirt, such as a hoop skirt, which a candidate may draw to emphasize femininity but may cost several leg and foot points as well. The Harris scale fares well in this respect, giving points on the woman scale for what was considered "feminine" when it was published in 1963. But it still reflects Western cisgender norms, assuming for instance that a man will not dress "feminine" or wear a long coat, and that neither sex will wear loose clothing that disguises the shape of the waist and hips, as is common in the Middle East.

Characters have changed between 1977 and 2015. Among German candidates aged 6 to 7 tested at school, girls tested in 2015 chose to draw a female character more often than in 1977, and children drew female characters as more distinctly feminine in their tertiary characteristics (clothing, hair style, and the like). Gender status has become more equal as gender differentiation has increased. This led Bettina Lamm, leader of the 2015 study at University of Münster and Osnabrück University, to conclude that it has become more accepted to appear feminine.

Various cultural subject matter can reduce the score on a particular scale for even a perfect drawing of a given character:
 * A character wearing a one-piece coverall, a one-piece dress, or a long shirt may cost a point in scales that count articles of clothing, as these replace separate garments for the trunk and legs.
 * A character who goes barefoot, whether due to living in a warm climate or due to having feet with tough soles and hair-covered uppers, costs points on scales that don't consider a character "fully dressed" without shoes.
 * A hat that completely covers the hair, such as the hat of a pattern-bald older man or the headscarf of a Muslim woman, may cost points for hair and ears.
 * A character wearing mittens may cost points for lack of fingers. This and lack of visible ears may be more common in an Eskimo character who wears a parka. But in an Eskimo study, points lost to fingers were often regained on the opposable thumb, nose, eyebrows, distinct costume, and especially boot details. :131-133

Further study directions
It may be possible to create a scale that allows fairer comparisons across cultures. Here are some principles to keep in mind:


 * The crotch criterion in several scales is intended to penalize immature drawings with an excessive "thigh gap", where legs are parallel and attached to the trunk too far apart. (Think of SpongeBob SquarePants.) The Harris woman scale includes an alternate criterion based on lower leg angles that compensates for the effect of a skirt that is calf-length or shorter. :284 Generalize skirt compensations to the man scale as well, using language borrowed from the Harris woman scale. This might improve validity with drawings of a long coat, tunic, kilt, or sarong.
 * To remove the bias associated with counting articles, treat a single article that covers both the chest and thighs as two articles.
 * Give the test in multiple regions and find items that correlate more with culture than with intellectual ability. Then balance the cultural items such that penalties for items uncommon in one culture compensate for penalties for items uncommon in another culture. For example, balance each body part such that possible detail points when covered match those when left bare. If 3 points are possible for shoes, 3 points ought to be possible for bare feet.
 * Formalize Goodenough's "common sense" treatment of disability in characters. During the "Tell me about it" phase, the candidate may clarify that the character is physically impaired and uses other body parts to compensate. Score those parts as both the homologous and analogous part. (A homologous part has the same position in the body plan; an analogous part has the same function.) For example, Bidge's arms would be scored on both the (homologous) arm criteria and the (analogous) leg criteria, and treating her mittens as boots would make up for some lost finger points. This quantum superposition of arms and legs-attached-to-shoulders would restore 9 points on the Goodenough scale (1 gross detail, 2 attachment, 1 clothing, 1 joints, 3 proportion, 1 coordination) and 10 on the Naglieri scale (2 attachment, 3 feet, 3 legs, 2 perfect sections).