Lightside.hu
Behavior Research Methods
The final publication is available at Springer via http://dx.doi.org/10.3758/s13428-014-0536-1
Spontaneous facial expression in unscripted social interactions can be
measured automatically
Jeffrey M. Girard
University of Pittsburgh
University of Pittsburgh
Carnegie Mellon University
Michael A. Sayette
Carnegie Mellon University
University of Pittsburgh
Fernando De la Torre
Carnegie Mellon University
Methods to assess individual facial actions have potential to shed light on important behavioralphenomena ranging from emotion and social interaction to psychological disorders and health.
However, manual coding of such actions is labor intensive and requires extensive training. Todate, establishing reliable automated coding of unscripted facial actions has been a dauntingchallenge impeding development of psychological theories and applications requiring facialexpression assessment. It is therefore essential that automated coding systems be developedwith enough precision and robustness to ease the burden of manual coding in challenging datainvolving variation in participant gender, ethnicity, head pose, speech, and occlusion. We reporta major advance in automated coding of spontaneous facial actions during an unscripted socialinteraction involving three strangers. For each participant (n = 80, 47% women, 15% Non-white), 25 facial action units (AUs) were manually coded from video using the Facial ActionCoding System. Twelve AUs occurred more than 3% of the time and were processed usingautomated FACS coding. Automated coding showed very strong reliability for the proportionof time that each AU occurred (mean intraclass correlation = 0.89), and the more stringentcriterion of frame-by-frame reliability was moderate to strong (mean Matthew's correlation =0.61). With few exceptions, differences in AU detection related to gender, ethnicity, pose, andaverage pixel intensity were small. Fewer than 6% of frames could be coded manually but notautomatically. These findings suggest automated FACS coding has progressed sufficiently tobe applied to observational research in emotion and related areas of study.
Keywords: facial expression, FACS, affective computing, automated coding
Jeffrey M. Girard, Department of Psychology, University of
During the past few decades, some of the most strik-
Pittsburgh; Jeffrey F. Cohn, Department of Psychology, University
ing findings about affective disorders, schizophrenia, addic-
of Pittsburgh, The Robotics Institute, Carnegie Mellon University;
tion, developmental psychopathology, and health have been
Laszlo A. Jeni, The Robotics Institute, Carnegie Mellon Univer-
based on sophisticated coding of facial expressions.
sity; Michael A. Sayette, Department of Psychology, University of
instance, it has been found that facial expression coding
Pittsburgh; Fernando De la Torre, The Robotics Institute, Carnegie
using the Facial Action Coding System (FACS), which is
Mellon University.
the most comprehensive system for coding facial behav-
This work was supported in part by US National Institutes of
ior (Ekman, Friesen, & Hager, 2002), identifies which de-
Health grants R01 MH051435 and R01 AA015773.
pressed patients are at greatest risk for reattempting sui-
Correspondence concerning this article should be addressed to
cide (Archinard, Haynal-Reymond, & Heller, 2000); consti-
Jeffrey Girard, 4325 Sennott Square, University of Pittsburgh, Pitts-
tutes an index of physical pain with desirable psychometric
burgh, PA 15260. Email:
[email protected]
properties (Prkachin & Solomon, 2008); distinguishes dif-ferent types of adolescent behavior problems (Keltner, Mof-fitt, & Stouthamer-Loeber, 1995); and distinguishes betweenEuropean-American, Japanese, and Chinese infants (Camras
JEFFREY M. GIRARD
et al., 1998). These findings have offered glimpses into criti-
more challenging task of AU detection during spontaneous
cal areas of human behavior that were not possible using ex-
facial behavior. Examples of the latter include AU detec-
isting methods of assessment, often generating considerable
tion in physical pain (G. C. Littlewort, Bartlett, & Lee, 2009;
research excitement and media attention.
P. Lucey, Cohn, Howlett, Member, & Sridharan, 2011), inter-
As striking as these original findings were, it is just as
views (Bartlett et al., 2006; Girard, Cohn, Mahoor, Mavadati,
striking how little follow-up work has occurred using these
Hammal, & Rosenwald, 2013; S. Lucey, Matthews, Am-
methods. The two primary reasons for this curious state of
badar, De la Torre, & Cohn, 2006), and computer-mediated
affairs are the intensive training required to learn facial ex-
tasks such as watching a video clip or filling out a form
pression coding and the extremely time-consuming nature of
(Hoque, McDuff, & Picard, 2012; Grafsgaard, Wiggins,
the coding itself. Paul Ekman, one of the creators of FACS,
Boyer, Wiebe, & Lester, 2013; G. Littlewort et al., 2011;
notes that certification in FACS requires about 6 months of
Mavadati, Mahoor, Bartlett, Trinh, & Cohn, 2013; McDuff,
training and that FACS coding a single minute of video can
El Kaliouby, Kodra, & Picard, 2013).
take over an hour (Ekman, 1982).
While much progress has been made, the current state of
FACS (Ekman & Friesen, 1978; Ekman et al., 2002) is an
the science is limited in several key respects. Stimuli to elicit
anatomically-based system for measuring nearly all visually-
spontaneous facial actions have been highly controlled (e.g.,
discernible facial movement. FACS describes facial activ-
watching pre-selected video clips or replying to structured
ities in terms of unique facial action units (AUs), which
interviews) and camera orientation has been frontal with lit-
correspond to the contraction of one or more facial mus-
tle or no variation in head pose. Non-frontal pose matters
cles. Any facial expression may be represented as a single
because the face looks different when viewed from different
AU or a combination of multiple AUs. For example, the
orientations and parts of the face may become self-occluded.
Duchenne smile (i.e., enjoyment smile) is indicated by si-
Rapid head movement also may be difficult to automatically
multaneous contraction of the zygomatic major (AU 12) and
track through a video sequence. Head motion and orienta-
orbicularis oculi pars lateralis (AU 6). Although there are
tion to the camera are important if AU detection is to be
alternative systems for characterizing facial expression (e.g.,
accomplished in social settings where facial expressions of-
Izard, 1979; Abrantes & Pereira, 1999), FACS is recognized
ten co-occur with head motion. For example, the face and
as the most comprehensive and objective means for measur-
head pitch forward and laterally during social embarrassment
ing facial movement currently available, and it has become
(Keltner et al., 1995; Ambadar, Cohn, & Reed, 2009). Kraut
the standard for facial measurement in behavioral research
and Johnston (1979) found that successful bowlers smile
(Cohn & Ekman, 2005; Ekman & Rosenberg, 2005).
only as they turn away from the bowling lane and toward
Given the often-prohibitive time commitment of FACS
their friends.
coding, there has been great interest in developing computer
Whether automated methods can detect spontaneous fa-
vision methods for automating facial expression coding. If
cial expressions in the presence of head pose variation is un-
successful, these methods would greatly improve the effi-
known, as too few studies have encountered or reported on
ciency and reliability of AU detection, and importantly make
it. Messinger, Mahoor, Chow, and Cohn (2009) encountered
its use feasible in applied settings outside of research.
out-of-plane head motion in video of infants, but neglected
Although the advantages of automated facial expression
to report whether it affected AU detection. Cohn and Sayette
coding are apparent, the challenges of developing such sys-
(2010) reported preliminary evidence that AU detection may
tems are considerable. While human observers easily ac-
be robust to pose variation up to 15 degrees from frontal.
commodate variations in pose, scale, illumination, occlusion,
Similarly, we know little about the effects of gender and eth-
and individual differences (e.g., gender and ethnicity), these
nicity on AU detection. Face shape and texture vary between
and other sources of variation represent considerable chal-
men and women (Bruce & Young, 1998), and may be further
lenges for a computer vision system. Further, there is the ma-
altered through the use of cosmetics. Skin color is an addi-
chine learning challenge of automatically detecting actions
tional factor that may affect AU detection. Accordingly, little
that require significant training and expertise even for human
is known about the operational parameters of automated AU
detection. For these reasons, automated FACS coding must
There has been significant effort to develop computer-
prove robust to these challenges.
vision based approaches to automated facial expression anal-
The current study evaluates automated FACS coding us-
ysis. Most of this work has focused on prototypic emotion
ing a database that is well-suited to testing just how far au-
expressions (e.g., joy and anger) in posed behavior. Zeng,
tomated methods have progressed, and how close we are to
Pantic, Roisman, and Huang (2009) have reviewed this lit-
using them to study naturally-occurring facial expressions.
erature through 2009. Within the past few years, studies
This investigation focuses on spontaneous facial expression
have progressed to AU detection in actor portrayals of emo-
in a far larger database (over 400,000 video frames from 80
tion (Valstar, Bihan, Mehu, Pantic, & Scherer, 2011) and the
people) than ever attempted; it includes men and women,
FACIAL EXPRESSION CAN BE MEASURED AUTOMATICALLY
Whites and Nonwhites, and a wide range of facial AUs that
vary in intensity and head orientation. Because this database
contains variation in head pose and participant gender, as
well as moderate variation in illumination and participant
ethnicity, we can examine their effect on AU detection. To
demonstrate automated AU detection in such a challenging
database would mark a crucial step toward the goal of es-
tablishing fully-automated systems capable of use in varied
research and applied settings.
Figure 1. Base rates of all the coded facial action units from
a subset of the data (n = 56)
The current study used digital video from 80 participants
(53% male, 85% white, average age 22.2 years) who were
The laboratory included a custom-designed video control
participating in a larger study on the impact of alcohol on
system that permitted synchronized video output for each
group formation processes (for elaboration, see Sayette et
participant, as well as an overhead shot of the group. The
al., 2012). They were randomly assigned to groups of 3 un-
individual view for each participant was used in this report.
acquainted participants. Whenever possible, all three partic-
The video data collected by each camera had a standard
ipants in a group were analyzed. Some participants were not
frame rate of 29.97 frames per second and a resolution of
analyzable due to excessive occlusion from hair or head wear
640×480 pixels. Audio was recorded from a single micro-
(n = 6) or gum chewing (n = 1). Participants were randomly
phone. The automated FACS coding system was processed
assigned to drink isovolumic alcoholic beverages (n = 31),
on a Dell T5600 workstation with 128GB of RAM and dual
placebo beverages (n = 21), or nonalcoholic control bever-
Xeon E5 processors. The system also runs on standard desk-
ages (n = 28); all participants in a group drank the same
top computers.
type of beverage. The majority of participants were fromgroups with a mixed gender composition of two males and
Manual FACS Coding
one female (n = 32) or two females and one male (n = 26),although some were from all male (n = 12) or all female
The FACS manual (Ekman et al., 2002) defines 32 distinct
(n = 10) groups. All participants reported that they had not
facial action units. All but 7 were manually coded. Omitted
consumed alcohol or psychoactive drugs (except nicotine or
were three "optional" AUs related to eye closure (AUs 43, 45,
caffeine) during the 24 hour period leading up to the obser-
and 46), three AUs related to mouth opening or closure (AUs
8, 25, and 26), and one AU that occurs on the neck ratherthan the face (AU 21). The remaining 25 AUs were manually
Setting and Equipment
coded from onset (start) to offset (stop) by one of two certi-fied and highly experienced FACS coders using Observer XT
All participants were previously unacquainted. They first
software (Noldus Information Technology, 2013). AU onsets
met only after entering the observation room where they were
were annotated when they reached slight or B level intensity
seated approximately equidistantly from each other around a
according to FACS; the corresponding offsets were annotated
circular (75 cm diameter) table. They were asked to con-
when they fell below B level intensity. AU of lower intensity
sume a beverage consisting of cranberry juice or cranberry
(i.e., A level intensity) are ambiguous and difficult to detect
juice and vodka (a 0.82 g/kg dose of alcohol for males and a
for both manual and automated coders. The original FACS
0.74 g/kg dose of alcohol for females) before engaging in a
manual (Ekman & Friesen, 1978) did not code A level inten-
variety of cognitive tasks. We focus on a portion of the 36-
sity (referred to there as "trace."). All AUs were annotated
minute unstructured observation period in which participants
during speech.
became acquainted with each other (mean duration 2.69 min-
Because highly skewed class distributions severely atten-
utes). Separate wall-mounted cameras faced each person.
uate measures of classifier performance (Jeni, Cohn, & De
It was initially explained that the cameras were focused on
la Torre, 2013), AUs that occurred less than about 3% of the
their drinks and would be used to monitor their consump-
time were excluded from analysis. Thirteen AUs were omit-
tion rate from the adjoining room, although participants later
ted on this account. Five of them either never occurred or
were told of our interest in observing their behavior and a
occurred less than 1% of the time. Manual coding of these
second consent form was signed if participants were willing.
five AUs was suspended after the first 56 subjects. Visual
All participants consented to this use of their data.
inspection of Figure 1 reveals that there was a large gap be-
JEFFREY M. GIRARD
The first step in automatically
detecting AUs was to locate the face and facial landmarks.
Landmarks refer to points that define the shape of perma-
nent facial features, such as the eyes and lips. This step
Image Metrics Tracking
was accomplished using the LiveDriver SDK (Image Met-
rics, 2013), which is a generic tracker that requires no indi-vidualized training to track facial landmarks of persons it has
never seen before. It locates the two-dimensional coordinatesof 64 facial landmarks in each image. These landmarks cor-
respond to important facial points such as the eye and mouthcorners, the tip of the nose, and the eyebrows. LiveDriver
SVM Classification
SDK also tracks head pose in three dimensions for each video
frame: pitch (i.e., vertical motion such as nodding), yaw (i.e.,
Similarity Normalization
horizontal motion such as shaking the head), and roll (i.e.,
lateral motion such as tipping the head sideways).
Shape and texture information can only be used to iden-
tify facial expressions if the confounding influence of headmotion is controlled (De la Torre & Cohn, 2011). Because
participants exhibited a great deal of rigid head motion dur-ing the group formation task, the second step was to remove
the influence of such motion on each image. Many tech-niques for alignment and registration are possible (Zeng et
Figure 2. Automated FACS Coding Pipeline.
al., 2009); we chose the widely-used similarity transforma-tion (Szeliski, 2011) to warp the facial images to the averagepose and a size of 128×128 pixels, thereby creating a com-
tween the AUs that occurred approximately 10% or more of
mon space in which to compare them. In this way, variation
the time and those that occurred approximately 3% or less of
in head size and orientation would not confound the mea-
the time. The class distributions of the excluded AUs were
surement of facial actions.
at least 3 times more skewed than those of the included AUs.
Feature Extraction.
Once the facial landmarks had
In all, 12 AUs met base-rate criteria and were included for
been located and normalized, the third step was to measure
automatic FACS coding.
the deformation of the face caused by expression. This was
To assess inter-observer reliability, video from 17 partici-
accomplished by extracting Scale-Invariant Feature Trans-
pants was annotated by both coders. Mean frame-level reli-
form (SIFT) descriptors (Lowe, 1999) in localized regions
ability was quantified with the Matthews Correlation Coeffi-
surrounding each facial landmark. SIFT applies a geomet-
cient (MCC), which is robust to agreement due to chance as
ric descriptor to an image region and measures features that
described below. The average MCC was 0.80, ranging from
correspond to changes in facial texture and orientation (e.g.,
0.69 for AU 24 to 0.88 for AU 12; according to convention,
facial wrinkles, folds, and bulges). It is robust to changes in
these numbers can be considered strong to very strong re-
illumination and shares properties with neurons responsible
liability (Chung, 2007). This high degree of inter-observer
for object recognition in primate vision (Serre et al., 2005).
reliability is likely due to extensive training and supervision
SIFT feature extraction was implemented using the VLFeat
of the coders.
open-source library (Vedali & Fulkerson, 2008). The diame-ter of the SIFT descriptor was set to 24 pixels, as illustrated
Automatic FACS Coding
above the left lip corner in Figure 2.
The final step in automatically de-
Figure 2 shows an overview of the AU detection pipeline.
tecting AUs was to train a classifier to detect each AU using
The face is detected automatically and facial landmarks are
SIFT features. By providing each classifier multiple exam-
detected and tracked. The face images and landmarks are
ples of an AU's presence and absence, it was able to learn
normalized to control for variation in size and orientation,
a mapping of SIFT features to that AU. The classifier then
and appearance features are extracted. The features then are
extrapolated from the examples to predict whether the AU
input to classification algorithms, as described below. Please
was present in new images. This process is called super-
note that the mentioned procedures do not provide incremen-
vised learning and was accomplished using support vector
tal results; all the procedures are required to perform classi-
machine (SVM) classifiers (Vapnik, 1995). SVM classifiers
fication and calculate an inter-system reliability score.
extrapolate from examples by fitting a hyperplane of maxi-
FACIAL EXPRESSION CAN BE MEASURED AUTOMATICALLY
mum margin into the transformed, high dimensional feature
detector), and prevents an undesired handicap from being in-
space. SVM classification was implemented using the LIB-
troduced by invariance to linear transformation. For exam-
LINEAR open-source library (Fan, Wang, & Lin, 2008).
ple, an automated system that always detected a base rate
The performance of a classifier is evaluated by testing the
twice as large as that of the human coder would have a per-
accuracy of its predictions. To ensure generalizability of the
fect Pearson Correlation Coefficient, but a poor ICC. For this
classifiers, they must be tested on examples from people they
reason, the behavior of ICC is more rigorous than that of the
have not seen previously. This is accomplished by cross-
Pearson Correlation Coefficient when applied to continuous
validation, which involves multiple rounds of training and
values. We used the one-way, random effects model ICC
testing on separate data. Stratified k-fold cross-validation
described in Equation 1.
(Geisser, 1993) was used to partition participants into 10folds with roughly equal AU base rates. On each round ofcross-validation, a classifier was trained using data (i.e., fea-
√(TP + FP)(TP + FN)(TN + FP)(TN + FN)
tures and labels) from eight of the ten folds. The classifier's
cost parameter was optimized using one of the two remainingfolds through a "grid-search" procedure (Hsu, Chang, & Lin,
The Matthews Correlation Coefficient (MCC), also known
2003). The predictions of the optimized classifier were then
as the phi coefficient, can be used as a measure of the quality
tested through extrapolation to the final fold. This process
of a binary classifier (D. M. Powers, 2007). It is equivalent
was repeated so that each fold was used once for testing and
to a Pearson Correlation Coefficient computed for two binary
parameter optimization; classifier performance was averaged
measures and can be interpreted in the same way: an MCC
over these 10 iterations. In this way, training and testing of
of 1 indicates perfect correlation between methods, while an
the classifiers was independent.
MCC of 0 indicates no correlation (or chance agreement).
MCC is related to the chi-squared statistic for a 2×2 con-
Inter-system Reliability
tingency table, and is the geometric mean of Informedness
The performance of the automated FACS coding system
(DeltaP) and Markedness (DeltaP'). Using Equation 2, MCC
was measured in two ways. Following the example of Girard,
can be calculated directly from a confusion matrix. Although
Cohn, Mahoor, Mavadati, Hammal, and Rosenwald (2013),
there is no perfect way to represent a confusion matrix in a
we measured both session-level and frame-level reliability.
single number, MCC is preferable to alternatives (e.g., the
Session-level reliability asks whether the expert coder and
F-measure or Kappa) because it makes fewer assumptions
the automated system are consistent in their estimates of the
about the distributions of the data set and the underlying pop-
proportion of frames that include a given AU. Frame-level
ulations (D. M. W. Powers, 2012).
reliability represents the extent to which the expert coder and
Because ICC and MCC are both correlation coefficients,
the automated system make the same judgments on a frame-
they can be evaluated using the same heuristic, such as the
by-frame basis. That is, for any given frame, do both detect
one proposed by Chung (2007): that coefficients between 0.0
the same AU? For many purposes, such as comparing the
and 0.2 represent very weak reliability, coefficients between
proportion of positive and negative expressions in relation to
0.2 and 0.4 represent weak reliability, coefficients between
severity of depression, session-level reliability of measure-
0.4 and 0.6 represent moderate reliability, coefficients be-
ment is what matters. Session-level reliability was assessed
tween 0.6 and 0.8 represent strong reliability, and coefficients
using intraclass correlation (ICC) (Shrout & Fleiss, 1979).
between 0.8 and 1.0 represent very strong reliability.
Frame-level reliability was quantified using the MatthewsCorrelation Coefficient (MCC) (D. M. Powers, 2007).
We considered a variety of factors that could potentially
influence automatic AU detection. These were participant
BMS + (k − 1)W MS
gender, ethnicity, mean pixel intensity of the face, seatinglocation, and variation in head pose. Mean pixel intensity
The Intraclass Correlation Coefficient (ICC) is a measure
is a composite of several factors that include skin color, ori-
of how much the units in a group resemble one another
entation to overhead lighting, and head pose. Orientation to
(Shrout & Fleiss, 1979). It is similar to the Pearson Cor-
overhead lighting could differ depending on participants' lo-
relation Coefficient, except that for ICC the data are centered
cation at the table. Because faces look different when viewed
and scaled using a pooled mean and standard deviation rather
from different angles, pose for each frame was considered.
than each variable being centered and scaled using its own
The influence of ethnicity, sex, average pixel intensity,
mean and standard deviation. This is appropriate when the
seating position, and pose on classification performance was
same measure is being applied to two sources of data (e.g.,
evaluated using hierarchical linear modeling (HLM; Rau-
two manual coders or a manual coder and an automated AU
denbush & Bryk, 2002). HLM is a powerful statistical tool
JEFFREY M. GIRARD
for modeling data with a "nested" or interdependent struc-
Session-level Reliability (ICC)
Frame-level Reliability (MCC)
ture. In the current study, repeated observations were nestedwithin participants. By creating sub-models (i.e., partition-
ing the variance and covariance) for each level, HLM ac-counted for the fact that observations from the same partic-
ipant are likely to be more similar than observations from
Classifier predictions for each video frame were assigned
a value of 1 if they matched the manual coder's annotationand a value of 0 otherwise. These values were entered into
a two-level HLM model as its outcome variable; a logit-
link function was used to transform the binomial values into
Figure 3. Mean inter-system reliability for twelve AUs
continuous log-odds. Four frame-level predictor variableswere added to the first level of the HLM: z-scores of eachframe's head pose (yaw, pitch, and roll) and mean pixel in-
p < 0.001. Effects of seating location were also significant,
tensity. Two participant-level predictor variables were added
with participants sitting in one of the chairs showing signifi-
to the second level of the HLM: dummy codes for participant
cantly lower mean pixel intensity than participants sitting in
gender (0=male, 1=female) and ethnicity (0=White, 1=Non-
the other chairs, F(79) = 5.71, p < .01. Head pose was
white). A sigmoid function was used to transform log-odds
uncorrelated with pixel intensity: for yaw, pitch, and roll,
to probabilities for ease of interpretation.
r = −0.09, −0.07, and −0.04, respectively.
Inter-System Reliability
The mean session-level reliability (i.e., ICC) for AUs was
Descriptive Statistics
very strong at 0.89, ranging from 0.80 for AU 17 to 0.95 for
Using manual FACS coding, the mean base rate for AUs
AU 12 and AU 7 (Fig. 3). The mean ICC was 0.91 for male
was 27.3% with a relatively wide range. AU 1 and AU 15
participants and 0.79 for female participants. The mean ICC
were least frequent, with each occurring in only 9.2% of
was 0.86 for participants self-identifying as White and 0.91
frames; AU 12 and AU 14 occurred most often, in 34.3%
for participants self-identifying as Nonwhite.
and 63.9% of frames, respectively (Table 1). Occlusion, de-
The mean frame-level reliability (i.e., MCC) for AUs was
fined as partial obstruction of the view of the face, occurred
strong at 0.60, ranging from 0.44 for AU 15 to 0.79 for AU
in 18.8% of all video frames.
12 (Fig. 3). The mean MCC was 0.61 for male participants
Base rates for two AUs differed between men and women.
and 0.59 for female participants. The mean MCC was 0.59
Women displayed significantly more AU 10 than men,
for participants self-identifying as White and 0.63 for partic-
t(78) = 2.79, p < .01, and significantly more AU 15 than
ipants self-identifying as Nonwhite.
men, t(78) = 3.05, p < .01. No other significant differences
between men and women emerged, and no significant differ-ences in base rates between Whites and Nonwhites emerged.
HLM found that a number of participant- and frame-
Approximately 5.6% of total frames could be coded man-
level factors affected the likelihood that the automated sys-
ually but not automatically. 9.7% of total frames could be
tem would make classification errors for specific AUs (Ta-
coded neither automatically nor manually. Occlusion was
ble 2). For several AUs, participant gender and self-reported
responsible for manual coding failures. Tracking failure was
ethnicity affected performance. Errors were 3.45% more
responsible for automatic coding failures.
likely in female than male participants for AU 6 (p < .05),
Head pose was variable, with most of that variation occur-
2.91% more likely in female than male participants for AU
ring within the interval of 0 to 20◦ from frontal view. (Abso-
15 (p < .01), and 5.15% more likely in White than Non-
lute values are reported for head pose.) Mean pose was 7.6◦
white participants for AU 17 (p < .05). For many AUs,
for pitch, 6.9◦ for yaw, and 6.1◦ for roll. The 95th percentiles
frame-level head pose and mean pixel intensity affected per-
were 20.1◦ for pitch, 15.7◦ for yaw, and 15.7◦ for roll.
formance. For every one standard deviation increase in the
Although illumination was relatively consistent in the ob-
absolute value of head yaw, the probability of making an er-
servation room, the average pixel intensity of faces did vary.
ror increased by 0.79% for AU 2 (p < .05), by 0.15% for AU
Mean pixel intensity was 40.3% with a standard deviation of
11 (p < .05), by 1.24% for AU 12 (p < .01), by 1.39% for
9.0%. Three potential sources of variation were considered:
AU 23 (p < .05), and by 0.77% for AU 24 (p < .05). For
ethnicity, seating location, and head pose. Mean pixel inten-
every one standard deviation increase in the absolute value of
sity was lower for Nonwhites than for Whites, t(78) = 4.87,
head pitch, the probability of making an error increased by
FACIAL EXPRESSION CAN BE MEASURED AUTOMATICALLY
Table 1Action Unit Base Rates from Manual FACS Coding (% of frames)
Note: Shaded cells indicate significant differences between groups (p < .05).
1.24% for AU 15 (p < .05). No significant effects were found
ies of group formation (Fairbairn, Sayette, Levine, Cohn, &
for deviations in head roll. Finally, for every one standard
Creswell, 2013; Sayette et al., 2012) as well as emotion and
deviation increase in mean pixel intensity, the probability of
social interaction more broadly (Ekman & Rosenberg, 2005).
making an error increased by 2.21% for AU 14 (p < .05).
Session-level reliability for AUs related to brow actions andsmile controls, which counteract the upward pull of the zygo-
matic major (Ambadar et al., 2009; Keltner, 1995), were onlysomewhat lower. Smile controls have been related to embar-
The major finding of the present study was that sponta-
rassment, efforts to down-regulate positive affect, deception,
neous facial expression during a three person, unscripted so-
and social distancing (Ekman & Heider, 1988; Girard, Cohn,
cial interaction can be reliably coded using automated meth-
Mahoor, Mavadati, & Rosenwald, 2013; Keltner & Buswell,
ods. This represents a significant breakthrough in the field of
1997; Reed et al., 2007).
affective computing and offers exciting new opportunities for
The more demanding frame-level reliability (i.e., MCC)
both basic and applied psychological research.
averaged 0.60, which can be considered strong. Similar to
We evaluated the readiness of automated FACS coding for
the session-level reliability results, actions associated with
research use in two ways. One was to assess session-level re-
positive affect had the highest frame-level reliability (0.76
liability: whether manual and automated measurement yield
for AU 6 and 0.79 for AU 12). MCC for smile controls was
consistent estimates of the proportion of time that different
more variable. For AU 14 (i.e., dimpler), which is associated
AUs occur. The other, more-demanding metric was frame-
with contempt and anxiety (Fairbairn et al., 2013), and AU
level reliability: whether manual and automated measure-
10, which is associated with disgust (Ekman, 2003), reliabil-
ment agree on a frame-by-frame basis. When average rates
ity was strong (MCC = 0.60 and 0.72, respectively). MCC
of actions are of interest, session-level reliability is the crit-
for some others was lower (e.g., 0.44 for AU 15). When
ical measure (e.g., Sayette & Hufford, 1995; Girard, Cohn,
frame-by-frame detection is required, reliability is strong for
Mahoor, Mavadati, Hammal, & Rosenwald, 2013). When
some AUs but only moderate for others. Further research
it is important to know when particular actions occur in the
is indicated to improve detection of the more difficult AUs
stream of behavior, for instance to define particular combi-
(e.g., AU 11 and AU 15).
nations of AUs, frame-level reliability is what matters (e.g.,
Our findings from a demanding group formation task with
Ekman & Heider, 1988; Reed, Sayette, & Cohn, 2007). For
frequent changes in head pose, speech, and intensity are
AUs that occurred as little as 3% of the time, we found evi-
highly consistent with what has been found previously in
dence of very strong session-level reliability and moderate to
more constrained settings. In psychiatric interview, for in-
strong frame-level reliability. AUs occurring less than 3% of
stance, we found that automated coding was highly con-
the time were not analyzed.
sistent with manual coding and revealed the same pattern
Session-level reliability (i.e., ICC) averaged 0.89, which
of state-related changes in depression severity over time
can be considered very strong. The individual coefficients
(Girard, Cohn, Mahoor, Mavadati, Hammal, & Rosenwald,
were especially strong for AUs associated with positive af-
fect (AU 6 and AU 12), which is of particular interest in stud-
Results from error analysis revealed that several
JEFFREY M. GIRARD
Table 2Standardized Regression Coefficients Predicting the Likelihood of Correct Automated Annotation
Participant Variables
Video Frame Variables
Note: Standardized regression coefficients are in log-odds form. ∗ = p < .05 and ∗∗ = p < .01
participant-level factors influenced the probability of mis-
be expected in the context of a spontaneous social interac-
classification. Errors were more common for female than
tion. For contexts in which larger pose variation is likely,
male participants for AU 6 and AU 15, which may be due
pose-dependent training may be needed (Guney, Arar, Fis-
to gender differences in facial shape, texture, or cosmetics-
cher, & Ekenel, 2013). Although the effects of mean pixel in-
usage. AU 15 was also more than twice as frequent in female
tensity were modest, further research is needed in databases
than male participants, which may have led to false nega-
with more variation in illumination.
tives for females. With this caveat in mind, the overall find-
Using only a few minutes of manual FACS coding each
ings strongly support use of automated FACS coding in sam-
from 80 participants, we were able to train classifiers that
ples with both genders. Regarding participant ethnicity, er-
repeatedly generalized (during iterative cross-validation) to
rors were more common in White than Nonwhite participants
unseen portions of the data set, including unseen participants.
for AU 17. This finding may suggest that the facial texture
This suggests that the un-coded portions of the data set - over
changes caused by AU 17 are easier to detect on darker skin.
30 minutes of video from 720 participants - could be auto-
Replication of this finding, however, would be important as
matically coded via extrapolation with no additional manual
the number of Nonwhite participants was small relative to
coding. Given that it can take over an hour to manually code
the number of White participants (i.e., 12 Nonwhite vs. 68
a single minute of video, this represents a substantial savings
of time and opens new frontiers in facial expression research.
Several frame-level factors also influenced the probabil-
A variety of approaches to AU detection using appearance
ity of misclassification. In the group formation task, most
features have been pursued in the literature. One is static
head pose variation was within plus or minus 20◦ of frontal
modeling; another is temporal modeling. In static modeling,
and illumination was relatively consistent. Five AUs showed
each video frame is evaluated independently. For this reason,
sensitivity to horizontal change in head pose (i.e., yaw): the
it is invariant to head motion. Static modeling is the approach
probability of errors increased for AU 2, AU 11, AU 12, AU
we used. Early work used neural networks for static mod-
23, and AU 24 as participants turned left or right and away
eling (Tian, Kanade, & Cohn, 2001). More recently, sup-
from frontal. Only one AU showed sensitivity to vertical
port vector machine classifiers such as we used have pre-
change in head pose (i.e., pitch): the probability of errors
dominated (De la Torre & Cohn, 2011). Boosting, an iter-
increased for AU 15 as participants turned up or down and
ative approach, has been used to a lesser extent for classifi-
away from frontal. No AUs showed sensitivity to rotational
cation as well as for feature selection (G. Littlewort, Bartlett,
change in head pose (i.e., roll). Finally, only one AU showed
Fasel, Susskind, & Movellan, 2006; Zhu, De la Torre, Cohn,
sensitivity to change in illumination: the probability of errors
& Zhang, 2011). Others have explored rule-based systems
increased for AU 14 as mean pixel intensity increased. These
(Pantic & Rothkrantz, 2000) for static modeling. In all, static
findings suggest that horizontal motion is more of a concern
modeling has been the most prominent approach.
than vertical or rotational motion. However, the overall relia-
In temporal modeling, recent work has focused on incor-
bility results suggest that automated FACS coding is suitable
porating motion features to improve performance. A popular
for use in databases with the amount of head motion that can
strategy is to use hidden Markov models (HMM) to tempo-
FACIAL EXPRESSION CAN BE MEASURED AUTOMATICALLY
rally segment actions by establishing a correspondence be-
others have proposed using either the distance measure or
tween AU onset, peak, and offset and an underlying latent
a pseudo-probability based on that distance measure. This
state. Valstar and Pantic (2007) used a combination of SVM
method worked well for posed facial actions but not for spon-
and HMM to temporally segment and recognize AUs. In sev-
taneous ones (Bartlett et al., 2006; Girard, 2014; Yang, Qing-
eral papers, Qiang and his colleagues (Li, Chen, Zhao, & Ji,
shan, & Metaxas, 2009). To automatically measure intensity
2013; Tong, Chen, & Ji, 2010; Tong, Liao, & Ji, 2007) used
of spontaneous facial actions, we found that it is necessary
what are referred to as dynamic Bayesian networks (DBN)
to train classifiers on manually coded AU intensity (Girard,
to detect facial action units. DBN exploits the known cor-
2014). In two separate data sets, we found that classifiers
relation between AU. For instance, some AUs are mutually
trained in this way consistently out-performed those that re-
exclusive. AU 26 (mouth open) cannot co-occur with AU
lied on distance measures. Behavioral researchers are cau-
24 (lips pressed). Others are mutually "excitatory." AU 6
tioned to be wary of approaches that use distance measures
and AU 12 frequently co-occur during social interaction with
in such a way.
friends. These "dependencies" can be used to reduce uncer-
Because classifier models may be sensitive to differences
tainty about whether an AU is present. While they risk false
in appearance, behavior, context, and recording environment
positives (e.g., detecting a Duchenne smile when only AU 12
(e.g., cameras and lighting), generalizability of AU detection
is present), they are a promising approach that may become
systems from one data set to another cannot be assumed. A
more common (Valstar & Pantic, 2007).
promising approach is to personalize classifiers by exploit-
The current study is, to our knowledge, the first to per-
ing similarities between test and training subjects (Chu, De
form a detailed and statistically-controlled error analysis of
la Torre, & Cohn, 2013; Chen, Liu, Tu, & Aragones, 2013;
an automated FACS coding system. Future research would
Sebe, 2014). For instance, some subjects in the test set may
benefit from evaluating additional factors that might influ-
have similar face shape, texture, or lighting to subsets of sub-
ence classification, such as speech and AU intensity. The
jects in the training. These similarities could be used to op-
specific influence of speech could not be evaluated because
timize classifier generalizability between data sets. Prelimi-
audio was recorded using a single microphone and it was not
nary work of this type has been encouraging. Using an ap-
feasible to code speech and non-speech separately for each
proach referred to as a selective transfer machine, Chu et al.
participant. The current study also focused on AU detection
(2013) achieved improved generalizability between different
and ignored AU intensity.
data sets of spontaneous facial behavior.
Action units can vary in intensity across a wide range from
In summary, we found that automated AU detection can
subtle, or trace, to very intense. The intensity of facial ex-
be achieved in an unscripted social context involving spon-
pressions is linked to both the intensity of emotional expe-
taneous expression, speech, variation in head pose, and in-
rience and social context (Ekman, Friesen, & Ancoli, 1980;
dividual differences. Overall, we found very strong session-
Hess, Banse, & Kappas, 1995; Fridlund, 1991), and is essen-
level reliability and moderate to strong frame-level reliabil-
tial to the modeling of expression dynamics over time. In an
ity. The system was able to detect AUs in participants it had
earlier study using automated tracking of facial landmarks,
never seen previously. We conclude that automated FACS
we found marked differences between posed and sponta-
coding is ready for use in research and applied settings,
neous facial actions. In the former, amplitude and velocity of
where it can alleviate the burden of manual coding and en-
smile onsets were strongly correlated consistent with ballistic
able more ambitious coding endeavors than ever before pos-
timing (Cohn & Schmidt, 2004). For posed smiles, the two
sible. Such a system could replicate and extend the exciting
were uncorrelated. In related work, Messinger et al. (2009)
findings of seminal facial expression analysis studies as well
found strong covariation in the timing of mother and infant
as open up entirely new avenues of research.
smile intensity. While the present data provide compellingevidence that automated coding systems now can code the
occurrence of spontaneous facial actions, future research isnecessary to test the ability to automatically code change in
Abrantes, G. A., & Pereira, F. (1999). MPEG-4 facial animation
AU intensity.
technology: Survey, implementation, and results. IEEE Trans-actions on Circuits and Systems for Video Technology, 9(2),
Some investigators have sought to measure AU intensity
using a probability or distance estimate from a binary classi-
Ambadar, Z., Cohn, J. F., & Reed, L. I. (2009). All smiles are not
fier. Recall that for an SVM, each video frame can be located
created equal: Morphology and timing of smiles perceived as
with respect to its distance from a hyper-plane that separates
amused, polite, and embarrassed/nervous. Journal of Nonverbal
positive and null instances of AU. When the value exceeds
Behavior, 33(1), 17–34.
a threshold, a binary classifier declares the AU is present.
Archinard, M., Haynal-Reymond, V., & Heller, M. (2000). Doc-
When the value falls short of the threshold, the binary clas-
tor's and patients' facial expressions and suicide reattempt risk
sifier rules otherwise. As a proxy for intensity, Bartlett and
assessment. Journal of Psychiatric Research, 34(3), 261–262.
JEFFREY M. GIRARD
Bartlett, M. S., Littlewort, G., Frank, M. G., Lainscsek, C., Fasel,
Fairbairn, C. E., Sayette, M. A., Levine, J. M., Cohn, J. F., &
I. R., & Movellan, J. R. (2006). Automatic recognition of facial
Creswell, K. G. (2013). The effects of alcohol on the emo-
actions in spontaneous expressions. Journal of Multimedia, 1(6),
tional displays of whites in interracial groups. Emotion, 13(3),
Bruce, V., & Young, A. (1998). In the eye of the beholder: The
Fan, R.-e., Wang, X.-r., & Lin, C.-j. (2008). LIBLINEAR: A library
science of face perception. New York, NY: Oxford University
for large linear classification. Journal of Machine Learning Re-
search, 9, 1871–1874.
Camras, L. A., Oster, H., Campos, J., Campos, R., Ujiie, T.,
Fridlund, A. J. (1991). Sociality of solitary smiling: Potentiation
Miyake, K., . . Meng, Z. (1998). Production of emotional fa-
by an implicit audience. Journal of Personality and Social Psy-
cial expressions in European American, Japanese, and Chinese
chology, 60(2), 12.
infants. Developmental Psychology, 34(4), 616–628.
Geisser, S. (1993). Predictive inference. New York, NY: Chapman
Chen, J., Liu, X., Tu, P., & Aragones, A. (2013). Learning person-
specific models for facial expression and action unit recognition.
Girard, J. M. (2014). Automatic detection and intensity estimation
Pattern Recognition Letters, 34(15), 1964–1970.
of spontaneous smiles (Master's thesis).
Chu, W.-S., De la Torre, F., & Cohn, J. F. (2013). Selective transfer
Girard, J. M., Cohn, J. F., Mahoor, M. H., Mavadati, S. M., Ham-
machine for personalized facial action unit detection. IEEE In-
mal, Z., & Rosenwald, D. P. (2013). Nonverbal social with-
ternational Conference on Computer Vision and Pattern Recog-
drawal in depression: Evidence from manual and automatic
analyses. Image and Vision Computing.
Chung, M. (2007). Correlation Coefficient. In N. J. Salkin (Ed.),
Girard, J. M., Cohn, J. F., Mahoor, M. H., Mavadati, S. M., &
Encyclopedia of measurement and statistics (pp. 189–201).
Rosenwald, D. P. (2013). Social risk and depression: Evidence
Cohn, J. F., & Ekman, P. (2005). Measuring facial action by man-
from manual and automatic facial expression analysis. IEEE
ual coding, facial EMG, and automatic facial image analysis. In
International Conference on Automatic Face & Gesture Recog-
J. A. Harrigan, R. Rosenthal, & K. R. Scherer (Eds.), The new
nition, 1–8.
handbook of nonverbal behavior research (pp. 9–64). New York,
Grafsgaard, J. F., Wiggins, J. B., Boyer, K. E., Wiebe, E. N., &
NY: Oxford University Press.
Lester, J. C. (2013). Automatically recognizing facial expres-
Cohn, J. F., & Sayette, M. A. (2010). Spontaneous facial expres-
sion: Predicting engagement and frustration. International Con-
sion in a small group can be automatically measured: An initial
ference on Educational Data Mining.
demonstration. Behavior Research Methods, 42(4), 1079–1086.
Guney, F., Arar, N. M., Fischer, M., & Ekenel, H. K. (2013). Cross-
Cohn, J. F., & Schmidt, K. L. (2004). The timing of facial mo-
pose facial expression recognition. IEEE International Confer-
tion in posed and spontaneous smiles. International Journal of
ence and Workshops on Automatic Face & Gesture Recognition,
Wavelets, Multiresolution and Information Processing, 2(2), 57–
Hess, U., Banse, R., & Kappas, A. (1995). The intensity of facial
De la Torre, F., & Cohn, J. F. (2011). Facial expression analysis.
expression is determined by underlying affective state and social
In T. B. Moeslund, A. Hilton, A. U. Volker Krüger, & L. Sigal
situation. Journal of Personality and Social Psychology, 69(2),
(Eds.), Visual analysis of humans (pp. 377–410). New York,
NY: Springer.
Hoque, M. E., McDuff, D. J., & Picard, R. W. (2012). Exploring
Methods for measuring facial action.
temporal patterns in classifying frustrated and delighted smiles.
K. R. Scherer & P. Ekman (Eds.), Handbook of methods in non-
IEEE Transactions on Affective Computing, 3(3), 323–334.
verbal behavior research (pp. 45–90). Cambridge: CambridgeUniversity Press.
Hsu, C.-W., Chang, C.-C., & Lin, C.-J. (2003). A practical guide
to support vector classification (Tech. Rep.).
Ekman, P. (2003). Darwin, deception, and facial expression. Annals
of the New York Academy of Sciences, 1000(1), 205–221.
Image Metrics. (2013). LiveDriver SDK. Manchester, England.
Ekman, P., & Friesen, W. V. (1978). Facial action coding system:
Izard, C. E. (1979). The maximally discriminative facial movement
A technique for the measurement of facial movement. Palo Alto,
coding system (Max). Newark, DE: University of Delaware, In-
CA: Consulting Psychologists Press.
structional Resources Center.
Ekman, P., Friesen, W. V., & Ancoli, S. (1980). Facial signs of
Jeni, L. A., Cohn, J. F., & De la Torre, F. (2013). Facing imbalanced
emotional experience. Journal of Personality and Social Psy-
data: Recommendations for the use of performance metrics. In
chology, 39(6), 1125–1134.
International conference on affective computing and intelligent
Ekman, P., Friesen, W. V., & Hager, J. (2002). Facial action coding
system: A technique for the measurement of facial movement.
Keltner, D. (1995). Signs of appeasement: Evidence for the distinct
Salt Lake City, UT: Research Nexus.
displays of embarrassment, amusement, and shame. Journal of
Ekman, P., & Heider, K. G. (1988). The universality of a contempt
Personality and Social Psychology, 68(3), 441.
expression: A replication. Motivation and Emotion, 12(3), 303–
Keltner, D., & Buswell, B. N. (1997). Embarrassment: Its dis-
tinct form and appeasement functions. Psychological Bulletin,
Ekman, P., & Rosenberg, E. L. (2005). What the face reveals: Basic
122(3), 250.
and applied studies of spontaneous expression using the facial
Keltner, D., Moffitt, T. E., & Stouthamer-Loeber, M. (1995). Facial
action coding system (FACS) (2nd ed.). New York, NY: Oxford
expressions of emotion and psychopathology in adolescent boys.
University Press.
Journal of Abnormal Psychology, 104(4), 644–52.
FACIAL EXPRESSION CAN BE MEASURED AUTOMATICALLY
Kraut, R. E., & Johnston, R. E. (1979). Social and emotional mes-
Journal of Abnormal Psychology, 116(4), 804–809.
sages of smiling: An ethological approach. Journal of Person-
Sayette, M. A., Creswell, K. G., Dimoff, J. D., Fairbairn, C. E.,
ality and Social Psychology, 37(9), 1539.
Cohn, J. F., Heckman, B. W., . . Moreland, R. L. (2012). Al-
Li, Y., Chen, J., Zhao, Y., & Ji, Q. (2013, April). Data-free prior
cohol and group formation: A multimodal investigation of the
model for facial action unit recognition. IEEE Transactions on
effects of alcohol on emotion and social bonding. Psychological
Affective Computing, 4(2), 127–141.
Science, 23(8), 869–878.
Littlewort, G., Bartlett, M. S., Fasel, I. R., Susskind, J., & Movellan,
Sayette, M. A., & Hufford, M. R. (1995). Urge and affect: A
J. R. (2006). Dynamics of facial expression extracted automati-
facial coding analysis of smokers. Experimental and Clinical
cally from video. Image and Vision Computing, 24(6), 615–625.
Psychopharmacology, 3(4), 417–423.
Littlewort, G., Whitehill, J., Tingfan, W., Fasel, I. R., Frank, M. G.,
Sebe, N. (2014). We are not all equal: Personalizing models for fa-
Movellan, J. R., & Bartlett, M. S. (2011). The computer expres-
cial expression analysis with transductive parameter transfer. In
sion recognition toolbox (CERT). IEEE International Confer-
Proceedings of the acm international conference on multimedia.
ence on Automatic Face & Gesture Recognition and Workshops,
Orlando, FL.
Serre, T., Kouh, M., Cadieu, C., Knoblich, U., Kreiman, G., & Pog-
Littlewort, G. C., Bartlett, M. S., & Lee, K. (2009). Automatic
gio, T. (2005). A theory of object recognition: Computations
coding of facial expressions displayed during posed and genuine
and circuits in the feedforward path of the ventral stream in pri-
pain. Image and Vision Computing, 27(12), 1797–1803.
mate visual cortex. Artificial Intelligence, 1–130.
Lowe, D. G. (1999). Object recognition from local scale-invariant
Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in
features. IEEE International Conference on Computer Vision,
assessing rater reliability. Psychological Bulletin, 86(2), 420.
Szeliski, R. (2011). Computer vision: Algorithms and applications.
Lucey, P., Cohn, J. F., Howlett, J., Member, S. L., & Sridharan, S.
London: Springer London.
(2011). Recognizing emotion with head pose variation: Identi-
Tian, Y.-l., Kanade, T., & Cohn, J. F. (2001). Recognizing ac-
fying pain segments in video. IEEE Transactions on Systems,
tion units for facial expression analysis. IEEE Transactions on
Man, and Cybernetics.
Pattern Analysis and Machine Intelligence, 23(2), 97–115.
Lucey, S., Matthews, I., Ambadar, Z., De la Torre, F., & Cohn, J. F.
Tong, Y., Chen, J., & Ji, Q. (2010). A unified probabilistic frame-
(2006). AAM derived face representations for robust facial ac-
work for spontaneous facial action modeling and understand-
tion recognition. IEEE International Conference on Automatic
ing. IEEE Transactions on Pattern Analysis and Machine Intel-
Face & Gesture Recognition, 155–162.
ligence, 32(2), 258–273.
Mavadati, S. M., Mahoor, M. H., Bartlett, K., Trinh, P., & Cohn,
Tong, Y., Liao, W., & Ji, Q. (2007). Facial action unit recog-
DISFA: A spontaneous facial action intensity
nition by exploiting their dynamic and semantic relationships.
database. IEEE Transactions on Affective Computing.
IEEE Transactions on Pattern Analysis and Machine Intelli-
McDuff, D., El Kaliouby, R., Kodra, E., & Picard, R. (2013). Mea-
gence, 29(10), 1683–1699.
suring voter's candidate preference based on affective responses
Valstar, M. F., Bihan, J., Mehu, M., Pantic, M., & Scherer, K. R.
to election debates. Humaine Association Conference on Affec-
(2011). The first facial expression recognition and analysis chal-
tive Computing and Intelligent Interaction, 369–374.
lenge. IEEE International Conference on Automatic Face &
Messinger, D. S., Mahoor, M. H., Chow, S.-M., & Cohn, J. F.
Gesture Recognition and Workshops, 921–926.
(2009). Automated measurement of facial expression in infant-mother interaction: A pilot study. Infancy, 14(3), 285–305.
Valstar, M. F., & Pantic, M. (2007). Combined support vector
Noldus Information Technology. (2013). The Observer XT. Wa-
machines and hidden markov models for modeling facial action
geningen, The Netherlands.
temporal dynamics. In Ieee international workshop on human-
Pantic, M., & Rothkrantz, L. J. M. (2000). Expert system for auto-
computer interaction (pp. 118–127).
matic analysis of facial expressions. Image and Vision Comput-
Vapnik, V. (1995). The nature of statistical learning theory. New
ing, 18(11), 881–905.
York, NY: Springer.
Powers, D. M. (2007). Evaluation: From precision, recall and F-
Vedali, A., & Fulkerson, B. (2008). VLFeat: An open and portable
factor to ROC, informedness, markedness & correlation (Tech.
library of computer vision algorithms.
Rep.). Adelaide, Australia.
Yang, P., Qingshan, L., & Metaxas, D. N. (2009). RankBoost with
Powers, D. M. W. (2012). The problem with kappa. Conference
l1 regularization for facial expression recognition and intensity
of the European Chapter of the Association for Computational
estimation. IEEE International Conference on Computer Vision,
Prkachin, K. M., & Solomon, P. E. (2008). The structure, reliabil-
Zeng, Z., Pantic, M., Roisman, G. I., & Huang, T. S. (2009). A
ity and validity of pain expression: Evidence from patients with
survey of affect recognition methods: audio, visual, and sponta-
shoulder pain. Pain, 139(2), 267–274.
neous expressions. IEEE Transactions on Pattern Analysis and
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear
Machine Intelligence, 31(1), 39–58.
models: Applications and data analysis methods (2nd ed. ed.).
Zhu, Y., De la Torre, F., Cohn, J. F., & Zhang, Y.-J. (2011). Dy-
Thousand Oaks, CA: Sage.
namic cascades with bidirectional bootstrapping for action unit
Reed, L. I., Sayette, M. A., & Cohn, J. F. (2007). Impact of depres-
detection in spontaneous facial behavior. IEEE Transactions on
sion on response to comedy: A dynamic facial coding analysis.
Affective Computing, 2(2), 79–91.
Source: http://lightside.hu/pub/articles/Girard14BRM.pdf
BOEHRINGER INGELHEIM REFERRAL FORMPlease complete and fax this form to 1-866-867-1861 Contact BI Solutions Plus: 1-844-8-SOLUTION (1-844-876-5884), Monday – Friday, 8:00 am – 8:00 pm, ET. PRESCRIBER INFORMATION (Verification of Benefits will be faxed to this Prescriber) NAME (First, MI, Last): PRESCRIBER'S NAME (FIRST, LAST): MEDICAID/MEDICARE PROVIDER#
Curr Surg Rep (2014) 2:50 Treatment of Unresectable Liver-Only Disease: Systemic Therapyversus Locoregional Therapy Jean M. Butte • Chad G. Ball • Elijah Dixon Springer Science + Business Media New York 2014 Most patients with colorectal liver metastases has decreased dramatically; nonetheless, an important present with unresectable/disseminated disease and are