Homepages.ed.ac.uk
JOURNAL OF VIROLOGY, Apr. 2011, p. 3649–3663
Copyright 2011, American Society for Microbiology. All Rights Reserved.
Coevolution of the Hepatitis C Virus Polyprotein Sites in Patients on
Combined Pegylated Interferon and Ribavirin Therapy䌤§
James Lara,* Guoliang Xia, Mike Purdy, and Yury Khudyakov*
Molecular Epidemiology & Bioinformatics Laboratory, Laboratory Branch, Division of Viral Hepatitis, Centers for Disease Control and
Prevention, 1600 Clifton Road, Atlanta, Georgia 30333
Received 19 October 2010/Accepted 7 January 2011
Genotype-specific sensitivity of the hepatitis C virus (HCV) to interferon-ribavirin (IFN-RBV) combination
therapy and reduced HCV response to IFN-RBV as infection progresses from acute to chronic infection suggest
that HCV genetic factors and intrahost HCV evolution play important roles in therapy outcomes. HCV
polyprotein sequences (n ⴝ
40) from 10 patients with unsustainable response (UR) (breakthrough and relapse)
and 10 patients with no response (NR) following therapy were identified through the Virahep-C study. Bayesian
networks (BNs) were constructed to relate interrelationships among HCV polymorphic sites to UR/NR out-
comes. All models showed an extensive interdependence of HCV sites and strong connections (P <
0.003) to
therapy response. Although all HCV proteins contributed to the networks, the topological properties of sites
differed among proteins. E2 and NS5A together contributed ⬃
40% of all sites and ⬃
62% of all links to the
polyprotein BN. The NS5A BN and E2 BN predicted UR/NR outcomes with 85% and 97.5% accuracy, respec-
tively, in 10-fold cross-validation experiments. The NS5A model constructed using physicochemical properties
of only five sites was shown to predict the UR/NR outcomes with 83.3% accuracy for 6 UR and 12 NR cases of
the HALT-C study. Thus, HCV adaptation to IFN-RBV is a complex trait encoded in the interrelationships
among many sites along the entire HCV polyprotein. E2 and NS5A generate broad epistatic connectivity across
the HCV polyprotein and essentially shape intrahost HCV evolution toward the IFN-RBV resistance. Both
proteins can be used to accurately predict the outcomes of IFN-RBV therapy.
Hepatitis C virus (HCV) is the major etiologic agent of
period of decline in viral load (breakthrough) or observed after
blood-borne non-A, non-B hepatitis (25). Chronic HCV infec-
cessation of therapy (relapse) (52).
tion is an established risk factor for the development of liver
Several factors are known to affect therapy outcome in
diseases, such as fibrosis, cirrhosis, and hepatocellular carci-
HCV-infected patients, most notably the infecting HCV geno-
noma (33, 124, 125). Approximately 70% to 80% of HCV-
type. There are six major HCV genotypes, 1 to 6 (108, 109).
infected patients fail to clear the virus and progress to chro-
Patients infected with genotype 2 are the most responsive, with
nicity (89a). At present, there are no preventive vaccines
SVR being achievable in 70% to 80% of cases (52, 80). In
against HCV. The current, accepted therapeutic approach to
contrast, only 50% to 60% of genotype 1-infected patients
treating chronic hepatitis C infection involves a 24- or 48-week
achieve SVR (48, 55, 80, 90). Genotype 1 is the most prevalent
course of pegylated alpha interferon (IFN-␣) combined with
genotype worldwide (78). The dependence of IFN-RBV re-
ribavirin (RBV) (i.e. IFN-RBV therapy) (48, 52). Because only
sponse rates on HCV genotype (48, 52, 55, 80) implies that the
50% to 70% of chronically infected patients develop a sus-
composition of the HCV genome plays a role in influencing
tained virologic response (SVR) to this treatment (48, 52, 55,
therapy outcome.
80) and because patient intolerance to such therapy is common
The mechanism of IFN action against HCV is not fully
(61, 68, 120), the development and application of other ther-
known. It was shown that treatment with IFN activates the
apeutic approaches using antiviral compounds that act against
host's innate antiviral immune responses by inducing IFN-
HCV more efficaciously and yet generate lower rates of ad-
stimulated genes (47, 59, 64, 84). Several HCV genomic re-
verse effects are major clinical management and public health
gions have been found to be associated with resistance to IFN
objectives. Therapeutic failure presents in two forms: (i) com-
treatment (74). Since responses to IFN differ among HCV
plete resistance to treatment (no response [NR]) and (ii) un-
strains, associations between IFN therapy outcome and HCV
sustainable response (UR), which is characterized by an in-
genomic variability in regions such as hypervariable region 1
crease in HCV load observed during therapy after an initial
(HVR1) of E2 (87, 118) and the V3 domain of NS5A (34, 79)have been frequently investigated. A correlation was reportedbetween NR and the high complexity of HVR1 variants beforetreatment (87, 118), but it was not confirmed in a subsequent
* Corresponding author. Mailing address: Molecular Epidemiology
study (75). A high level of V3 heterogeneity was associated
& Bioinformatics Laboratory, Laboratory Branch, Division of ViralHepatitis, Centers for Disease Control and Prevention, 1600 Clifton
with IFN sensitivity (34, 99, 119). Specific mutations in the core
Road, Atlanta, GA 30333. Phone for J. Lara: (404) 639-1152. Fax:
protein have also been suggested to determine the early re-
(404) 639-1563. E-mail:
[email protected]. Phone for Y. Khudyakov: (404)
sponse to IFN-RBV therapy (36).
639-2610. Fax: (404) 639-1563. E-mail:
[email protected].
Both E2 and NS5A proteins have been implicated in binding
§ Supplemental material for this article may be found at http://jvi
to the IFN-inducible, double-stranded, RNA-activated protein
䌤 Published ahead of print on 19 January 2011.
kinase R (PKR), which is involved in the IFN-induced antiviral
response (49). A 12-amino-acid (aa) region located between
In this paper, we report modeling of quantitative associa-
positions 659 and 670 in E2 known as the PKR-␣ subunit of
tions between a global epistatic connectivity among the HCV
eukaryotic initiation factor 2 (PKR-eIF2␣) phosphorylation
polymorphic amino acid sites and UR/NR outcomes of the
homology domain (PePHD) was shown to bind PKR
in vitro
IFN-RBV therapy. While NR represents complete resistance
(113). The PePHD sequence has similarity to the autophos-
to IFN-RBV, UR reflects incomplete suppression of HCV or
phorylation sites of PKR and the phosphorylation site in
the intrahost HCV evolution toward IFN-RBV resistance (93).
eIF2␣. This similarity is greater for HCV genotype 1 than
Both UR and NR are associated with HCV persistence despite
genotype 2 or 3. However, the association between PePHD
treatment (52). With HCV available for analysis at the start
sequence and therapy outcomes has not been consistently
and end of therapy, these outcomes provide an important
shown (1, 98). A PKR-binding domain is located in the C-ter-
setting for analyzing genetic changes in the HCV genome
minal region of NS5A (49). A variable 40-aa region of this
associated with resistance.
domain, termed the interferon sensitivity determining region(ISDR), was reported to play a key role in the IFN therapy
MATERIALS AND METHODS
response (37, 38). Analysis of HCV 1b sequences showed an
Sequence data. Analyses were conducted using the HCV 1a full-length poly-
association between the number of ISDR mutations and the
protein consensus sequences from 20 patients (10 UR and 10 NR cases) iden-
response to the IFN therapy (92). However, studies of HCV
tified through the Virahep-C study (18, 26). Sequences in the Virahep-C study
genotype 2b and 3a did not find such a relation between SVR
were sampled from patients before (
n ⫽ 20) and at the end of treatment (
n ⫽ 20)with pegylated IFN-␣2a and RBV. Analyses included all sites from the entire
and NS5A variability (8, 89). Additionally, no binding between
HCV polyprotein except for the most C-terminal 56 aa from the NS5B protein.
PKR and the genotype 3a NS5A from the IFN-resistant HCV
This sequence data set served as a training set for developing models for pre-
strains was observed
in vitro (20).
diction of therapy outcomes. For some analyses, a total of 298 HCV 1a full-
RBV, a guanosine nucleotide analog, is inefficacious against
length consensus polyprotein sequences from GenBank were used. In addition,full-length NS5A protein consensus sequences from 18 treatment-naïve patients
HCV when used alone but when combined with IFN therapy
(6 UR and 12 NR) identified through the HALT-C trial (131) were used as a test
dramatically improves viral clearance and decreases relapse
data set to validate the NS5A predictive models constructed from the Virahep-C
rates (42). The mechanism by which RBV improves treatment
data. A full listing of the GenBank accession numbers of all sequences used in
responses is not well understood. Several mechanisms of its
this study can be found in the supplemental material.
therapeutic action have been proposed, including inosine
An alignment of the HCV viral sequences from all three data sets was gen-
erated using the Clustal W program (115) implemented in BioEdit v7.0.5.3 (58).
monophosphate dehydrogenase inhibition (133), viral inhibi-
HCV H77 (GenBank accession no. AF009606) was used as the reference se-
tion (77), facilitation of Th1 immunoresponses (111), mu-
quence. In addition, alignments of consensus sequences for individual gene
tagenesis (27, 76), inhibition of 5⬘ cap formation on mRNAs
products were generated using the Virahep-C data. Each amino acid site was
(53), and upregulation of genes involved in IFN signaling (44,
numbered according to its position in the HCV polyprotein. For modeling, eachsequence was associated with the IFN-RBV therapy outcome, UR or NR. To-
132). However, none of these mechanisms has been convinc-
gether, the sequences and assigned therapy outcome attributes constituted the
ingly shown to be responsible for its efficacy when combined
entire set of viral features representing each HCV variant. These viral features
with IFN (42). Nonetheless, RBV was recently shown to im-
of the Virahep-C data were used for modeling dependencies among sites in
prove early responses to IFN (43), thus supporting its role in
relation to treatment response.
enhancing IFN signaling (44, 132) and emphasizing the leading
Conditional independence analysis. Pairwise conditional independencies (CI)
among HCV viral features (amino acid sites and therapy outcome) were exam-
role of IFN in combination therapy.
ined using full-length polyprotein consensus sequences from the Virahep-C study
Host factors have been also found to affect both the natural
(18, 26). Testing for CI was performed in the form of undirected independence
course of HCV infection and the outcome of treatment (116).
graphs (71), which present the CI among a collection of variables. Nodes in the
For example, common-source HCV infections frequently lead
graph represent the HCV polyprotein sites and therapy outcome, while linksbetween nodes represent dependencies among the features.
to differential outcomes among incident cases, with some pa-
The CI testing was used to validate dependency among the polyprotein sites in
tients resolving the infection and some developing chronic
relation to the therapy outcome. Only polymorphic sites were considered for
hepatitis C (122, 123), or patients chronically infected with the
finding CI from the data. The identified dependency between two features was
same genotype respond differently to IFN-RBV treatment de-
shown in the graph as a link. This type of statistical analysis assumes the null
spite carrying similar HCV viral loads (55, 103). In addition to
hypothesis of independence between any two given features. Relative strengthsassigned to links in the graph were based on the marginal dependencies between
genotype, demographic factors such as ethnicity and gender
observed associations. Marginal dependence for each link connecting variables A
have been associated with therapy outcomes (48, 80, 103).
and B was quantified through
P value. For each set,
C, of conditioning variables,
Several studies reported the role of the host genetic polymor-
a
P value for {
A,
B} was computed, which expresses the probability that
A and
phism, e.g., in the
IL28B locus, in defining the rate of sponta-
B are conditionally independent given
C. The marginal
P value is the valuecorresponding to
C ⫽ {
A,
B}. The marginal dependence between
A and
B is
neous clearance (114) and IFN-RBV SVR (50, 110).
defined as 1 minus the marginal
P value associated with {
A,
B}, where a marginal
Many host selection pressures, including innate and adaptive
dependence of 0 means that
A and
B are completely independent and 1 means
immune responses, shape HCV evolution, and their effects
that they are completely dependent. The CI among the features was measured at
should be reflected in HCV genetic composition and epistatic
several different levels of significance (thresholds between 0.05 and 5 ⫻ 10⫺6).
connectivity among genomic sites. Indeed, polymorphic sites
Undirected independence graphs and statistical computations of CI were con-ducted as implemented in the commercially available software package Hugin
within the HCV genome have been shown to be organized as
Researcher (v6.8).
a network of coordinated substitutions (17), with the topology
Bayesian network (BN). Relationships among amino acid sites of the HCV
of the network being different for HCV strains that are resis-
polyprotein and therapy outcome were examined using probabilistic graphical
tant or sensitive to treatment (7). Although indicating a strong
models in the form of a Bayesian network (BN) (63), where nodes in the graphrepresent variables (here, amino acid sites and therapy outcome) and links
association of many HCV sites with outcomes of therapy, these
between the nodes represent relationship. Unlike the undirected independence
networks, however, do not provide quantitative measures for
graphs, BNs provide a more complex notion of the relationships. This includes
viral genomic parameters related to IFN treatment.
the notion of the conditional probability and directionality of the relationship.
HCV GENETIC PATTERNS OF ADAPTATION TO IFN THERAPY
Links connecting two variables (nodes in the graph) are represented as arcs,
perform well when applied with feature selection (56). The DTBN model splits
which may project toward the node (incoming links) or from the node (outgoing
the features into two groups: one group assigns class probabilities based on naïve
links), thus specifying the direction of influences among variables. Relationships
Bayes, and the other assigns probability class based on a decision table. The
between variables in a BN may be interpreted as causal (22). The conditional
resulting probability estimates are then combined to estimate the probability of
probability distributions are represented in the conditional probability tables
the outcome class association.
(CPTs) of the variables (features) in the network. CPTs of the BNs in this study
Physicochemical properties of HCV variants. Each amino acid can be repre-
represent amino acid probability distributions at each site and probabilities
sented as a set of physicochemical properties. Using these properties, the HCV
associated with therapy outcome. Inference of the network structure and param-
polyprotein consensus sequences from the Virahep-C data set were converted
eter estimations (i.e., CPTs) of all BNs constructed for this study was performed
into the respective physicochemical vectors, which were subsequently used to
through Bayesian artificial intelligence learning algorithms.
identify their association with therapy outcome. Analyses were conducted for the
BNs were inferred from the Virahep-C data using the HCV polyprotein
HCV polyprotein and individual gene products. Conserved positions were not
sequences and associated therapy outcomes. The objectives of analysis with BN
considered for the physicochemical representation of HCV variants. Position
were to examine the complexity of the probabilistic interrelationship and mea-
numbering of polymorphic sites was maintained according to the HCV polypro-
sure the importance (or strength) of links among the HCV amino acid sites and
tein. Sequence alignments comprised of polymorphic sites were transformed into
therapy outcome (variables). Measurements of the importance of links were used
N ⫻ 5 dimensional numerical vectors, where
N is the sequence length and 5
to identify the most influential amino acid sites in the polyprotein BN. The
represents the number of physicochemical values assigned to each amino acid
importance of a variable can be estimated using the number and strength of links
site in the sequence. The five physicochemical factors used in this study have
associated with the corresponding node in the BN. The amino acid sites that
been previously described (6). Each vector was then associated with the known
most strongly influence the probabilities of the treatment outcome were of a
therapy outcome (18, 26).
Physicochemical mapping of the data was conducted using a projection pur-
The greedy thick thinning (GTT) method (31) was used to infer the BN
suit-based technique in the form of a two-dimensional linear projection (LP)
structure for the task of examining complexity of interrelationships among the
(32). The method was used to search for a combination of the physicochemical
variables. The number of incoming links to any given node was constrained
vectors (projections) that most accurately separates HCV variants into two
between 3 and 10. Parameter estimation of the CPTs was performed using the K2
classes: UR and NR therapy outcomes. The LP mapping can be tested on new
priors (28) of each variable in the network. Complexity of the probabilistic
data without having to reconstruct the original mapping (32).
interrelationship among amino acid sites and therapy outcomes was also exam-
Feature selection was used to identify amino acid sites and their properties
ined by individual protein regions. BNs were constructed for each individual
most relevant to the therapy outcome-based clustering of the HCV variants. A
protein using the same methods as described above for structure learning and
minimal subset of site-specific properties (features) from the NS5A protein was
network parameterization (GTT and the K2 priors, respectively). BNs were
derived, using a heuristic method (73), to search for "interesting" projections that
constructed using the GeNIe software (http://genie.sis.pitt.edu/).
were most associated with the therapy outcome. Projections were evaluated during
It is important to note that with the increase in the number of variables, the
the global and local searching that was performed using the
k-nearest neighbor
number of possible networks grows superexponentially and computation of the
method (
k ⫽ 10) and tested by 10-fold cross validation (10-fold CV) for classification
probabilities of all links becomes NP-hard (24). Therefore, a search heuristics
correctness. Correctness estimation was based on the average probability of a pro-
method was adopted to compute the strengths of the links in order to derive
jection to be assigned to the correct therapy outcome class. During the global and
measures of the importance of relationships among amino acid sites and therapy
local searches, 5 ⫻ 106 and 3 ⫻ 106 projections, respectively, were evaluated.
outcome. The maximum spanning tree (MST) algorithm was used to infer the
Feature selection (FS). FS was applied to alignments of the full-length con-
BN structure from the data.
sensus polyprotein sequences and individual gene products of the Virahep-C
The strength of the probabilistic relationships (or force of the influences)
data to determine which amino acid sites were most associated with therapy
among variables (amino acid sites and therapy outcome) was inferred by com-
outcome. FS reduces dimensionality of the data and improves the prediction
puting the Kullback-Leibler (KL) divergence (69) between the joint probability
performance of BNCs. The usefulness of each amino acid site for the prediction
distribution with and without the link. The greater the KL divergence between
of the therapy outcome was evaluated using FS techniques for ranking or select-
these two distributions, the greater the strength of the link, hence, the impor-
ing an optimal subset of features. Feature ranking was conducted using divide-
tance of the relationship it represents. The global importance of an amino acid
and-conquer approaches (decision trees) and information-based metrics. Corre-
site was calculated as the sum of strength of incoming and outgoing links asso-
lation was used as the filtering metric to search for optimal subsets of features.
ciated with the node representing this site in the network. The overall strength of
Given that FS techniques have biases known to affect the variable selection
links for individual protein regions and relevance to the therapy outcome was
optimization method (30, 54), several FS methods were applied.
calculated by summing the strength of incoming (incoming strength) and outgo-
Three FS techniques based on information theory were used: information gain
ing links (outgoing strength) associated with each region.
(101), Gini gain (16), and gain ratio (101). These methods rank the elevance of
The relative significance of the contribution that each amino acid site inde-
the features (amino acid sites) based on a score that each feature receives in
pendently provided to the knowledge of therapy outcome was determined using
relation to the therapy outcomes, UR and NR. The top 25 ranked amino acid
a naïve BN (28) approach. The BN structure was inferred from the Virahep-C
sites relevant to the UR/NR outcome were selected and used for comparison
data using the MST algorithm. This approach identifies associations between the
between the techniques. Features that by themselves are not useful for prediction
therapy outcome and amino acid sites, with sites considered to be independent
(those with a low score) may, however, become useful when combined with other
from each other. Mutual information was used to measure contribution of each
features and, hence, be relevant to the prediction (54). Therefore, the feature
site to the knowledge of therapy outcome (29). All algorithms based on heuristic
subset selection method based on correlation (CFS) (57) was applied to the
methods used here to infer the BN structures as well as computation of link
Virahep-C data. Unlike the ranking methods, the CFS identifies a subset of
strength and relevance of variables were carried out as implemented in the
features (amino acid sites) based on their degree of correlation to the class
Professional Edition of BayesiaLaB software (Bayesia SAS, Laval, France). The
variable (therapy outcome) and low intercorrelation between features. This
Pearson correlation coefficient was calculated using SAS (version 9.2; SAS In-
method was used to search for a minimal subset of complementary amino acid
stitute Inc., Cary, NC).
sites to improve the BNC accuracy.
Bayesian network classifier (BNC). BNC was constructed for E2 and NS5A.
Evaluation and validation of the therapy outcome predictors. The E2 and
Both BNCs can infer the probabilities of the UR/NR responses to IFN-RBV
NS5A BNC were evaluated by 10-fold CV. Briefly, the HCV variants represented
treatment directly from amino acid sequence. The E2 BNC and NS5A BNC were
by all polymorphic sites or selected amino acid sites from E2 or NS5A were
inferred from the Virahep-C data as follows: (i) the network was initialized as a
randomly divided into 10 parts of equal size. Each part was held out strictly as a
naïve BN (28), where the therapy outcome was directly linked to all amino acid
testing data set to evaluate the prediction accuracy of the BNC trained with the
sites; (ii) conditional probabilities for amino acid sites were computed. The K2
remaining nine parts of the data. This process was executed until the BNC was
learning algorithm (28) was used to infer BN structure. The maximum number of
evaluated with all 10 parts. The 10 accuracy estimates were then averaged to
incoming links associated with each node (feature) in BN was constrained to 4.
estimate the overall accuracy of the BNC.
Parameter estimation of CPTs of each feature in the BN was empirically derived
Also, BNCs trained with data sets—where the E2 and NS5A protein sequences
from the data.
were randomly assigned with UR/NR outcome—were evaluated for prediction
The NS5A BNC based on the selected amino acid sites was constructed using
accuracy. The results were then compared to the accuracy obtained from the
the hybrid decision table-naïve Bayes method (DTNB) (56). The DTNB is a BN
BNCs trained with the correct outcome assignment in order to account for any
where CPTs are represented by a decision table. This method has been shown to
random statistical correlations present in the Virahep-C data.
FIG. 1. Undirected independence graphs showing relative strengths of the dependencies (links in the graph) found among HCV polyprotein
sites (nodes in the graph) and UR/NR outcomes following IFN-RBV therapy from 40 sequences obtained from 10 UR and 10 NR patients in theVirahep-C data. Feature pairs whose dependencies exceed the threshold are linked. HCV polyprotein sites are grouped by region; from left toright: core, E2, NS2, NS4A, and NS5A (upper row), and E1, P7, NS3, NS4B, NS5B (lower row). Therapy outcome is shown as a single node atthe top of the graphs. (a) Initial explorative search for significant dependencies (P ⬍ 0.05) followed by gradual decrease in thresholds: 0.0032 (b),3 ⫻ 10⫺4 (c), 6 ⫻ 10⫺5 (d), and 5 ⫻ 10⫺6 (e) (f) Reduced diagram of summarized dependencies between HCV polyprotein sites and treatment
outcome (P ⱕ 0.003).
Two measures of accuracy were used for classification performance: overall
HCV polymorphic amino acid sites and their potential linkage
percent classification correctness and precision. The overall percent correctness
to the UR/NR outcome of IFN-RBV therapy was conducted
was measured as [(no. correctly classified instances/total no. of instances) ⫻ 100].
Precision was determined in the following manner (where TP is the number of
using 40 HCV full-genome sequences obtained before and at
true positives, TN is the number of true negatives, FP is the number of false positives,
the end of therapy from 10 UR and 10 NR patients from the
and FN is the number of false negatives): precision ⫽ 关TP/(TP ⫹ FP)] ⫻ 100%;
Virahep-C study (18). The HCV sequences from before and
TP ⫽ [TP/(TP ⫹ FN )] ⫻100%; FP ⫽ [FP/(FP ⫹ TN)] ⫻ 100%.
after therapy were used to account for HCV evolution during
The validation of the NS5A predictive models was conducted using the consensus
sequences of the NS5A protein from the HALT-C study (131), which were not part
treatment. A total of 551 polymorphic sites were found in the
of any of the analyses described herein. Estimation of the NS5A BNC and NS5A-LP
HCV polyprotein consensus sequences from these patients. CI
accuracy of prediction of treatment outcome for HCV NS5A variants from the
tests were performed to measure the degree of dependency
HALT-C study was based on the overall percent classification correctness.
among the polymorphic amino acid sites and the UR/NR out-come of IFN-RBV therapy. Results of the CI test were visually
displayed as the undirected independence graph (71) (Fig. 1),
Complex interdependence between polymorphic sites and
in which the conditional dependencies among amino acid sites
therapy outcome. CI analysis of interdependencies between
and UR/NR outcome (shown as nodes in the graph) are rep-
HCV GENETIC PATTERNS OF ADAPTATION TO IFN THERAPY
TABLE 1. Propertiesa of the HCV polyprotein BN
No of linksb
a Properties correspond to a polyprotein BN inferred for the Virahep-C HCV variants.
b The number of incoming links to any given node was constrained to a maximum of 10. The node count represents the total number of polymorphic sites from each
polyprotein region that contributed to the BN. Outdegree, total number of outgoing links from all sites in each protein region. Indegree, total number of incoming linksto sites in each region. Max. outdegree, maximum number of outgoing links from any one site from the respective protein region. To response, number of links withdirect relationship to the outcome in the BN.
c Total number of links in the network.
resented as undirected links or edges. The undirected graph
into this polyprotein BN are listed in Table S1 in the supple-
displayed numerous links representing a dense and complex
mental material.
network of dependencies (P ⬍ 4 ⫻ 10⫺4) between amino acid
As shown in Table 1, HCV proteins do not contribute
polymorphic sites across the entire HCV polyprotein and ther-
equally to the network topology. The E2 and NS5A regions of
apy outcome. A large number of links among sites within and
the HCV polyprotein are the two major contributors of sites
between individual proteins remained present up to a thresh-
into the HCV polyprotein BN (21% and 18% of amino acid
old value of 2 ⫻ 10⫺5. E2 protein sites formed the strongest
sites, correspondingly). The E1, E2, and NS5A regions are also
dependencies. For example, site 612 of E2 is strongly con-
major contributors of links into the network (25.3%, 42%, and
nected to site 233 in E1 (P ⫽ 3 ⫻ 10⫺8), and site 642 of E2 to
26.8% of all links, correspondingly). The majority of links are
site 1756 in NS4B (P ⫽ 2 ⫻ 10⫺8). Also, sites 482 and 612 are
between proteins, with only 17.5% of all links being within
strongly connected to site 642 (P ⫽ 2 ⫻ 10⫺9). It is important
individual protein regions. Among all E2 links, 18.9% are
to note that links connecting amino acid sites to therapy out-
among E2 sites, whereas all other proteins contain only 1.4%
come were among the strongest (P ⱕ 7 ⫻ 10⫺5). As shown in
to 10.5% of intraprotein links. Owing to the large number of
Fig. 1, therapy outcome was strongly linked (P ⱕ 0.003) to
polymorphic sites contributing to the network, the E2 and
amino acid sites from the E1 (site 242), E2 (sites 397, 434, 524,
NS5A proteins are extensively connected to each other and to
and 655), P7 (site 790), NS3 (site 1090), NS5A (sites 2280,
all other proteins. As shown in Fig. 3, ⬃20% of all E2 sites
2283, 2320, 2366, 2411, 2413, and 2414), and NS5B (sites 2530,
have direct links to NS5A, and ⬃35% of all NS5A sites have
2633, 2730, and 2747) regions. The strongest dependencies
direct links to E2 in the polyprotein BN, indicating a significant
were found with sites from P7 (site 790; P ⫽ 2 ⫻ 10⫺4), NS5A
coordination of substitutions between these two proteins.
(site 2280 and 2283; P ⫽ 3 ⫻ 10⫺4 and P ⫽ 7 ⫻ 10⫺5, respec-
Despite generating many connections (n ⫽ 554) and con-
tively), and NS5B (site 2633; P ⫽ 9 ⫻ 10⫺4). These data
tributing many sites (n ⫽ 118) to the polyprotein BN (Table 1),
suggest strong coordination of substitutions at sites along the
E2 does not have direct links to therapy outcome. Only six sites
entire HCV polyprotein and association between polymorphic
form such direct connections, with two sites (at positions 864
sites and therapy outcome.
and 934) being from NS2, a single site (at position 1841) from
Contribution of different proteins to therapy outcome. To
NS4B, two sites (at positions 2280 and 2283) from NS5A, and
infer a more insightful representation of the relationships
a single site (at position 2633) from NS5B.
among polymorphic sites and therapy outcome, a Bayesian
The core protein contributes only 11 sites (1.8% of all sites)
network (BN) approach (63) was used. The complexity of
but 136 links (10.3% of all links) to the polyprotein BN, with
relationships among HCV polymorphic sites and UR/NR out-
each site being connected to ⬃13 other sites, which is ⬃3 to 8
come was evaluated by inferring BNs from the full-length HCV
times more than the individual sites from any other HCV
polyprotein consensus sequences. The properties of the net-
protein (Table 1). The E1 sites contain 4.5 connections on
work are listed in Table 1. In concordance with the undirected
average, while sites of all other proteins are linked on average
interdependence graph findings, interrelationships among all
to 1.5 to 2.7 other sites. The essential difference is in the
polymorphic sites were found to be highly complex. Figure 2
directionality of links among proteins. Two proteins, core and
shows the structure of the polyprotein BN containing 551 poly-
E1, located at the N terminus of the HCV polyprotein, have
morphic amino acid sites and their association to therapeutic
92% and 69.7% of their links directed outside, respectively,
outcome. Although all sites are interdependent, the number of
suggesting their important causal role in defining states of
links broadly varies from 1 to 30 among sites. Sites contributing
many polyprotein sites connected to these two proteins. All
FIG. 2. BN (P ⫽ 3) of inferred relationships among the full-length HCV polyprotein sites and IFN-RBV therapy outcome. Polyprotein sites
and outcome are represented as nodes in the graph. Relationships among features are represented as arcs. Features whose probabilisticdependencies exceed the conditional independency tests and GTT scoring are connected. The graph was constructed using a spring-embeddednetwork layout algorithm. Features are color-coded by region (inset), and therapy outcome is shown in red.
other proteins have almost equal measures of incoming (in-degree) and outgoing (out-degree) links.
Many essential properties of the polyprotein BN constructed
using the 40 Virahep-C sequences, except for linkage to ther-apy, were observed with another BN constructed using HCVgenotype 1a full-length genome sequences obtained fromGenBank (n ⫽ 298). As shown in Fig. 3 and 4, the GenBankBN and Virahep-C BN have similar distributions of links, andinterrelationships among individual proteins are highly corre-
FIG. 3. Distribution of links among polymorphic sites of the HCV
lated (r ⫽ 0.99, P ⬍ 0.0001), indicating that the overall coor-
1a NS5A or E2 proteins with other viral proteins in the HCV poly-
dination among substitutions in the HCV genotype 1a data set
protein BN. E2 and NS5A interrelationships are compared between
has been adequately represented by the Virahep-C sequences
the HCV polyprotein BNs inferred from GenBank data and Vira-
used in this study. However, variations in the number of poly-
hep-C data. Sites (%), percentage of sites from each region that werelinked to sites from NS5A or E2.
morphic sites are observed between the GenBank BN (n ⫽
HCV GENETIC PATTERNS OF ADAPTATION TO IFN THERAPY
Protein sites relevant to therapy outcome. Observation of a
significant interconnection and coordination among HCV pro-teins suggests that all proteins contribute to determining theUR/NR outcome. To analyze these contributions in more de-tail, BNs were constructed for the individual gene products.
Extensive dependencies between sites and association to ther-apy outcome were found in all individual polyprotein regions,albeit to different degrees. The E2 and NS5A regions werefound to form a more dense set of links than other regions ofthe polyprotein (Table 2).
Although many polymorphic sites were found to be inter-
linked in the model shown in Fig. 1, indicating a significantcoordination of heterogeneity along the HCV polyprotein,there are a large number of sites with very few links, suggestingtheir marginal contribution to the polyprotein BN. To evaluatewhich proteins and amino acid sites were most associated withthe outcome, we conducted feature selection experiments. Byusing a naïve Bayesian network with feature selection, the E2and NS5A polyprotein regions were found to contribute thegreatest number of sites relevant to outcome (27.5% and26.3%, respectively) (Fig. 5). Similar results were observedwith four filtering methods for feature selection (Fig. 6). Each
FIG. 4. Relative strength and direction of links associated with
of the feature selection techniques extracted a certain number
individual HCV proteins in the Virahep-C BN (A) and GenBank BN
of the most relevant sites. A greater proportion of sites were
(B). The total strength of all outgoing links (blue bars), incoming links(red bars), and the global strength (green bars) are shown for each
selected from E2 and NS5A as relevant to the outcome (Fig.
6). The NS5A region consistently contributes a large numberof relevant sites with all four feature selection techniques.
Depending on the technique, 14.3% to 32% and 24.0% to 44%
1,296) and polyprotein BN (n ⫽ 551). Despite the greater
of amino acid sites were, respectively, selected from E2 and
number of polymorphic sites in the GenBank sequences, the
NS5A as contributing to the outcome. All of the techniques
Virahep-C sequences contain 25 unique polymorphic sites dis-
used selected significantly overlapping sets of the relevant
tributed among all but core proteins: at positions 230, 349, and
amino acid sites from all proteins, albeit with variations in
381 in E1; 385, 582, 631, and 742 in E2; 768 in P7; 826 and 926
ranking among the selected sites (see Table S2 in the supple-
in NS2; 1385, 1461, 1520, 1528, 1565, and 1592 in NS3; 1681 in
mental material). A set of sites selected using one of the
NS4A; 1805, 1820, and 1846 in NS4B; 2003, 2049, and 2343 in
techniques is shown in Table 3.
NS5A; and 2500 and 2548 in NS5B. These findings—in con-
Relationships between variables in a BN may be interpreted
junction with the observation of the 1.7-fold increase in the
as being causal (22), which can be applied to detect relevance
number of links between sites in E1 and E2 and the 2-fold
of a variable to define a target feature, in this case, therapy
increase between sites in E2 and NS5A in the Virahep-C BN
outcome. Analysis of the strength of influence measured as the
compared to the GenBank BN (Fig. 3)—suggest the treat-
Kullback-Leibler divergence (69) between the joint probability
ment-specific variations in coordination of substitutions at the
distribution with and without the arc shows that sites from the
genomic sites in the UR/NR HCV strains.
E2 and NS5A proteins have the strongest overall influences on
TABLE 2. Propertiesa of the BNs for individual protein regions
No. of linksb
a Properties correspond to protein-BN inferred from the Virahep-C data using alignments of the individual HCV gene products.
b The maximum number of incoming links was constrained to 10. The NS3-BN reached maximum complexity at a constraint of 11. Arc count, total number of links
in the BN; Avg or Max. outcomes, numbers of amino acid states (heterogeneity) of protein sites.
TABLE 3. Correlation-based feature selection (CFS) of HCV sites
relevant to UR/NR outcomesa
Polyprotein positions
29, 48, 75, 81, 106, 147, 161
192, 210, 230, 231, 236, 242, 243, 256, 280,
287, 293, 300, 308, 314, 345, 372, 379
394, 397, 434, 478, 480, 490, 498, 524, 528,
534, 591, 595, 625, 655, 668
762, 763, 767, 768, 770, 777, 789, 790
814, 824, 841, 843, 873, 934, 938, 941, 957,
958, 962, 982, 1017, 1021
1068, 1087, 1088, 1090, 1115, 1124, 1145,
1148, 1196, 1200, 1239, 1266, 1306, 1366,1398, 1405, 1409, 1412, 1417, 1428, 1444,1461, 1592
1681, 1686, 1687, 1693, 1700
1737, 1753, 1759, 1804, 1816, 1841, 1941, 1968
2024, 2043, 2280, 2283, 2320, 2366, 2376,
2501, 2530, 2582, 2629, 2633, 2730, 2747,
FIG. 5. BN with selection of relevant sites linked to the UR/NR
524, 790, 1090, 1409, 1592, 2024, 2280, 2283,
outcomes. Site selections were based on the BN choice of relevant
2366, 2376, 2414, 2530, 2633, 2950
features for outcome prediction. A total of 80 HCV polyprotein sitesare shown. Nodes are color coded by region (inset).
a List of subset of amino acid sites relevant to outcome prediction. Subsets of
sites from each region were determined by filtering out less-predictive sites. CFSwas applied to data sets: 10 data sets representing sites from each individual geneproduct and the therapy outcome and 1 data set representing the full-length
outcome (Fig. 7). Additionally, analysis of contribution of in-
HCV polyprotein sequences and associated therapy outcome. Site subsets listed
dividual sites to the UR/NR outcome was conducted using a
for the E2 and NS5A proteins were used in E2-BNC and NS5A-BNC (Fig. 10).
ratio of the mutual information calculated for each site and theoutcome over the maximal mutual information (MI) (MI ⫽0.3951, P ⫽ 0.0001) was calculated for site 2283 in the NS5A
interesting observation is that hypervariable region 1 (HVR1)
protein. Using this ratio as a measure of the relative signifi-
contributes five of eight relevant sites in E2, thus suggesting
cance of each site for determining outcome, 25 sites were
that HVR1 heterogeneity is associated with HCV evolution
identified in six proteins with values for this ratio being ⬎0.5
toward the IFN-RBV resistance.
(Fig. 8). Among these sites were one site at position 242 in E1,
Association of protein physicochemical properties with IFN-
eight sites at positions 391, 394, 397, 400, 401, 434, 528, and 655
RBV resistance. The observation of coordinated substitutions
in E2, two sites at positions 753 and 790 in P7, one site at
in all HCV proteins suggests extensive interrelationships
position 941 in NS2, nine sites at positions 2153, 2198, 2280,
among phenotypic traits encoded by these proteins and an
2288, 2320, 2339, 2375, 2376, and 2413 in NS5A, and four sites
important role of these interrelationships in defining HCV
at positions 2633, 2730, 2747, and 2755 in NS5B. The impor-
evolution toward IFN-RBV resistance. Although not clearly
tant observation from this analysis is that E2 and NS5A to-
determined, these phenotypic traits can be further analyzed
gether contain ⬃70% of these highly relevant sites. Another
using amino acid physicochemical properties as a quantitativeapproximation to phenotype. The factors affecting sequencevariation and diversity should be also reflected in the physico-chemical properties of the HCV polyprotein. Herein, the phys-
FIG. 6. Contribution of the UR/NR-relevant sites from individual
FIG. 7. Total strength of association between sites of individual
HCV proteins identified using four filtering methods.
HCV proteins and the UR/NR outcome.
HCV GENETIC PATTERNS OF ADAPTATION TO IFN THERAPY
optimized linear two-dimensional (2D) spaces. The probabi-listic mapping of NR and UR outcomes in these 2D physico-chemical spaces is shown in Fig. 9.
This analysis showed that the probability of outcomes
mapped in the optimized 2D physicochemical spaces of thepolyprotein, E2, and NS5A was distributed in the least convo-luted way, providing almost equal representations of UR andNR (Fig. 9). These observations suggest that the physicochem-ical properties of all HCV proteins are related to outcome,albeit to various degrees.
Strong association of E2 and NS5A with IFN-RBV resis-
tance. The results shown above strongly suggest that the IFN-
FIG. 8. Relative significance of association of the HCV polyprotein
RBV resistance is encoded in many regions of the HCV poly-
sites to the UR/NR outcome. Only sites with relative significance of
protein, with E2 and NS5A being strongly linked to this
⬎0.5 are shown. Color code: black, E1; red, E2; green, P7; yellow, NS2;
resistance. To further investigate the strength of association of
blue, NS5A; and cyan, NS5B. Relative significance is a ratio between
the IFN-RBV resistance with variation in the E2 and NS5A
the mutual information brought by each feature and the greatest mu-tual information.
primary structure, BN classifiers (BNCs) were developed usingpolymorphic sites from these two proteins. The accuracy ofperformance of the models was evaluated using the 10-fold CV
icochemical space dispersion of the HCV variants from the
protocol. The results of the evaluation are shown in Fig. 10.
UR/NR Virahep-C cases (18) was examined using a linear
The E2 and NS5A BNCs constructed using all polymorphic
projection technique (32). The analysis was conducted using
sites were found to be 82.5% and 90% accurate in the predic-
polymorphic sites of the HCV polyprotein or individual gene
tion of outcomes in the 10-fold CV, respectively. BNCs con-
products (see Table S1 in the supplemental material). The
structed using 15 sites selected from E2 and 9 sites selected
polymorphic sites from each protein were converted into vec-
from NS5A (Table 3) improved accuracies to 85% and 97.5%,
tors of amino acid physicochemical properties (6). For each
respectively, while the randomized data sets produced BNCs
protein, these vectors were used to generate a multidimen-
showing accuracies of only 35% to 47.5% (Fig. 10). Thus,
sional physicochemical space and project this space into the
although the networks of sites from both proteins have a strong
FIG. 9. Physicochemical projection of HCV polyprotein and individual proteins. Shown are the optimized 2D linear projections. Variation in
shade of colors reflects probability estimates for UR (red) and NR (blue) outcomes, with darker shades corresponding to greater probability values.
FIG. 10. 10-fold CV performance of the E2 BNC and NS5A
BNC constructed using all polymorphic sites (black bar) and se-lected relevant sites (white bar). Results for BNCs with randomizedlabels are shown using patterned bars (black for all and white forselected sites).
association with the IFN-RBV resistance, the NS5A BNCssignificantly outperformed the E2 BNCs in the CV experi-ments.
Prediction of UR/NR outcomes using NS5A. A high accuracy
FIG. 11. Projection of five selected physicochemical features of five
of the BNC models described above suggests a strong associ-
NS5A sites from the HALT-C sequence data set onto the physico-
ation of coordinated substitutions in NS5A with evolution to-
chemical space-based model derived from the Virahep-C sequence
ward the IFN-RBV resistance. However, since these models
data set. Lines originating from the center of the graph are projections
were generated using only 40 sequences from 20 patients, it is
of five physicochemical features. Circles in the graph map the UR/NRoutcomes of therapy for Virahep-C (unfilled circles) and HALT-C
critical to demonstrate that the interrelationships identified for
(filled circles). For color coding, see legend to Fig. 9.
these patients are representative of those for other patients.
For this purpose, two predictive models were constructed usingthe same Virahep-C data set and tested using the HCV NS5A
sequences from baseline specimens obtained from patients inthe HALT-C study (131). Because no additional data were
Two important features of HCV infection, persistence fol-
available for E2 from patients with NR and UR outcomes
lowing primary infection and resistance to IFN-based therapy,
investigated in a single study, only the NS5A models were
have been related to the extensive HCV genetic variability (39,
41). Although HCV has developed a very efficient capacity to
One model was constructed using physicochemical proper-
escape from adaptive (15, 35, 104, 128) and innate immune
ties of five NS5A sites selected using a heuristic method (73).
responses (12, 13, 85, 126), ⬃20% to 30% of all HCV infec-
The secondary structure for sites at position 2153 (projection
tions are cleared by the host (23) and 50% to 70% of chronic
X167 in Fig. 11) and 2413 (X492), the electrostatic charge for
infections can be successfully treated with IFN-RBV (48, 52,
site at position 2198 (X195), the polarity for site at position
55, 80). The variation in response to therapy among HCVstrains remains poorly understood. However, differential sen-
2280 (X281), and the molecular volume or size for site at
sitivity of HCV genotypes to IFN therapy (52, 80) suggests that
position 2320 (X328) were selected as the most relevant fea-
viral genetic factors play an important role in determining
tures for outcome in the Virahep-C data set. The LP model
therapy outcomes. Despite a low degree of response to treat-
mapping UR and NR outcomes into the 2D space generated
ment during chronic infection, 80% to 98% of patients with
using linear projection from the 5D physicochemical space is
acute HCV infection can achieve complete virological re-
shown in Fig. 11. Another model was constructed as a hybrid
sponse to IFN therapy (51, 62), suggesting that HCV acquires
between the decision table and a naïve Bayes (DTBN)-based
a significant degree of IFN resistance during chronic infection.
machine-learning technique (56) using 12 NS5A sites: nine
Taken together, these observations indicate a strong connec-
shown in Table 3 and three additional sites, at positions 2153,2198, and 2413, used in the linear projection approach.
After a 10-fold CV, both Virahep-C models were tested
TABLE 4. Validation of the NS5A Virahep-C models using the
on the HALT-C data set with 6 NS5A sequences obtained
HALT-C NS5A sequencesa
from UR and 12 from NR patients. The hybrid DTBN
Validation (% accuracy)
model showed an overall accuracy of 72.2% and the linear
projection model showed an overall accuracy of 83.3% of
outcome prediction for the HALT-C patients (Table 4).
This finding suggests that, although many sites along the
entire HCV polyprotein are relevant to development of
a Shown are the overall prediction accuracies of the BNC (DTNB method) and
the IFN-RBV resistance, the small number of features from
physicochemical-based LP models using selected NS5A sites (see text for de-
the NS5A protein alone may be sufficient for the prediction
b Average probability of correct classification in 10-fold CV.
of therapy outcomes.
c Average percent classification correctness in 10-fold CV.
HCV GENETIC PATTERNS OF ADAPTATION TO IFN THERAPY
tion between the intrahost HCV evolution and success of the
fectivity and viral fitness. Mutation at position 147 in domain 2
IFN-RBV therapy.
of the core protein was found to affect adherence of core to
In the current study, an integrative approach was imple-
lipid droplets and virus production (107). Our data show that
mented for the evolutionary analysis of the HCV genome. This
this site has direct links in the polyprotein BN to sites in E1,
approach was based on modeling interrelationships between
E2, NS2, and domain 3 of NS5A. Another site, from domain 2
polymorphic sites along the entire HCV polyprotein and re-
of core at position 161, linked to P7 in addition to these four
lating the modeled coordination among amino acid substitu-
proteins. All of these proteins play a role in the membrane-
tions to the UR/NR outcomes of therapy. Models constructed
associated viral replication (86). These observations suggest
here showed an extensive interdependence of all polymorphic
coordination of heterogeneity across the HCV polyprotein re-
sites within the HCV polyprotein, suggesting a significant co-
lated to viral production and the important role played by the
evolution among individual HCV proteins. The data indicate
core protein in this coordination.
that all HCV proteins contain sites coordinating their poly-
Two proteins, E2 and NS5A, together contribute ⬃40% of
morphism with sites in all other proteins (Fig. 2). A similar
all sites and ⬃62% of all links to the polyprotein BN and,
observation has been recently made using a correlation net-
therefore, essentially define the state of this entire network. In
work analysis of the HCV genotype 1a full-genome sequences
combination with E1, these three proteins contribute ⬃50% of
from untreated patients (17) and patients on therapy (7).
all sites and ⬃77% of all links to the polyprotein BN. It is
Among all connections identified using the polyprotein BN in
interesting that E2 and NS5A also mutually coordinate their
this study, only 17.5% were among sites within individual pro-
heterogeneity (Fig. 3). Although coordination between sites
teins. It is interesting to note that E2 shows the most extensive
from any two HCV proteins is a common feature of the poly-
coordination among its sites, with all other proteins having ⬃2
protein BN, this coordination is most extensive between sites
to 13 times fewer connections among intraprotein sites than
of E2 and NS5A, owing to the large number of sites contrib-
E2. With 82.5% of all connections in the network being among
uted by these two proteins to the network. Thus, the states of
proteins, HCV evolution is evidently defined by coadaptation
many sites in one of these two proteins reflect the states of
among many phenotypic traits encoded by different HCV pro-
many sites in the other protein, suggesting a high degree of
coevolution between these two proteins. Additionally, it was
Although all HCV proteins contribute to the network, the
observed that sites from E2 formed the strongest links with
topological properties of sites differed among proteins. The
many other sites in the polyprotein as determined by CI testing
core protein contributes fewer sites (n ⫽ 11) per its size than
(Fig. 1), among which were links between sites 482 and 642 in
any other HCV protein. However, each core site forms ⬃2 to
E2 (P ⫽ 2 ⫻ 10⫺9), 612 in E2 and 233 in E1 (P ⫽ 3 ⫻ 10⫺8),
4 times more links in the network than any site from other
and 642 in E2 and 1756 in NS4B (P ⫽ 2 ⫻ 10⫺8). Taking into
proteins (Table 1). This protein has 12.4 times more outgoing
consideration that site 482 is from the CD81-binding region
than incoming links in the polyprotein BN, while the ratio
(45, 127) and site 612 from one of two E2 regions proposed to
between outgoing and incoming links for all other proteins
be involved in the viral fusion process (72, 91, 95), we speculate
varies from 0.8 to 2.2 (Table 1). Another important feature of
that the tight coordination between sites 482 and 642 as well as
core connectivity in the polyprotein BN is that 98.6% of all
that between sites 612 and 233 is associated with viral entry.
core links are with other proteins. The presence of only two
Another important observation made in this study is that all
intraprotein links (polyprotein positions 903110 and 47329)makes the core protein the least intraconnected protein, indi-
HCV proteins have association with the UR/NR outcome of
cating a minimal direct coordination among core polymorphic
IFN-RBV therapy. Taking into consideration the aforemen-
sites. Thus, the contribution of core to the network topology
tioned extensive linkage among polymorphic sites from differ-
differs considerably from those of all other proteins, suggesting
ent proteins, this observation, although not surprising, reveals
that this protein has a unique role in coordinating substitutions
that the HCV response to immunomodulatory therapy is a very
and defining heterogeneity at many sites of the HCV polypro-
complex trait involving numerous viral functions that require
coordination. All networks constructed for individual proteins
This observation is in agreement with the multitude of func-
included the UR/NR outcome as a variable (Table 2). How-
tions performed by the core protein and emphasizes its impor-
ever, this observation cannot be unequivocally interpreted in
tant role in HCV infection. In addition to forming the nucleo-
terms of equal contribution of each protein to the IFN-RBV
capsid (105), this protein was shown to interfere with many
response. Nevertheless, it suggests that the genome-wide co-
cellular signaling pathways involved in apoptosis (134), tran-
ordination among sites is important for this response, with
scription (60, 130), and transformation (21, 65, 102, 129). The
some proteins possibly playing accessory roles and reflecting
core protein is also involved in lipid metabolism (10, 96). It
the IFN-RBV-related changes in other proteins that are
inhibits the microsomal triglyceride transfer protein, binds to
mainly responsible for resistance. The analysis conducted here
apolipoprotein AII, and induces accumulation of cytoplasmic
revealed that sites substantially associated with the outcome
lipid droplets (2). Core and NS5A are key factors for assembly
are scattered along the entire HCV polyprotein. Among the
of infectious particles. Both colocalize on the surface of lipid
sites with relevant significance of ⬎0.5 (Fig. 8) are sites in E1
droplets, a proposed site for HCV particle assembly (4). With
(n ⫽ 1), E2 (n ⫽ 8), p7 (n ⫽ 2), NS2 (n ⫽ 1), NS5A (n ⫽ 9),
lipid droplets playing a crucial role in the assembly and release
and NS5B (n ⫽ 4). Two proteins, E2 and NS5A, shared 68%
of infectious HCV particle (83), interactions involving domain
of these 25 sites, suggesting their strong connection to IFN-
2 of core and domain 3 of NS5A (5, 14, 81, 82) are essential for
RBV resistance. E2, NS5A, and P7 have, respectively, 6.8%,
virion production and, therefore, have a strong impact on in-
9.0%, and 11.7% of their polymorphic sites being highly rele-
vant to the therapy outcome, while all other proteins have only
without consideration of their relationships seems inefficient in
1.5% to 3.1% of these sites.
detecting a reliable connection to the outcomes. Only 3 among
One surprising finding was that five among the eight sites
25 sites having the highest value of mutual information with
most relevant to therapy outcome are located in HVR1 of the
the outcome (Fig. 8) were found to be directly linked to the
E2 protein (aa 384 to 410), emphasizing a strong connection of
outcome in the polyprotein BN (Fig. 2). The same 3 sites, 2280,
HVR1 heterogeneity to IFN-RBV resistance. Association of
2283, and 2633, are among the 14 most relevant sites extracted
HVR1 sites with outcomes of therapy can be also found in the
from the HCV polyprotein using correlation-based feature se-
correlation networks (7). However, the significance of these
lection (Table 3) and among 18 sites that have the strongest
observations is not apparent. Analysis of HVR1 connectivity in
connections to outcome in the undirected dependence graph
the polyprotein BN showed that polymorphic HVR1 sites have
(Fig. 1). All computational techniques used in this study
a total of 140 links to all HCV proteins, with each HVR1 site
ranked the contribution of various sites differently. For exam-
being connected to three to nine sites in the HCV polyprotein.
ple, only 12 sites were shared by 18 sites shown in Fig. 1 and 25
Such an extensive interdependence of HVR1 sites with many
sites shown in Fig. 8. Although sites 2280 and 2283 from NS5A
sites across the entire HCV polyprotein (Fig. 3), in conjunction
and site 2633 from NS5B were frequently identified as most
with the earlier similar observations using network analysis
relevant to the IFN-RBV response, analysis of states at these
(17), suggests that the HVR1 substitutions are not random and
sites is not sufficient for an accurate prediction of the therapy
that HVR1 evolution is substantially coordinated with all HCV
outcome (data not shown). Such a prediction requires the use
proteins. Coordination of HVR1 heterogeneity is especially no-
of a combination of sites selected for their collective contribu-
ticeable with E1, E2, and NS5A, which share, respectively, 15%,
tion to the outcome.
26.4%, and 14% of all HVR1 links in the polyprotein BN, while
For that purpose, we conducted a series of experiments for
any other HCV protein shares 3.6% to 9.3% of HVR1 links.
selection of site sets most relevant to the therapy outcome
HVR1 contains antigenic epitopes (66, 67, 112, 121) with
from the entire HCV polyprotein and individual proteins (Ta-
HCV neutralizing activity (40). Rapid HVR1 evolution is as-
ble 3). Two proteins, E2 and NS5A, were explored in detail. As
sociated with immune escape (70). However, the conservation
mentioned earlier, both proteins have many polymorphic sites
of the HVR1 physicochemical properties and conformation
and contributed many links to the polyprotein BN. These two
(94) argues that this region is significantly functionally con-
proteins consistently made substantial contributions of the
strained despite its extensive heterogeneity. The observation
most relevant sites identified using different feature selection
that compensatory mutations in the ectodomain of E2 (46) and
techniques (Fig. 5 and 6). Probabilistic mapping of UR and NR
the I347L mutation in E1 compensate for HCV fusion impair-
outcomes in 2D physicochemical space showed an equally rep-
ment (9) in HCV mutants whose HVR1 have been excised
resentative distribution of the outcome probabilities for E2,
suggests potential functional relationships of this region with
NS5A, and the polyprotein (Fig. 9). All these findings strongly
other parts of the HCV genome. HVR1 was shown to be
suggest that these two proteins have a strong connection to
involved in the SR-B1-facilitated entry of HCV pseudopar-
therapy response and can be used for the accurate prediction
ticles in cell culture (11). It was suggested that HVR1 plays an
of therapeutic outcomes. However, as can be seen in Fig. 10,
important role in HCV entry by modulating receptor recogni-
the 10-fold CV experiments showed that the NS5A BN out-
tion and affects lipoprotein composition and infectivity of viral
performs the E2 BN constructed using complete sets of poly-
particles (9). HVR1 heterogeneity was also associated with the
morphic sites (82.5% versus 90% accuracy) or feature-selected
development of resistance to therapy (74, 87, 117, 118). We
sites (85% versus 97.5% accuracy). These results, taken to-
hypothesize that complex functional relationships of HVR1
gether with the observation that NS5A contains two of six sites
are reflected in coordinated evolution with other HCV pro-
directly connected to the therapy outcome in the polyprotein
teins and that HVR1 mirrors the evolution of the entire HCV
BN while E2 has no direct links to the outcome, suggest that
genome, including evolution toward the IFN-RBV resistance.
NS5A has a very strong relevance to evolution toward the
There are many sites from different HCV proteins strongly
IFN-RBV resistance.
linked to the IFN-RBV resistance (Fig. 1 and 8). However,
Two sites, at positions 2376 and 2414 in NS5A, have exper-
consideration of individual sites allows only for the identifica-
imentally been associated with the development of resistance
tion of connections to the therapy outcome in the form of a
to RBV (97). It is important to note that these two sites were
trend and does not have a strong predictive power. Correlation
consistently selected as being relevant to the therapy outcome
of the IFN-RBV therapy outcomes has been reported with site
(Table 3 and Fig. 8), indicating that the NS5A BN as well as
polymorphisms in the core (36), E2 (87, 106), and NS5A (88,
polyprotein BN constructed using all or feature-selected sites
106) proteins. Although these observations revealed numerous
includes links that reflect contribution of RBV to therapy. Site
associations between the HCV genetic polymorphism and evo-
2414 located in domain 3 of NS5A is linked to site 161 in
lution toward IFN-RBV resistance, these associations were
domain 2 of core in the polyprotein BN. As mentioned earlier,
never explored in terms of their interrelationships and formulated
both domains are involved in protein-protein interactions be-
into an integrative model capable of revealing accurate quantita-
tween these two proteins, association with lipid droplets, and
tive connections between HCV genetic changes and therapy out-
assembly and release of viral particles (81, 83). There seems to
be a linkage between coevolution of the core and NS5A pro-
The current report presents several probabilistic models
teins and RBV resistance, and this resistance is associated with
connecting the UR/NR outcome to coordinated changes at
interaction between these two proteins. The final validation of
polymorphic sites across the entire HCV polyprotein as well as
the two predictive NS5A Virahep-C models using the HALT-C
from individual HCV proteins. Analysis of individual sites
data strongly confirms a robust connection between coordina-
HCV GENETIC PATTERNS OF ADAPTATION TO IFN THERAPY
tion among the NS5A sites and IFN-RBV resistance. Addi-
responsible for the response to IFN, there is no single IFN
tionally, it shows that a small number of features from NS5A
resistance mutation. Once established, the wide-ranging epi-
alone may be sufficient for the prediction of therapy outcomes
static connectivity among sites involved in the IFN response
(Fig. 11). This finding suggests that analysis of a very few sites
may not be rapidly reverted even with reduction of the selec-
from a small HCV genomic region, such as NS5A, may be used
tion pressure in the absence of treatment, thus locking the
for monitoring sensitivity to the IFN-RBV therapy.
HCV genome into the state of resistance to IFN. Without
A general interconnectivity among HCV proteins was com-
being eliminated by IFN-RBV therapy, these variants can con-
parable for the 40 Virahep-C sequences and the 298 HCV
tinue to circulate among human hosts. In contrast, IFN-RBV-
genotype 1a full-genome sequences obtained from GenBank
sensitive strains are being removed from circulation. This con-
(Fig. 3 and 4), indicating that the modeled coordination among
sideration implies that the current widespread adoption of
substitutions is essentially similar for all HCV variants from
IFN-based therapy, although extremely beneficial for individ-
treated and treatment-naïve patients. This observation addi-
ual patients with SVR, may affect the composition of the cir-
tionally suggests that the development of resistance during
culating HCV population and enlarge the reservoir of IFN-
immunomodulatory therapy is generally shaped by selection
resistant HCV, a potentially alarming public health issue that
pressures similar to the HCV evolution in untreated patients.
warrants a further investigation.
However, there are some important differences between the
polyprotein BNs generated using sequences from treated andtreatment-naı¨ve patients. The GenBank sequences from un-
We are grateful to Chong-Gee Teo for critical review and discussion
treated patients contain more polymorphic sites (n ⫽ 1,296)
of findings in this paper as well as to two anonymous reviewers forimportant comments.
than the Virahep-C sequences (n ⫽ 551). Despite this fact, the
This work was supported by CDC intramural funding.
Virahep-C sequences contain 25 polymorphic sites that are
This information has not been formally disseminated by the Centers
conserved in the GenBank sequences. These sites are distrib-
for Disease Control and Prevention/Agency for Toxic Substances and
uted within E1 (n ⫽ 3), E2 (n ⫽ 4), P7 (n ⫽ 1), NS2 (n ⫽ 2),
Disease Registry. It does not represent and should not be construed to
NS3 (n ⫽ 6), NS4A (n ⫽ 1), NS4B (n ⫽ 3), NS5A (n ⫽ 3), and
represent any agency determination or policy.
NS5B (n ⫽ 2). Among them, sites at positions 230 in E1, 768
in P7, and 1461 and 1592 in NS3 are the most relevant to the
1. Abid, K., R. Quadri, and F. Negro. 2000. Hepatitis C virus, the E2 envelope
protein, and alpha-interferon resistance. Science 287:1555.
IFN-RBV response (Table 3). Furthermore, the two BNs had
2. Andre, P., G. Perlemuter, A. Budkowska, C. Brechot, and V. Lotteau. 2005.
topological differences in the number of interprotein links,
Hepatitis C virus particles and lipoprotein metabolism. Semin. Liver Dis.
most notably the 1.7- and 2-fold proportional increase in the
3. Reference deleted.
number of links between E1 and E2 and between E2 and
4. Appel, N., et al. 2008. Essential role of domain III of nonstructural protein
NS5A in the Virahep-C BN compared to those in the
5A for hepatitis C virus infectious particle assembly. PLoS Pathog.
GenBank BN (Fig. 3). These observations suggest that de-
5. Appel, N., et al. 2008. Essential role of domain III of nonstructural protein
spite the similarity of these two networks, there are distinct
5A for hepatitis C virus infectious particle assembly. PLoS Pathog.
differences in coordination among substitutions in HCV
6. Atchley, W. R., J. Zhao, A. D. Fernandes, and T. Druke. 2005. Solving the
from treated and treatment-naı¨ve patients.
protein sequence metric problem. Proc. Natl. Acad. Sci. U. S. A. 102:6395–
IFN is a major component of innate immunity (19, 100).
Several HCV proteins are involved in modulation of the host
7. Aurora, R., M. J. Donlin, N. A. Cannon, and J. E. Tavis. 2009. Genome-
wide hepatitis C virus amino acid covariance networks can predict response
IFN response (12, 13, 85, 126). RBV used as a component of
to antiviral therapy in humans. J. Clin. Invest. 119:225–236.
combined therapy seems to facilitate early response to IFN
8. Bagaglio, S., et al. 2003. Genetic heterogeneity of hepatitis C virus (HCV)
(43) rather than playing a strong independent role. Resistance
in clinical strains of HIV positive and HIV negative patients chronically
infected with HCV genotype 3a. J. Biol. Regul. Homeost. Agents 17:153–
to IFN is not clearly linked to any specific mutation within the
HCV genome. As shown in this study, HCV adaptation to IFN
9. Bankwitz, D., et al. 2010. Hepatitis C virus hypervariable region 1 modu-
lates receptor interactions, conceals the CD81 binding site, and protects
is a complex trait encoded in the interrelationships among
conserved neutralizing epitopes. J. Virol. 84:5751–5763.
many sites along the entire HCV polyprotein. The extensive
10. Barba, G., et al. 1997. Hepatitis C virus core protein shows a cytoplasmic
coevolution among HCV amino acid sites leads to a significant
localization and associates to cellular lipid storage droplets. Proc. Natl.
Acad. Sci. U. S. A. 94:1200–1205.
integration among the HCV IFN-response-related phenotypic
11. Bartosch, B., et al. 2003. Cell entry of hepatitis C virus requires a set of
traits. Each HCV protein contributes to the IFN resistance,
co-receptors that include the CD81 tetraspanin and the SR-B1 scavenger
albeit to a different degree. With E2 and NS5A contributing
receptor. J. Biol. Chem. 278:41624–41630.
12. Blindenbacher, A., et al. 2003. Expression of hepatitis c virus proteins
many polymorphic sites to the network and generating a broad
inhibits interferon alpha signaling in the liver of transgenic mice. Gastro-
epistatic connectivity to sites in other HCV proteins, intrahost
13. Bode, J. G., et al. 2003. IFN-alpha antagonistic activity of HCV core protein
HCV evolution toward the IFN resistance is essentially defined
involves induction of suppressor of cytokine signaling-3. FASEB J. 17:488–
and, therefore, can be accurately predicted using a carefully
selected combination of sites from these two proteins.
14. Boulant, S., et al. 2006. Structural determinants that target the hepatitis C
virus core protein to lipid droplets. J. Biol. Chem. 281:22236–22247.
Treatment with IFN does not exert an unusual selection
15. Brady, M. T., A. J. MacDonald, A. G. Rowan, and K. H. Mills. 2003.
pressure on HCV, unlike treatment using direct-acting antivi-
Hepatitis C virus non-structural protein 4 suppresses Th1 responses by
ral compounds, but rather generates an unusually strong se-
stimulating IL-10 production from monocytes. Eur. J. Immunol. 33:3448–
3457.
lection pressure of the innate immune system. Thus, HCV
16. Brieman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classi-
strains capable of resisting or evolving toward resistance to
fication and regression trees. Chapman & Hall/CRC, Boca Raton, FL.
17. Campo, D. S., Z. Dimitrova, R. J. Mitchell, J. Lara, and Y. Khudyakov.
immunomodulatory therapy are most efficient in overcoming
2008. Coordinated evolution of the hepatitis C virus. Proc. Natl. Acad. Sci.
the host immune system. With the entire HCV genome being
U. S. A. 105:9685–9690.
18. Cannon, N. A., M. J. Donlin, X. Fan, R. Aurora, and J. E. Tavis. 2008.
47. Frese, M., T. Pietschmann, D. Moradpour, O. Haller, and R. Barten-
Hepatitis C virus diversity and evolution in the full open-reading frame
schlager. 2001. Interferon-alpha inhibits hepatitis C virus subgenomic RNA
during antiviral therapy. PLoS One 3:e2123.
replication by an MxA-independent pathway. J. Gen. Virol. 82:723–733.
19. Carney, D. S., and M. Gale, Jr. 2006. HCV regulation of host defense, p.
48. Fried, M. W., et al. 2002. Peginterferon alfa-2a plus ribavirin for chronic
375–398. In Seng-Lai Tan (ed.), Hepatitis C viruses. Horizon Bioscience,
hepatitis C virus infection. N. Engl. J. Med. 347:975–982.
Norfolk, United Kingdom.
49. Gale, M. J., Jr., et al. 1997. Evidence that hepatitis C virus resistance to
20. Castelain, S., et al. 2002. Variability of the nonstructural 5A protein of
interferon is mediated through repression of the PKR protein kinase by the
hepatitis C virus type 3a isolates and relation to interferon sensitivity. J.
nonstructural 5A protein. Virology 230:217–227.
Infect. Dis. 185:573–583.
50. Ge, D., et al. 2009. Genetic variation in IL28B predicts hepatitis C treat-
21. Chang, J., et al. 1998. Hepatitis C virus core from two different genotypes
ment-induced viral clearance. Nature 461:399–401.
has an oncogenic potential but is not sufficient for transforming primary rat
51. Gerlach, J. T., et al. 2003. Acute hepatitis C: high rate of both spontaneous
embryo fibroblasts in cooperation with the H-ras oncogene. J. Virol. 72:
and treatment-induced viral clearance. Gastroenterology 125:80–88.
52. Ghany, M. G., D. B. Strader, D. L. Thomas, and L. B. Seeff. 2009. Diagnosis,
22. Charniak, E. 1991. Bayesian networks without tears. AI Mag. 12:50–63.
management, and treatment of hepatitis C: an update. Hepatology 49:
23. Chen, S. L., and T. R. Morgan. 2006. The natural history of hepatitis C virus
(HCV) infection. Int. J. Med. Sci. 3:47–52.
53. Goswami, B. B., R. Crea, J. H. Van Boom, and O. K. Sharma. 1982.
24. Chickering, D. M., D. Heckerman, and C. Meek. 2004. Large-sample learn-
2⬘-5⬘-Linked oligo(adenylic acid) and its analogs. A new class of inhibitors
ing of Bayesian networks is NP-hard. J. Mach. Learn. Res. 5:1287–1330.
of mRNA methylation. J. Biol. Chem. 257:6867–6870.
25. Choo, Q. L., et al. 1990. Hepatitis C virus: the major causative agent of viral
54. Guyon, I., and A. Elisseeff. 2003. An introduction to variable and feature
non-A, non-B hepatitis. Br. Med. Bull. 46:423–441.
selection. Mach. Learn. Res. 3:1157–1182.
26. Conjeevaram, H. S., et al. 2006. Peginterferon and ribavirin treatment in
55. Hadziyannis, S. J., et al. 2004. Peginterferon-alpha2a and ribavirin combi-
African American and Caucasian American patients with hepatitis C ge-
nation therapy in chronic hepatitis C: a randomized study of treatment
notype 1. Gastroenterology 131:470–477.
duration and ribavirin dose. Ann. Intern. Med. 140:346–355.
27. Contreras, A. M., et al. 2002. Viral RNA mutations are region specific and
56. Hall, M., and E. Frank. 2008. Combining naive Bayes and decision tables,
increased by ribavirin in a full-length hepatitis C virus replication system.
p. 318–319. In D. Wilson and H. Chad (ed.). Proceedings of the 21st Florida
J. Virol. 76:8505–8517.
Artificial Intelligence Research Society Conference. AAAI Press, Coconut
28. Cooper, G. F., and E. Herskovits. 1992. A Bayesian method for the induc-
tion of probabilistic networks from data. Mach. Learn. 9:309–347.
57. Hall, M. A. 1999. Correlation-based feature subset selection for machine
29. Cox, L. A. 2006. Detecting causal non-linear exposure-response relations in
learning. Ph.D. thesis, Department of Computer Science, University of
epidemiological data. Dose Response 4:119–132.
Waikato, Waikato, New Zealand.
30. Daelemans, W., V. Hoste, F. De Meulder, and B. Naudts. 2003. Combined
58. Hall, T. A. 1999. BioEdit: a user-friendly biological sequence alignment
optimization of feature selection and algorithm parameters in machine
editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp.
learning of language. Machine Learning: Ecml 2003 2837:84–95.
31. Dash, D., and M. J. Druzdzel. 2003. Robust independence testing for
59. Helbig, K. J., D. T. Lau, L. Semendric, H. A. Harley, and M. R. Beard. 2005.
constraint-based learning of causal structure, p. 167–174. In The 19th An-
Analysis of ISG expression in chronic hepatitis C identifies viperin as a
nual Conference on Uncertainty in Artificial Intelligence (UAI-03). Mor-
potential antiviral effector. Hepatology 42:702–710.
gan Kaufmann, San Francisco, CA.
60. Hsieh, T. Y., et al. 1998. Hepatitis C virus core protein interacts with
32. Demsar, J., G. Leban, and B. Zupan. 2005. FreeViz—an intelligent visual-
heterogeneous nuclear ribonucleoprotein K. J. Biol. Chem. 273:17651–
ization approach for class-labeled multidimensional data sets, p. 61–66. In
Intelligent Data Analysis in Medicine and Pharmacology (IDAMAP)
61. Huber, M., et al. 2005. Interferon alpha-2a plus ribavirin 1,000/1,200 mg
Workshop, Aberdeen, United Kingdom.
versus interferon alpha-2a plus ribavirin 600 mg for chronic hepatitis C
33. Deuffic-Burban, S., T. Poynard, M. S. Sulkowski, and J. B. Wong. 2007.
infection in patients on opiate maintenance treatment: an open-label ran-
Estimating the future health burden of chronic hepatitis C and human
domized multicenter trial. Infection 33:25–29.
immunodeficiency virus infections in the United States. J. Viral Hepat.
62. Jaeckel, E., et al. 2001. Treatment of acute hepatitis C with interferon
alfa-2b. N. Engl. J. Med. 345:1452–1457.
34. Duverlie, G., et al. 1998. Sequence analysis of the NS5A protein of Euro-
63. Jensen, F. 1996. An introduction to Bayesian networks. UCL Press, Lon-
pean hepatitis C virus 1b isolates and relation to interferon sensitivity.
don, United Kingdom.
J. Gen. Virol. 79:1373–1381.
64. Jiang, D., et al. 2008. Identification of three interferon-inducible cellular en-
35. Emi, K., et al. 1999. Magnitude of activity in chronic hepatitis C is influ-
zymes that inhibit the replication of hepatitis C virus. J. Virol. 82:1665–1678.
enced by apoptosis of T cells responsible for hepatitis C virus. J. Gastro-
65. Jin, D. Y., et al. 2000. Hepatitis C virus core protein-induced loss of LZIP
enterol. Hepatol. 14:1018–1024.
function correlates with cellular transformation. EMBO J. 19:729–740.
36. Enomoto, N., and S. Maekawa. 2010. HCV genetic elements determining
66. Kato, N., et al. 1994. Genetic drift in hypervariable region 1 of the viral
the early response to peginterferon and ribavirin therapy. Intervirology
genome in persistent hepatitis C virus infection. J. Virol. 68:4776–4784.
67. Kato, N., et al. 1993. Humoral immune response to hypervariable region 1
37. Enomoto, N., et al. 1995. Comparison of full-length sequences of interfer-
of the putative envelope glycoprotein (gp70) of hepatitis C virus. J. Virol.
on-sensitive and resistant hepatitis C virus 1b. Sensitivity to interferon is
conferred by amino acid substitutions in the NS5A region. J. Clin. Invest.
68. Kraus, M. R., et al. 2001. Compliance with therapy in patients with chronic
hepatitis C: associations with psychiatric symptoms, interpersonal prob-
38. Enomoto, N., et al. 1996. Mutations in the nonstructural protein 5A gene
lems, and mode of acquisition. Dig. Dis. Sci. 46:2060–2065.
and response to interferon in patients with chronic hepatitis C virus 1b
69. Kullback, S., and R. A. Leibler. 1951. On information and sufficiency. Ann.
infection. N. Engl. J. Med. 334:77–81.
Math. Stat. 22:79–86.
39. Farci, P. 2001. Hepatitis C virus. The importance of viral heterogeneity.
70. Kurosaki, M., N. Enomoto, F. Marumo, and C. Sato. 1993. Rapid sequence
Clin. Liver Dis. 5:895–916.
variation of the hypervariable region of hepatitis C virus during the course
40. Farci, P., et al. 1996. Prevention of hepatitis C virus infection in chimpan-
of chronic infection. Hepatology 18:1293–1299.
zees by hyperimmune serum against the hypervariable region 1 of the
71. Lauritzen, S. L. 1996. Graphical models. Clarendon Press, Oxford, United
envelope 2 protein. Proc. Natl. Acad. Sci. U. S. A. 93:15394–15399.
41. Farci, P., et al. 2002. Early changes in hepatitis C viral quasispecies during
72. Lavillette, D., et al. 2007. Characterization of fusion determinants points to
interferon therapy predict the therapeutic outcome. Proc. Natl. Acad. Sci.
the involvement of three discrete regions of both E1 and E2 glycoproteins
U. S. A. 99:3081–3086.
in the membrane fusion process of hepatitis C virus. J. Virol. 81:8752–8765.
42. Feld, J. J., and J. H. Hoofnagle. 2005. Mechanism of action of interferon
73. Leban, G., I. Bratko, U. Petrovic, T. Curk, and B. Zupan. 2005. VizRank:
and ribavirin in treatment of hepatitis C. Nature 436:967–972.
finding informative data projections in functional genomics by machine
43. Feld, J. J., et al. 2010. Ribavirin improves early responses to peginterferon
through improved interferon signaling. Gastroenterology 139:154–162.
74. Le Guillou-Guillemette, H., et al. 2007. Genetic diversity of the hepatitis C
44. Feld, J. J., et al. 2007. Hepatic gene expression during treatment with
virus: impact and issues in the antiviral therapy. World J. Gastroenterol.
peginterferon and ribavirin: identifying molecular pathways for treatment
response. Hepatology 46:1548–1563.
75. Lopez-Labrador, F. X., et al. 1999. Relationship of the genomic complexity
45. Flint, M., et al. 1999. Characterization of hepatitis C virus E2 glycoprotein
of hepatitis C virus with liver disease severity and response to interferon in
interaction with a putative cellular receptor, CD81. J. Virol. 73:6235–6244.
patients with chronic HCV genotype 1b infection [correction of interferon].
46. Forns, X., et al. 2000. Hepatitis C virus lacking the hypervariable region 1
of the second envelope protein is infectious and causes acute resolving or
76. Lutchman, G., et al. 2007. Mutation rate of the hepatitis C virus NS5B in
persistent infection in chimpanzees. Proc. Natl. Acad. Sci. U. S. A. 97:
patients undergoing treatment with ribavirin monotherapy. Gastroenterol-
HCV GENETIC PATTERNS OF ADAPTATION TO IFN THERAPY
77. Maag, D., C. Castro, Z. Hong, and C. E. Cameron. 2001. Hepatitis C virus
105. Santolini, E., G. Migliaccio, and N. Lamonica. 1994. Biosynthesis and
RNA-dependent RNA polymerase (NS5B) as a mediator of the antiviral
biochemical properties of the hepatitis C virus core protein. J. Virol. 68:
activity of ribavirin. J. Biol. Chem. 276:46094–46098.
78. Magiorkinis, G., et al. 2009. The global spread of hepatitis C virus 1a and
106. Sarrazin, C., et al. 2000. Mutations within the E2 and NS5A protein in
1b: a phylodynamic and phylogeographic analysis. PLoS Med. 6:e1000198.
patients infected with hepatitis C virus type 3a and correlation with treat-
79. Mangoni, E. D., D. M. Forton, G. Ruggiero, and P. Karayiannis. 2003.
ment response. Hepatology 31:1360–1370.
Hepatitis C virus E2 and NS5A region variability during sequential treat-
107. Shavinskaya, A., S. Boulant, F. Penin, J. McLauchlan, and R. Barten-
ment with two interferon-alpha preparations. J. Med. Virol. 70:62–73.
schlager. 2007. The lipid droplet binding domain of hepatitis C virus core
80. Manns, M. P., et al. 2001. Peginterferon alfa-2b plus ribavirin compared
protein is a major determinant for efficient virus assembly. J. Biol. Chem.
with interferon alfa-2b plus ribavirin for initial treatment of chronic hepa-
titis C: a randomised trial. Lancet 358:958–965.
108. Simmonds, P., et al. 2005. Consensus proposals for a unified system of
81. Masaki, T., et al. 2008. Interaction of hepatitis C virus nonstructural protein
nomenclature of hepatitis C virus genotypes. Hepatology 42:962–973.
5A with core protein is critical for the production of infectious virus par-
109. Simmonds, P., et al. 1993. Classification of hepatitis C virus into six major
ticles. J. Virol. 82:7964–7976.
genotypes and a series of subtypes by phylogenetic analysis of the NS-5
82. McLauchlan, J. 2000. Properties of the hepatitis C virus core protein: a
region. J. Gen. Virol. 74:2391–2399.
structural protein that modulates cellular processes. J. Viral Hepat. 7:2–14.
110. Suppiah, V., et al. 2009. IL28B is associated with response to chronic hepatitis
83. McLauchlan, J. 2009. Hepatitis C virus: viral proteins on the move.
C interferon-alpha and ribavirin therapy. Nat. Genet. 41:1100–1104.
Biochem. Soc. Trans. 37:986–990.
111. Tam, R. C., et al. 1999. Ribavirin polarizes human T cell responses towards
84. Melen, K., P. Keskinen, A. Lehtonen, and I. Julkunen. 2000. Interferon-
a type 1 cytokine profile. J. Hepatol. 30:376–382.
induced gene expression and signaling in human hepatoma cell lines.
112. Taniguchi, S., et al. 1993. A structurally flexible and antigenically variable
J. Hepatol. 33:764–772.
N-terminal domain of the hepatitis C virus E2/NS1 protein: implication for
85. Miller, K., et al. 2004. Effects of the hepatitis C virus core protein on innate
an escape from antibody. Virology 195:297–301.
cellular defense pathways. J. Interferon Cytokine Res. 24:391–402.
113. Taylor, D. R., S. T. Shi, P. R. Romano, G. N. Barber, and M. M. Lai. 1999.
86. Moradpour, D., et al. 2003. Membrane association of hepatitis C virus
Inhibition of the interferon-inducible protein kinase PKR by HCV E2
nonstructural proteins and identification of the membrane alteration that
protein. Science 285:107–110.
harbors the viral replication complex. Antiviral Res. 60:103–109.
114. Thomas, D. L., et al. 2009. Genetic variation in IL28B and spontaneous
87. Moribe, T., et al. 1995. Hepatitis C viral complexity detected by single-
clearance of hepatitis C virus. Nature 461:798–801.
strand conformation polymorphism and response to interferon therapy.
115. Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W:
improving the sensitivity of progressive multiple sequence alignment
88. Munoz de Rueda, P., et al. 2008. Mutations in E2-PePHD, NS5A-PKRBD,
through sequence weighting, position-specific gap penalties and weight
NS5A-ISDR, and NS5A-V3 of hepatitis C virus genotype 1 and their
matrix choice. Nucleic Acids Res. 22:4673–4680.
relationships to pegylated interferon-ribavirin treatment responses. J. Virol.
116. Tillmann, H. L., et al. 2010. A polymorphism near IL28B is associated with
spontaneous clearance of acute hepatitis C virus and jaundice. Gastroen-
89. Murakami, T., et al. 1999. Mutations in nonstructural protein 5A gene and
response to interferon in hepatitis C virus genotype 2 infection. Hepatology
117. Torres-Puente, M., et al. 2008. Genetic variability in hepatitis C virus and
its role in antiviral treatment response. J. Viral Hepat. 15:188–199.
89a.National Institutes of Health. 2002. NIH consensus statement on manage-
118. Toyoda, H., et al. 1997. Quasispecies nature of hepatitis C virus and re-
ment of hepatitis C: 2002. NIH Consens. State Sci. Statements 19(3):1–46.
sponse to alpha interferon: significance as a predictor of direct response to
90. Neumann, A. U., et al. 2000. Differences in viral dynamics between geno-
interferon. J. Hepatol. 26:6–13.
types 1 and 2 of hepatitis C virus. J. Infect. Dis. 182:28–35.
119. Veillon, P., C. Payan, H. Le Guillou-Guillemette, C. Gaudy, and F. Lunel.
91. Pacheco, B., et al. 2006. Membrane-perturbing properties of three peptides
2007. Quasispecies evolution in NS5A region of hepatitis C virus genotype
corresponding to the ectodomain of hepatitis C virus E2 envelope protein.
1b during interferon or combined interferon-ribavirin therapy. World J.
Biochim. Biophys. Acta 1758:755–763.
92. Pascu, M., et al. 2004. Sustained virological response in hepatitis C virus
120. von Wagner, et al. 2008. Placebo-controlled trial of 400 mg amantadine
type 1b infected patients is predicted by the number of mutations within the
combined with peginterferon alfa-2a and ribavirin for 48 weeks in chronic
NS5A-ISDR: a meta-analysis focused on geographical differences. Gut
hepatitis C virus-1 infection. Hepatology 48:1404–1411.
121. Weiner, A. J., et al. 1992. Evidence for immune selection of hepatitis C virus
93. Pawlotsky, J. M., et al. 1999. Evolution of the hepatitis C virus second
(HCV) putative envelope glycoprotein variants: potential role in chronic
envelope protein hypervariable region in chronically infected patients re-
HCV infections. Proc. Natl. Acad. Sci. U. S. A. 89:3468–3472.
ceiving alpha interferon therapy. J. Virol. 73:6490–6499.
122. Wiese, M., F. Berr, M. Lafrenz, H. Porst, and U. Oesen. 2000. Low fre-
94. Penin, F., et al. 2001. Conservation of the conformation and positive
quency of cirrhosis in a hepatitis C (genotype 1b) single-source outbreak in
charges of hepatitis C virus E2 envelope glycoprotein hypervariable region
Germany: a 20-year multicenter study. Hepatology 32:91–96.
1 points to a role in cell attachment. J. Virol. 75:5703–5710.
123. Wiese, M., et al. 2005. Outcome in a hepatitis C (genotype 1b) single source
Perez-Berna, A. J., et al. 2008. Interaction of the most membranotropic
region of the HCV E2 envelope glycoprotein with membranes. Biophysical
outbreak in Germany—a 25-year multicenter study. J. Hepatol. 43:590–598.
characterization. Biophys. J. 94:4737–4750.
124. World Health Organization. 1997. Hepatitis C. Wkly. Epidemiol. Rec.
96. Perlemuter, G., et al. 2002. Hepatitis C virus core protein inhibits micro-
somal triglyceride transfer protein activity and very low density lipoprotein
125. World Health Organization. 1997. Hepatitis C: global prevalence. Wkly.
secretion: a model of viral-related steatosis. FASEB J. 16:185–194.
Epidemiol. Rec. 72:341–344.
97. Pfeiffer, J. K., and K. Kirkegaard. 2005. Ribavirin resistance in hepatitis C
126. Xu, J., S. Liu, Y. Xu, P. Tien, and G. Gao. 2009. Identification of the
virus replicon-containing cell lines conferred by changes in the cell line or
nonstructural protein 4B of hepatitis C virus as a factor that inhibits the
mutations in the replicon RNA. J. Virol. 79:2346–2355.
antiviral activity of interferon-alpha. Virus Res. 141:55–62.
98. Polyak, S. J., et al. 2000. The protein kinase-interacting domain in the
127. Yagnik, A. T., et al. 2000. A model for the hepatitis C virus envelope
hepatitis C virus envelope glycoprotein-2 gene is highly conserved in geno-
glycoprotein E2. Proteins 40:355–366.
type 1-infected patients treated with interferon. J. Infect. Dis. 182:397–404.
128. Yao, Z. Q., et al. 2005. SOCS1 and SOCS3 are targeted by hepatitis C virus
99. Puig-Basagoiti, F., et al. 2005. Dynamics of hepatitis C virus NS5A quasi-
core/gC1qR ligation to inhibit T-cell function. J. Virol. 79:15417–15429.
species during interferon and ribavirin therapy in responder and non-
129. Yoshida, T., et al. 2002. Activation of STAT3 by the hepatitis C virus core
responder patients with genotype 1b chronic hepatitis C. J. Gen. Virol.
protein leads to cellular transformation. J. Exp. Med. 196:641–653.
130. You, L. R., et al. 1999. Hepatitis C virus core protein interacts with cellular
100. Pulaski, B. A., M. J. Smyth, and S. Ostrand-Rosenberg. 2002. Interferon-
putative RNA helicase. J. Virol. 73:2841–2853.
gamma-dependent phagocytic cells are a critical component of innate im-
131. Yuan, H. J., M. Jain, K. K. Snow, J. M. Gale, and W. M. Lee. 2010.
munity against metastatic mammary carcinoma. Cancer Res. 62:4406–4412.
Evolution of hepatitis C virus NS5A region in breakthrough patients during
101. Quinlan, R. J. 1986. Induction of decision trees. Mach. Learn. 1:81–106.
pegylated interferon and ribavirin therapy. J. Viral Hepat. 17:208–216.
102. Ray, R. B., L. M. Lagging, K. Meyer, and R. Ray. 1996. Hepatitis C virus
132. Zhang, Y., et al. 2003. Ribavirin treatment up-regulates antiviral gene
core protein cooperates with ras and transforms primary rat embryo fibro-
expression via the interferon-stimulated response element in respiratory
blasts to tumorigenic phenotype. J. Virol. 70:4438–4443.
syncytial virus-infected epithelial cells. J. Virol. 77:5933–5947.
103. Romero-Gomez, M., et al. 2005. Insulin resistance impairs sustained re-
133. Zhou, S., R. Liu, B. M. Baroudy, B. A. Malcolm, and G. R. Reyes. 2003. The
sponse rate to peginterferon plus ribavirin in chronic hepatitis C patients.
effect of ribavirin and IMPDH inhibitors on hepatitis C virus subgenomic
replicon RNA. Virology 310:333–342.
104. Saito, K., M. it-Goughoulte, et al. 2008. Hepatitis C virus inhibits cell
134. Zhu, N. L., et al. 1998. Hepatitis C virus core protein binds to the cytoplas-
surface expression of HLA-DR, prevents dendritic cell maturation, and
mic domain of tumor necrosis factor (TNF) receptor 1 and enhances TNF-
induces interleukin-10 production. J. Virol. 82:3320–3328.
induced apoptosis. J. Virol. 72:3691–3697.
Source: http://www.homepages.ed.ac.uk/aspiliop/2010_2011/lara_coevolution_hcv_2011.pdf
DORAMECTINA 1,1%. Endectocida altamente efectivo para el control y tratamiento de los parásitos internos y externos. NOVIEMBRE DE 2013 Doramectina es una lactona macrocíclica Figura 1. Origen y clasificación de las lactonas semisintética, perteneciente a la familia de macrocíclicas: avermectinas y milbemicinas (tomado de Lifschitz y col 2002).
I n f o r m a t i o n s m a t e r i a l v o m 1 7 . 0 1 . 2 0 0 8 Zysten, Fisteln und Co. Der 18. November 1686 war der aufregendste Tag im Leben des Chirurgen Charles-François Félix (1653-1703). Ein Jahr lang hatte sich sein Patient der von ihm vorge-schlagenen Operation seiner Analfistel verweigert. Nun reckte er ihm den After ent-gegen, bereit, einige schmerzhafte Schnitte zu ertragen, denn mit Betäubungsmitteln war es damals nicht weit her. Der Patient war Schmerzen gewöhnt, zeitlebens hatten ihn unzählige Leiden geplagt. Doch ein Misslingen dieser Operation, wo-