JOURNAL OF VIROLOGY, Apr. 2011, p. 3649–3663 Copyright 2011, American Society for Microbiology. All Rights Reserved.
Coevolution of the Hepatitis C Virus Polyprotein Sites in Patients on Combined Pegylated Interferon and Ribavirin Therapy䌤§ James Lara,* Guoliang Xia, Mike Purdy, and Yury Khudyakov* Molecular Epidemiology & Bioinformatics Laboratory, Laboratory Branch, Division of Viral Hepatitis, Centers for Disease Control and Prevention, 1600 Clifton Road, Atlanta, Georgia 30333 Received 19 October 2010/Accepted 7 January 2011 Genotype-specific sensitivity of the hepatitis C virus (HCV) to interferon-ribavirin (IFN-RBV) combination
therapy and reduced HCV response to IFN-RBV as infection progresses from acute to chronic infection suggest
that HCV genetic factors and intrahost HCV evolution play important roles in therapy outcomes. HCV
polyprotein sequences (n
40) from 10 patients with unsustainable response (UR) (breakthrough and relapse)
and 10 patients with no response (NR) following therapy were identified through the Virahep-C study. Bayesian
networks (BNs) were constructed to relate interrelationships among HCV polymorphic sites to UR/NR out-
comes. All models showed an extensive interdependence of HCV sites and strong connections (P
< 0.003) to
therapy response. Although all HCV proteins contributed to the networks, the topological properties of sites
differed among proteins. E2 and NS5A together contributed
40% of all sites and 62% of all links to the
polyprotein BN. The NS5A BN and E2 BN predicted UR/NR outcomes with 85% and 97.5% accuracy, respec-
tively, in 10-fold cross-validation experiments. The NS5A model constructed using physicochemical properties
of only five sites was shown to predict the UR/NR outcomes with 83.3% accuracy for 6 UR and 12 NR cases of
the HALT-C study. Thus, HCV adaptation to IFN-RBV is a complex trait encoded in the interrelationships
among many sites along the entire HCV polyprotein. E2 and NS5A generate broad epistatic connectivity across
the HCV polyprotein and essentially shape intrahost HCV evolution toward the IFN-RBV resistance. Both
proteins can be used to accurately predict the outcomes of IFN-RBV therapy.

Hepatitis C virus (HCV) is the major etiologic agent of period of decline in viral load (breakthrough) or observed after blood-borne non-A, non-B hepatitis (25). Chronic HCV infec- cessation of therapy (relapse) (52).
tion is an established risk factor for the development of liver Several factors are known to affect therapy outcome in diseases, such as fibrosis, cirrhosis, and hepatocellular carci- HCV-infected patients, most notably the infecting HCV geno- noma (33, 124, 125). Approximately 70% to 80% of HCV- type. There are six major HCV genotypes, 1 to 6 (108, 109).
infected patients fail to clear the virus and progress to chro- Patients infected with genotype 2 are the most responsive, with nicity (89a). At present, there are no preventive vaccines SVR being achievable in 70% to 80% of cases (52, 80). In against HCV. The current, accepted therapeutic approach to contrast, only 50% to 60% of genotype 1-infected patients treating chronic hepatitis C infection involves a 24- or 48-week achieve SVR (48, 55, 80, 90). Genotype 1 is the most prevalent course of pegylated alpha interferon (IFN-␣) combined with genotype worldwide (78). The dependence of IFN-RBV re- ribavirin (RBV) (i.e. IFN-RBV therapy) (48, 52). Because only sponse rates on HCV genotype (48, 52, 55, 80) implies that the 50% to 70% of chronically infected patients develop a sus- composition of the HCV genome plays a role in influencing tained virologic response (SVR) to this treatment (48, 52, 55, therapy outcome.
80) and because patient intolerance to such therapy is common The mechanism of IFN action against HCV is not fully (61, 68, 120), the development and application of other ther- known. It was shown that treatment with IFN activates the apeutic approaches using antiviral compounds that act against host's innate antiviral immune responses by inducing IFN- HCV more efficaciously and yet generate lower rates of ad- stimulated genes (47, 59, 64, 84). Several HCV genomic re- verse effects are major clinical management and public health gions have been found to be associated with resistance to IFN objectives. Therapeutic failure presents in two forms: (i) com- treatment (74). Since responses to IFN differ among HCV plete resistance to treatment (no response [NR]) and (ii) un- strains, associations between IFN therapy outcome and HCV sustainable response (UR), which is characterized by an in- genomic variability in regions such as hypervariable region 1 crease in HCV load observed during therapy after an initial (HVR1) of E2 (87, 118) and the V3 domain of NS5A (34, 79)have been frequently investigated. A correlation was reportedbetween NR and the high complexity of HVR1 variants beforetreatment (87, 118), but it was not confirmed in a subsequent * Corresponding author. Mailing address: Molecular Epidemiology study (75). A high level of V3 heterogeneity was associated & Bioinformatics Laboratory, Laboratory Branch, Division of ViralHepatitis, Centers for Disease Control and Prevention, 1600 Clifton with IFN sensitivity (34, 99, 119). Specific mutations in the core Road, Atlanta, GA 30333. Phone for J. Lara: (404) 639-1152. Fax: protein have also been suggested to determine the early re- (404) 639-1563. E-mail: [email protected]. Phone for Y. Khudyakov: (404) sponse to IFN-RBV therapy (36).
639-2610. Fax: (404) 639-1563. E-mail: [email protected].
Both E2 and NS5A proteins have been implicated in binding § Supplemental material for this article may be found at http://jvi to the IFN-inducible, double-stranded, RNA-activated protein 䌤 Published ahead of print on 19 January 2011.
kinase R (PKR), which is involved in the IFN-induced antiviral response (49). A 12-amino-acid (aa) region located between In this paper, we report modeling of quantitative associa- positions 659 and 670 in E2 known as the PKR-␣ subunit of tions between a global epistatic connectivity among the HCV eukaryotic initiation factor 2 (PKR-eIF2␣) phosphorylation polymorphic amino acid sites and UR/NR outcomes of the homology domain (PePHD) was shown to bind PKR in vitro IFN-RBV therapy. While NR represents complete resistance (113). The PePHD sequence has similarity to the autophos- to IFN-RBV, UR reflects incomplete suppression of HCV or phorylation sites of PKR and the phosphorylation site in the intrahost HCV evolution toward IFN-RBV resistance (93).
eIF2␣. This similarity is greater for HCV genotype 1 than Both UR and NR are associated with HCV persistence despite genotype 2 or 3. However, the association between PePHD treatment (52). With HCV available for analysis at the start sequence and therapy outcomes has not been consistently and end of therapy, these outcomes provide an important shown (1, 98). A PKR-binding domain is located in the C-ter- setting for analyzing genetic changes in the HCV genome minal region of NS5A (49). A variable 40-aa region of this associated with resistance.
domain, termed the interferon sensitivity determining region(ISDR), was reported to play a key role in the IFN therapy MATERIALS AND METHODS
response (37, 38). Analysis of HCV 1b sequences showed an Sequence data. Analyses were conducted using the HCV 1a full-length poly-
association between the number of ISDR mutations and the protein consensus sequences from 20 patients (10 UR and 10 NR cases) iden- response to the IFN therapy (92). However, studies of HCV tified through the Virahep-C study (18, 26). Sequences in the Virahep-C study genotype 2b and 3a did not find such a relation between SVR were sampled from patients before (n ⫽ 20) and at the end of treatment (n ⫽ 20)with pegylated IFN-␣2a and RBV. Analyses included all sites from the entire and NS5A variability (8, 89). Additionally, no binding between HCV polyprotein except for the most C-terminal 56 aa from the NS5B protein.
PKR and the genotype 3a NS5A from the IFN-resistant HCV This sequence data set served as a training set for developing models for pre- strains was observed in vitro (20).
diction of therapy outcomes. For some analyses, a total of 298 HCV 1a full- RBV, a guanosine nucleotide analog, is inefficacious against length consensus polyprotein sequences from GenBank were used. In addition,full-length NS5A protein consensus sequences from 18 treatment-naïve patients HCV when used alone but when combined with IFN therapy (6 UR and 12 NR) identified through the HALT-C trial (131) were used as a test dramatically improves viral clearance and decreases relapse data set to validate the NS5A predictive models constructed from the Virahep-C rates (42). The mechanism by which RBV improves treatment data. A full listing of the GenBank accession numbers of all sequences used in responses is not well understood. Several mechanisms of its this study can be found in the supplemental material.
therapeutic action have been proposed, including inosine An alignment of the HCV viral sequences from all three data sets was gen- erated using the Clustal W program (115) implemented in BioEdit v7.0.5.3 (58).
monophosphate dehydrogenase inhibition (133), viral inhibi- HCV H77 (GenBank accession no. AF009606) was used as the reference se- tion (77), facilitation of Th1 immunoresponses (111), mu- quence. In addition, alignments of consensus sequences for individual gene tagenesis (27, 76), inhibition of 5⬘ cap formation on mRNAs products were generated using the Virahep-C data. Each amino acid site was (53), and upregulation of genes involved in IFN signaling (44, numbered according to its position in the HCV polyprotein. For modeling, eachsequence was associated with the IFN-RBV therapy outcome, UR or NR. To- 132). However, none of these mechanisms has been convinc- gether, the sequences and assigned therapy outcome attributes constituted the ingly shown to be responsible for its efficacy when combined entire set of viral features representing each HCV variant. These viral features with IFN (42). Nonetheless, RBV was recently shown to im- of the Virahep-C data were used for modeling dependencies among sites in prove early responses to IFN (43), thus supporting its role in relation to treatment response.
enhancing IFN signaling (44, 132) and emphasizing the leading Conditional independence analysis. Pairwise conditional independencies (CI)
among HCV viral features (amino acid sites and therapy outcome) were exam- role of IFN in combination therapy.
ined using full-length polyprotein consensus sequences from the Virahep-C study Host factors have been also found to affect both the natural (18, 26). Testing for CI was performed in the form of undirected independence course of HCV infection and the outcome of treatment (116).
graphs (71), which present the CI among a collection of variables. Nodes in the For example, common-source HCV infections frequently lead graph represent the HCV polyprotein sites and therapy outcome, while linksbetween nodes represent dependencies among the features.
to differential outcomes among incident cases, with some pa- The CI testing was used to validate dependency among the polyprotein sites in tients resolving the infection and some developing chronic relation to the therapy outcome. Only polymorphic sites were considered for hepatitis C (122, 123), or patients chronically infected with the finding CI from the data. The identified dependency between two features was same genotype respond differently to IFN-RBV treatment de- shown in the graph as a link. This type of statistical analysis assumes the null spite carrying similar HCV viral loads (55, 103). In addition to hypothesis of independence between any two given features. Relative strengthsassigned to links in the graph were based on the marginal dependencies between genotype, demographic factors such as ethnicity and gender observed associations. Marginal dependence for each link connecting variables A have been associated with therapy outcomes (48, 80, 103).
and B was quantified through P value. For each set, C, of conditioning variables, Several studies reported the role of the host genetic polymor- a P value for {A, B} was computed, which expresses the probability that A and phism, e.g., in the IL28B locus, in defining the rate of sponta- B are conditionally independent given C. The marginal P value is the valuecorresponding to C ⫽ {A, B}. The marginal dependence between A and B is neous clearance (114) and IFN-RBV SVR (50, 110).
defined as 1 minus the marginal P value associated with {A, B}, where a marginal Many host selection pressures, including innate and adaptive dependence of 0 means that A and B are completely independent and 1 means immune responses, shape HCV evolution, and their effects that they are completely dependent. The CI among the features was measured at should be reflected in HCV genetic composition and epistatic several different levels of significance (thresholds between 0.05 and 5 ⫻ 10⫺6).
connectivity among genomic sites. Indeed, polymorphic sites Undirected independence graphs and statistical computations of CI were con-ducted as implemented in the commercially available software package Hugin within the HCV genome have been shown to be organized as Researcher (v6.8).
a network of coordinated substitutions (17), with the topology Bayesian network (BN). Relationships among amino acid sites of the HCV
of the network being different for HCV strains that are resis- polyprotein and therapy outcome were examined using probabilistic graphical tant or sensitive to treatment (7). Although indicating a strong models in the form of a Bayesian network (BN) (63), where nodes in the graphrepresent variables (here, amino acid sites and therapy outcome) and links association of many HCV sites with outcomes of therapy, these between the nodes represent relationship. Unlike the undirected independence networks, however, do not provide quantitative measures for graphs, BNs provide a more complex notion of the relationships. This includes viral genomic parameters related to IFN treatment.
the notion of the conditional probability and directionality of the relationship.
HCV GENETIC PATTERNS OF ADAPTATION TO IFN THERAPY Links connecting two variables (nodes in the graph) are represented as arcs, perform well when applied with feature selection (56). The DTBN model splits which may project toward the node (incoming links) or from the node (outgoing the features into two groups: one group assigns class probabilities based on naïve links), thus specifying the direction of influences among variables. Relationships Bayes, and the other assigns probability class based on a decision table. The between variables in a BN may be interpreted as causal (22). The conditional resulting probability estimates are then combined to estimate the probability of probability distributions are represented in the conditional probability tables the outcome class association.
(CPTs) of the variables (features) in the network. CPTs of the BNs in this study Physicochemical properties of HCV variants. Each amino acid can be repre-
represent amino acid probability distributions at each site and probabilities sented as a set of physicochemical properties. Using these properties, the HCV associated with therapy outcome. Inference of the network structure and param- polyprotein consensus sequences from the Virahep-C data set were converted eter estimations (i.e., CPTs) of all BNs constructed for this study was performed into the respective physicochemical vectors, which were subsequently used to through Bayesian artificial intelligence learning algorithms.
identify their association with therapy outcome. Analyses were conducted for the BNs were inferred from the Virahep-C data using the HCV polyprotein HCV polyprotein and individual gene products. Conserved positions were not sequences and associated therapy outcomes. The objectives of analysis with BN considered for the physicochemical representation of HCV variants. Position were to examine the complexity of the probabilistic interrelationship and mea- numbering of polymorphic sites was maintained according to the HCV polypro- sure the importance (or strength) of links among the HCV amino acid sites and tein. Sequence alignments comprised of polymorphic sites were transformed into therapy outcome (variables). Measurements of the importance of links were used N ⫻ 5 dimensional numerical vectors, where N is the sequence length and 5 to identify the most influential amino acid sites in the polyprotein BN. The represents the number of physicochemical values assigned to each amino acid importance of a variable can be estimated using the number and strength of links site in the sequence. The five physicochemical factors used in this study have associated with the corresponding node in the BN. The amino acid sites that been previously described (6). Each vector was then associated with the known most strongly influence the probabilities of the treatment outcome were of a therapy outcome (18, 26).
Physicochemical mapping of the data was conducted using a projection pur- The greedy thick thinning (GTT) method (31) was used to infer the BN suit-based technique in the form of a two-dimensional linear projection (LP) structure for the task of examining complexity of interrelationships among the (32). The method was used to search for a combination of the physicochemical variables. The number of incoming links to any given node was constrained vectors (projections) that most accurately separates HCV variants into two between 3 and 10. Parameter estimation of the CPTs was performed using the K2 classes: UR and NR therapy outcomes. The LP mapping can be tested on new priors (28) of each variable in the network. Complexity of the probabilistic data without having to reconstruct the original mapping (32).
interrelationship among amino acid sites and therapy outcomes was also exam- Feature selection was used to identify amino acid sites and their properties ined by individual protein regions. BNs were constructed for each individual most relevant to the therapy outcome-based clustering of the HCV variants. A protein using the same methods as described above for structure learning and minimal subset of site-specific properties (features) from the NS5A protein was network parameterization (GTT and the K2 priors, respectively). BNs were derived, using a heuristic method (73), to search for "interesting" projections that constructed using the GeNIe software (http://genie.sis.pitt.edu/).
were most associated with the therapy outcome. Projections were evaluated during It is important to note that with the increase in the number of variables, the the global and local searching that was performed using the k-nearest neighbor number of possible networks grows superexponentially and computation of the method (k ⫽ 10) and tested by 10-fold cross validation (10-fold CV) for classification probabilities of all links becomes NP-hard (24). Therefore, a search heuristics correctness. Correctness estimation was based on the average probability of a pro- method was adopted to compute the strengths of the links in order to derive jection to be assigned to the correct therapy outcome class. During the global and measures of the importance of relationships among amino acid sites and therapy local searches, 5 ⫻ 106 and 3 ⫻ 106 projections, respectively, were evaluated.
outcome. The maximum spanning tree (MST) algorithm was used to infer the Feature selection (FS). FS was applied to alignments of the full-length con-
BN structure from the data.
sensus polyprotein sequences and individual gene products of the Virahep-C The strength of the probabilistic relationships (or force of the influences) data to determine which amino acid sites were most associated with therapy among variables (amino acid sites and therapy outcome) was inferred by com- outcome. FS reduces dimensionality of the data and improves the prediction puting the Kullback-Leibler (KL) divergence (69) between the joint probability performance of BNCs. The usefulness of each amino acid site for the prediction distribution with and without the link. The greater the KL divergence between of the therapy outcome was evaluated using FS techniques for ranking or select- these two distributions, the greater the strength of the link, hence, the impor- ing an optimal subset of features. Feature ranking was conducted using divide- tance of the relationship it represents. The global importance of an amino acid and-conquer approaches (decision trees) and information-based metrics. Corre- site was calculated as the sum of strength of incoming and outgoing links asso- lation was used as the filtering metric to search for optimal subsets of features.
ciated with the node representing this site in the network. The overall strength of Given that FS techniques have biases known to affect the variable selection links for individual protein regions and relevance to the therapy outcome was optimization method (30, 54), several FS methods were applied.
calculated by summing the strength of incoming (incoming strength) and outgo- Three FS techniques based on information theory were used: information gain ing links (outgoing strength) associated with each region.
(101), Gini gain (16), and gain ratio (101). These methods rank the elevance of The relative significance of the contribution that each amino acid site inde- the features (amino acid sites) based on a score that each feature receives in pendently provided to the knowledge of therapy outcome was determined using relation to the therapy outcomes, UR and NR. The top 25 ranked amino acid a naïve BN (28) approach. The BN structure was inferred from the Virahep-C sites relevant to the UR/NR outcome were selected and used for comparison data using the MST algorithm. This approach identifies associations between the between the techniques. Features that by themselves are not useful for prediction therapy outcome and amino acid sites, with sites considered to be independent (those with a low score) may, however, become useful when combined with other from each other. Mutual information was used to measure contribution of each features and, hence, be relevant to the prediction (54). Therefore, the feature site to the knowledge of therapy outcome (29). All algorithms based on heuristic subset selection method based on correlation (CFS) (57) was applied to the methods used here to infer the BN structures as well as computation of link Virahep-C data. Unlike the ranking methods, the CFS identifies a subset of strength and relevance of variables were carried out as implemented in the features (amino acid sites) based on their degree of correlation to the class Professional Edition of BayesiaLaB software (Bayesia SAS, Laval, France). The variable (therapy outcome) and low intercorrelation between features. This Pearson correlation coefficient was calculated using SAS (version 9.2; SAS In- method was used to search for a minimal subset of complementary amino acid stitute Inc., Cary, NC).
sites to improve the BNC accuracy.
Bayesian network classifier (BNC). BNC was constructed for E2 and NS5A.
Evaluation and validation of the therapy outcome predictors. The E2 and
Both BNCs can infer the probabilities of the UR/NR responses to IFN-RBV NS5A BNC were evaluated by 10-fold CV. Briefly, the HCV variants represented treatment directly from amino acid sequence. The E2 BNC and NS5A BNC were by all polymorphic sites or selected amino acid sites from E2 or NS5A were inferred from the Virahep-C data as follows: (i) the network was initialized as a randomly divided into 10 parts of equal size. Each part was held out strictly as a naïve BN (28), where the therapy outcome was directly linked to all amino acid testing data set to evaluate the prediction accuracy of the BNC trained with the sites; (ii) conditional probabilities for amino acid sites were computed. The K2 remaining nine parts of the data. This process was executed until the BNC was learning algorithm (28) was used to infer BN structure. The maximum number of evaluated with all 10 parts. The 10 accuracy estimates were then averaged to incoming links associated with each node (feature) in BN was constrained to 4.
estimate the overall accuracy of the BNC.
Parameter estimation of CPTs of each feature in the BN was empirically derived Also, BNCs trained with data sets—where the E2 and NS5A protein sequences from the data.
were randomly assigned with UR/NR outcome—were evaluated for prediction The NS5A BNC based on the selected amino acid sites was constructed using accuracy. The results were then compared to the accuracy obtained from the the hybrid decision table-naïve Bayes method (DTNB) (56). The DTNB is a BN BNCs trained with the correct outcome assignment in order to account for any where CPTs are represented by a decision table. This method has been shown to random statistical correlations present in the Virahep-C data.

FIG. 1. Undirected independence graphs showing relative strengths of the dependencies (links in the graph) found among HCV polyprotein sites (nodes in the graph) and UR/NR outcomes following IFN-RBV therapy from 40 sequences obtained from 10 UR and 10 NR patients in theVirahep-C data. Feature pairs whose dependencies exceed the threshold are linked. HCV polyprotein sites are grouped by region; from left toright: core, E2, NS2, NS4A, and NS5A (upper row), and E1, P7, NS3, NS4B, NS5B (lower row). Therapy outcome is shown as a single node atthe top of the graphs. (a) Initial explorative search for significant dependencies (P ⬍ 0.05) followed by gradual decrease in thresholds: 0.0032 (b),3 ⫻ 10⫺4 (c), 6 ⫻ 10⫺5 (d), and 5 ⫻ 10⫺6 (e) (f) Reduced diagram of summarized dependencies between HCV polyprotein sites and treatment outcome (P ⱕ 0.003).
Two measures of accuracy were used for classification performance: overall HCV polymorphic amino acid sites and their potential linkage percent classification correctness and precision. The overall percent correctness to the UR/NR outcome of IFN-RBV therapy was conducted was measured as [(no. correctly classified instances/total no. of instances) ⫻ 100].
Precision was determined in the following manner (where TP is the number of using 40 HCV full-genome sequences obtained before and at true positives, TN is the number of true negatives, FP is the number of false positives, the end of therapy from 10 UR and 10 NR patients from the and FN is the number of false negatives): precision ⫽ 关TP/(TP FP)] ⫻ 100%; Virahep-C study (18). The HCV sequences from before and TP ⫽ [TP/(TP FN )] ⫻100%; FP ⫽ [FP/(FP TN)] ⫻ 100%.
after therapy were used to account for HCV evolution during The validation of the NS5A predictive models was conducted using the consensus sequences of the NS5A protein from the HALT-C study (131), which were not part treatment. A total of 551 polymorphic sites were found in the of any of the analyses described herein. Estimation of the NS5A BNC and NS5A-LP HCV polyprotein consensus sequences from these patients. CI accuracy of prediction of treatment outcome for HCV NS5A variants from the tests were performed to measure the degree of dependency HALT-C study was based on the overall percent classification correctness.
among the polymorphic amino acid sites and the UR/NR out-come of IFN-RBV therapy. Results of the CI test were visually displayed as the undirected independence graph (71) (Fig. 1), Complex interdependence between polymorphic sites and
in which the conditional dependencies among amino acid sites therapy outcome. CI analysis of interdependencies between
and UR/NR outcome (shown as nodes in the graph) are rep- HCV GENETIC PATTERNS OF ADAPTATION TO IFN THERAPY TABLE 1. Propertiesa of the HCV polyprotein BN No of linksb a Properties correspond to a polyprotein BN inferred for the Virahep-C HCV variants.
b The number of incoming links to any given node was constrained to a maximum of 10. The node count represents the total number of polymorphic sites from each polyprotein region that contributed to the BN. Outdegree, total number of outgoing links from all sites in each protein region. Indegree, total number of incoming linksto sites in each region. Max. outdegree, maximum number of outgoing links from any one site from the respective protein region. To response, number of links withdirect relationship to the outcome in the BN.
c Total number of links in the network.
resented as undirected links or edges. The undirected graph into this polyprotein BN are listed in Table S1 in the supple- displayed numerous links representing a dense and complex mental material.
network of dependencies (P ⬍ 4 ⫻ 10⫺4) between amino acid As shown in Table 1, HCV proteins do not contribute polymorphic sites across the entire HCV polyprotein and ther- equally to the network topology. The E2 and NS5A regions of apy outcome. A large number of links among sites within and the HCV polyprotein are the two major contributors of sites between individual proteins remained present up to a thresh- into the HCV polyprotein BN (21% and 18% of amino acid old value of 2 ⫻ 10⫺5. E2 protein sites formed the strongest sites, correspondingly). The E1, E2, and NS5A regions are also dependencies. For example, site 612 of E2 is strongly con- major contributors of links into the network (25.3%, 42%, and nected to site 233 in E1 (P ⫽ 3 ⫻ 10⫺8), and site 642 of E2 to 26.8% of all links, correspondingly). The majority of links are site 1756 in NS4B (P ⫽ 2 ⫻ 10⫺8). Also, sites 482 and 612 are between proteins, with only 17.5% of all links being within strongly connected to site 642 (P ⫽ 2 ⫻ 10⫺9). It is important individual protein regions. Among all E2 links, 18.9% are to note that links connecting amino acid sites to therapy out- among E2 sites, whereas all other proteins contain only 1.4% come were among the strongest (P ⱕ 7 ⫻ 10⫺5). As shown in to 10.5% of intraprotein links. Owing to the large number of Fig. 1, therapy outcome was strongly linked (P ⱕ 0.003) to polymorphic sites contributing to the network, the E2 and amino acid sites from the E1 (site 242), E2 (sites 397, 434, 524, NS5A proteins are extensively connected to each other and to and 655), P7 (site 790), NS3 (site 1090), NS5A (sites 2280, all other proteins. As shown in Fig. 3, ⬃20% of all E2 sites 2283, 2320, 2366, 2411, 2413, and 2414), and NS5B (sites 2530, have direct links to NS5A, and ⬃35% of all NS5A sites have 2633, 2730, and 2747) regions. The strongest dependencies direct links to E2 in the polyprotein BN, indicating a significant were found with sites from P7 (site 790; P ⫽ 2 ⫻ 10⫺4), NS5A coordination of substitutions between these two proteins.
(site 2280 and 2283; P ⫽ 3 ⫻ 10⫺4 and P ⫽ 7 ⫻ 10⫺5, respec- Despite generating many connections (n ⫽ 554) and con- tively), and NS5B (site 2633; P ⫽ 9 ⫻ 10⫺4). These data tributing many sites (n ⫽ 118) to the polyprotein BN (Table 1), suggest strong coordination of substitutions at sites along the E2 does not have direct links to therapy outcome. Only six sites entire HCV polyprotein and association between polymorphic form such direct connections, with two sites (at positions 864 sites and therapy outcome.
and 934) being from NS2, a single site (at position 1841) from Contribution of different proteins to therapy outcome. To
NS4B, two sites (at positions 2280 and 2283) from NS5A, and infer a more insightful representation of the relationships a single site (at position 2633) from NS5B.
among polymorphic sites and therapy outcome, a Bayesian The core protein contributes only 11 sites (1.8% of all sites) network (BN) approach (63) was used. The complexity of but 136 links (10.3% of all links) to the polyprotein BN, with relationships among HCV polymorphic sites and UR/NR out- each site being connected to ⬃13 other sites, which is ⬃3 to 8 come was evaluated by inferring BNs from the full-length HCV times more than the individual sites from any other HCV polyprotein consensus sequences. The properties of the net- protein (Table 1). The E1 sites contain 4.5 connections on work are listed in Table 1. In concordance with the undirected average, while sites of all other proteins are linked on average interdependence graph findings, interrelationships among all to 1.5 to 2.7 other sites. The essential difference is in the polymorphic sites were found to be highly complex. Figure 2 directionality of links among proteins. Two proteins, core and shows the structure of the polyprotein BN containing 551 poly- E1, located at the N terminus of the HCV polyprotein, have morphic amino acid sites and their association to therapeutic 92% and 69.7% of their links directed outside, respectively, outcome. Although all sites are interdependent, the number of suggesting their important causal role in defining states of links broadly varies from 1 to 30 among sites. Sites contributing many polyprotein sites connected to these two proteins. All

FIG. 2. BN (P ⫽ 3) of inferred relationships among the full-length HCV polyprotein sites and IFN-RBV therapy outcome. Polyprotein sites and outcome are represented as nodes in the graph. Relationships among features are represented as arcs. Features whose probabilisticdependencies exceed the conditional independency tests and GTT scoring are connected. The graph was constructed using a spring-embeddednetwork layout algorithm. Features are color-coded by region (inset), and therapy outcome is shown in red.
other proteins have almost equal measures of incoming (in-degree) and outgoing (out-degree) links.
Many essential properties of the polyprotein BN constructed using the 40 Virahep-C sequences, except for linkage to ther-apy, were observed with another BN constructed using HCVgenotype 1a full-length genome sequences obtained fromGenBank (n ⫽ 298). As shown in Fig. 3 and 4, the GenBankBN and Virahep-C BN have similar distributions of links, andinterrelationships among individual proteins are highly corre- FIG. 3. Distribution of links among polymorphic sites of the HCV lated (r ⫽ 0.99, P ⬍ 0.0001), indicating that the overall coor- 1a NS5A or E2 proteins with other viral proteins in the HCV poly- dination among substitutions in the HCV genotype 1a data set protein BN. E2 and NS5A interrelationships are compared between has been adequately represented by the Virahep-C sequences the HCV polyprotein BNs inferred from GenBank data and Vira- used in this study. However, variations in the number of poly- hep-C data. Sites (%), percentage of sites from each region that werelinked to sites from NS5A or E2.
morphic sites are observed between the GenBank BN (n

HCV GENETIC PATTERNS OF ADAPTATION TO IFN THERAPY Protein sites relevant to therapy outcome. Observation of a
significant interconnection and coordination among HCV pro-teins suggests that all proteins contribute to determining theUR/NR outcome. To analyze these contributions in more de-tail, BNs were constructed for the individual gene products.
Extensive dependencies between sites and association to ther-apy outcome were found in all individual polyprotein regions,albeit to different degrees. The E2 and NS5A regions werefound to form a more dense set of links than other regions ofthe polyprotein (Table 2).
Although many polymorphic sites were found to be inter- linked in the model shown in Fig. 1, indicating a significantcoordination of heterogeneity along the HCV polyprotein,there are a large number of sites with very few links, suggestingtheir marginal contribution to the polyprotein BN. To evaluatewhich proteins and amino acid sites were most associated withthe outcome, we conducted feature selection experiments. Byusing a naïve Bayesian network with feature selection, the E2and NS5A polyprotein regions were found to contribute thegreatest number of sites relevant to outcome (27.5% and26.3%, respectively) (Fig. 5). Similar results were observedwith four filtering methods for feature selection (Fig. 6). Each FIG. 4. Relative strength and direction of links associated with of the feature selection techniques extracted a certain number individual HCV proteins in the Virahep-C BN (A) and GenBank BN of the most relevant sites. A greater proportion of sites were (B). The total strength of all outgoing links (blue bars), incoming links(red bars), and the global strength (green bars) are shown for each selected from E2 and NS5A as relevant to the outcome (Fig.
6). The NS5A region consistently contributes a large numberof relevant sites with all four feature selection techniques.
Depending on the technique, 14.3% to 32% and 24.0% to 44% 1,296) and polyprotein BN (n ⫽ 551). Despite the greater of amino acid sites were, respectively, selected from E2 and number of polymorphic sites in the GenBank sequences, the NS5A as contributing to the outcome. All of the techniques Virahep-C sequences contain 25 unique polymorphic sites dis- used selected significantly overlapping sets of the relevant tributed among all but core proteins: at positions 230, 349, and amino acid sites from all proteins, albeit with variations in 381 in E1; 385, 582, 631, and 742 in E2; 768 in P7; 826 and 926 ranking among the selected sites (see Table S2 in the supple- in NS2; 1385, 1461, 1520, 1528, 1565, and 1592 in NS3; 1681 in mental material). A set of sites selected using one of the NS4A; 1805, 1820, and 1846 in NS4B; 2003, 2049, and 2343 in techniques is shown in Table 3.
NS5A; and 2500 and 2548 in NS5B. These findings—in con- Relationships between variables in a BN may be interpreted junction with the observation of the 1.7-fold increase in the as being causal (22), which can be applied to detect relevance number of links between sites in E1 and E2 and the 2-fold of a variable to define a target feature, in this case, therapy increase between sites in E2 and NS5A in the Virahep-C BN outcome. Analysis of the strength of influence measured as the compared to the GenBank BN (Fig. 3)—suggest the treat- Kullback-Leibler divergence (69) between the joint probability ment-specific variations in coordination of substitutions at the distribution with and without the arc shows that sites from the genomic sites in the UR/NR HCV strains.
E2 and NS5A proteins have the strongest overall influences on TABLE 2. Propertiesa of the BNs for individual protein regions No. of linksb a Properties correspond to protein-BN inferred from the Virahep-C data using alignments of the individual HCV gene products.
b The maximum number of incoming links was constrained to 10. The NS3-BN reached maximum complexity at a constraint of 11. Arc count, total number of links in the BN; Avg or Max. outcomes, numbers of amino acid states (heterogeneity) of protein sites.

TABLE 3. Correlation-based feature selection (CFS) of HCV sites relevant to UR/NR outcomesa Polyprotein positions 29, 48, 75, 81, 106, 147, 161 192, 210, 230, 231, 236, 242, 243, 256, 280, 287, 293, 300, 308, 314, 345, 372, 379 394, 397, 434, 478, 480, 490, 498, 524, 528, 534, 591, 595, 625, 655, 668 762, 763, 767, 768, 770, 777, 789, 790 814, 824, 841, 843, 873, 934, 938, 941, 957, 958, 962, 982, 1017, 1021 1068, 1087, 1088, 1090, 1115, 1124, 1145, 1148, 1196, 1200, 1239, 1266, 1306, 1366,1398, 1405, 1409, 1412, 1417, 1428, 1444,1461, 1592 1681, 1686, 1687, 1693, 1700 1737, 1753, 1759, 1804, 1816, 1841, 1941, 1968 2024, 2043, 2280, 2283, 2320, 2366, 2376, 2501, 2530, 2582, 2629, 2633, 2730, 2747, FIG. 5. BN with selection of relevant sites linked to the UR/NR 524, 790, 1090, 1409, 1592, 2024, 2280, 2283, outcomes. Site selections were based on the BN choice of relevant 2366, 2376, 2414, 2530, 2633, 2950 features for outcome prediction. A total of 80 HCV polyprotein sitesare shown. Nodes are color coded by region (inset).
a List of subset of amino acid sites relevant to outcome prediction. Subsets of sites from each region were determined by filtering out less-predictive sites. CFSwas applied to data sets: 10 data sets representing sites from each individual geneproduct and the therapy outcome and 1 data set representing the full-length outcome (Fig. 7). Additionally, analysis of contribution of in- HCV polyprotein sequences and associated therapy outcome. Site subsets listed dividual sites to the UR/NR outcome was conducted using a for the E2 and NS5A proteins were used in E2-BNC and NS5A-BNC (Fig. 10).
ratio of the mutual information calculated for each site and theoutcome over the maximal mutual information (MI) (MI ⫽0.3951, P ⫽ 0.0001) was calculated for site 2283 in the NS5A interesting observation is that hypervariable region 1 (HVR1) protein. Using this ratio as a measure of the relative signifi- contributes five of eight relevant sites in E2, thus suggesting cance of each site for determining outcome, 25 sites were that HVR1 heterogeneity is associated with HCV evolution identified in six proteins with values for this ratio being ⬎0.5 toward the IFN-RBV resistance.
(Fig. 8). Among these sites were one site at position 242 in E1, Association of protein physicochemical properties with IFN-
eight sites at positions 391, 394, 397, 400, 401, 434, 528, and 655 RBV resistance. The observation of coordinated substitutions
in E2, two sites at positions 753 and 790 in P7, one site at in all HCV proteins suggests extensive interrelationships position 941 in NS2, nine sites at positions 2153, 2198, 2280, among phenotypic traits encoded by these proteins and an 2288, 2320, 2339, 2375, 2376, and 2413 in NS5A, and four sites important role of these interrelationships in defining HCV at positions 2633, 2730, 2747, and 2755 in NS5B. The impor- evolution toward IFN-RBV resistance. Although not clearly tant observation from this analysis is that E2 and NS5A to- determined, these phenotypic traits can be further analyzed gether contain ⬃70% of these highly relevant sites. Another using amino acid physicochemical properties as a quantitativeapproximation to phenotype. The factors affecting sequencevariation and diversity should be also reflected in the physico-chemical properties of the HCV polyprotein. Herein, the phys- FIG. 6. Contribution of the UR/NR-relevant sites from individual FIG. 7. Total strength of association between sites of individual HCV proteins identified using four filtering methods.
HCV proteins and the UR/NR outcome.

HCV GENETIC PATTERNS OF ADAPTATION TO IFN THERAPY optimized linear two-dimensional (2D) spaces. The probabi-listic mapping of NR and UR outcomes in these 2D physico-chemical spaces is shown in Fig. 9.
This analysis showed that the probability of outcomes mapped in the optimized 2D physicochemical spaces of thepolyprotein, E2, and NS5A was distributed in the least convo-luted way, providing almost equal representations of UR andNR (Fig. 9). These observations suggest that the physicochem-ical properties of all HCV proteins are related to outcome,albeit to various degrees.
Strong association of E2 and NS5A with IFN-RBV resis-
tance. The results shown above strongly suggest that the IFN-
FIG. 8. Relative significance of association of the HCV polyprotein RBV resistance is encoded in many regions of the HCV poly- sites to the UR/NR outcome. Only sites with relative significance of protein, with E2 and NS5A being strongly linked to this ⬎0.5 are shown. Color code: black, E1; red, E2; green, P7; yellow, NS2; resistance. To further investigate the strength of association of blue, NS5A; and cyan, NS5B. Relative significance is a ratio between the IFN-RBV resistance with variation in the E2 and NS5A the mutual information brought by each feature and the greatest mu-tual information.
primary structure, BN classifiers (BNCs) were developed usingpolymorphic sites from these two proteins. The accuracy ofperformance of the models was evaluated using the 10-fold CV icochemical space dispersion of the HCV variants from the protocol. The results of the evaluation are shown in Fig. 10.
UR/NR Virahep-C cases (18) was examined using a linear The E2 and NS5A BNCs constructed using all polymorphic projection technique (32). The analysis was conducted using sites were found to be 82.5% and 90% accurate in the predic- polymorphic sites of the HCV polyprotein or individual gene tion of outcomes in the 10-fold CV, respectively. BNCs con- products (see Table S1 in the supplemental material). The structed using 15 sites selected from E2 and 9 sites selected polymorphic sites from each protein were converted into vec- from NS5A (Table 3) improved accuracies to 85% and 97.5%, tors of amino acid physicochemical properties (6). For each respectively, while the randomized data sets produced BNCs protein, these vectors were used to generate a multidimen- showing accuracies of only 35% to 47.5% (Fig. 10). Thus, sional physicochemical space and project this space into the although the networks of sites from both proteins have a strong FIG. 9. Physicochemical projection of HCV polyprotein and individual proteins. Shown are the optimized 2D linear projections. Variation in shade of colors reflects probability estimates for UR (red) and NR (blue) outcomes, with darker shades corresponding to greater probability values.
FIG. 10. 10-fold CV performance of the E2 BNC and NS5A BNC constructed using all polymorphic sites (black bar) and se-lected relevant sites (white bar). Results for BNCs with randomizedlabels are shown using patterned bars (black for all and white forselected sites).
association with the IFN-RBV resistance, the NS5A BNCssignificantly outperformed the E2 BNCs in the CV experi-ments.
Prediction of UR/NR outcomes using NS5A. A high accuracy
FIG. 11. Projection of five selected physicochemical features of five of the BNC models described above suggests a strong associ- NS5A sites from the HALT-C sequence data set onto the physico- ation of coordinated substitutions in NS5A with evolution to- chemical space-based model derived from the Virahep-C sequence ward the IFN-RBV resistance. However, since these models data set. Lines originating from the center of the graph are projections were generated using only 40 sequences from 20 patients, it is of five physicochemical features. Circles in the graph map the UR/NRoutcomes of therapy for Virahep-C (unfilled circles) and HALT-C critical to demonstrate that the interrelationships identified for (filled circles). For color coding, see legend to Fig. 9.
these patients are representative of those for other patients.
For this purpose, two predictive models were constructed usingthe same Virahep-C data set and tested using the HCV NS5A sequences from baseline specimens obtained from patients inthe HALT-C study (131). Because no additional data were Two important features of HCV infection, persistence fol- available for E2 from patients with NR and UR outcomes lowing primary infection and resistance to IFN-based therapy, investigated in a single study, only the NS5A models were have been related to the extensive HCV genetic variability (39, 41). Although HCV has developed a very efficient capacity to One model was constructed using physicochemical proper- escape from adaptive (15, 35, 104, 128) and innate immune ties of five NS5A sites selected using a heuristic method (73).
responses (12, 13, 85, 126), ⬃20% to 30% of all HCV infec- The secondary structure for sites at position 2153 (projection tions are cleared by the host (23) and 50% to 70% of chronic X167 in Fig. 11) and 2413 (X492), the electrostatic charge for infections can be successfully treated with IFN-RBV (48, 52, site at position 2198 (X195), the polarity for site at position 55, 80). The variation in response to therapy among HCVstrains remains poorly understood. However, differential sen- 2280 (X281), and the molecular volume or size for site at sitivity of HCV genotypes to IFN therapy (52, 80) suggests that position 2320 (X328) were selected as the most relevant fea- viral genetic factors play an important role in determining tures for outcome in the Virahep-C data set. The LP model therapy outcomes. Despite a low degree of response to treat- mapping UR and NR outcomes into the 2D space generated ment during chronic infection, 80% to 98% of patients with using linear projection from the 5D physicochemical space is acute HCV infection can achieve complete virological re- shown in Fig. 11. Another model was constructed as a hybrid sponse to IFN therapy (51, 62), suggesting that HCV acquires between the decision table and a naïve Bayes (DTBN)-based a significant degree of IFN resistance during chronic infection.
machine-learning technique (56) using 12 NS5A sites: nine Taken together, these observations indicate a strong connec- shown in Table 3 and three additional sites, at positions 2153,2198, and 2413, used in the linear projection approach.
After a 10-fold CV, both Virahep-C models were tested TABLE 4. Validation of the NS5A Virahep-C models using the on the HALT-C data set with 6 NS5A sequences obtained HALT-C NS5A sequencesa from UR and 12 from NR patients. The hybrid DTBN Validation (% accuracy) model showed an overall accuracy of 72.2% and the linear projection model showed an overall accuracy of 83.3% of outcome prediction for the HALT-C patients (Table 4).
This finding suggests that, although many sites along the entire HCV polyprotein are relevant to development of a Shown are the overall prediction accuracies of the BNC (DTNB method) and the IFN-RBV resistance, the small number of features from physicochemical-based LP models using selected NS5A sites (see text for de- the NS5A protein alone may be sufficient for the prediction b Average probability of correct classification in 10-fold CV.
of therapy outcomes.
c Average percent classification correctness in 10-fold CV.
HCV GENETIC PATTERNS OF ADAPTATION TO IFN THERAPY tion between the intrahost HCV evolution and success of the fectivity and viral fitness. Mutation at position 147 in domain 2 IFN-RBV therapy.
of the core protein was found to affect adherence of core to In the current study, an integrative approach was imple- lipid droplets and virus production (107). Our data show that mented for the evolutionary analysis of the HCV genome. This this site has direct links in the polyprotein BN to sites in E1, approach was based on modeling interrelationships between E2, NS2, and domain 3 of NS5A. Another site, from domain 2 polymorphic sites along the entire HCV polyprotein and re- of core at position 161, linked to P7 in addition to these four lating the modeled coordination among amino acid substitu- proteins. All of these proteins play a role in the membrane- tions to the UR/NR outcomes of therapy. Models constructed associated viral replication (86). These observations suggest here showed an extensive interdependence of all polymorphic coordination of heterogeneity across the HCV polyprotein re- sites within the HCV polyprotein, suggesting a significant co- lated to viral production and the important role played by the evolution among individual HCV proteins. The data indicate core protein in this coordination.
that all HCV proteins contain sites coordinating their poly- Two proteins, E2 and NS5A, together contribute ⬃40% of morphism with sites in all other proteins (Fig. 2). A similar all sites and ⬃62% of all links to the polyprotein BN and, observation has been recently made using a correlation net- therefore, essentially define the state of this entire network. In work analysis of the HCV genotype 1a full-genome sequences combination with E1, these three proteins contribute ⬃50% of from untreated patients (17) and patients on therapy (7).
all sites and ⬃77% of all links to the polyprotein BN. It is Among all connections identified using the polyprotein BN in interesting that E2 and NS5A also mutually coordinate their this study, only 17.5% were among sites within individual pro- heterogeneity (Fig. 3). Although coordination between sites teins. It is interesting to note that E2 shows the most extensive from any two HCV proteins is a common feature of the poly- coordination among its sites, with all other proteins having ⬃2 protein BN, this coordination is most extensive between sites to 13 times fewer connections among intraprotein sites than of E2 and NS5A, owing to the large number of sites contrib- E2. With 82.5% of all connections in the network being among uted by these two proteins to the network. Thus, the states of proteins, HCV evolution is evidently defined by coadaptation many sites in one of these two proteins reflect the states of among many phenotypic traits encoded by different HCV pro- many sites in the other protein, suggesting a high degree of coevolution between these two proteins. Additionally, it was Although all HCV proteins contribute to the network, the observed that sites from E2 formed the strongest links with topological properties of sites differed among proteins. The many other sites in the polyprotein as determined by CI testing core protein contributes fewer sites (n ⫽ 11) per its size than (Fig. 1), among which were links between sites 482 and 642 in any other HCV protein. However, each core site forms ⬃2 to E2 (P ⫽ 2 ⫻ 10⫺9), 612 in E2 and 233 in E1 (P ⫽ 3 ⫻ 10⫺8), 4 times more links in the network than any site from other and 642 in E2 and 1756 in NS4B (P ⫽ 2 ⫻ 10⫺8). Taking into proteins (Table 1). This protein has 12.4 times more outgoing consideration that site 482 is from the CD81-binding region than incoming links in the polyprotein BN, while the ratio (45, 127) and site 612 from one of two E2 regions proposed to between outgoing and incoming links for all other proteins be involved in the viral fusion process (72, 91, 95), we speculate varies from 0.8 to 2.2 (Table 1). Another important feature of that the tight coordination between sites 482 and 642 as well as core connectivity in the polyprotein BN is that 98.6% of all that between sites 612 and 233 is associated with viral entry.
core links are with other proteins. The presence of only two Another important observation made in this study is that all intraprotein links (polyprotein positions 903110 and 47329)makes the core protein the least intraconnected protein, indi- HCV proteins have association with the UR/NR outcome of cating a minimal direct coordination among core polymorphic IFN-RBV therapy. Taking into consideration the aforemen- sites. Thus, the contribution of core to the network topology tioned extensive linkage among polymorphic sites from differ- differs considerably from those of all other proteins, suggesting ent proteins, this observation, although not surprising, reveals that this protein has a unique role in coordinating substitutions that the HCV response to immunomodulatory therapy is a very and defining heterogeneity at many sites of the HCV polypro- complex trait involving numerous viral functions that require coordination. All networks constructed for individual proteins This observation is in agreement with the multitude of func- included the UR/NR outcome as a variable (Table 2). How- tions performed by the core protein and emphasizes its impor- ever, this observation cannot be unequivocally interpreted in tant role in HCV infection. In addition to forming the nucleo- terms of equal contribution of each protein to the IFN-RBV capsid (105), this protein was shown to interfere with many response. Nevertheless, it suggests that the genome-wide co- cellular signaling pathways involved in apoptosis (134), tran- ordination among sites is important for this response, with scription (60, 130), and transformation (21, 65, 102, 129). The some proteins possibly playing accessory roles and reflecting core protein is also involved in lipid metabolism (10, 96). It the IFN-RBV-related changes in other proteins that are inhibits the microsomal triglyceride transfer protein, binds to mainly responsible for resistance. The analysis conducted here apolipoprotein AII, and induces accumulation of cytoplasmic revealed that sites substantially associated with the outcome lipid droplets (2). Core and NS5A are key factors for assembly are scattered along the entire HCV polyprotein. Among the of infectious particles. Both colocalize on the surface of lipid sites with relevant significance of ⬎0.5 (Fig. 8) are sites in E1 droplets, a proposed site for HCV particle assembly (4). With (n ⫽ 1), E2 (n ⫽ 8), p7 (n ⫽ 2), NS2 (n ⫽ 1), NS5A (n ⫽ 9), lipid droplets playing a crucial role in the assembly and release and NS5B (n ⫽ 4). Two proteins, E2 and NS5A, shared 68% of infectious HCV particle (83), interactions involving domain of these 25 sites, suggesting their strong connection to IFN- 2 of core and domain 3 of NS5A (5, 14, 81, 82) are essential for RBV resistance. E2, NS5A, and P7 have, respectively, 6.8%, virion production and, therefore, have a strong impact on in- 9.0%, and 11.7% of their polymorphic sites being highly rele- vant to the therapy outcome, while all other proteins have only without consideration of their relationships seems inefficient in 1.5% to 3.1% of these sites.
detecting a reliable connection to the outcomes. Only 3 among One surprising finding was that five among the eight sites 25 sites having the highest value of mutual information with most relevant to therapy outcome are located in HVR1 of the the outcome (Fig. 8) were found to be directly linked to the E2 protein (aa 384 to 410), emphasizing a strong connection of outcome in the polyprotein BN (Fig. 2). The same 3 sites, 2280, HVR1 heterogeneity to IFN-RBV resistance. Association of 2283, and 2633, are among the 14 most relevant sites extracted HVR1 sites with outcomes of therapy can be also found in the from the HCV polyprotein using correlation-based feature se- correlation networks (7). However, the significance of these lection (Table 3) and among 18 sites that have the strongest observations is not apparent. Analysis of HVR1 connectivity in connections to outcome in the undirected dependence graph the polyprotein BN showed that polymorphic HVR1 sites have (Fig. 1). All computational techniques used in this study a total of 140 links to all HCV proteins, with each HVR1 site ranked the contribution of various sites differently. For exam- being connected to three to nine sites in the HCV polyprotein.
ple, only 12 sites were shared by 18 sites shown in Fig. 1 and 25 Such an extensive interdependence of HVR1 sites with many sites shown in Fig. 8. Although sites 2280 and 2283 from NS5A sites across the entire HCV polyprotein (Fig. 3), in conjunction and site 2633 from NS5B were frequently identified as most with the earlier similar observations using network analysis relevant to the IFN-RBV response, analysis of states at these (17), suggests that the HVR1 substitutions are not random and sites is not sufficient for an accurate prediction of the therapy that HVR1 evolution is substantially coordinated with all HCV outcome (data not shown). Such a prediction requires the use proteins. Coordination of HVR1 heterogeneity is especially no- of a combination of sites selected for their collective contribu- ticeable with E1, E2, and NS5A, which share, respectively, 15%, tion to the outcome.
26.4%, and 14% of all HVR1 links in the polyprotein BN, while For that purpose, we conducted a series of experiments for any other HCV protein shares 3.6% to 9.3% of HVR1 links.
selection of site sets most relevant to the therapy outcome HVR1 contains antigenic epitopes (66, 67, 112, 121) with from the entire HCV polyprotein and individual proteins (Ta- HCV neutralizing activity (40). Rapid HVR1 evolution is as- ble 3). Two proteins, E2 and NS5A, were explored in detail. As sociated with immune escape (70). However, the conservation mentioned earlier, both proteins have many polymorphic sites of the HVR1 physicochemical properties and conformation and contributed many links to the polyprotein BN. These two (94) argues that this region is significantly functionally con- proteins consistently made substantial contributions of the strained despite its extensive heterogeneity. The observation most relevant sites identified using different feature selection that compensatory mutations in the ectodomain of E2 (46) and techniques (Fig. 5 and 6). Probabilistic mapping of UR and NR the I347L mutation in E1 compensate for HCV fusion impair- outcomes in 2D physicochemical space showed an equally rep- ment (9) in HCV mutants whose HVR1 have been excised resentative distribution of the outcome probabilities for E2, suggests potential functional relationships of this region with NS5A, and the polyprotein (Fig. 9). All these findings strongly other parts of the HCV genome. HVR1 was shown to be suggest that these two proteins have a strong connection to involved in the SR-B1-facilitated entry of HCV pseudopar- therapy response and can be used for the accurate prediction ticles in cell culture (11). It was suggested that HVR1 plays an of therapeutic outcomes. However, as can be seen in Fig. 10, important role in HCV entry by modulating receptor recogni- the 10-fold CV experiments showed that the NS5A BN out- tion and affects lipoprotein composition and infectivity of viral performs the E2 BN constructed using complete sets of poly- particles (9). HVR1 heterogeneity was also associated with the morphic sites (82.5% versus 90% accuracy) or feature-selected development of resistance to therapy (74, 87, 117, 118). We sites (85% versus 97.5% accuracy). These results, taken to- hypothesize that complex functional relationships of HVR1 gether with the observation that NS5A contains two of six sites are reflected in coordinated evolution with other HCV pro- directly connected to the therapy outcome in the polyprotein teins and that HVR1 mirrors the evolution of the entire HCV BN while E2 has no direct links to the outcome, suggest that genome, including evolution toward the IFN-RBV resistance.
NS5A has a very strong relevance to evolution toward the There are many sites from different HCV proteins strongly IFN-RBV resistance.
linked to the IFN-RBV resistance (Fig. 1 and 8). However, Two sites, at positions 2376 and 2414 in NS5A, have exper- consideration of individual sites allows only for the identifica- imentally been associated with the development of resistance tion of connections to the therapy outcome in the form of a to RBV (97). It is important to note that these two sites were trend and does not have a strong predictive power. Correlation consistently selected as being relevant to the therapy outcome of the IFN-RBV therapy outcomes has been reported with site (Table 3 and Fig. 8), indicating that the NS5A BN as well as polymorphisms in the core (36), E2 (87, 106), and NS5A (88, polyprotein BN constructed using all or feature-selected sites 106) proteins. Although these observations revealed numerous includes links that reflect contribution of RBV to therapy. Site associations between the HCV genetic polymorphism and evo- 2414 located in domain 3 of NS5A is linked to site 161 in lution toward IFN-RBV resistance, these associations were domain 2 of core in the polyprotein BN. As mentioned earlier, never explored in terms of their interrelationships and formulated both domains are involved in protein-protein interactions be- into an integrative model capable of revealing accurate quantita- tween these two proteins, association with lipid droplets, and tive connections between HCV genetic changes and therapy out- assembly and release of viral particles (81, 83). There seems to be a linkage between coevolution of the core and NS5A pro- The current report presents several probabilistic models teins and RBV resistance, and this resistance is associated with connecting the UR/NR outcome to coordinated changes at interaction between these two proteins. The final validation of polymorphic sites across the entire HCV polyprotein as well as the two predictive NS5A Virahep-C models using the HALT-C from individual HCV proteins. Analysis of individual sites data strongly confirms a robust connection between coordina- HCV GENETIC PATTERNS OF ADAPTATION TO IFN THERAPY tion among the NS5A sites and IFN-RBV resistance. Addi- responsible for the response to IFN, there is no single IFN tionally, it shows that a small number of features from NS5A resistance mutation. Once established, the wide-ranging epi- alone may be sufficient for the prediction of therapy outcomes static connectivity among sites involved in the IFN response (Fig. 11). This finding suggests that analysis of a very few sites may not be rapidly reverted even with reduction of the selec- from a small HCV genomic region, such as NS5A, may be used tion pressure in the absence of treatment, thus locking the for monitoring sensitivity to the IFN-RBV therapy.
HCV genome into the state of resistance to IFN. Without A general interconnectivity among HCV proteins was com- being eliminated by IFN-RBV therapy, these variants can con- parable for the 40 Virahep-C sequences and the 298 HCV tinue to circulate among human hosts. In contrast, IFN-RBV- genotype 1a full-genome sequences obtained from GenBank sensitive strains are being removed from circulation. This con- (Fig. 3 and 4), indicating that the modeled coordination among sideration implies that the current widespread adoption of substitutions is essentially similar for all HCV variants from IFN-based therapy, although extremely beneficial for individ- treated and treatment-naïve patients. This observation addi- ual patients with SVR, may affect the composition of the cir- tionally suggests that the development of resistance during culating HCV population and enlarge the reservoir of IFN- immunomodulatory therapy is generally shaped by selection resistant HCV, a potentially alarming public health issue that pressures similar to the HCV evolution in untreated patients.
warrants a further investigation.
However, there are some important differences between the polyprotein BNs generated using sequences from treated andtreatment-naı¨ve patients. The GenBank sequences from un- We are grateful to Chong-Gee Teo for critical review and discussion treated patients contain more polymorphic sites (n ⫽ 1,296) of findings in this paper as well as to two anonymous reviewers forimportant comments.
than the Virahep-C sequences (n ⫽ 551). Despite this fact, the This work was supported by CDC intramural funding.
Virahep-C sequences contain 25 polymorphic sites that are This information has not been formally disseminated by the Centers conserved in the GenBank sequences. These sites are distrib- for Disease Control and Prevention/Agency for Toxic Substances and uted within E1 (n ⫽ 3), E2 (n ⫽ 4), P7 (n ⫽ 1), NS2 (n ⫽ 2), Disease Registry. It does not represent and should not be construed to NS3 (n ⫽ 6), NS4A (n ⫽ 1), NS4B (n ⫽ 3), NS5A (n ⫽ 3), and represent any agency determination or policy.
NS5B (n ⫽ 2). Among them, sites at positions 230 in E1, 768 in P7, and 1461 and 1592 in NS3 are the most relevant to the 1. Abid, K., R. Quadri, and F. Negro. 2000. Hepatitis C virus, the E2 envelope
protein, and alpha-interferon resistance. Science 287:1555.
IFN-RBV response (Table 3). Furthermore, the two BNs had 2. Andre, P., G. Perlemuter, A. Budkowska, C. Brechot, and V. Lotteau. 2005.
topological differences in the number of interprotein links, Hepatitis C virus particles and lipoprotein metabolism. Semin. Liver Dis.
most notably the 1.7- and 2-fold proportional increase in the 3. Reference deleted.
number of links between E1 and E2 and between E2 and 4. Appel, N., et al. 2008. Essential role of domain III of nonstructural protein
NS5A in the Virahep-C BN compared to those in the 5A for hepatitis C virus infectious particle assembly. PLoS Pathog.
GenBank BN (Fig. 3). These observations suggest that de- 5. Appel, N., et al. 2008. Essential role of domain III of nonstructural protein
spite the similarity of these two networks, there are distinct 5A for hepatitis C virus infectious particle assembly. PLoS Pathog.
differences in coordination among substitutions in HCV 6. Atchley, W. R., J. Zhao, A. D. Fernandes, and T. Druke. 2005. Solving the
from treated and treatment-naı¨ve patients.
protein sequence metric problem. Proc. Natl. Acad. Sci. U. S. A. 102:6395–
IFN is a major component of innate immunity (19, 100).
Several HCV proteins are involved in modulation of the host 7. Aurora, R., M. J. Donlin, N. A. Cannon, and J. E. Tavis. 2009. Genome-
wide hepatitis C virus amino acid covariance networks can predict response IFN response (12, 13, 85, 126). RBV used as a component of to antiviral therapy in humans. J. Clin. Invest. 119:225–236.
combined therapy seems to facilitate early response to IFN 8. Bagaglio, S., et al. 2003. Genetic heterogeneity of hepatitis C virus (HCV)
(43) rather than playing a strong independent role. Resistance in clinical strains of HIV positive and HIV negative patients chronically
infected with HCV genotype 3a. J. Biol. Regul. Homeost. Agents 17:153–
to IFN is not clearly linked to any specific mutation within the HCV genome. As shown in this study, HCV adaptation to IFN 9. Bankwitz, D., et al. 2010. Hepatitis C virus hypervariable region 1 modu-
lates receptor interactions, conceals the CD81 binding site, and protects is a complex trait encoded in the interrelationships among conserved neutralizing epitopes. J. Virol. 84:5751–5763.
many sites along the entire HCV polyprotein. The extensive 10. Barba, G., et al. 1997. Hepatitis C virus core protein shows a cytoplasmic
coevolution among HCV amino acid sites leads to a significant localization and associates to cellular lipid storage droplets. Proc. Natl.
Acad. Sci. U. S. A. 94:1200–1205.
integration among the HCV IFN-response-related phenotypic 11. Bartosch, B., et al. 2003. Cell entry of hepatitis C virus requires a set of
traits. Each HCV protein contributes to the IFN resistance, co-receptors that include the CD81 tetraspanin and the SR-B1 scavenger albeit to a different degree. With E2 and NS5A contributing receptor. J. Biol. Chem. 278:41624–41630.
12. Blindenbacher, A., et al. 2003. Expression of hepatitis c virus proteins
many polymorphic sites to the network and generating a broad inhibits interferon alpha signaling in the liver of transgenic mice. Gastro- epistatic connectivity to sites in other HCV proteins, intrahost 13. Bode, J. G., et al. 2003. IFN-alpha antagonistic activity of HCV core protein
HCV evolution toward the IFN resistance is essentially defined involves induction of suppressor of cytokine signaling-3. FASEB J. 17:488–
and, therefore, can be accurately predicted using a carefully selected combination of sites from these two proteins.
14. Boulant, S., et al. 2006. Structural determinants that target the hepatitis C
virus core protein to lipid droplets. J. Biol. Chem. 281:22236–22247.
Treatment with IFN does not exert an unusual selection 15. Brady, M. T., A. J. MacDonald, A. G. Rowan, and K. H. Mills. 2003.
pressure on HCV, unlike treatment using direct-acting antivi- Hepatitis C virus non-structural protein 4 suppresses Th1 responses by ral compounds, but rather generates an unusually strong se- stimulating IL-10 production from monocytes. Eur. J. Immunol. 33:3448–
lection pressure of the innate immune system. Thus, HCV 16. Brieman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classi-
strains capable of resisting or evolving toward resistance to fication and regression trees. Chapman & Hall/CRC, Boca Raton, FL.
17. Campo, D. S., Z. Dimitrova, R. J. Mitchell, J. Lara, and Y. Khudyakov.
immunomodulatory therapy are most efficient in overcoming 2008. Coordinated evolution of the hepatitis C virus. Proc. Natl. Acad. Sci.
the host immune system. With the entire HCV genome being U. S. A. 105:9685–9690.
18. Cannon, N. A., M. J. Donlin, X. Fan, R. Aurora, and J. E. Tavis. 2008.
47. Frese, M., T. Pietschmann, D. Moradpour, O. Haller, and R. Barten-
Hepatitis C virus diversity and evolution in the full open-reading frame schlager. 2001. Interferon-alpha inhibits hepatitis C virus subgenomic RNA
during antiviral therapy. PLoS One 3:e2123.
replication by an MxA-independent pathway. J. Gen. Virol. 82:723–733.
19. Carney, D. S., and M. Gale, Jr. 2006. HCV regulation of host defense, p.
48. Fried, M. W., et al. 2002. Peginterferon alfa-2a plus ribavirin for chronic
375–398. In Seng-Lai Tan (ed.), Hepatitis C viruses. Horizon Bioscience, hepatitis C virus infection. N. Engl. J. Med. 347:975–982.
Norfolk, United Kingdom.
49. Gale, M. J., Jr., et al. 1997. Evidence that hepatitis C virus resistance to
20. Castelain, S., et al. 2002. Variability of the nonstructural 5A protein of
interferon is mediated through repression of the PKR protein kinase by the hepatitis C virus type 3a isolates and relation to interferon sensitivity. J.
nonstructural 5A protein. Virology 230:217–227.
Infect. Dis. 185:573–583.
50. Ge, D., et al. 2009. Genetic variation in IL28B predicts hepatitis C treat-
21. Chang, J., et al. 1998. Hepatitis C virus core from two different genotypes
ment-induced viral clearance. Nature 461:399–401.
has an oncogenic potential but is not sufficient for transforming primary rat 51. Gerlach, J. T., et al. 2003. Acute hepatitis C: high rate of both spontaneous
embryo fibroblasts in cooperation with the H-ras oncogene. J. Virol. 72:
and treatment-induced viral clearance. Gastroenterology 125:80–88.
52. Ghany, M. G., D. B. Strader, D. L. Thomas, and L. B. Seeff. 2009. Diagnosis,
22. Charniak, E. 1991. Bayesian networks without tears. AI Mag. 12:50–63.
management, and treatment of hepatitis C: an update. Hepatology 49:
23. Chen, S. L., and T. R. Morgan. 2006. The natural history of hepatitis C virus
(HCV) infection. Int. J. Med. Sci. 3:47–52.
53. Goswami, B. B., R. Crea, J. H. Van Boom, and O. K. Sharma. 1982.
24. Chickering, D. M., D. Heckerman, and C. Meek. 2004. Large-sample learn-
2⬘-5⬘-Linked oligo(adenylic acid) and its analogs. A new class of inhibitors ing of Bayesian networks is NP-hard. J. Mach. Learn. Res. 5:1287–1330.
of mRNA methylation. J. Biol. Chem. 257:6867–6870.
25. Choo, Q. L., et al. 1990. Hepatitis C virus: the major causative agent of viral
54. Guyon, I., and A. Elisseeff. 2003. An introduction to variable and feature
non-A, non-B hepatitis. Br. Med. Bull. 46:423–441.
selection. Mach. Learn. Res. 3:1157–1182.
26. Conjeevaram, H. S., et al. 2006. Peginterferon and ribavirin treatment in
55. Hadziyannis, S. J., et al. 2004. Peginterferon-alpha2a and ribavirin combi-
African American and Caucasian American patients with hepatitis C ge- nation therapy in chronic hepatitis C: a randomized study of treatment notype 1. Gastroenterology 131:470–477.
duration and ribavirin dose. Ann. Intern. Med. 140:346–355.
27. Contreras, A. M., et al. 2002. Viral RNA mutations are region specific and
56. Hall, M., and E. Frank. 2008. Combining naive Bayes and decision tables,
increased by ribavirin in a full-length hepatitis C virus replication system.
p. 318–319. In D. Wilson and H. Chad (ed.). Proceedings of the 21st Florida J. Virol. 76:8505–8517.
Artificial Intelligence Research Society Conference. AAAI Press, Coconut 28. Cooper, G. F., and E. Herskovits. 1992. A Bayesian method for the induc-
tion of probabilistic networks from data. Mach. Learn. 9:309–347.
57. Hall, M. A. 1999. Correlation-based feature subset selection for machine
29. Cox, L. A. 2006. Detecting causal non-linear exposure-response relations in
learning. Ph.D. thesis, Department of Computer Science, University of epidemiological data. Dose Response 4:119–132.
Waikato, Waikato, New Zealand.
30. Daelemans, W., V. Hoste, F. De Meulder, and B. Naudts. 2003. Combined
58. Hall, T. A. 1999. BioEdit: a user-friendly biological sequence alignment
optimization of feature selection and algorithm parameters in machine editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp.
learning of language. Machine Learning: Ecml 2003 2837:84–95.
31. Dash, D., and M. J. Druzdzel. 2003. Robust independence testing for
59. Helbig, K. J., D. T. Lau, L. Semendric, H. A. Harley, and M. R. Beard. 2005.
constraint-based learning of causal structure, p. 167–174. In The 19th An- Analysis of ISG expression in chronic hepatitis C identifies viperin as a nual Conference on Uncertainty in Artificial Intelligence (UAI-03). Mor- potential antiviral effector. Hepatology 42:702–710.
gan Kaufmann, San Francisco, CA.
60. Hsieh, T. Y., et al. 1998. Hepatitis C virus core protein interacts with
32. Demsar, J., G. Leban, and B. Zupan. 2005. FreeViz—an intelligent visual-
heterogeneous nuclear ribonucleoprotein K. J. Biol. Chem. 273:17651–
ization approach for class-labeled multidimensional data sets, p. 61–66. In Intelligent Data Analysis in Medicine and Pharmacology (IDAMAP) 61. Huber, M., et al. 2005. Interferon alpha-2a plus ribavirin 1,000/1,200 mg
Workshop, Aberdeen, United Kingdom.
versus interferon alpha-2a plus ribavirin 600 mg for chronic hepatitis C 33. Deuffic-Burban, S., T. Poynard, M. S. Sulkowski, and J. B. Wong. 2007.
infection in patients on opiate maintenance treatment: an open-label ran- Estimating the future health burden of chronic hepatitis C and human domized multicenter trial. Infection 33:25–29.
immunodeficiency virus infections in the United States. J. Viral Hepat.
62. Jaeckel, E., et al. 2001. Treatment of acute hepatitis C with interferon
alfa-2b. N. Engl. J. Med. 345:1452–1457.
34. Duverlie, G., et al. 1998. Sequence analysis of the NS5A protein of Euro-
63. Jensen, F. 1996. An introduction to Bayesian networks. UCL Press, Lon-
pean hepatitis C virus 1b isolates and relation to interferon sensitivity.
don, United Kingdom.
J. Gen. Virol. 79:1373–1381.
64. Jiang, D., et al. 2008. Identification of three interferon-inducible cellular en-
35. Emi, K., et al. 1999. Magnitude of activity in chronic hepatitis C is influ-
zymes that inhibit the replication of hepatitis C virus. J. Virol. 82:1665–1678.
enced by apoptosis of T cells responsible for hepatitis C virus. J. Gastro- 65. Jin, D. Y., et al. 2000. Hepatitis C virus core protein-induced loss of LZIP
enterol. Hepatol. 14:1018–1024.
function correlates with cellular transformation. EMBO J. 19:729–740.
36. Enomoto, N., and S. Maekawa. 2010. HCV genetic elements determining
66. Kato, N., et al. 1994. Genetic drift in hypervariable region 1 of the viral
the early response to peginterferon and ribavirin therapy. Intervirology genome in persistent hepatitis C virus infection. J. Virol. 68:4776–4784.
67. Kato, N., et al. 1993. Humoral immune response to hypervariable region 1
37. Enomoto, N., et al. 1995. Comparison of full-length sequences of interfer-
of the putative envelope glycoprotein (gp70) of hepatitis C virus. J. Virol.
on-sensitive and resistant hepatitis C virus 1b. Sensitivity to interferon is conferred by amino acid substitutions in the NS5A region. J. Clin. Invest.
68. Kraus, M. R., et al. 2001. Compliance with therapy in patients with chronic
hepatitis C: associations with psychiatric symptoms, interpersonal prob- 38. Enomoto, N., et al. 1996. Mutations in the nonstructural protein 5A gene
lems, and mode of acquisition. Dig. Dis. Sci. 46:2060–2065.
and response to interferon in patients with chronic hepatitis C virus 1b 69. Kullback, S., and R. A. Leibler. 1951. On information and sufficiency. Ann.
infection. N. Engl. J. Med. 334:77–81.
Math. Stat. 22:79–86.
39. Farci, P. 2001. Hepatitis C virus. The importance of viral heterogeneity.
70. Kurosaki, M., N. Enomoto, F. Marumo, and C. Sato. 1993. Rapid sequence
Clin. Liver Dis. 5:895–916.
variation of the hypervariable region of hepatitis C virus during the course 40. Farci, P., et al. 1996. Prevention of hepatitis C virus infection in chimpan-
of chronic infection. Hepatology 18:1293–1299.
zees by hyperimmune serum against the hypervariable region 1 of the 71. Lauritzen, S. L. 1996. Graphical models. Clarendon Press, Oxford, United
envelope 2 protein. Proc. Natl. Acad. Sci. U. S. A. 93:15394–15399.
41. Farci, P., et al. 2002. Early changes in hepatitis C viral quasispecies during
72. Lavillette, D., et al. 2007. Characterization of fusion determinants points to
interferon therapy predict the therapeutic outcome. Proc. Natl. Acad. Sci.
the involvement of three discrete regions of both E1 and E2 glycoproteins U. S. A. 99:3081–3086.
in the membrane fusion process of hepatitis C virus. J. Virol. 81:8752–8765.
42. Feld, J. J., and J. H. Hoofnagle. 2005. Mechanism of action of interferon
73. Leban, G., I. Bratko, U. Petrovic, T. Curk, and B. Zupan. 2005. VizRank:
and ribavirin in treatment of hepatitis C. Nature 436:967–972.
finding informative data projections in functional genomics by machine 43. Feld, J. J., et al. 2010. Ribavirin improves early responses to peginterferon
through improved interferon signaling. Gastroenterology 139:154–162.
74. Le Guillou-Guillemette, H., et al. 2007. Genetic diversity of the hepatitis C
44. Feld, J. J., et al. 2007. Hepatic gene expression during treatment with
virus: impact and issues in the antiviral therapy. World J. Gastroenterol.
peginterferon and ribavirin: identifying molecular pathways for treatment response. Hepatology 46:1548–1563.
75. Lopez-Labrador, F. X., et al. 1999. Relationship of the genomic complexity
45. Flint, M., et al. 1999. Characterization of hepatitis C virus E2 glycoprotein
of hepatitis C virus with liver disease severity and response to interferon in interaction with a putative cellular receptor, CD81. J. Virol. 73:6235–6244.
patients with chronic HCV genotype 1b infection [correction of interferon].
46. Forns, X., et al. 2000. Hepatitis C virus lacking the hypervariable region 1
of the second envelope protein is infectious and causes acute resolving or 76. Lutchman, G., et al. 2007. Mutation rate of the hepatitis C virus NS5B in
persistent infection in chimpanzees. Proc. Natl. Acad. Sci. U. S. A. 97:
patients undergoing treatment with ribavirin monotherapy. Gastroenterol- HCV GENETIC PATTERNS OF ADAPTATION TO IFN THERAPY 77. Maag, D., C. Castro, Z. Hong, and C. E. Cameron. 2001. Hepatitis C virus
105. Santolini, E., G. Migliaccio, and N. Lamonica. 1994. Biosynthesis and
RNA-dependent RNA polymerase (NS5B) as a mediator of the antiviral biochemical properties of the hepatitis C virus core protein. J. Virol. 68:
activity of ribavirin. J. Biol. Chem. 276:46094–46098.
78. Magiorkinis, G., et al. 2009. The global spread of hepatitis C virus 1a and
106. Sarrazin, C., et al. 2000. Mutations within the E2 and NS5A protein in
1b: a phylodynamic and phylogeographic analysis. PLoS Med. 6:e1000198.
patients infected with hepatitis C virus type 3a and correlation with treat- 79. Mangoni, E. D., D. M. Forton, G. Ruggiero, and P. Karayiannis. 2003.
ment response. Hepatology 31:1360–1370.
Hepatitis C virus E2 and NS5A region variability during sequential treat- 107. Shavinskaya, A., S. Boulant, F. Penin, J. McLauchlan, and R. Barten-
ment with two interferon-alpha preparations. J. Med. Virol. 70:62–73.
schlager. 2007. The lipid droplet binding domain of hepatitis C virus core
80. Manns, M. P., et al. 2001. Peginterferon alfa-2b plus ribavirin compared
protein is a major determinant for efficient virus assembly. J. Biol. Chem.
with interferon alfa-2b plus ribavirin for initial treatment of chronic hepa- titis C: a randomised trial. Lancet 358:958–965.
108. Simmonds, P., et al. 2005. Consensus proposals for a unified system of
81. Masaki, T., et al. 2008. Interaction of hepatitis C virus nonstructural protein
nomenclature of hepatitis C virus genotypes. Hepatology 42:962–973.
5A with core protein is critical for the production of infectious virus par- 109. Simmonds, P., et al. 1993. Classification of hepatitis C virus into six major
ticles. J. Virol. 82:7964–7976.
genotypes and a series of subtypes by phylogenetic analysis of the NS-5 82. McLauchlan, J. 2000. Properties of the hepatitis C virus core protein: a
region. J. Gen. Virol. 74:2391–2399.
structural protein that modulates cellular processes. J. Viral Hepat. 7:2–14.
110. Suppiah, V., et al. 2009. IL28B is associated with response to chronic hepatitis
83. McLauchlan, J. 2009. Hepatitis C virus: viral proteins on the move.
C interferon-alpha and ribavirin therapy. Nat. Genet. 41:1100–1104.
Biochem. Soc. Trans. 37:986–990.
111. Tam, R. C., et al. 1999. Ribavirin polarizes human T cell responses towards
84. Melen, K., P. Keskinen, A. Lehtonen, and I. Julkunen. 2000. Interferon-
a type 1 cytokine profile. J. Hepatol. 30:376–382.
induced gene expression and signaling in human hepatoma cell lines.
112. Taniguchi, S., et al. 1993. A structurally flexible and antigenically variable
J. Hepatol. 33:764–772.
N-terminal domain of the hepatitis C virus E2/NS1 protein: implication for 85. Miller, K., et al. 2004. Effects of the hepatitis C virus core protein on innate
an escape from antibody. Virology 195:297–301.
cellular defense pathways. J. Interferon Cytokine Res. 24:391–402.
113. Taylor, D. R., S. T. Shi, P. R. Romano, G. N. Barber, and M. M. Lai. 1999.
86. Moradpour, D., et al. 2003. Membrane association of hepatitis C virus
Inhibition of the interferon-inducible protein kinase PKR by HCV E2 nonstructural proteins and identification of the membrane alteration that protein. Science 285:107–110.
harbors the viral replication complex. Antiviral Res. 60:103–109.
114. Thomas, D. L., et al. 2009. Genetic variation in IL28B and spontaneous
87. Moribe, T., et al. 1995. Hepatitis C viral complexity detected by single-
clearance of hepatitis C virus. Nature 461:798–801.
strand conformation polymorphism and response to interferon therapy.
115. Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W:
improving the sensitivity of progressive multiple sequence alignment 88. Munoz de Rueda, P., et al. 2008. Mutations in E2-PePHD, NS5A-PKRBD,
through sequence weighting, position-specific gap penalties and weight NS5A-ISDR, and NS5A-V3 of hepatitis C virus genotype 1 and their matrix choice. Nucleic Acids Res. 22:4673–4680.
relationships to pegylated interferon-ribavirin treatment responses. J. Virol.
116. Tillmann, H. L., et al. 2010. A polymorphism near IL28B is associated with
spontaneous clearance of acute hepatitis C virus and jaundice. Gastroen- 89. Murakami, T., et al. 1999. Mutations in nonstructural protein 5A gene and
response to interferon in hepatitis C virus genotype 2 infection. Hepatology 117. Torres-Puente, M., et al. 2008. Genetic variability in hepatitis C virus and
its role in antiviral treatment response. J. Viral Hepat. 15:188–199.
89a.National Institutes of Health. 2002. NIH consensus statement on manage-
118. Toyoda, H., et al. 1997. Quasispecies nature of hepatitis C virus and re-
ment of hepatitis C: 2002. NIH Consens. State Sci. Statements 19(3):1–46.
sponse to alpha interferon: significance as a predictor of direct response to 90. Neumann, A. U., et al. 2000. Differences in viral dynamics between geno-
interferon. J. Hepatol. 26:6–13.
types 1 and 2 of hepatitis C virus. J. Infect. Dis. 182:28–35.
119. Veillon, P., C. Payan, H. Le Guillou-Guillemette, C. Gaudy, and F. Lunel.
91. Pacheco, B., et al. 2006. Membrane-perturbing properties of three peptides
2007. Quasispecies evolution in NS5A region of hepatitis C virus genotype corresponding to the ectodomain of hepatitis C virus E2 envelope protein.
1b during interferon or combined interferon-ribavirin therapy. World J.
Biochim. Biophys. Acta 1758:755–763.
92. Pascu, M., et al. 2004. Sustained virological response in hepatitis C virus
120. von Wagner, et al. 2008. Placebo-controlled trial of 400 mg amantadine
type 1b infected patients is predicted by the number of mutations within the combined with peginterferon alfa-2a and ribavirin for 48 weeks in chronic NS5A-ISDR: a meta-analysis focused on geographical differences. Gut hepatitis C virus-1 infection. Hepatology 48:1404–1411.
121. Weiner, A. J., et al. 1992. Evidence for immune selection of hepatitis C virus
93. Pawlotsky, J. M., et al. 1999. Evolution of the hepatitis C virus second
(HCV) putative envelope glycoprotein variants: potential role in chronic envelope protein hypervariable region in chronically infected patients re- HCV infections. Proc. Natl. Acad. Sci. U. S. A. 89:3468–3472.
ceiving alpha interferon therapy. J. Virol. 73:6490–6499.
122. Wiese, M., F. Berr, M. Lafrenz, H. Porst, and U. Oesen. 2000. Low fre-
94. Penin, F., et al. 2001. Conservation of the conformation and positive
quency of cirrhosis in a hepatitis C (genotype 1b) single-source outbreak in charges of hepatitis C virus E2 envelope glycoprotein hypervariable region Germany: a 20-year multicenter study. Hepatology 32:91–96.
1 points to a role in cell attachment. J. Virol. 75:5703–5710.
123. Wiese, M., et al. 2005. Outcome in a hepatitis C (genotype 1b) single source
Perez-Berna, A. J., et al. 2008. Interaction of the most membranotropic
region of the HCV E2 envelope glycoprotein with membranes. Biophysical
outbreak in Germany—a 25-year multicenter study. J. Hepatol. 43:590–598.
characterization. Biophys. J. 94:4737–4750.
124. World Health Organization. 1997. Hepatitis C. Wkly. Epidemiol. Rec.
96. Perlemuter, G., et al. 2002. Hepatitis C virus core protein inhibits micro-
somal triglyceride transfer protein activity and very low density lipoprotein 125. World Health Organization. 1997. Hepatitis C: global prevalence. Wkly.
secretion: a model of viral-related steatosis. FASEB J. 16:185–194.
Epidemiol. Rec. 72:341–344.
97. Pfeiffer, J. K., and K. Kirkegaard. 2005. Ribavirin resistance in hepatitis C
126. Xu, J., S. Liu, Y. Xu, P. Tien, and G. Gao. 2009. Identification of the
virus replicon-containing cell lines conferred by changes in the cell line or nonstructural protein 4B of hepatitis C virus as a factor that inhibits the mutations in the replicon RNA. J. Virol. 79:2346–2355.
antiviral activity of interferon-alpha. Virus Res. 141:55–62.
98. Polyak, S. J., et al. 2000. The protein kinase-interacting domain in the
127. Yagnik, A. T., et al. 2000. A model for the hepatitis C virus envelope
hepatitis C virus envelope glycoprotein-2 gene is highly conserved in geno- glycoprotein E2. Proteins 40:355–366.
type 1-infected patients treated with interferon. J. Infect. Dis. 182:397–404.
128. Yao, Z. Q., et al. 2005. SOCS1 and SOCS3 are targeted by hepatitis C virus
99. Puig-Basagoiti, F., et al. 2005. Dynamics of hepatitis C virus NS5A quasi-
core/gC1qR ligation to inhibit T-cell function. J. Virol. 79:15417–15429.
species during interferon and ribavirin therapy in responder and non- 129. Yoshida, T., et al. 2002. Activation of STAT3 by the hepatitis C virus core
responder patients with genotype 1b chronic hepatitis C. J. Gen. Virol.
protein leads to cellular transformation. J. Exp. Med. 196:641–653.
130. You, L. R., et al. 1999. Hepatitis C virus core protein interacts with cellular
100. Pulaski, B. A., M. J. Smyth, and S. Ostrand-Rosenberg. 2002. Interferon-
putative RNA helicase. J. Virol. 73:2841–2853.
gamma-dependent phagocytic cells are a critical component of innate im- 131. Yuan, H. J., M. Jain, K. K. Snow, J. M. Gale, and W. M. Lee. 2010.
munity against metastatic mammary carcinoma. Cancer Res. 62:4406–4412.
Evolution of hepatitis C virus NS5A region in breakthrough patients during 101. Quinlan, R. J. 1986. Induction of decision trees. Mach. Learn. 1:81–106.
pegylated interferon and ribavirin therapy. J. Viral Hepat. 17:208–216.
102. Ray, R. B., L. M. Lagging, K. Meyer, and R. Ray. 1996. Hepatitis C virus
132. Zhang, Y., et al. 2003. Ribavirin treatment up-regulates antiviral gene
core protein cooperates with ras and transforms primary rat embryo fibro- expression via the interferon-stimulated response element in respiratory blasts to tumorigenic phenotype. J. Virol. 70:4438–4443.
syncytial virus-infected epithelial cells. J. Virol. 77:5933–5947.
103. Romero-Gomez, M., et al. 2005. Insulin resistance impairs sustained re-
133. Zhou, S., R. Liu, B. M. Baroudy, B. A. Malcolm, and G. R. Reyes. 2003. The
sponse rate to peginterferon plus ribavirin in chronic hepatitis C patients.
effect of ribavirin and IMPDH inhibitors on hepatitis C virus subgenomic replicon RNA. Virology 310:333–342.
104. Saito, K., M. it-Goughoulte, et al. 2008. Hepatitis C virus inhibits cell
134. Zhu, N. L., et al. 1998. Hepatitis C virus core protein binds to the cytoplas-
surface expression of HLA-DR, prevents dendritic cell maturation, and mic domain of tumor necrosis factor (TNF) receptor 1 and enhances TNF- induces interleukin-10 production. J. Virol. 82:3320–3328.
induced apoptosis. J. Virol. 72:3691–3697.

Source: http://www.homepages.ed.ac.uk/aspiliop/2010_2011/lara_coevolution_hcv_2011.pdf

Sin título-

DORAMECTINA 1,1%. Endectocida altamente efectivo para el control y tratamiento de los parásitos internos y externos. NOVIEMBRE DE 2013 Doramectina es una lactona macrocíclica Figura 1. Origen y clasificación de las lactonas semisintética, perteneciente a la familia de macrocíclicas: avermectinas y milbemicinas (tomado de Lifschitz y col 2002).

Microsoft word - i080117.doc

I n f o r m a t i o n s m a t e r i a l v o m 1 7 . 0 1 . 2 0 0 8 Zysten, Fisteln und Co. Der 18. November 1686 war der aufregendste Tag im Leben des Chirurgen Charles-François Félix (1653-1703). Ein Jahr lang hatte sich sein Patient der von ihm vorge-schlagenen Operation seiner Analfistel verweigert. Nun reckte er ihm den After ent-gegen, bereit, einige schmerzhafte Schnitte zu ertragen, denn mit Betäubungsmitteln war es damals nicht weit her. Der Patient war Schmerzen gewöhnt, zeitlebens hatten ihn unzählige Leiden geplagt. Doch ein Misslingen dieser Operation, wo-