Cathepsin B is one of the most versatile human cysteine cathepsins. It is important for intracellular protein degradation under normal conditions and is involved in a number of pathological processes. The occluding loop makes cathepsin B unique among cysteine cathepsins. This ~20residue long insertion imbedded into the papainlike protease scaffold restricts access to the active site cleft and endows cathepsin B with its carboxydipeptidase activity. Nevertheless, the enzyme also exhibits endopeptidase activity and is inhibited by stefins and cystatins. To explain the structural properties of the occluding loop upon binding of stefins, we have determined the crystal structure of the complex between the wild type human stefin A and the wild type human cathepsin B, at 2.6 Å resolution. The papainlike part of cathepsin B structure remains unmodified, whereas the occluding loop residues are displaced. The part enclosed by the disulfide bridge containing histidines 110 and 111, the "lasso" part, is rotated by approximately 45 degrees away from its original position. A comparison of the structure of the unliganded cathepsin B with its complexes with chagasin and stefin A, as well as with the structure of the proenzyme, shows that the magnitude of the shift of the occluding loop is related to the size of the competing ligand, but with no impact on the binding constant. Hence, cathepsin B can dock inhibitors and certain substrates regardless of their size.
INTRODUCTION
Cathepsin B (EC 3.4.22.1), a lysosomal, papainlike cysteine protease is one of the most extensive studied human cathepsins [1]. This enzyme is abundantly expressed in a variety of tissues where it takes part in protein degradation and processing. It is involved in a number of physiological and pathological processes, such as intracellular protein degradation, immune response, prohormone processing, cancer, and arthritis [29]. Its proteolytic activity is regulated by stefins and cystatins, endogenous inhibitors of cysteine cathepsins [10]. Cathepsin B differs from other cathepsins by its dual role, exhibiting exoas well as endopeptidase activity. The crystal structure of this human enzyme [11] has revealed that about a 20 residues long insertion, termed "occluding loop", occupies the part of the active site cleft on the primed side and blocks access to the active site cleft beyond the S2' substrate binding site [11, 12]. The occluding loop is held together by the disulfide bond between C108 and C119. Its attachment to the body of the enzyme is stabilized by two salt bridges, between H110 and D22, and between R116 and D224. The crystal structure suggested that two histidines, H110 and H111, positioned within the active site cleft, are responsible for docking of the Cterminal carboxylic group of peptidyl substrates. This observation has been later confirmed by the crystal structure of the complex of a substratemimicking inhibitor, CA030, interacting through its Cterminal carboxylic group with the two histidine residues [13]. The concept of utilizing additional structural features to block part of the active site cleft in order to restrict the binding of peptidyl substrates and facilitating binding of the substrate termini is not unique to cathepsin B [14]. The amino dipeptidase cathepsin C [15, 16] possesses a large segment of the proregion [17], termed exclusion domain, which remains associated with the mature enzyme and blocks the active site cleft beyond the S2 site. The amino peptidase cathepsin H has covalently attached stretch of eight residues originating from the propeptide, termed the mini chain, which blocks the unprimed binding site [18]. The mini loop in carboxypeptidase cathepsin X blocks the primed side of the active site, restricting access to only one residue [19].
While the structures of the mature native form of cathepsin B clearly exposed the relevance of the occluding loop for the exopeptidase activity [11], they have not explained the mechanisms of endopeptidase activity nor the inhibition of the enzyme by their endogenous protein inhibitors cystatins and stefins [20]. A further step in understanding of these mechanisms was provided by the crystal structures of human
[21] and rat procathepsins B [22]. They have revealed that, in the zymogen form, the propeptide rather than the occluding loop fills the active site cleft. It was shown that the single and double mutations D22A, H110A, R116A, and D224A disrupted the salt bridges between the occluding loop and the body of the enzyme, resulting in enhanced endopeptidase activity [23]. Furthermore, the deletion mutant lacking 12 central residues of the "lasso" region between the disulfide C109C118 confirmed that their absence yields an enzyme with pure endopeptidase activity, completely lacking exopeptidase activity, and with a 40fold increase of affinity for cystatins [12]. These results indicated that loop flexibility must be responsible for the endopeptidase activity of cathepsin B, as well as that endopeptidase activity should be associated with the occluding loop displacement from the active site cleft. Recently, crystal structure of the complex between chagasin, a cysteine protease inhibitor from Trypanosoma cruzi, and human cathepsin B, a multiple mutant with destabilized affinity of the occluding loop residues towards the active site cleft, has shown that on binding to cathepsin B chagasin displaces the occluding loop from the active cleft [24]. Here we present the crystal structure of the complex between two human proteins: the wild type stefin A and the wild type human cathepsin B. A structural comparison suggests that the extent of the movement of the occluding loop residues necessary for their displacement from the active site cleft is ligand size dependent.
RESULTS AND DISCUSSION
Crystals of the complex of stefin A and cathepsin B contain complete wild type protein sequences. The positioning of the main chains of nearly all residues is clearly revealed by the electron density maps, with the exception of E95, a stretch of four occluding loop residues from V112 to S115 in the first molecule of cathepsin B, G75 and Q76 in the molecule A of stefin A, and M1 and E78 in the molecule B of stefin A. Additionally, eleven side chains lack adequate electron density. The RMS deviation between all pairs of superimposed CA atoms of cathepsin B molecules, excluding residues 105125 of the occluding loop, is 0.34 Å, whereas the RMS deviation between all pairs of superimposed CA atoms of stefin A molecules exhibits somewhat larger RMS of 0.88 Å. This comparison shows that the differences between the two molecules of cathepsin B are confined to the occluding loop region, whereas the differences between the two stefin A molecules are spread out through the entire structure, with slightly increased variability in the S72-D79 region that forms the second binding loop.
Cathepsin B has a twodomain, papainlike fold [11]. The Nterminal domain includes the central helix that contains, on its N terminus, the active site C29. The Cterminal domain is based on a 4stranded βbarrel fold, contributing H199, the other active site residue. The active site cleft is formed at the interface between the two domains, that are also named Land R(left and right), according to the standard view used to present the papainlike folds.
The structure of stefin A exhibits the cystatinlike fold composed of a fivestranded βsheet embracing an αhelix (Fig. 1). This arrangement creates a wedgeshaped structure with the Nterminal trunk and two hairpin loops at its narrow edge [25]. This narrow edge docks into the active site cleft of cathepsin B (Fig. 1). The binding mode is equivalent to those from the related complexes of stefin Bpapain [26] and stefin Acathepsin H[27]. A comparison of the average distances between CA atoms of the active site cysteine and histidine residues in cathepsins B and H and CA atoms of stefins in the structures of both complexes showed that stefin A binds to cathepsin B as deep as does stefin B to cathepsin H, with the equivalent average distances 23.36 Å and 23.43 Å, respectively (Table 1). This shows that the final position of stefin A molecules in the complex is not effected by the additional features of exopeptidases, occluding loop and minichain, which occupy parts of the active site cleft (Figure 4). These additional features hinder binding along the whole interdomain interface, yet they both get pushed away upon binding of the ligand.
The Nterminal trunk and the first binding loop occlude the active site C29, blocking the enzymatic activity. The Nterminal trunk binds into the nonprimed substrate binding sites, whereas the two loops bind into the primed sites. They occlude the catalytic C29 (surface colored in yellow) in the middle and thereby prevent the approach of substrate molecules. The same approach is utilized by the p41 fragment, a representative of thyropins [28], chagasin [29, 30], and mycocypins [31].
The Nterminal trunk comes down the S1 binding area of cathepsin B, occupies the S2 binding site with proline residue P3 and continues through the S2 binding site upwards (away from the cathepsin B surface). Two hydrogen bonds between the stefin A amide hydrogen (G4) and carbonyl (P3) with cathepsin B carbonyl atom (G198) and amide hydrogen (G74) attach the first loop to the active site cleft.
The first binding loop of stefin A (V47 to Q51) fills the S1' site with V48. Besides this hydrophobic interaction, the loop is fastened to the cathepsin B surface by the hydrogen bond between the stefin A A49 amide and cathepsin B G24 carbonyl. The binding of this loop is further stabilized by a hydrogen bond between the stefin A N52 side chain amide and the cathepsin B S25 carbonyl group.
The second binding loop (L73 to D79) comes down to the area beyond the S2' site and displaces the occluding loop residues of cathepsin B. It is firmly anchored by the βsheet hydrogen bonding pattern formed between the three loops in stefin A and an additional hydrogen bond formed between the amide hydrogen of L73 and the side chain carbonyl of E109. A layer of solvent molecules mediates the contacts between the Cterminal part of the second binding loop and cathepsin B.
The occluding loop differs from the native structure (PDB code 1HUC) [11] in the region from S104 and D124 (Figs. 2, 3). The lasso structure between the C108 C119 disulfide is rotated by approximately 45° and pushed aside. This movement dramatically changes the position of the two occluding loop histidines, H110 and H111. Instead of a parallel positioning within the active site cleft, these two side chains now point into different, almost opposite directions. The side chain of H110 points away from the active site cleft to the back of the molecule, while the side chain of H111 points upwards and away from the surface. In the complex, two stefin A residues, A49 from the tip of the first binding loop and L73 from the second binding loop, fill the places that the two histidines occupy in the native structure. Besides the lasso, the inhibitor also pushes away the chain from C119 to the D124. The position of CA atom of E122 is changed by almost 7 Å from the position it occupies in the native cathepsin B structure. In this respect, stefin interactions with exopeptidases are not unique. The Nterminal trunk of stefin A can displace the mini chain which blocks part of the binding cleft in cathepsin H [27].
Two salt bridges, H110 - D22 and R116 - D224, which additionally stabilize the attachment of the loop to the body of the enzyme, are disrupted in the complex. R116 and D224, however, compensate for the loss of the salt bridge interaction by finding electrostatically favorable partners in K184 of cathepsin B and E78 of stefin A, respectively. The structure presented here shows that weakening of embedding of the occluding loop into the active site cleft is not mandatory for formation of the crystals of the complex, even though it is associated in a drop of Ki from 0.93 to 0.35 nM, as shown by the chagasin - cathepsin B study. The stefin A cathepsin B complex contains the wild type sequences and physiologically occurring interactions, as opposed to the crystal structure of chagasin, a parasite inhibitor from Tripanosoma cruzi, and cathepsin B complex [24](PDB code 3CBJ). In that complex the first salt bridge interaction has been disrupted by the H110A mutant and the enzyme's reactive site turned off by the C29A mutant. (We assume here that the cathepsin B mutations have not affected the geometry of binding of chagasin.) The wild type sequences have also been preserved in the related structural studies of procathepsin B [21].
These three structures, as well as the structure of native cathepsin B (Figs. 2, 3) demonstrate that the occluding loop can adopt a variety of positions, with the moving part consisting of residues between E109 and D124. The extent of the occluding loop shift from the position in the native enzyme (PDB code 1HUC) is shown in a series of structures starting with the proenzyme form (PDB code 3PBH), complex with stefin A, and chagasin [24] (PDB code 3CBJ) (Figs. 2, 3). The CA atom position of N113 is marked in Fig. 3 to indicate the shifted positions which are 7 Å, 16 Å, and 22.5 Å (14 Å) away from the position that this atom occupies in the native form. Our conclusion is that the size matters. The larger and the wider are the features of the ligands that compete with the occluding loop for binding to the active site, the farther away the occluding loop residues are shifted. Hence, these structures demonstrate that the occluding loop residues can adopt a variety of conformations, whereas the rest of the structure of cathepsin B appears to be rigid. A comparison of the interaction constants of binding of chagasin (Ki=0.93nM [24]) and stefins (1.7 and 2 nM [32, 33], 0.91 nM [34]) to cathepsin B indicate that the extent of the shift does not affect the inhibition constants, even though the interaction surface of chagasin with the occluding loop (160 Å2) is slightly larger than that of stefin A (100 Å2). This observation suggests that the energy cost of ligand binding associated with the occluding loop removal is not related to the magnitude of the occluding loop shift from the active site cleft. Cathepsin B can bind certain ligands along the whole interdomain interface. During docking their size alone likely does not play a role. Cathepsin B will accept inhibitors or substrates - whatever comes across.
MATERIALS AND METHODS
Cathepsin B and stefin A were expressed as previously reported [35, 36], mixed in a molar ratio 1:1.1, and concentrated to 30 mg/mL in 10 mM sodium acetate, pH=5.5. Crystals were grown in 0.2 M sodium sulfate, 24% PEG3000. The initial crystals grown by the sitting drop method were highly mosaic, thereby useless for structure determination. Therefore, the hanging drop method was used in combination with the controlled evaporation approach [37], which greatly improved crystal quality. The crystals, which grew in the form of thin plates, were soaked in mother liquor supplemented with 2030% glycerol and frozen in liquid nitrogen prior to data collection.
Diffraction data were collected at the XRD1 workstation at Synchrotron Elletra, Trieste, and processed using the HKL2000 package [38]. Determination of the space group was nontrivial. The data were first processed in the P21 space group due to the higher symmetry, with an acceptable Rmerge of 0.132 and data completeness of 96.7%. The structure was determined by molecular replacement using Amore [39] with cathepsin B [13] and stefin A [27] as search models. The crystals are extremely dense, having only 28% of solvent, resulting in Matthews coefficient (VM) of 1.70 [40]. It was surprising that so tightly packed crystals diffracted only to 2.6 Å. The protein database analysis took into account 10,471 crystal forms of proteins, deposited in PDB in 2002 [41]. It showed that more tightly packed crystals (lower VM) tend to diffract to higher resolutions.
Since we were unable to position the occluding loop residues consistently within the electron density maps, we decided to reprocess the diffraction data in the lower symmetry space group, P1. These data had a lower Rmerge of 0.084 and slightly lower completeness (92.4%). The lower completeness of the P1 data set is a consequence of highly anisotropic diffraction, which forced us to discard part of the collected data to maintain reasonable merging statistics. The anisotropy was a consequence of the shape of the crystals, which were thin plates diffracting poorly in the direction perpendicular to the beam. The P1 space group data resulted in an improved electron density map for the occluding loop residues and were used for further refinement and model building. The structure was refined using Refmac [42] and MAIN [43].
Data collection and refinement statitistics are summarized in Table 2. The coordinates and structure factors were deposited in the PDB (ID 3K9M).
Distance d (table 1) between stefin A and different enzymes is the average distance between all CA atoms in stefin A and CA atoms of reactive site cysteine and histidine residues.
TABLES Table 1: Average distances between CA atoms of the stefins and catalytic residues of cysteine proteases.
d (Å) Papain - stefin B 23.93 Cathepsin H - stefin A 23.36±0.23 Cathepsin B - stefin A 23.34±0.15
Table 2: Data collection and refinement statistics for the complex of cathepsin B with stefin A. Numbers in parentheses are for the highest resolution shell. No intensity cutoffs were applied.
Data collection
PDB ID 3K9M
Space group Cell dimesions a, b, c (Å) α, β, γ (°) Resolution (Å) Rmerge (%) I/σI Completeness (%) Redundancy
P1
62.0, 31.0, 70.9 90.0, 104.5, 90.0 68.6 - 2.51 8.4 (20.6) 9.5 (2.6)
92.1 (66.7)
2.6 (2.2)
Refinement
Resolution 40.5 - 2.61 No. of reflections (work/free) 24360 / 713 Rwork/Rfree 19.8 / 25.0 B factor (Å 2) 42.0 No. of atoms Protein 5454 Water 127
r.m.s. deviation Bond lenghts (Å) 0.013 Bond angles (°) 1.71
FIGURE LEGENDS
Figure 1: Structure of the cathepsin B - stefin A complex. A) A view along the active site cleft. B) A view perpendicular to the active site cleft. Cathepsin B is shown in gray and stefin A in green. The catalytic cysteine is shown in yellow. The wedgeshaped structure of stefin A fills the active site cleft along the whole length and displaces the occluding loop (the "lasso" is shown in red).
Figure 2: The extent of the occluding loop displacement in the unliganded and liganded structures. The occluding loop (red) is shown in on the surface of the papainlike part of the structure (gray). A) Unliganded cathepsin B (PDB code 1HUC) [11]. B) propeptide in dark blue (PDB code 3PBH) [21]. C) A complex with stefin A, with stefin A in green. D) A complex with chagasin (shown in cyan) (PDB code 3CBJ) [24].
Figure 3: The extent of the occluding loop displacement - superimposed. The papainlike part of cathepsin B is shown as a gray surface with the catalytic cysteine part shown in yellow, while the S1, S1' and S2' binding sites are shown in green and cyan. The occluding loops from various cathepsin B structures (proenzyme, complex with stefin A, complex with chagasin) are shown in dark blue, red, and cyan, respectively. The occluding loop residues, H110 and H111, from the naked cathepsin B, are shown in orange. Spheres represent the position of CA atom of N113, to indicate the extent of movement of the occluding loop.
Figure 4: Flexibility of stefin structures. Papain surface (PDB code 1STF) [26] is shown in gray with the part of the reactive cysteine residue shown in yellow. Four structures of stefin A from the complex with cathepsin H are shown in cyan (PDB code 1NB3) [27]. The two structures of stefin A from the complex with cathepsin B are shown in red. The stefin B structure from the complex with papain is shown in green. Six stefin A molecules were moved onto the scaffold of papain using transformation parameters obtained from the superimpositions of their enzymatic partners on the papain structure.