Download of SCOP2 data
The SCOP2 REST web-service is a convenient way to retrieve data from SCOP2. It allows easy programming language-agnostic access to SCOP2 ontology and SCOP2 domains. The data can be requested with simple HTTP requests and returned in a JSON. A variety of programatic and bioinformatics relevant formats may be supported in the future.
Each datatype resides on a REST endpoint, these are all listed in the Endpoints section. A request is made to an endpoint as a parameterised HTTP request adhering a predefined URI schema. Parameters can be required or optional. Optional parameters are indicated by being placed inside square brackets, e.g [ p = 1], in a documented endpoint. Other parameters are required and must be included in the URL. All parameters can be included as CGI style parameters as part of the HTTP request.
for node - provides all the info about the SCOP2 node, including its parents, children, domains etc, e.g
for domain - provides the info about the SCOP2 domain, including its segments data and the SCOP2 node, e.g
At the moment only JSON format is supported, but we'll add others depending on user requests.
More information about endpoints and access can be found here.
Five parseable files are available for download that contain the classification of the representative entries in SCOP2. In all parseable files the columns are tab delimited. A brief description of the files is provided below.
- scop2_graph_nodes : Lists the binary relations between SCOP2 nodes e.g.
- scop2_nodes_names : Lists the names of the SCOP2 nodes e.g.
1000000 All alpha proteins
1000001 All beta proteins
1000002 Alpha and beta proteins (a/b)
- domains2nodes : Lists the binary relations between SCOP2 nodes and representative domains e.g.
- domain_segments_pdb : Lists the SCOP2 representative structural domain segment information. Serial indicates the sequential order of the segment in a multi-segment domain. Different segments of the same domain are listed in different line. The begin and end boundaries of the segments are given in PDB residue ids.
DomID Serial PDB Chain Begin End
8000045 1 2RHC A 5 261
8000048 1 1ULU B 1 257
8000049 1 2Q45 A 7 265
- domain_segments_seq : Lists the SCOP2 representative sequence domain segment information. Serial indicates the sequential order of the segment in a multi-segment domain. Different segments of the same domain are listed in different lines. The begin and end boundaries of the segments are given as sequence string positions. The ExtDB field defines the external identifier of the reference sequence. The entire representative sequence is listed. Use the begin and end boundaries in order to generate the sequence segment of the domain.
DomID Serial ExtDB Begin End Sequence
8000045 1 P16544 5 261 MATQDSEVALVTGATSGIGLEIARRLGKEGLRVFVCARGEE.....
8000048 1 Q5SLI9 1 257 MLTVDLSGKKALVMGVTNQRSLGFAIAAKLKEAGA.......
The sequence libraries of SCOP2 representative domains in fasta format can be downloaded from here.