Download of SCOP2 data

1. SCOP2 REST web-service

1.1. Making a REST request

1.2. Example request

 1.3. Content Type

2. SCOP2 MySQL

3. SCOP2 Parseable files

4. SCOP2 Sequence libraries

 

1. SCOP2 REST web-service

The SCOP2 REST web-service is a convenient way to retrieve data from SCOP2. It allows easy programming language-agnostic access to SCOP2 ontology and SCOP2 domains. The data can be requested with simple HTTP requests and returned in a JSON. A variety of programatic and bioinformatics relevant formats may be supported in the future.

1.1.  Making a REST request

Each datatype resides on a REST endpoint, these are all listed in the Endpoints section. A request is made to an endpoint as a parameterised HTTP request adhering a predefined URI schema. Parameters can be required or optional. Optional parameters are indicated by being placed inside square brackets, e.g [ p = 1], in a documented endpoint. Other parameters are required and must be included in the URL. All parameters can be included as CGI style parameters as part of the HTTP request.

1.2. Example request

for node - provides all the info about the SCOP2 node, including its parents, children, domains etc, e.g                                              
http://scop2.mrc-lmb.cam.ac.uk/graph/restapi/term?id=SF:3000133

for domain - provides the info about the SCOP2 domain, including its segments data and the SCOP2 node, e.g                                                                                  
http://scop2.mrc-lmb.cam.ac.uk/graph/restapi/domain?id=CF-8004382-2BCGG

 1.3. Content Type

At the moment only JSON format is supported, but we'll add others depending on user requests.

More information about endpoints and access can be found here.

2. SCOP2 MySQL

A MySQL dump of the database tables is available for download here. For help on how to set up MySQL databases and how to import MySQL dumps, please consult the MySQL website.

3. SCOP2 Parseable files

Five parseable files are available for download that contain the classification of the representative entries in SCOP2. In all parseable files the columns are tab delimited. A brief description of the files is provided below.

- scop2_graph_nodes : Lists the binary relations between SCOP2 nodes e.g.

ParentID ChildID

1000000 2000000

1000000 2000002

1000000 2000006

- scop2_nodes_names : Lists the names of the SCOP2 nodes e.g.

NodeID  Name

1000000 All alpha proteins

1000001 All beta proteins

1000002 Alpha and beta proteins (a/b)

- domains2nodes :  Lists the binary relations between SCOP2 nodes and representative domains e.g.

DomID   NodeID

8000045 6000041

8000048 6000044

8000049 6000045

8000055 6000051

- domain_segments_pdb : Lists the SCOP2 representative structural domain segment information. Serial indicates the sequential order of the segment in a multi-segment domain. Different segments of the same domain are listed in different line. The begin and end boundaries of the segments are given in PDB residue ids.

DomID   Serial  PDB     Chain   Begin   End

8000045 1       2RHC    A       5       261

8000048 1       1ULU    B       1       257

8000049 1       2Q45    A       7       265

- domain_segments_seq : Lists the SCOP2 representative sequence domain segment information.  Serial indicates the sequential order of the segment in a multi-segment domain. Different segments of the same domain are listed in different lines. The begin and end boundaries of the segments are given as sequence string positions. The ExtDB field defines the external identifier of the reference sequence. The entire representative sequence is listed. Use the begin and end boundaries  in order to generate the sequence segment of the domain.

DomID   Serial  ExtDB   Begin   End     Sequence

8000045 1 P16544 5 261  MATQDSEVALVTGATSGIGLEIARRLGKEGLRVFVCARGEE.....

8000048 1 Q5SLI9 1 257  MLTVDLSGKKALVMGVTNQRSLGFAIAAKLKEAGA.......

4. SCOP2 Sequence libraries

The sequence libraries of SCOP2 representative domains in fasta format can be downloaded from here.