php抓取ncib中pubmed文献数据

上传人:xiao****1972 文档编号:84979652 上传时间:2019-03-06 格式:DOCX 页数:6 大小:31.15KB
返回 下载 相关 举报
php抓取ncib中pubmed文献数据_第1页
第1页 / 共6页
php抓取ncib中pubmed文献数据_第2页
第2页 / 共6页
php抓取ncib中pubmed文献数据_第3页
第3页 / 共6页
php抓取ncib中pubmed文献数据_第4页
第4页 / 共6页
php抓取ncib中pubmed文献数据_第5页
第5页 / 共6页
点击查看更多>>
资源描述

《php抓取ncib中pubmed文献数据》由会员分享,可在线阅读,更多相关《php抓取ncib中pubmed文献数据(6页珍藏版)》请在金锄头文库上搜索。

1、NCIB中pubmed文献数据的抓取Pubmed数据库中含有大量的文献相关信息,但是抓取这些数据的时候会有很多的问题和困难,但是有了pubmed自己的工具就可以随心所欲的抓取了!http:/www.ncbi.nlm.nih.gov/books/NBK25499/这里面有各种工具和参数介绍!这里是EFetch的介绍:EFetchBase URLhttp:/eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgiFunctions Returns formatted data records for a list of input UIDs Returns

2、 formatted data records for a set of UIDs stored on the Entrez History serverRequired ParametersdbDatabase from which to retrieve records. The value must be a validEntrez database name(default = pubmed). Currently EFetch does not support all Entrez databases. Please seeTable 1in Chapter 2 for a list

3、 of available databases.Required Parameter Used only when input is from a UID listidUID list. Either a single UID or a comma-delimited list of UIDs may be provided. All of the UIDs must be from the database specified bydb. There is no set maximum for the number of UIDs that can be passed to EFetch,

4、but if more than about 200 UIDs are to be provided, the request should be made using the HTTP POST method.efetch.fcgi?db=protein&id=15718680,157427902,119703751Required Parameters Used only when input is from the Entrez History serverquery_keyQuery key. This integer specifies which of the UID lists

5、attached to the given Web Environment will be used as input to EFetch. Query keys are obtained from the output of previous ESearch, EPost or ELInk calls. Thequery_keyparameter must be used in conjunction withWebEnv.WebEnvWeb Environment. This parameter specifies the Web Environment that contains the

6、 UID list to be provided as input to EFetch. Usually this WebEnv value is obtained from the output of a previous ESearch, EPost or ELink call. TheWebEnvparameter must be used in conjunction withquery_key.efetch.fcgi?db=protein&query_key=&WebEnv=Optional Parameters RetrievalretmodeRetrieval mode. Thi

7、s parameter specifies the data format of the records returned, such as plain text, HMTL or XML. SeeTable 1for a full list of allowed values for each database.TableTable 1 Valid values of &retmode and &rettype for EFetch (null = empty string)rettypeRetrieval type. This parameter specifies the record

8、view returned, such as Abstract or MEDLINE from PubMed, or GenPept or FASTA from protein. Please seeTable 1for a full list of allowed values for each database.retstartSequential index of the first record to be retrieved (default=0, corresponding to the first record of the entire set). This parameter

9、 can be used in conjunction withretmaxto download an arbitrary subset of records from the input set.retmaxTotal number of records from the input set to be retrieved, up to a maximum of 10,000. Optionally, for a large set the value ofretstartcan be iterated while holdingretmaxconstant, thereby downlo

10、ading the entire set in batches of sizeretmax.Optional Parameters Sequence DatabasesstrandStrand of DNA to retrieve. Available values are 1 for the plus strand and 2 for the minus strand.seq_startFirst sequence base to retrieve. The value should be the integer coordinate of the first desired base, w

11、ith 1 representing the first base of the seqence.seq_stopLast sequence base to retrieve. The value should be the integer coordinate of the last desired base, with 1 representing the first base of the plexityData content to return. Many sequence records are part of a larger data structure or blob, an

12、d thecomplexityparameter determines how much of that blob to return. For example, an mRNA may be stored together with its protein product. The available values are as follows:Value of complexityData returned for each requested GI0entire blob1bioseq2minimal bioseq-set3minimal nuc-prot4minimal pub-set

13、ExamplesPubMedFetch PMIDs 17284678 and 9997 as text abstracts:http:/eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=17284678,9997&retmode=text&rettype=abstractFetch PMIDs in XML:http:/eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=11748933,11700088&retmode=xmlPubMed Ce

14、ntralFetch XML for PubMed Central ID 212403:http:/eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pmc&id=212403Nucleotide/NuccoreFetch the first 100 bases of the plus strand of GI 21614549 in FASTA format:http:/eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=21614549&strand=1&se

15、q_start=1&seq_stop=100&rettype=fasta&retmode=textFetch the first 100 bases of the minus strand of GI 21614549 in FASTA format:http:/eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=21614549&strand=2&seq_start=1&seq_stop=100&rettype=fasta&retmode=textFetch the nuc-prot object for GI 21614549:http:/eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=21614549&complexity=3Fetch the full ASN.1 record for GI 5:http:/eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=5Fetch FASTA for GI 5:http:/eutils.ncbi.nlm.nih.gov/entrez/eutils

展开阅读全文
相关资源
相关搜索

当前位置:首页 > 大杂烩/其它

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号