asrch

asrch

Usage

asrch is a command-line tool used to search an optional on-site installation of Solr and extract data either in Solr response format or complete CLAIMS Direct XML. It is installed as part of the CLAIMS Direct repository. Please see the Client Tools Installation Instructions for more information about how to install this tool.

asrch [Options ...] query --url=s search URL (excluding /select) (default=http://solr.alexandria.com:8080/alexandria-index/alexandria) --raw output raw Solr XML --count output total documents found --maxrows=i maximum documents to output this argument is ignored when using --table --output=file specify output file --dtdpublic=pi Public Identifier for DTD --dtdsystem=si System Identifier for DTD Output Options -------- --archive archive result set documents into predictable path directory structure (Alexandria XML only) --archiveroot=dir root directory to place result set (default=.) --wrapper=s wrap multiple documents in wrapper-named element default=patent-documents --pretty indent output SOLR Options -------- --solropt=s@ Solr options. e.g., --solropt=sort=f1,f2,f3 --solropt=rows=30 See: http://wiki.apache.org/solr/CommonQueryParameters DWH Options -------- --pgdbname as defined in /etc/alexandria.xml (default=alexandria) --dbfunc extract UDF (default=xml.f_patent_document_s) --table=s If specified, a table of UCIDs/publication_ids is created -- could later be used for indexing --truncate truncate --table if it currently exists --help print this usage and exit

Detailed Description of the Parameters

Connectivity

Parameter

Description

Parameter

Description

pgdbname

As configured in /etc/alexandria.xml, the database entry pointing to the on-site CLAIMS Direct PostgreSQL instance. The default value is alexandria as this value is pre-configured in /etc/alexandria.xml.

url

This is the URL of the CLAIMS Direct Solr instance.

Output Options

The following parameters specify output possibilities.

Parameter

Description

Parameter

Description

output

Output results to named file. The default output goes to stdout.

archive

Archive results in a predictable path structure. See aext.

archiveroot

The root directory of the archive. See aext.

wrapper

Default top-level XML element. The default is patent-document.

pretty

Indent the output XML.

count

Only output the count of documents.

maxrows

Maximum number of documents to output. If using the --table option, this parameter is ignored.

table

If specified, a table of UCIDs/publication_ids is created.

raw

This parameter specifies Solr response XML as format.

Solr Options

Parameter

Description

Parameter

Description

solropt

Raw Solr query parameters. This parameter can be used multiple times, e.g.,


--solropt='sort=pd desc' --solropt='fq=pnctry:us'

Examples

Search and Count Results

asrch --count \ --url=http://SOLR-INSTANCE-URL/alexandria-v2.1/alexandria \ 'loadid:261358' -> executing search ... (found 4613; done in 0.095) 4613

Output Select Fields in Solr XML

The following example searches Solr and returns the results in XML format.

You can return Solr results in a variety of formats using the query parameter wt. For a detailed list of output format options, see https://cwiki.apache.org/confluence/display/solr/Response+Writers.

asrch --raw \ --url=http://SOLR-INSTANCE-URL/alexandria-v2.1/alexandria \ --solropt='wt=xml' \ --solropt='fl=ucid,score' \ --solropt='rows=1' \ --solropt='shards.info=false' \ 'loadid:261358' -> executing search ... 200 OK   <?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"> <bool name="zkConnected">true</bool> <int name="status">0</int> <int name="QTime">14</int> <lst name="params"> <str name="q">loadid:261358</str> <str name="qt">premium</str> <str name="echoParams">all</str> <str name="indent">true</str> <str name="fl">ucid,score</str> <str name="shards.info">false</str> <str name="sort">pd desc</str> <str name="rows">1</str> <str name="wt">xml</str> </lst> </lst> <result name="response" numFound="4613" start="0" maxScore="9.676081"> <doc> <str name="ucid">JP-2013257331-A</str> <float name="score">9.617687</float></doc> </result> </response>