Using uws-client’s Python interface for scripting

The uws-client (PyPi-link) provides a Python script and library that can be used to interact with the UWS interface of CosmoSim (and any other UWS-enabled services) from the command line. This is very useful for writing scripts, e.g. if you need to send thousands of jobs for retrieving the data you need for your research.

We had collected some basic shell-scripts (bash) for CosmoSim already, but this time I want to show you how to do it all in Python, using the uws-client as a library. You can find the necessary functions also yourself: they are defined in uws/UWS/client.py and used in uws/cli/main.py.

Install the latest release of the uws-client:

pip install uws-client

Start a new interactive Python (2.7) console:

python

Now, within the Python environment, you need to import the UWS module:

from uws import UWS

Define your username and password and then use them to create a client-instance:

myusername = 'xxxx'
mypassword = 'xxxx'
url = 'https://www.cosmosim.org/uws/query/'
c = UWS.client.Client(url, myusername, mypassword)

For creating a new job, first define the necessary parameters, i.e. for CosmoSim, set the query string and choose a query queue:

parameters = {'query': 'SELECT x,y,z FROM MDR1.FOF LIMIT 10', 'queue': 'long'}

Now create a job with these parameters:

job = c.new_job(parameters)

The returned job object will contain all the attributes of the job, like job_id, phase and parameters. You can check the parameters like this:

job.parameters[0].id, job.parameters[0].value
job.parameters[1].id, job.parameters[1].value

Currently, the job is still in pending phase, which can be checked using the ‘get_phase’ function:

c.get_phase(job.job_id)

As long as a job is in this phase, the parameters can still be adjusted: set the new parameters
and use the set_parameters_job function:

parameters = {'query': 'SELECT Mvir FROM MDR1.BDMV LIMIT 10', 'queue': 'long'}
job = c.set_parameters_job(job.job_id, parameters)

Start the job now:

job = c.run_job(job.job_id)

Check the current phase:

job.phase

This will show ‘QUEUED’, because the job is submitted to the queueing system.
Get the current phase by sending another request to the server:

c.get_phase(job.job_id)

By the way, there is a print-function defined for the job object, so you can get a nice job detail view using this:

c.get_job(job.job_id)
print job

You can check the phase of the job regularly in order to find out when it is completed, or you use the WAIT parameter for your get_job request and provide a wait-time in seconds. Then the request will return only when the specified wait-time is over or the job-phase has changed:

job = c.get_job(job.job_id, wait='30')

If you add a phse as well, then it will only wait, if the job is currently in the specified phase. Please note that only “active” job phases (those where a phase change is still possible, i.e. PENDING, QUEUED or EXECUTING) make sense.

job = c.get_job(job.job_id, wait='30', phase='QUEUED')

When your job is ready, you probably want to download the results.
First check with the job results by printing the job’s possible results with print job or using the job.results-object:

len(job.results)
job.results[0].id

For CosmoSim (and other Daiquiri-instances), there is one result for each format, each with a different id. The first result usually has id ‘csv’, which we will be using in this example.
In order to download this result, do:

url = str(job.results[0].reference)
c.connection.download_file(url, myusername, mypassword, 'results.csv')

You probably wondered already how you can get an overview of all the jobs that you created already. Of course there is a function for getting the complete job list as well!

jobs = c.get_job_list({})

You can get the job_ids from the job list using:

jobid = jobs.job_reference[0].id

This jobid can be used to query further details for the job in an extra request as before:

job = c.get_job(jobid)

You can also use print for the job list to get all jobs with basic information printed on screen:

print jobs

Note that you can also use the new UWS 1.1 job list filtering capabilities here, i.e. you can filter by a time (jobs created AFTER a certain timestamp), choose the last xx jobs or filter by phase:

jobs = c.get_job_list({'after': '2016-11-09'})
jobs = c.get_job_list({'last': '10'})
jobs = c.get_job_list({'phases': ['PENDING', 'ERROR']})

Please try it out and report back any issues that you may encounter. Have fun!

Previous post:     Next post:
Proudly powered by Daiquiri
©2016 The CosmoSim databaseImprint and Data Protection Statement