102.2. Catalog queries with TAP#
102.2. Catalog queries with TAP¶
For the Rubin Science Platform at data.lsst.cloud.
Data Release: Data Preview 1
Container Size: large
LSST Science Pipelines version: r29.2.0
Last verified to run: 2025-11-10
Repository: github.com/lsst/tutorial-notebooks
DOI: 10.11578/rubin/dc.20250909.20
Learning objective: How to execute complex ADQL TAP queries on catalogs.
LSST data products: Object table
Packages: lsst.rsp.get_tap_service
Credit: Originally developed by the Rubin Community Science team. Please consider acknowledging them if this notebook is used for the preparation of journal articles, software releases, or other notebooks.
Get Support: Everyone is encouraged to ask questions or raise issues in the Support Category of the Rubin Community Forum. Rubin staff will respond to all questions posted there.
1. Introduction¶
TAP provides standardized access to catalog data for discovery, search, and retrieval. Full documentation for TAP is provided by the International Virtual Observatory Alliance (IVOA).
ADQL (Astronomy Data Query Language) is similar to SQL (Structured Query Langage). The documentation for ADQL includes more information about syntax and keywords. Not all ADQL functionality is supported yet in the preview-era RSP.
This tutorial demonstrates options for manipulating and constraining columns and formatting the retrieved data sets with ADQL.
Related tutorials: The other 100-level tutorials in this series demonstrate the TAP basics. The 200-level tutorials describe the contents of the catalog data and have example queries for each.
import numpy as np
from lsst.rsp import get_tap_service
1.2. Define parameters¶
Instantiate the TAP service.
service = get_tap_service("tap")
assert service is not None
2. Manipulate columns¶
Start with a basic query that would execute a cone search on the Object table for objects within 0.01 degrees of coordinates near the center of the ECDFS field, RA, Dec = 53, -28 degrees.
query = """SELECT coord_ra, coord_dec
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1"""
print(query)
SELECT coord_ra, coord_dec
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1
2.1. Rename columns¶
Columns can be renamed using AS in the query.
Rename coord_ra as ra, and coord_dec as dec.
query = """SELECT coord_ra AS ra, coord_dec AS dec
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1"""
print(query)
SELECT coord_ra AS ra, coord_dec AS dec
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1
Execute the query.
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
Job phase is COMPLETED
Show that the columns were renamed.
results.colnames
['ra', 'dec']
job.delete()
del query, results
2.2. Apply ADQL functions¶
A variety of mathematical, trigonometrical, and geometrical functions can be applied to columns, with the results stored as columns in the returned array. See the ADQL documentation for the full list.
Use the same query as above but convert the coordinates from degrees to radians,
and rename the columns ra_radians and dec_radians.
Also retrieve the 2D sky distance between each object and the search coordinates as distance.
query = """SELECT RADIANS(coord_ra) AS ra_rad,
RADIANS(coord_dec) AS dec_rad,
DISTANCE(POINT('ICRS', coord_ra, coord_dec),
POINT('ICRS', 53, -28)) AS distance
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1"""
print(query)
SELECT RADIANS(coord_ra) AS ra_rad,
RADIANS(coord_dec) AS dec_rad,
DISTANCE(POINT('ICRS', coord_ra, coord_dec),
POINT('ICRS', 53, -28)) AS distance
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
Job phase is COMPLETED
Display only the first row of the table.
results[0]
| ra_rad | dec_rad | distance |
|---|---|---|
| float64 | float64 | float64 |
| 0.9250236455912983 | -0.488863082199624 | 0.009791465987706714 |
job.delete()
del query, results
2.3. Convert fluxes to magnitudes¶
Photometric measurements for LSST objects and sources are stored in the catalogs as fluxes with units of nanoJanskys (nJy).
The conversion from nJy to AB magnitude is $m_{\rm AB} = -2.5 \log(f_{\rm nJy}) + 31.4$.
Warning: Difference-image fluxes and forced photometry fluxes can be negative; take care with magnitude conversions.
Functions exist to return fluxes (and errors) as AB magnitudes:
scisql_nanojanskyToAbMagscisql_nanojanskyToAbMagSigma
The Object catalog has magnitude columns corresponding to the PSF and cModel fluxes, but other columns (and other tables) have only fluxes, not magnitudes.
The query below returns the coordinates, the g- and r-band cModel magnitudes, and converts the Sersic fluxes to AB magnitudes and retrieves them.
query = """SELECT coord_ra, coord_dec,
g_cModelMag, g_cModelMagErr,
r_cModelMag, r_cModelMagErr,
scisql_nanojanskyToAbMag(g_sersicFlux) AS g_sersicMag,
scisql_nanojanskyToAbMagSigma(g_sersicFlux, g_sersicFluxErr) AS g_sersicMagErr,
scisql_nanojanskyToAbMag(r_sersicFlux) AS r_sersicMag,
scisql_nanojanskyToAbMagSigma(r_sersicFlux, r_sersicFluxErr) AS r_sersicMagErr
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1"""
print(query)
SELECT coord_ra, coord_dec,
g_cModelMag, g_cModelMagErr,
r_cModelMag, r_cModelMagErr,
scisql_nanojanskyToAbMag(g_sersicFlux) AS g_sersicMag,
scisql_nanojanskyToAbMagSigma(g_sersicFlux, g_sersicFluxErr) AS g_sersicMagErr,
scisql_nanojanskyToAbMag(r_sersicFlux) AS r_sersicMag,
scisql_nanojanskyToAbMagSigma(r_sersicFlux, r_sersicFluxErr) AS r_sersicMagErr
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
Job phase is COMPLETED
results
| coord_ra | coord_dec | g_cModelMag | g_cModelMagErr | r_cModelMag | r_cModelMagErr | g_sersicMag | g_sersicMagErr | r_sersicMag | r_sersicMagErr |
|---|---|---|---|---|---|---|---|---|---|
| deg | deg | mag | mag | mag | mag | ||||
| float64 | float64 | float32 | float32 | float32 | float32 | float64 | float64 | float64 | float64 |
| 52.999950842186635 | -28.009791369795494 | 25.3058 | 0.0408128 | 24.8951 | 0.037891 | 25.289549452182214 | 0.040237818711488176 | 24.892949378203916 | 0.03766112794964982 |
| 53.000647668093336 | -28.00041790185584 | 25.0412 | 0.0298784 | 24.4576 | 0.0226485 | 24.866171833290277 | 0.0292662574085082 | 24.292082739141648 | 0.022401841938212753 |
| 52.99865025242912 | -28.007343580438622 | 24.9923 | 0.0309844 | 24.7067 | 0.0317504 | 24.90763455079954 | 0.030462383351975278 | 24.630331766822504 | 0.03144153073250518 |
| 53.00440835952543 | -28.00706527664946 | 25.615 | 0.0891581 | 24.0388 | 0.021853 | 25.824036556658854 | 0.08788197304022516 | 23.975996228411077 | 0.021711488954269238 |
| 52.99728049236805 | -28.004763783217346 | 24.1276 | 0.0187086 | 23.7754 | 0.020053 | 23.905462678258765 | 0.01842487813911017 | 23.66474104374684 | 0.019896250251999065 |
| 52.998611316150786 | -28.007966170045258 | 24.6015 | 0.0201114 | 24.1979 | 0.0186312 | 24.604628401562163 | 0.019900164058944833 | 24.217741365529697 | 0.01851201132941137 |
| 52.99610446568829 | -28.00392847688764 | 25.7216 | 0.0553019 | 24.699 | 0.028772 | 25.51758445383836 | 0.05332594940368145 | 24.509594032041257 | 0.028202493409365273 |
| 52.994574524381925 | -27.997372314676916 | 24.6329 | 0.0226044 | 23.8767 | 0.0155296 | 24.588047465995317 | 0.022427202238376947 | 23.902507096802537 | 0.015430592488508811 |
| 52.998335112826375 | -28.003816346850325 | 24.2705 | 0.0164768 | 24.1018 | 0.0186653 | 24.28554955290755 | 0.016401610324314392 | 24.121507128684378 | 0.018583906770695945 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 53.00766969633569 | -27.993399513118437 | 20.4858 | 0.000620706 | 19.3215 | 0.000333081 | 20.488225674135386 | 0.0005964093735726239 | 19.327588879321173 | 0.000313522792252015 |
| 53.00642044578477 | -27.994667920805927 | 24.3085 | 0.0337899 | 24.1698 | 0.0395691 | 23.392352928047327 | 0.0332268011886009 | 23.19645761026413 | 0.03703635296876196 |
| 52.996802053288405 | -27.9912332351428 | 28.5065 | 0.677682 | 32.7069 | 42.3778 | 28.228616805619353 | 0.7439242460744452 | 34.46578974634167 | 299.92762698447825 |
| 53.0031279692871 | -27.99123590474861 | 29.5039 | 1.75463 | 29.4607 | 3.12418 | 31.701939724362447 | 25.793940395325283 | 31.092177473081613 | 19.884466632324333 |
| 53.003139507715005 | -27.992142682076214 | 29.3874 | 1.50775 | 28.666 | 1.00012 | 27.92504600267281 | 0.6554646047782284 | 27.728928845537737 | 0.7193995074034013 |
| 53.00136705577273 | -27.990401541457896 | 26.7056 | 0.150559 | 26.6686 | 0.180899 | 26.253211444495825 | 0.15207798087462815 | 26.144998117227367 | 0.18148483107217858 |
| 53.001939897066684 | -27.990603538058334 | 25.7765 | 0.07207 | 25.7781 | 0.0942976 | 25.506527466972184 | 0.06893346762410889 | 25.482099994870694 | 0.08844322213745942 |
| 53.00036920322047 | -27.991722047885638 | 27.4609 | 0.267638 | 26.5169 | 0.143943 | 26.846490281567256 | 0.23935535511200468 | 26.0771889088126 | 0.15246200938368937 |
| 53.00070612505027 | -27.99198924376921 | 26.9852 | 0.227693 | 26.5283 | 0.182682 | 26.740069361434774 | 0.244807886464271 | 26.109630148347108 | 0.17823406636097816 |
job.delete()
del query, results
2.4. Derive new columns¶
Retrieve "derived columns" that are the difference of two columns, in this case the cModel magnitudes of adjacent filters to obtain colors.
query = """SELECT coord_ra, coord_dec,
u_cModelMag - g_cModelMag AS ug_clr,
g_cModelMag - r_cModelMag AS gr_clr,
r_cModelMag - i_cModelMag AS ri_clr,
i_cModelMag - z_cModelMag AS iz_clr,
z_cModelMag - y_cModelMag AS zy_clr
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1"""
print(query)
SELECT coord_ra, coord_dec,
u_cModelMag - g_cModelMag AS ug_clr,
g_cModelMag - r_cModelMag AS gr_clr,
r_cModelMag - i_cModelMag AS ri_clr,
i_cModelMag - z_cModelMag AS iz_clr,
z_cModelMag - y_cModelMag AS zy_clr
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
len(results)
Job phase is COMPLETED
141
print(results.colnames)
['coord_ra', 'coord_dec', 'ug_clr', 'gr_clr', 'ri_clr', 'iz_clr', 'zy_clr']
job.delete()
del query, results
3. Constrain column values¶
3.1. Apply relational operators¶
In the queries above, only the = operator is used,
but others are available, e.g., !=, <, >, <=, and >=.
Include query constraints on the magnitudes to return only objects that are brighter than 25th mag in all filters.
query = """SELECT coord_ra, coord_dec,
u_cModelMag - g_cModelMag AS ug_clr,
g_cModelMag - r_cModelMag AS gr_clr,
r_cModelMag - i_cModelMag AS ri_clr,
i_cModelMag - z_cModelMag AS iz_clr,
z_cModelMag - y_cModelMag AS zy_clr
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1
AND u_cModelMag < 25 AND g_cModelMag < 25
AND r_cModelMag < 25 AND i_cModelMag < 25
AND z_cModelMag < 25 AND y_cModelMag < 25"""
print(query)
SELECT coord_ra, coord_dec,
u_cModelMag - g_cModelMag AS ug_clr,
g_cModelMag - r_cModelMag AS gr_clr,
r_cModelMag - i_cModelMag AS ri_clr,
i_cModelMag - z_cModelMag AS iz_clr,
z_cModelMag - y_cModelMag AS zy_clr
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1
AND u_cModelMag < 25 AND g_cModelMag < 25
AND r_cModelMag < 25 AND i_cModelMag < 25
AND z_cModelMag < 25 AND y_cModelMag < 25
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
Job phase is COMPLETED
As an example, print the minimum and maximum values of the derived columns.
print(np.min(results['ug_clr']), np.max(results['ug_clr']))
print(np.min(results['gr_clr']), np.max(results['gr_clr']))
print(np.min(results['ri_clr']), np.max(results['ri_clr']))
print(np.min(results['iz_clr']), np.max(results['iz_clr']))
print(np.min(results['zy_clr']), np.max(results['zy_clr']))
-0.22574234008789062 2.437074661254883 0.03248786926269531 1.5774612426757812 -0.13392066955566406 0.775299072265625 -0.0517425537109375 0.5262928009033203 -0.6113967895507812 0.5244102478027344
job.delete()
del query, results
3.2. Apply operators to functions¶
Only the original column names can be used in a constraint.
For example, this attempt to create a column named g_sersicMag and then also apply a constraint to it will return an error.
SELECT coord_ra, coord_dec,
scisql_nanojanskyToAbMag(g_sersicFlux) AS g_sersicMag
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1
AND g_sersicMag < 25
The correct way to make this query is to use the original column name in the WHERE statement.
query = """SELECT coord_ra, coord_dec,
scisql_nanojanskyToAbMag(g_sersicFlux) AS g_sersicMag
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1
AND scisql_nanojanskyToAbMag(g_sersicFlux) < 25"""
print(query)
SELECT coord_ra, coord_dec,
scisql_nanojanskyToAbMag(g_sersicFlux) AS g_sersicMag
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1
AND scisql_nanojanskyToAbMag(g_sersicFlux) < 25
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
print(len(results))
Job phase is COMPLETED 44
job.delete()
del query, results
3.3. Use WHERE ... BETWEEN ... AND¶
A constraint that one column's value is between two other columns' values can be applied with a statement like WHERE a BETWEEN b AND c.
Order matters!
WHERE a BETWEEN b AND cwill return all rows for which $b < a < c$.WHERE a BETWEEN c AND bwill return all rows for which $c < a < b$.
For example, return all objects within the cone search region for which the PSF magnitudes are $g < r < i$. In other words, objects that are brighter in g-band than r-band, and brighter in r-band than i-band (i.e., blue objects).
query = """SELECT objectId, g_psfMag, r_psfMag, i_psfMag
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1
AND r_psfMag BETWEEN g_psfMag AND i_psfMag"""
print(query)
SELECT objectId, g_psfMag, r_psfMag, i_psfMag
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.01)) = 1
AND r_psfMag BETWEEN g_psfMag AND i_psfMag
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
print(len(results))
Job phase is COMPLETED 3
for i in range(len(results)):
print(results['objectId'][i],
results['g_psfMag'][i],
results['r_psfMag'][i],
results['i_psfMag'][i])
611255141361789816 26.2232 26.2289 26.2984 611255141361790690 26.64 26.7169 27.3359 611255141361791996 28.4808 31.817 33.0504
job.delete()
del query, results
3.4. Use WHERE ... IN ()¶
The constraint option of WHERE <col1> IN (value1, value2, value3) is a simpler way of doing, e.g., WHERE <col1> = value1 OR <col1> = value2 OR <col1 = value3.
A scenario for using WHERE ... IN () is when the list of objectId are known. A list of up to ~50,000 identifiers can be passed.
Define id_list as a string, formatted as a tuple, of the objectId of the three blue objects identified in the previous section.
id_array = np.asarray((611255141361789816, 611255141361790690, 611255141361791996),
dtype='long')
id_list = "(" + ", ".join(str(value) for value in id_array) + ")"
print(id_list)
(611255141361789816, 611255141361790690, 611255141361791996)
Create a query to return the coordinates of the three objects.
query = """SELECT objectId, coord_ra, coord_dec
FROM dp1.Object
WHERE objectId IN {}""".format(id_list)
print(query)
SELECT objectId, coord_ra, coord_dec
FROM dp1.Object
WHERE objectId IN (611255141361789816, 611255141361790690, 611255141361791996)
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
Job phase is COMPLETED
Display the results.
results
| objectId | coord_ra | coord_dec |
|---|---|---|
| deg | deg | |
| int64 | float64 | float64 |
| 611255141361789816 | 53.005323357996005 | -28.00468291953882 |
| 611255141361790690 | 52.99438633549297 | -27.99473439509563 |
| 611255141361791996 | 52.996802053288405 | -27.9912332351428 |
job.delete()
del query, results
4. Manipulate the results¶
4.1. Use caution with TOP and ORDER BY¶
Warning: The TAP service first applies
WHEREconstraints, thenORDER BY, and thenTOP. If the query is not well constrained, and many more rows meet theWHEREconstraints than theNreturned byTOP, then they all are sorted, first, before only the topNare returned. This is not an efficient use of the shared TAP service.
It is recommended to only combine TOP and ORDER BY if the query's WHERE statements significantly cut down the number of objects that would need to be sorted.
4.2. Limit rows¶
Very small spatial regions are the recommended way to restrict the number of rows returned, but TOP will also work.
To only retrieve a subset of the rows which meet the query constraints, add TOP N after SELECT to return only the top N number of rows. Including maxrec in the job definition will also work, but use of TOP is recommended over maxrec.
query = """SELECT TOP 10 coord_ra, coord_dec
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.1)) = 1"""
print(query)
SELECT TOP 10 coord_ra, coord_dec
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.1)) = 1
Execute the query.
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
Job phase is COMPLETED
Fetch the results and confirm that only 10 rows are returned.
assert job.phase == 'COMPLETED'
results = job.fetch_result()
assert len(results) == 10
print(len(results))
10
Clean up.
job.delete()
del query, results
4.3. Sort table¶
4.3.1. Sort with astropy¶
Retrieve the query results as an astropy table and use the sort method.
Create a query that is similar to the one in Section 4.1, but does not use TOP and only returns rows with g_cModelMag brighter than 20th mag.
query = """SELECT coord_ra, coord_dec, g_cModelMag
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.1)) = 1
AND g_cModelMag < 20"""
print(query)
SELECT coord_ra, coord_dec, g_cModelMag
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.1)) = 1
AND g_cModelMag < 20
Submit the query job.
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
Job phase is COMPLETED
Retrieve the results as an astropy formatted table with to_table(), in order to use the sort method.
assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
Sort the astropy table by g_cModelMag in descending order (i.e., use reverse=True; the default is False, for ascending order).
It is also possible to sort by multiple columns with, e.g., results.sort(['column_1', 'column_2']).
results.sort('g_cModelMag', reverse=True)
Display the results, now sorted by $g$-band magnitude, with the faintest object in the first row.
results
| coord_ra | coord_dec | g_cModelMag |
|---|---|---|
| deg | deg | mag |
| float64 | float64 | float32 |
| 52.917598505724484 | -28.036759972159413 | 19.9929 |
| 53.04775482029748 | -27.975392984903166 | 19.9504 |
| 53.03584306939495 | -27.928161395037968 | 19.9287 |
| 53.03903906240283 | -27.959458794774264 | 19.9 |
| 52.96663897778455 | -27.963653398451722 | 19.874 |
| 53.00918641402325 | -28.044180804131248 | 19.7971 |
| 53.01829138799236 | -28.075509632285595 | 19.7953 |
| 53.07382332409231 | -28.06505463389963 | 19.752 |
| 53.09918248237757 | -27.979397996279193 | 19.6067 |
| ... | ... | ... |
| 52.91466707323275 | -28.027825200964536 | 17.1495 |
| 52.97177339646181 | -28.049185382350746 | 17.018 |
| 53.01835078638047 | -28.085549256632248 | 17.0006 |
| 53.069949903324094 | -28.039568580067808 | 16.9427 |
| 53.09326022159075 | -28.018392693608014 | 16.6915 |
| 53.08453522823213 | -28.026264129464558 | 16.2093 |
| 52.92140884571618 | -28.052060061799228 | 16.1101 |
| 52.956019116141974 | -27.952755305471864 | 15.8976 |
| 52.92022761953359 | -28.010189784283277 | 15.6904 |
Clean up.
job.delete()
del query, results
4.3.2. Sort with ORDER BY¶
Add an ORDER BY statement to the TAP query to sort the results in ascending (ASC) or descending (DESC) order.
Warning: Adding
ORDER BYstatements to TAP queries requires more processing than unordered queries, and consumes more of the shared resource that is the TAP service. Sorting withastropyis recommended for this reason (Section 4.3.1).
Order the results by descending g_cModelMag values.
query = """SELECT coord_ra, coord_dec, g_cModelMag
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.1)) = 1
AND g_cModelMag < 20
ORDER BY g_cModelMag DESC"""
print(query)
SELECT coord_ra, coord_dec, g_cModelMag
FROM dp1.Object
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),
CIRCLE('ICRS', 53, -28, 0.1)) = 1
AND g_cModelMag < 20
ORDER BY g_cModelMag DESC
Submit the query job.
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
Job phase is COMPLETED
Return the results.
assert job.phase == 'COMPLETED'
results = job.fetch_result()
Display the results, which are already sorted by $g$-band magnitude from faint to bright.
results
<DALResultsTable length=51>
coord_ra coord_dec g_cModelMag
deg deg mag
float64 float64 float32
------------------ ------------------- -----------
52.917598505724484 -28.036759972159413 19.9929
53.04775482029748 -27.975392984903166 19.9504
53.03584306939495 -27.928161395037968 19.9287
53.03903906240283 -27.959458794774264 19.9
52.96663897778455 -27.963653398451722 19.874
53.00918641402325 -28.044180804131248 19.7971
53.01829138799236 -28.075509632285595 19.7953
53.07382332409231 -28.06505463389963 19.752
53.09918248237757 -27.979397996279193 19.6067
... ... ...
52.91466707323275 -28.027825200964536 17.1495
52.97177339646181 -28.049185382350746 17.018
53.01835078638047 -28.085549256632248 17.0006
53.069949903324094 -28.039568580067808 16.9427
53.09326022159075 -28.018392693608014 16.6915
53.08453522823213 -28.026264129464558 16.2093
52.92140884571618 -28.052060061799228 16.1101
52.956019116141974 -27.952755305471864 15.8976
52.92022761953359 -28.010189784283277 15.6904
Clean up.
job.delete()
del query, results
5. Aggregate rows¶
The SQL and ADQL aggregate functions are: COUNT, SUM, AVG, MIN, and MAX.
Use these to return the statistics of the column values for a query, instead of the values themselves.
Define a query to return the number (COUNT) of $r$-band images obtained for the ECDFS field, and also the Modified Julian Date (MJD) of the first (MIN) and last (MAX) image.
query = """SELECT COUNT(expMidptMJD) AS n_visits,
MIN(expMidptMJD) AS min_mjd, MAX(expMidptMJD) AS max_mjd
FROM dp1.Visit
WHERE CONTAINS(POINT('ICRS', ra, dec),
CIRCLE('ICRS', 53.13, -28.10, 3)) = 1
AND band = 'r'"""
print(query)
SELECT COUNT(expMidptMJD) AS n_visits,
MIN(expMidptMJD) AS min_mjd, MAX(expMidptMJD) AS max_mjd
FROM dp1.Visit
WHERE CONTAINS(POINT('ICRS', ra, dec),
CIRCLE('ICRS', 53.13, -28.10, 3)) = 1
AND band = 'r'
Submit the query job.
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
Job phase is COMPLETED
Return the results.
assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
Display the results.
results
| n_visits | min_mjd | max_mjd |
|---|---|---|
| int64 | float64 | float64 |
| 237 | 60623.25932903928 | 60655.24643617473 |
Clean up.
job.delete()
del query, results
5.1. Use GROUP BY¶
To apply the ADQL / SQL aggregate functions to groups of rows instead of the full table, use GROUP BY.
Define a query to return the visit, the MJD, the band, and the number of detected sources per visit by joining the Visit and Source tables, and using GROUP BY on the visit. Constrain the query to only $r$-band images obtained within a $\sim30$ minute window between MJD 60623.250 and 60623.270.
query = """SELECT v.visit, v.expMidptMJD, v.band,
COUNT(s.sourceId) AS n_sources
FROM dp1.Visit AS v
JOIN dp1.Source AS s ON s.visit = v.visit
WHERE CONTAINS(POINT('ICRS', v.ra, v.dec),
CIRCLE('ICRS', 53.13, -28.10, 3)) = 1
AND v.band = 'r'
AND v.expMidptMJD > 60623.250
AND v.expMidptMJD < 60623.270
GROUP BY v.visit"""
print(query)
SELECT v.visit, v.expMidptMJD, v.band,
COUNT(s.sourceId) AS n_sources
FROM dp1.Visit AS v
JOIN dp1.Source AS s ON s.visit = v.visit
WHERE CONTAINS(POINT('ICRS', v.ra, v.dec),
CIRCLE('ICRS', 53.13, -28.10, 3)) = 1
AND v.band = 'r'
AND v.expMidptMJD > 60623.250
AND v.expMidptMJD < 60623.270
GROUP BY v.visit
Submit the job.
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
Job phase is COMPLETED
Retrieve the results.
assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
Display the results.
results
| visit | expMidptMJD | band | n_sources |
|---|---|---|---|
| d | |||
| int64 | float64 | object | int64 |
| 2024110800246 | 60623.25932903928 | r | 27708 |
| 2024110800247 | 60623.25989537033 | r | 27446 |
| 2024110800250 | 60623.262127748836 | r | 27834 |
| 2024110800251 | 60623.2626942534 | r | 27474 |
| 2024110800254 | 60623.2648977835 | r | 28241 |
| 2024110800255 | 60623.265465752265 | r | 27057 |
| 2024110800258 | 60623.2676935763 | r | 26917 |
| 2024110800259 | 60623.268262389945 | r | 27709 |
Clean up.
job.delete()
del query, results