306.3. Cross match queries¶
306.3. Cross-match by coordinate¶
For the Rubin Science Platform at data.lsst.cloud.
Data Release: Data Preview 1
Container Size: large
LSST Science Pipelines version: Release r29.1.1
Last verified to run: 2025-07-16
Repository: github.com/lsst/tutorial-notebooks
Learning objective: An efficient way to execute queries to check if lightcurves exist for a set of coordinates.
LSST data products: DiaObject
, ForcedSourceOnDiaObject
Packages: lsst.rsp
,lsst.afw
Credit: Originally developed by the Rubin Community Science team. Please consider acknowledging them if this notebook is used for the preparation of journal articles, software releases, or other notebooks.
Get Support: Everyone is encouraged to ask questions or raise issues in the Support Category of the Rubin Community Forum. Rubin staff will respond to all questions posted there.
1. Introduction¶
This tutorial demonstrates how to make efficient queries for the common question of "does LSST have light curves at these sky coordinates?"
Avoid: querying the forced photometry tables by coordinate.
It is recommended that the fluxes in the forced photometry table, ForcedSourceOnDiaObject
, be used to generate light curves.
However, queries by coordinate on the ForcedSourceOnDiaObject
table are very inefficient and should be avoided.
This is because the ForcedSourceOnDiaObject
table contains forced flux measurements made on every difference and direct image, at the location of every DiaObject
.
It is much more efficient to do the spatial cross-match with DiaObject
once and then retrieve forced fluxes for successful matches based on their IDs, rather than spatially cross-matching to the tens or hundreds (and in the future, thousands) of visits.
The same advice applies to the ForcedSource
and Object
tables.
Avoid: putting asynchronous queries into for
loops.
Although it is possible loop over a set of RA and Dec, and make a spatial cross-match query for each coordinate, this should be avoided. In the future users will be rate-limited on TAP queries and unable to have many queries executing simultaneously.
Correct: upload a table of coordinates and cross-match using a TAP table join.
This notebook demonstrates how to upload a table of coordinates and use the TAP service to cross-match
to the DiaObject
table and then retrieve light curves for successful cross-matches.
Related tutorials: The 200-level tutorials on the DiaObject
and ForcedSourceOnDiaObject
catalogs. The 100-level tutorials on catalog access with TAP and ADQL query statements.
1.1. Import packages¶
Import modules from matplotlib
, numpy
, and astropy
.
From the lsst
package, import the module get_tap_service
for accessing the Table Access Protocol (TAP) service.
import matplotlib.pyplot as plt
import numpy as np
from astropy.table import Table
from lsst.rsp import get_tap_service
from lsst.utils.plotting import (get_multiband_plot_colors,
get_multiband_plot_symbols)
1.2. Define parameters and functions¶
Create an instance of the TAP service, and assert that it exists.
service = get_tap_service("tap")
assert service is not None
Define filter names, plot markers, linestyles, and colors for plotting
filter_names = ['u', 'g', 'r', 'i', 'z', 'y']
filter_colors = get_multiband_plot_colors()
filter_symbols = get_multiband_plot_symbols()
2. Load a coordinate list¶
This pre-made demo list includes the DiaObjects
identified in another tutorial in this series as being potential supernovae,
and stars which may or may not be variable, all in the Extended Chandra Deep Field South (ECDFS).
The central coordinates of ECDFS are RA, Dec = 53.13, -28.10 degrees.
ra_cen = 53.13
dec_cen = -28.10
Read in the targets as an Astropy
Table
.
path = '/rubin/cst_repos/tutorial-notebooks-data/data/'
user_table = Table.read(path + 'dp1_306_3_targets.dat', format='ascii.csv')
assert len(user_table) == 32
Option to view the table.
# user_table
Visualize the target coordinates.
Targets have been "named" with integers that represent their origin:
- 100s: known
DiaObjects
- 200s: known
Objects
(random stars) - 300s: random coordinates
fig = plt.figure(figsize=(4, 4))
plt.plot(ra_cen, dec_cen, '*', ms=10, mew=0, color='black', label='ECDFS center')
tx1 = np.where(user_table['name'] < 200)[0]
tx2 = np.where((user_table['name'] >= 200) & (user_table['name'] < 300))[0]
tx3 = np.where(user_table['name'] >= 300)[0]
plt.plot(user_table['ra'][tx1], user_table['dec'][tx1], 'o', ms=5, mew=0, alpha=0.7,
color='grey', label='DiaObject')
plt.plot(user_table['ra'][tx2], user_table['dec'][tx2], 's', ms=5, mew=0, alpha=0.7,
color='dodgerblue', label='Object')
plt.plot(user_table['ra'][tx3], user_table['dec'][tx3], '^', ms=6, mew=0, alpha=0.7,
color='darkorange', label='random')
plt.legend(loc='best', handletextpad=0)
plt.gca().invert_xaxis()
plt.xlabel('RA')
plt.ylabel('Dec')
plt.show()
Figure 1: The sky locations of the coordinates for this demonstration.
3. Query in two steps¶
3.1. Cross-match to diaObject¶
Upload the table to the TAP service and join it with the DiaObject
table.
Use a spatial cross-match radius of 0.5 arcsec (0.00014 deg), and impose the constraint that only matched DiaObjects
that were detected more than 10 times are returned (nDiaSources
> 10).
Also return the DISTANCE()
, the radial distance in degrees, between the target and the DiaObject
.
In this scenario, all matches to the DiaObject
table are exact matches to known DiaObjects
, so
all distances will be 0.0, but in other scenarios it would be informative.
query = """
SELECT diao.diaObjectId, diao.ra, diao.dec, diao.nDiaSources,
ut.name AS ut_name, ut.ra AS ut_ra, ut.dec AS ut_dec,
DISTANCE(POINT('ICRS', diao.ra, diao.dec), POINT('ICRS', ut.ra, ut.dec)) AS distance
FROM dp1.DiaObject AS diao
JOIN TAP_UPLOAD.ut AS ut
WHERE CONTAINS(POINT('ICRS', diao.ra, diao.dec),
CIRCLE('ICRS', ut.ra, ut.dec, 0.00014))=1
AND diao.nDiaSources > 10
"""
job = service.submit_job(query, uploads={"ut": user_table})
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
Job phase is COMPLETED
assert job.phase == 'COMPLETED'
dia_objects = job.fetch_result().to_table()
Show the table of the 13 matched DiaObjects
.
assert len(dia_objects) == 13
print(len(dia_objects))
dia_objects
13
diaObjectId | ra | dec | nDiaSources | ut_name | ut_ra | ut_dec | distance |
---|---|---|---|---|---|---|---|
deg | deg | ||||||
int64 | float64 | float64 | int64 | int64 | float64 | float64 | float64 |
611255278800732178 | 52.69695529429179 | -27.988292690356108 | 78 | 108 | 52.69695529429179 | -27.988292690356108 | 0.0 |
609788942606139423 | 53.057543091666545 | -28.470666917705937 | 30 | 107 | 53.057543091666545 | -28.470666917705937 | 0.0 |
609789423642476567 | 53.59070920538382 | -28.337607282878942 | 38 | 109 | 53.59070920538382 | -28.337607282878942 | 0.0 |
609788805167185927 | 53.38674314387356 | -28.372341661331774 | 167 | 104 | 53.38674314387356 | -28.372341661331774 | 0.0 |
611256447031836758 | 53.11255514455679 | -27.68482288916528 | 183 | 103 | 53.11255514455679 | -27.68482288916528 | 0.0 |
611255759837069429 | 53.13130494487106 | -27.743080378923775 | 21 | 110 | 53.13130494487106 | -27.743080378923775 | 0.0 |
611255759837069401 | 53.124767650110215 | -27.739814591611168 | 253 | 102 | 53.124767650110215 | -27.739814591611168 | 0.0 |
611255003922825298 | 53.290560409754 | -27.937289443173917 | 52 | 113 | 53.290560409754 | -27.937289443173917 | 0.0 |
611255003922825270 | 53.30261321636857 | -27.93106467624108 | 37 | 112 | 53.30261321636857 | -27.93106467624108 | 0.0 |
611255141361779352 | 53.04016935490686 | -27.993923912026723 | 35 | 101 | 53.04016935490686 | -27.993923912026723 | 0.0 |
611255141361779062 | 53.007957021858076 | -27.902674310446752 | 36 | 106 | 53.007957021858076 | -27.902674310446752 | 0.0 |
611255072642302059 | 53.2226933114704 | -27.889567374680585 | 37 | 105 | 53.2226933114704 | -27.889567374680585 | 0.0 |
611255072642301958 | 53.097942739683255 | -27.986920270662733 | 109 | 111 | 53.097942739683255 | -27.986920270662733 | 0.0 |
The table above shows that only known DiaObjects
(with a ut_name
starting with the number 1) were successfully cross matched, as expected.
Confirm that no targets were matched with multiple DiaObjects
.
values, counts = np.unique(dia_objects['ut_name'], return_counts=True)
tx = np.where(counts > 1)[0]
print('Number of multiply-matched targets: ', len(tx))
Number of multiply-matched targets: 0
3.2. Retrieve forced photometry¶
Retrieve forced photometry by diaObjectId
using a WHERE ... IN ()
statement.
dia_object_id_list = "(" + ", ".join(str(value) for value in dia_objects['diaObjectId']) + ")"
print(dia_object_id_list)
(611255278800732178, 609788942606139423, 609789423642476567, 609788805167185927, 611256447031836758, 611255759837069429, 611255759837069401, 611255003922825298, 611255003922825270, 611255141361779352, 611255141361779062, 611255072642302059, 611255072642301958)
query = """SELECT fsodo.coord_ra, fsodo.coord_dec,
fsodo.diaObjectId, fsodo.visit, fsodo.band,
fsodo.psfDiffFlux, fsodo.psfDiffFluxErr,
vis.expMidptMJD
FROM dp1.ForcedSourceOnDiaObject as fsodo
JOIN dp1.Visit as vis ON vis.visit = fsodo.visit
WHERE diaObjectId IN {}
ORDER BY diaObjectId ASC
""".format(dia_object_id_list)
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
Job phase is COMPLETED
assert job.phase == 'COMPLETED'
fsodo_table = job.fetch_result().to_table()
assert len(fsodo_table) == 7034
print(len(fsodo_table))
7034
Option to display the table of forced photometry.
# fsodo_table
See Section 5 for a demonstration of how to plot light curves using the retrieved forced photometry.
del dia_objects, fsodo_table
4. Query in one step (table join)¶
Combine the two steps of Section 3 into a single query with table joins.
This query does the spatial cross-match to the targets on the DiaObject
table,
and joins with the ForcedSourceOnDiaObject
and Visit
table to retrieve light curves for successful matches.
query = """
SELECT diao.diaObjectId, diao.ra, diao.dec, diao.nDiaSources,
ut.name AS ut_name, ut.ra AS ut_ra, ut.dec AS ut_dec,
fsodo.diaObjectId, fsodo.visit, fsodo.band,
fsodo.psfDiffFlux, fsodo.psfDiffFluxErr,
v.expMidptMJD
FROM dp1.DiaObject AS diao
JOIN TAP_UPLOAD.ut AS ut
JOIN dp1.ForcedSourceOnDiaObject AS fsodo ON fsodo.diaObjectId = diao.diaObjectId
JOIN dp1.Visit AS v ON v.visit = fsodo.visit
WHERE CONTAINS(POINT('ICRS', diao.ra, diao.dec),
CIRCLE('ICRS', ut.ra, ut.dec, 0.00014))=1
AND diao.nDiaSources > 10
"""
print(query)
SELECT diao.diaObjectId, diao.ra, diao.dec, diao.nDiaSources, ut.name AS ut_name, ut.ra AS ut_ra, ut.dec AS ut_dec, fsodo.diaObjectId, fsodo.visit, fsodo.band, fsodo.psfDiffFlux, fsodo.psfDiffFluxErr, v.expMidptMJD FROM dp1.DiaObject AS diao JOIN TAP_UPLOAD.ut AS ut JOIN dp1.ForcedSourceOnDiaObject AS fsodo ON fsodo.diaObjectId = diao.diaObjectId JOIN dp1.Visit AS v ON v.visit = fsodo.visit WHERE CONTAINS(POINT('ICRS', diao.ra, diao.dec), CIRCLE('ICRS', ut.ra, ut.dec, 0.00014))=1 AND diao.nDiaSources > 10
job = service.submit_job(query, uploads={"ut": user_table})
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
Job phase is COMPLETED
assert job.phase == 'COMPLETED'
fsodo_table_2 = job.fetch_result().to_table()
assert len(fsodo_table_2) == 7034
print(len(fsodo_table_2))
7034
The fsodo_table_2
table contains the same data as the fsodo_table
from Section 3.2.
5. Plot light curves¶
How to plot light curves is also demonstrated in other tutorials.
First get the identifiers of the unique DiaObjects
for which forced photometry was returned.
unique_dia_obj_id = np.unique(fsodo_table_2['diaObjectId'])
assert len(unique_dia_obj_id) == 13
print(len(unique_dia_obj_id))
13
fig, ax = plt.subplots(3, 2, figsize=(8, 6), sharey=False, sharex=False)
x = 0
for i in range(3):
for j in range(2):
for f, filt in enumerate(filter_names):
dx = np.where(fsodo_table_2['diaObjectId'] == unique_dia_obj_id[x])
lc = fsodo_table_2[dx]
fx = np.where(lc['band'] == filt)[0]
if (i == 0) & (j == 0):
ax[i, j].plot(lc['expMidptMJD'][fx], lc['psfDiffFlux'][fx],
filter_symbols[filt], ms=5, mew=0, alpha=0.5,
color=filter_colors[filt], label=filt)
else:
ax[i, j].plot(lc['expMidptMJD'][fx], lc['psfDiffFlux'][fx],
filter_symbols[filt], ms=5, mew=0, alpha=0.5,
color=filter_colors[filt])
del dx, fx
ax[i, j].set_title(unique_dia_obj_id[x])
del lc
if (i == 0) & (j == 0):
ax[i, j].legend(loc='best', ncol=3, handletextpad=0)
if i == 2:
ax[i, j].xaxis.set_label_text('MJD (days)')
if j == 0:
ax[i, j].yaxis.set_label_text('Diff Flux (nJy)')
x += 1
plt.tight_layout()
plt.show()
Figure 2: The first six light curves.
del fsodo_table_2