201.3. ForcedSource table¶

201_3_ForcedSource_table

201.3. Forced Source table¶

For the Rubin Science Platform at data.lsst.cloud.
Data Release: Data Preview 1
Container Size: large
LSST Science Pipelines version: r29.1.1
Last verified to run: 2025-06-21
Repository: github.com/lsst/tutorial-notebooks

Learning objective: To understand the contents of the ForcedSource table and how to access it.

LSST data products: ForcedSource

Packages: lsst.rsp, lsst.daf.butler

Credit: Originally developed by the Rubin Community Science team. Please consider acknowledging them if this notebook is used for the preparation of journal articles, software releases, or other notebooks.

Get Support: Everyone is encouraged to ask questions or raise issues in the Support Category of the Rubin Community Forum. Rubin staff will respond to all questions posted there.

1. Introduction¶

The ForcedSource table contains forced PSF measurements in individual visit_images and difference_images at the sky coordinates of all Objects.

Objects are the union set of all sources detected in the deep_coadd images or in the individual visit images (even those detected only once in one filter).

It is recommended to use the Forced Source and Difference Image Analysis (DIA) tables for light curves.

TAP table name: dp1.ForcedSource
Butler table name: object_forced_source
columns: 28
rows: 268,796,943

Related tutorials: The TAP and Butler data access services are demonstrated in the 100-level "How to" tutorials. There are 200-level tutorials on visit_images, difference_images and the Object table.

1.1. Import packages¶

Import standard python packages re, numpy, matplotlib and astropy.

From the lsst package, import modules for the TAP service, the Butler, and plotting.

In [1]:

import re
import numpy as np
import matplotlib.pyplot as plt
from lsst.rsp import get_tap_service
from lsst.daf.butler import Butler
from lsst.utils.plotting import (get_multiband_plot_colors,
                                 get_multiband_plot_symbols,
                                 get_multiband_plot_linestyles)

1.2. Define parameters and functions¶

Create an instance of the TAP service, and assert that it exists.

In [2]:

service = get_tap_service("tap")
assert service is not None

Create an instance of the Rubin data Butler, and assert that it exists.

In [3]:

butler = Butler('dp1', collections="LSSTComCam/DP1")
assert butler is not None

Define the colors and symbols to represent the LSST filters in plots.

In [4]:

filter_names = ['u', 'g', 'r', 'i', 'z', 'y']
filter_colors = get_multiband_plot_colors()
filter_symbols = get_multiband_plot_symbols()
filter_linestyles = get_multiband_plot_linestyles()

2. Schema (columns)¶

To browse the table schema visit the Rubin schema browser, or use the TAP service via the Portal Aspect or as demonstrated in Section 2.1.

2.1. Retrieve table schema¶

To retrieve the table schema, define a query for the schema columns of the ForcedSource table and run the query job.

In [5]:

query = "SELECT column_name, datatype, description, unit " \
        "FROM tap_schema.columns " \
        "WHERE table_name = 'dp1.ForcedSource'"
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
    job.raise_if_error()

Job phase is COMPLETED

Retrieve the query results and display them as an astropy table with the to_table() attribute.

In [6]:

assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
results

Out[6]:

Table length=28

column_name	datatype	description	unit
str64	str64	str512	str64
band	char	Abstract filter that is not associated with a particular instrument
coord_dec	double	Fiducial ICRS Declination of centroid used for database indexing	deg
coord_ra	double	Fiducial ICRS Right Ascension of centroid used for database indexing	deg
detector	long	Id of the detector where this source was measured. Datatype short instead of byte because of DB concerns about unsigned bytes.
diff_PixelFlags_nodataCenter	boolean	Source center is outside usable region on image difference (masked NO_DATA)
invalidPsfFlag	boolean	Source has an invalid PSF.
objectId	long	Unique Object ID. Primary Key of the Object Table
parentObjectId	long	Unique ObjectId of the parent of the ObjectId in context of the deblender.
patch	long	Skymap patch ID
pixelFlags_bad	boolean	Bad pixel in the Source footprint
pixelFlags_cr	boolean	Cosmic ray in the Source footprint
pixelFlags_crCenter	boolean	Cosmic ray in the Source center
pixelFlags_edge	boolean	Source is on the edge of an exposure region (masked EDGE)
pixelFlags_interpolated	boolean	Interpolated pixel in the Source footprint
pixelFlags_interpolatedCenter	boolean	Interpolated pixel in the Source center
pixelFlags_nodata	boolean	Source is outside usable exposure region (masked NO_DATA)
pixelFlags_saturated	boolean	Saturated pixel in the Source footprint
pixelFlags_saturatedCenter	boolean	Saturated pixel in the Source center
pixelFlags_suspect	boolean	Sources footprint includes suspect pixels
pixelFlags_suspectCenter	boolean	Sources center is close to suspect pixels
psfDiffFlux	float	Flux derived from linear least-squares fit of psf model forced on the image difference	nJy
psfDiffFlux_flag	boolean	Failure to derive linear least-squares fit of psf model forced on the image difference
psfDiffFluxErr	float	Uncertainty on the flux derived from linear least-squares fit of psf model forced on the image difference	nJy
psfFlux	float	Flux derived from linear least-squares fit of psf model forced on the calexp	nJy
psfFlux_flag	boolean	Failure to derive linear least-squares fit of psf model forced on the calexp
psfFluxErr	float	Uncertainty on the flux derived from linear least-squares fit of psf model forced on the calexp	nJy
tract	long	Skymap tract ID
visit	long	Id of the visit where this source was measured.

The table displayed above has been truncated.

Option to print every column name as a list.

In [7]:

# for col in results['column_name']:
#     print(col)

Option to use the regular expressions package re to search for column names that contain the string temp.

In [8]:

# temp = 'Err'
# temp = 'Flux'
# temp = 'psf'
# for col in results['column_name']:
#     if re.search(temp, col):
#         print(col)

Delete the job, but not the results.

In [9]:

del query
job.delete()

2.2. Key columns¶

The few most commonly used columns of the ForcedSource table.

2.2.1. Object Id¶

The objectId that identifies the row of the Object table associated with the ForcedSource.

objectId

2.2.2. Coordinates¶

The sky coordinates in decimal degrees for the associated Object are held fixed during the forced measurements, and have no quoted uncertainty in the ForcedSource table.

coord_ra, coord_dec

2.2.3. Observation¶

The band, visit, and detector in which the ForcedSource was measured.

band, visit, detector

2.2.4. Photometry¶

A forced fit of the Point Spread Function (PSF) at the object's coordinates in each individual visit_image and difference_image.

psfFlux, psfFluxErr
psfDiffFlux, psfDiffFluxErr

Forced fluxes can be negative.

Forced PSF photometry on an image in which the background is slightly oversubtracted at the coordinates can be negative. Photometry on differences can be negative as well.

It is not recommended to convert forced fluxes to magnitudes, but if they are positive, the fluxes ($f$) are in nanoJanskys and the corresponding AB magnitude ($m$) is:

$m = -2.5\log(f) + 31.4$

2.2.5. Flags¶

A variety of flags indicating whether pixels that are saturated, or affected by cosmic rays, contributed to the source's measurements.

pixelFlags_*

2.3. Descriptions and units¶

For a subset of the key columns show the table of their descriptions and units.

In [10]:

col_list = set(['objectId', 'coord_ra', 'coord_dec', 'band', 'visit', 'detector',
                'psfFlux', 'psfFluxErr', 'psfDiffFlux', 'psfDiffFluxErr'])
tx = [i for i, item in enumerate(results['column_name']) if item in col_list]
results[tx]

Out[10]:

Table length=10

column_name	datatype	description	unit
str64	str64	str512	str64
band	char	Abstract filter that is not associated with a particular instrument
coord_dec	double	Fiducial ICRS Declination of centroid used for database indexing	deg
coord_ra	double	Fiducial ICRS Right Ascension of centroid used for database indexing	deg
detector	long	Id of the detector where this source was measured. Datatype short instead of byte because of DB concerns about unsigned bytes.
objectId	long	Unique Object ID. Primary Key of the Object Table
psfDiffFlux	float	Flux derived from linear least-squares fit of psf model forced on the image difference	nJy
psfDiffFluxErr	float	Uncertainty on the flux derived from linear least-squares fit of psf model forced on the image difference	nJy
psfFlux	float	Flux derived from linear least-squares fit of psf model forced on the calexp	nJy
psfFluxErr	float	Uncertainty on the flux derived from linear least-squares fit of psf model forced on the calexp	nJy
visit	long	Id of the visit where this source was measured.

Clean up.

In [11]:

del col_list, tx, results

3. Data access¶

The ForcedSource table is available via the TAP service and the Butler.

Recommended access method: TAP.

3.1. Advisory: avoid full-table queries¶

Avoid full-table queries. Always include a constraint on the coordinates, objectId, or visit.

The ForcedSource table is a large, inclusive union set of forced measurements made in all the visit and difference images at the locations of all Objects.

Furthermore, skipping spatial constraints is not a good habit to form, because future data release ForcedSource tables will contain trillions of rows.

3.2. TAP (Table Access Protocol)¶

The ForcedSource table is stored in Qserv and accessible via the TAP services using ADQL queries.

Include spatial constraints: Qserv stores catalog data sharded by coordinate (RA, Dec), so ADQL query statements that include constraints by coordinate do not requre a whole-catalog search and are typically faster (and can be much faster) than ADQL query statements which only include constraints for other columns. Use either an ADQL cone or polygon search for faster queries (do not use WHERE ... BETWEEN statements to set boundaries on RA and Dec).

3.2.1. Demo query¶

Define a query to return the ten "key columns" from Section 2.3.

Use the coordinates of a variable star discovered in LSSTComCam data (Carlin et al. 2025, in prep.):

RA = 94.9226329830
Dec = -25.2318482104

Impose spatial constraints: search within a 2 arcsecond radius of the star's coordinates.

In [12]:

query = "SELECT objectId, coord_ra, coord_dec, band, visit, detector, " \
        "psfFlux, psfFluxErr, psfDiffFlux, psfDiffFluxErr " \
        "FROM dp1.ForcedSource " \
        "WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec), " \
        "CIRCLE('ICRS', 94.9226329830, -25.2318482104, 0.00056)) = 1 " \
        "ORDER BY objectId ASC "
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
    job.raise_if_error()

Job phase is COMPLETED

Fetch the results as an astropy table.

In [13]:

assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
print(len(results))

Option to display the table.

In [14]:

# results

Show that all of the ForcedSource photometry that was returned is associated with only 1 Object.

In [15]:

values, counts = np.unique(results['objectId'], return_counts=True)
assert len(values) == 1
for value, count in zip(values, counts):
    print(value, count)
del values, counts

614435753623041404 219

As an example, calculate and print the minimum, mean, maximum, and standard deviation in PSF flux in the visit_images per filter.

In [16]:

for filt in filter_names:
    fx = np.where(results['band'] == filt)[0]
    print('%s %3i %8.1f %8.1f %8.1f %8.1f' %
          (filt, len(fx),
           np.min(results['psfFlux'][fx]), np.mean(results['psfFlux'][fx]),
           np.max(results['psfFlux'][fx]), np.std(results['psfFlux'][fx])))

u  20  43299.7  49598.4  74063.9   7876.5
g  64    363.7 109666.4 162934.0  29583.2
r  68    485.2 114955.3 147708.0  18085.7
i  17 111610.0 122977.4 133661.0   8460.0
z  42   -361.3 100899.1 133390.0  24700.9
y   8 111474.0 114524.8 116633.0   1416.7

Clean up.

In [17]:

job.delete()
del query, results

3.2.2. Joinable tables¶

The ForcedSource table can be joined to:

the Visit table on column visit
the Object table on column objectId

The Visit table contains information about the observation, such as the date and time.

The Object table contains measurements made on the deep_coadd image.

To plot a light curve, join the ForcedSource and Visit tables.

To the query used in Section 3.2.1, add a table join to the Visit table and retrieve the column expMidptMJD. Change the WHERE statement to select all rows of the ForcedSource catalog for the objectId identified in the previous section. This query would also work the same with the CONTAINS(POINT(), CIRCLE()) statement.

In [18]:

query = "SELECT fs.objectId, fs.coord_ra, fs.coord_dec, fs.band, fs.visit, fs.detector, " \
        "fs.psfFlux, fs.psfFluxErr, fs.psfDiffFlux, fs.psfDiffFluxErr, v.expMidptMJD, " \
        "o.u_psfFlux, o.g_psfFlux, o.r_psfFlux, o.i_psfFlux, o.z_psfFlux, o.y_psfFlux " \
        "FROM dp1.ForcedSource AS fs " \
        "JOIN dp1.Visit AS v ON fs.visit = v.visit " \
        "JOIN dp1.Object AS o ON o.objectId = fs.objectId " \
        "WHERE o.objectId = 614435753623041404 " \
        "ORDER BY v.expMidptMJD ASC "
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
    job.raise_if_error()

Job phase is COMPLETED

Fetch the results as an astropy table.

In [19]:

assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
print(len(results))

Option to display the results table.

In [20]:

# results

Plot the light curve for this object, as measured in the visit_image and difference_image.

In [21]:

fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(10, 4))
for filt in filter_names:
    fx = np.where(results['band'] == filt)[0]
    if len(fx) > 0:
        ax1.axhline(results[filt+'_psfFlux'][fx[0]], color=filter_colors[filt],
                    ls=filter_linestyles[filt], alpha=0.2, label=filt)
        ax1.plot(results['expMidptMJD'][fx],
                 results['psfFlux'][fx],
                 filter_symbols[filt], ms=5, mew=0, alpha=0.7,
                 color=filter_colors[filt])
        ax2.plot(results['expMidptMJD'][fx],
                 results['psfDiffFlux'][fx],
                 filter_symbols[filt], ms=5, mew=0, alpha=0.7,
                 color=filter_colors[filt], label=filt)
ax1.set_xlabel('MJD')
ax1.set_ylabel('PSF Flux')
ax1.legend(loc='lower left', ncol=2)
ax2.set_xlabel('MJD')
ax2.set_ylabel('PSF Diff Flux')
ax2.legend(loc='lower left', ncol=2)
plt.tight_layout()
plt.show()

No description has been provided for this image

Figure 1: At left, the PSF forced photometry light curve on the visit_images for the object. Horizontal lines mark the object's flux in the deep_coadd image. At right, the PSF forced photometry light curve on the difference_images for the object.

Clean up.

In [22]:

job.delete()
del query, results

3.4. Butler¶

TAP is the recommended way to access the source table, but the Butler can also be used.

Show that the dimensions for the object_forced_source table are the skymap's tract and patch, and that both are required.

In [23]:

butler.get_dataset_type('object_forced_source')

Out[23]:

DatasetType('object_forced_source', {skymap, tract, patch}, ArrowAstropy)

In [24]:

butler.get_dataset_type('object_forced_source').dimensions.required

Out[24]:

{skymap, tract, patch}

3.4.1. Demo query¶

Retrieve all dataset_refs for object_forced_source tables for patches that overlap the coordinates that were also used in Section 3.2.

In [25]:

ra = 94.9226329830
dec = -25.2318482104

refs = butler.query_datasets("object_forced_source",
                             where="patch.region OVERLAPS POINT(:ra, :dec)",
                             bind={"ra": ra, "dec": dec})

Print the number of dataset references returned.

In [26]:

print(len(refs))

Print the dataIds of the refs.

In [27]:

for ref in refs:
    print(ref.dataId)

{skymap: 'lsst_cells_v1', tract: 5306, patch: 97}
{skymap: 'lsst_cells_v1', tract: 5525, patch: 0}
{skymap: 'lsst_cells_v1', tract: 5526, patch: 9}

Define the columns to retrieve.

In [28]:

col_list = ['objectId', 'coord_ra', 'coord_dec',
            'band', 'psfFlux', 'psfFluxErr']

Get the data from the Butler for the third patch in the list of returned dataset refs.

In [29]:

results = butler.get(refs[2],
                     parameters={'columns': col_list})
print(len(results))

Option to display the results.

In [30]:

# results

Without a small constraint on the region as in Section 3.2.1, data for many more unique objects has been returned.

In [31]:

values = np.unique(results['objectId'])
print(len(values))
del values

Check that the variable star object is in the results table.

In [32]:

tx = np.where(results['objectId'] == 614435753623041404)[0]
print(len(tx))

Print the same table of psfFlux statistics as in Section 3.2.1.

In [33]:

for filt in filter_names:
    fx = np.where(results['band'][tx] == filt)[0]
    print('%s %3i %8.1f %8.1f %8.1f %8.1f' %
          (filt, len(fx),
           np.min(results['psfFlux'][tx[fx]]),
           np.mean(results['psfFlux'][tx[fx]]),
           np.max(results['psfFlux'][tx[fx]]),
           np.std(results['psfFlux'][tx[fx]])))

u  20  43299.7  49598.4  74063.9   7876.5
g  64    363.7 109666.3 162934.2  29583.2
r  68    485.2 114955.3 147707.7  18085.7
i  17 111610.2 122977.3 133661.2   8460.0
z  42   -361.3 100899.1 133390.0  24700.9
y   8 111474.4 114524.8 116633.3   1416.7

In [34]:

del ra, dec, refs, col_list
del results, tx, fx

In [ ]: