303.3. Color selections#
303.3. Color selections¶
For the Rubin Science Platform at data.lsst.cloud.
Data Release: Data Preview 1
Container Size: large
LSST Science Pipelines version: r29.2.0
Last verified to run: 2026-06-12
Repository: github.com/lsst/tutorial-notebooks
Learning objective: Explore high redshift galaxy color selection with DP1.
LSST data products: Object table, deep_coadd images
Packages: lsst.rsp.get_tap_service, pyvo, astroquery
Credit: This notebook benefited from discussions with Dan Taranu, John Franklin Crenshaw, and the Rubin LSST photometric redshift commissioning team.
Get Support: Everyone is encouraged to ask questions or raise issues in the Support Category of the Rubin Community Forum. Rubin staff will respond to all questions posted there.
1. Introduction¶
Color selections have served as a key method of identifying different populations of galaxies for decades. Especially at redshift z > 2.5, the Lyman-break is a strong spectral feature that is exploited to identify galaxies at high redshift in ground-based imaging. Lyman-break selections make use of the fact that intergalactic hydrogen in the foreground absorbs all the light of background galaxies that is emitted blueward of the Lyman limit (912A) or at very high redshifts, Lyman-alpha (1216A). More details about why this works can be found at this blog-post.
Thus, galaxies at high redshifts can be identified by their very red colors between filters that bridge the Lyman break feature (and are named after their selection method as Lyman break galaxies or LBGs). Since intergalactic hydrogen absorbs the light blueward of the Lyman break, the flux of true z ~ 3 galaxies drops as the Lyman-break redshifts into the LSST u-band, and at z ~ 4 in the LSST g-band.
This notebook demonstrates the Lyman break color selection on DP1 data for galaxies at z ~ 4 as an example (referred to as g-band dropouts), and provides some validation metrics for its performance. It makes use of color selections defined and used in the literature with data from facilities with similar filter sets (band pass shapes and effective wavelengths). These are using the Canada-France-Hawaii Telescope Legacy Survey (CFHTLS) as presented in Hildebrandt et al., 2009 and van der Burg et al., 2010. A similar selection was performed using Subaru Hyper-Suprime Cam data in Ono et al. 2018.
Related tutorials: See also the 303-series tutorials on galaxy science, specifically the 303.1 tutorial that overviews the various galaxy photometry measurements produced by the LSST pipelines, and their science use cases.
1.1. Import packages¶
Import common scientific analysis packages numpy, matplotlib, and astropy.
Import LSST Science Pipelines package utilities for remote data access lsst.rsp.
Import pyvo packages for working with the virtual observatory cutout service.
Import astroquery to allow access to publicly available external science products (spectroscopic redshift catalogs).
import io
import numpy as np
import matplotlib.pyplot as plt
from astropy.io import fits
from astropy.coordinates import SkyCoord
import astropy.units as u
from astropy.visualization import ZScaleInterval, LinearStretch, ImageNormalize
from lsst.rsp import get_tap_service
from lsst.rsp.utils import get_pyvo_auth
from pyvo.dal.adhoc import SodaQuery, DatalinkResults
from astroquery.vizier import Vizier
1.2. Define parameters and functions¶
Create an instance of the TAP service, and store the authorization as session.
tap_service = get_tap_service("tap")
session = get_pyvo_auth()
2. Query and photometric validation¶
Set the target field to be the ECDFS, where many spectroscopic redshift data exist.
target_ra = 53.125
target_dec = -28.1
Define a query to retrieve galaxies from the object table using the extendedness=1 flag. Search around the center of the field using a 0.5 degree search radius to cover the full field. To ensure good detections, include the $S/N > 10$ cut on the $i$-band.
query = (
"SELECT obj.objectId, obj.coord_ra, obj.coord_dec, "
"obj.u_sersicFlux, obj.u_sersicFluxErr, "
"obj.g_sersicFlux, obj.g_sersicFluxErr, "
"obj.r_sersicFlux, obj.r_sersicFluxErr, "
"obj.i_sersicFlux, obj.i_sersicFluxErr, "
"obj.z_sersicFlux, obj.z_sersicFluxErr "
"FROM dp1.Object AS obj "
"WHERE (obj.i_sersicFlux / obj.i_sersicFluxErr > 10) "
"AND (obj.i_extendedness = 1) "
"AND (obj.sersic_no_data_flag = 0) "
"AND (obj.i_cModel_flag = 0) "
"AND CONTAINS(POINT('ICRS', obj.coord_ra, obj.coord_dec), "
f"CIRCLE('ICRS', {target_ra}, {target_dec}, 0.5)) = 1"
)
Run the query.
job = tap_service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
assert job.phase == 'COMPLETED'
Job phase is COMPLETED
Retrieve the query results and save as an astropy table.
tab = job.fetch_result().to_table()
print(f"Retrieved {len(tab)} galaxies from Rubin imaging.")
Retrieved 113859 galaxies from Rubin imaging.
Galaxy colors may be inaccurate for galaxies whose S/N is very low in one or more filters. To ensure that (in particular) the dropout color is robust, if the S/N is less than 1 then set the flux to the flux error (this will use the 1-sigma lower limit to the color, replacing flux with the error floor if S/N < 1). Convert the nJy flux to ABmag, using definition of -2.5 * log10(nJy) + 31.4.
filters = ['u', 'g', 'r', 'i']
with np.errstate(divide='ignore', invalid='ignore'):
for filt in filters:
flux_robust = np.where(tab[f'{filt}_sersicFlux']
< tab[f'{filt}_sersicFluxErr'],
tab[f'{filt}_sersicFluxErr'],
tab[f'{filt}_sersicFlux'])
tab[f'{filt}_mag_robust'] = -2.50 * np.log10(flux_robust) + 31.4
Store these robust colors as new columns in the table.
tab['g_minus_r'] = tab['g_mag_robust'] - tab['r_mag_robust']
tab['r_minus_i'] = tab['r_mag_robust'] - tab['i_mag_robust']
2.1. LBG selection¶
Now, perform the color selection to identify the high-redshift galaxy candidates. Use a typical Lyman break selection for a camera with similar filter properties to the LSST (e.g. Ono et al. 2018). The selection requires very red g-r colors (to identify galaxies with strong Lyman breaks), and relatively blue r-i colors (typical of star-forming galaxies with little dust, and which excludes stars).
Use the selection criteria for z~4 LBGs defined in Ono et al. 2018, which are:
- g - r > 1.0
- r - i < 1.0
- g - r > 1.5 * (r - i) + 0.8
Also, add another criteria requiring that the u-band flux be undetected (S/N < 3), since all u-band flux should also be absorbed by the intergalactic medium since it is blueward of the Lyman limit at z > 3.
is_z4_lbg = ((tab['g_minus_r'] > 1.0)
& (tab['u_sersicFlux'] / tab['u_sersicFluxErr'] < 3)
& (tab['r_minus_i'] < 1.0)
& (tab['g_minus_r'] > 1.5 * tab['r_minus_i'] + 0.8))
print(f"Identified {np.sum(is_z4_lbg)} photometric LBG candidates.")
Identified 1838 photometric LBG candidates.
2.2. Validation data¶
This section will fetch an archival spectroscopic redshift catalog to use as a validation dataset. Use data from the VANDELS spectroscopic redshift catalog (a deep VLT/VIMOS spectroscopic survey of the Cosmic Assembly Near-infrared Deep Extragalactic Survey or CANDELS). This survey was very deep and targeted a large number of high redshift galaxies making it a good single-catalog choice for validation.
2.2.1. Fetch spec-z catalog¶
Use the astroquery package's VizieR service to retrieve the public catalog. Information on the catalog and its contents can be found in Garilli et al. 2021. The catalog ID to search in VizieR can be found on adsabs.harvard.edu in connection to the data release paper. VANDELS took data in two extragalactic deep fields. Pull the table of data obtained in the ECDFS as vandels_cdfs, which is the first table in the return (index 0).
vandels_survey = Vizier(columns=['**'], catalog="J/A+A/647/A150")
vandels_survey.ROW_LIMIT = -1
all_tables = vandels_survey.get_catalogs("J/A+A/647/A150")
vandels_cdfs = all_tables[0]
Select LBGs at high redshift using a loose window with 3.5 < z < 4.5 (this is the typical selection window for this same g-band dropout selection, as characterized by Ono et al. 2018). Ensure that only high quality measurements are used by requiring the good and excellent spectroscopic redshift flag q_zsp of 3 and 4.
is_vandels_z4 = ((vandels_cdfs['zsp'] >= 3.7)
& (vandels_cdfs['zsp'] <= 4.5)
& (vandels_cdfs['q_zsp'] >= 3))
vandels_truth = vandels_cdfs[is_vandels_z4]
print(f"Found {len(vandels_truth)} high-confidence VANDELS sources at z~4")
Found 133 high-confidence VANDELS sources at z~4
2.2.2. Cross-match¶
Perform the LBG selection on the Rubin data, then match to objects with the same coordinates in the VANDELS catalog to see what true high redshift galaxies passed the LBG selection and which failed.
coord_v = SkyCoord(ra=vandels_truth['RAJ2000'], dec=vandels_truth['DEJ2000'], unit='deg')
coord_r = SkyCoord(ra=tab['coord_ra'], dec=tab['coord_dec'], unit='deg')
idx_rubin, d2d, _ = coord_v.match_to_catalog_sky(coord_r)
match_mask = d2d < 1.0 * u.arcsec
Below, isolate the matched sources in the two datasets. Then define success and failure masks based on whether the matched objects passed the LBG color selection.
matched_rubin_indices = idx_rubin[match_mask]
vandels_in_rubin = tab[matched_rubin_indices]
v_matched_truth = vandels_truth[match_mask]
lbg_success_mask = is_z4_lbg[matched_rubin_indices]
failed_lbg = ~lbg_success_mask
print(f"VANDELS z~4 sources detected in Rubin catalog: {len(matched_rubin_indices)}")
print(f"VANDELS sources successfully recovered by LBG selection: {np.sum(lbg_success_mask)}")
print(f"VANDELS sources missed by LBG selection: {np.sum(failed_lbg)}")
VANDELS z~4 sources detected in Rubin catalog: 51 VANDELS sources successfully recovered by LBG selection: 27 VANDELS sources missed by LBG selection: 24
2.3 Plot color color diagrams¶
First, store some shorthand parameters for the colors in the selection, and for parameters to color-code the galaxies by.
v_g_minus_r = vandels_in_rubin['g_minus_r']
v_r_minus_i = vandels_in_rubin['r_minus_i']
v_i_mag = vandels_in_rubin['i_mag_robust']
v_z_spec = v_matched_truth['zsp']
v_gerr = vandels_in_rubin['g_sersicFluxErr']
Define the selection window from Ono et al. 2018 to plot in the color color diagram.
intersect_x = (1.0 - 0.8) / 1.5
x_diag = np.linspace(intersect_x, 1.0, 50)
y_diag = 1.5 * x_diag + 0.8
Define a figure for the color color diagram. Plot all the Rubin galaxies from the query in Section 2.1 (gray), all Rubin galaxies that satisfy the LBG color selection (blue), and the true spectroscopically confirmed high-redshift galaxies as bold colored symbols. Stars indicate the true high-redshift galaxies that meet the color selection and plus signs are the true high-redshift galaxies that do not.
2.3.1 Magnitude dependence¶
Color-code the galaxies by their i-band magnitude, to see if spectroscopically confirmed galaxies that are missed by the LBG selection might be faint, suggesting photometric scatter could contribute to missed galaxies.
fig, ax = plt.subplots(figsize=(9, 6))
vmin, vmax = np.nanmin(v_i_mag), np.nanmax(v_i_mag)
ax.scatter(tab['r_minus_i'], tab['g_minus_r'], s=2, color='lightgray',
alpha=0.3, label='Rubin Parent Sample')
ax.scatter(tab['r_minus_i'][is_z4_lbg], tab['g_minus_r'][is_z4_lbg],
s=15, color='dodgerblue', alpha=0.1, label='Rubin LBG Candidates')
sc = ax.scatter(v_r_minus_i[failed_lbg], v_g_minus_r[failed_lbg],
s=120, c=v_i_mag[failed_lbg], cmap='viridis', vmin=vmin, vmax=vmax,
marker='P', edgecolor='k', linewidth=0.5, zorder=4,
label='VANDELS Truth (Missed)')
sc = ax.scatter(v_r_minus_i[lbg_success_mask], v_g_minus_r[lbg_success_mask],
s=250, c=v_i_mag[lbg_success_mask], cmap='viridis', vmin=vmin, vmax=vmax,
marker='*', edgecolor='k', linewidth=0.8, zorder=5,
label='Confirmed LBG Candidates')
cbar = plt.colorbar(sc, ax=ax)
cbar.set_label('i-band [Sersic Mag]', fontsize=12, fontweight='bold')
cbar.ax.invert_yaxis()
ax.plot([-1.5, intersect_x], [1.0, 1.0], color='black', linestyle='--', lw=1.5)
ax.plot(x_diag, y_diag, color='black', linestyle='--', lw=1.5)
ax.plot([1.0, 1.0], [2.3, 4.0], color='black', linestyle='--', lw=1.5)
ax.set_xlim(-1.0, 2.0)
ax.set_ylim(-1.0, 4.0)
ax.set_xlabel('$(r - i)$ [AB Mag]', fontsize=14)
ax.set_ylabel('$(g - r)$ [AB Mag]', fontsize=14)
ax.set_title('Lyman Break Selection (g-r > 1)', fontsize=16, fontweight='bold')
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles, labels, loc='upper left', framealpha=1)
ax.grid(True, linestyle=':', alpha=0.6)
plt.tight_layout()
plt.show()
Figure 1: A color-color diagram showing g - r vs r - i colors for all Rubin galaxies (gray). The Lyman-break color selection window is shown in black dashed lines. Blue points indicate Rubin galaxies that meet the color criteria for being at high-redshift. Stars and pluses indicate Rubin galaxies that are spectroscopically confirmed by VANDELS to lie at z > 3.5 but do and do not meet the color selection (respectively). Confirmed galaxies are color coded by i-band magnitude and their successful selection does not show any trend with magnitude (which might be expected if photometric scatter.
2.3.2 Imaging depth dependence¶
Color-code the galaxies by their g-band flux error, a proxy for the depth of the g-band imaging, to see if spectroscopically confirmed galaxies that are missed by the LBG selection might because the g-band imaging was not deep enough to provide a robust lower limit on the dropout color.
v_gerr = vandels_in_rubin['g_sersicFluxErr']
fig, ax = plt.subplots(figsize=(9, 6))
vmin, vmax = np.nanmin(v_gerr), 20
ax.scatter(tab['r_minus_i'], tab['g_minus_r'], s=2, color='lightgray',
alpha=0.3, label='Rubin Parent Sample')
ax.scatter(tab['r_minus_i'][is_z4_lbg], tab['g_minus_r'][is_z4_lbg],
s=15, color='dodgerblue', alpha=0.1, label='Rubin LBG Candidates')
sc = ax.scatter(v_r_minus_i[failed_lbg], v_g_minus_r[failed_lbg],
s=120, c=v_gerr[failed_lbg], cmap='viridis', vmin=vmin, vmax=vmax,
marker='P', edgecolor='k', linewidth=0.5, zorder=4,
label='VANDELS Truth (Missed)')
sc = ax.scatter(v_r_minus_i[lbg_success_mask], v_g_minus_r[lbg_success_mask],
s=250, c=v_gerr[lbg_success_mask], cmap='viridis', vmin=vmin, vmax=vmax,
marker='*', edgecolor='k', linewidth=0.8, zorder=5,
label='Confirmed LBG Candidates')
cbar = plt.colorbar(sc, ax=ax)
cbar.set_label('g-band flux err (njy)', fontsize=12, fontweight='bold')
ax.plot([-1.5, intersect_x], [1.0, 1.0], color='black', linestyle='--', lw=1.5)
ax.plot(x_diag, y_diag, color='black', linestyle='--', lw=1.5)
ax.plot([1.0, 1.0], [2.3, 4.0], color='black', linestyle='--', lw=1.5)
ax.set_xlim(-1.0, 2.0)
ax.set_ylim(-1.0, 4.0)
ax.set_xlabel('$(r - i)$ [AB Mag]', fontsize=14)
ax.set_ylabel('$(g - r)$ [AB Mag]', fontsize=14)
ax.set_title('Lyman Break Selection (g-r > 1)', fontsize=16, fontweight='bold')
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles, labels, loc='upper left', framealpha=1)
ax.grid(True, linestyle=':', alpha=0.6)
plt.tight_layout()
plt.show()
Figure 2: Like Figure 1, except that confirmed galaxies are color coded by g-band flux uncertainty as a proxy for variations in the depth or integration time in the g-band filter imaging. Their unsuccessful selection does not show any trend with image depth (which might be expected if galaxies in shallower imaging have smaller lower limits to the g-r color).
2.3.3 Redshift dependence¶
Since there is no obvious trend with magnitude or imaging depth, color code the spectroscopically confirmed high-redshift galaxies by their spectroscopic redshift. At the higher redshift end of the selection window, the g-band flux is completely undetected when the filter probes blueward of the Lyman limit (912A). But at lower end of the redshift selection window the g-band will probe blueward of Lyman alpha (1216A) and redward of the Lyman limit (912A) but will only be partially absorbed because of the Lyman-alpha forest. The amount of flux can vary. In the plot below, color-code by spectroscopic redshift to see if the bluer galaxies are at lower redshifts.
vmin, vmax = np.nanmin(v_z_spec), np.nanmax(v_z_spec)
fig, ax = plt.subplots(figsize=(9, 6))
ax.scatter(tab['r_minus_i'], tab['g_minus_r'], s=2, color='lightgray', alpha=0.3,
label='Parent Sample (Rubin)')
ax.scatter(tab['r_minus_i'][is_z4_lbg], tab['g_minus_r'][is_z4_lbg],
s=15, color='dodgerblue', alpha=0.1, label='Rubin LBG Candidates (g-r > 1.5)')
sc = ax.scatter(v_r_minus_i[failed_lbg], v_g_minus_r[failed_lbg],
s=120, c=v_z_spec[failed_lbg], cmap='plasma', vmin=vmin, vmax=vmax,
marker='P', edgecolor='k', linewidth=0.5, zorder=4,
label='VANDELS Truth (Missed)')
sc = ax.scatter(v_r_minus_i[lbg_success_mask], v_g_minus_r[lbg_success_mask],
s=250, c=v_z_spec[lbg_success_mask], cmap='plasma', vmin=vmin, vmax=vmax,
marker='*', edgecolor='k', linewidth=0.8, zorder=5,
label='Confirmed LBG Candidates')
cbar = plt.colorbar(sc, ax=ax)
cbar.set_label('Spectroscopic Redshift (VANDELS z_spec)', fontsize=12, fontweight='bold')
ax.plot([-1.5, intersect_x], [1.0, 1.0], color='black', linestyle='--', lw=1.5)
ax.plot(x_diag, y_diag, color='black', linestyle='--', lw=1.5)
ax.plot([1.0, 1.0], [2.3, 4.0], color='black', linestyle='--', lw=1.5)
ax.set_xlim(-1.0, 2.0)
ax.set_ylim(-1.0, 4.0)
ax.set_xlabel('$(r - i)$ [AB Mag]', fontsize=14)
ax.set_ylabel('$(g - r)$ [AB Mag]', fontsize=14)
ax.set_title('Validation of Lyman Break Selection', fontsize=16, fontweight='bold')
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles, labels, loc='upper left', framealpha=1)
ax.grid(True, linestyle=':', alpha=0.6)
plt.tight_layout()
plt.show()
Figure 3: Like Figure 1, except that confirmed galaxies are color coded by spectroscopic redshift, and their successful selection does not show any trend with redshift.
2.3.4 Interpretation¶
The color selection identifies a large number of true high-redshift galaxies. A number of confirmed galaxies fall outside the selection window. While the color selection is not designed to be 100% inclusive of high redshift galaxies, it should select a relatively pure sample with few interlopers. Galaxies may reside outside the selection box do not seem to trend with any single property tested in Section 2.3.1-2.3.3, but may be caused by a combination of photometric scatter, variation in depth of the g-band imaging, and that the strength of the g-r color will vary across the redshift selection window. Other effects also contribute: dust attenuation in the galaxy will redden the restframe UV probed by the r-i color and similarly decreasing the brightness of the observed r band flux (thus making the g-r color less robustly constrained). A possible effect from DP1 could also be the known issue of a small fraction of red-wavelength photon leak in the g-band which might contaminate the dropout filter with redder-wavelength light from the galaxy. This effect is exacerbated by the higher operating temperature of LSSTComCam during commissioning and is expected to be resolved in DP2 LSSTCam data.
This exemplifies how photometric selection is statistical in nature. While the physics of the Lyman break creates the strong spectral feature that can be exploited for a simple selection, the photometric noise and intrinsic galaxy variations will scatter the galaxy colors in and out of the selection box.
3. Visual inspection¶
Finally, generate image cutouts to visually inspect both the missed and confirmed subsets of spectroscopically confirmed galaxies. This validation is necessary to confirm the fidelity of the sample.
missed_v_truth = v_matched_truth[failed_lbg]
missed_r_data = vandels_in_rubin[failed_lbg]
confirmed_v_truth = v_matched_truth[lbg_success_mask]
confirmed_r_data = vandels_in_rubin[lbg_success_mask]
Define a function to generate image cutouts in 4 filters for visual inspection.
def plot_cutouts(r_data, v_data, n_max=10, title_label="Candidate"):
n_plot = min(len(r_data), n_max)
if n_plot == 0:
print(f"No sources to plot for {title_label}.")
return
filters = ['u', 'g', 'r', 'i']
fig, axes = plt.subplots(n_plot, len(filters), figsize=(12, 3.5 * n_plot))
if n_plot == 1:
axes = [axes]
print(f"Requesting cutouts for {n_plot} sources: {title_label}...")
for row_idx in range(n_plot):
ra = r_data['coord_ra'][row_idx]
dec = r_data['coord_dec'][row_idx]
obj_id = r_data['objectId'][row_idx]
z_spec = v_data['zsp'][row_idx]
g_r = r_data['g_minus_r'][row_idx]
r_i = r_data['r_minus_i'][row_idx]
cutout_query = f"""
SELECT lsst_band, access_url
FROM ivoa.ObsCore
WHERE dataproduct_subtype = 'lsst.deep_coadd'
AND obs_collection = 'LSST.DP1'
AND CONTAINS(POINT('ICRS', {ra}, {dec}), s_region) = 1
"""
job = tap_service.submit_job(cutout_query)
job.run()
job.wait()
coadds = job.fetch_result().to_table()
for col_idx, f in enumerate(filters):
ax = axes[row_idx][col_idx]
band_match = coadds[coadds['lsst_band'] == f]
if len(band_match) > 0:
try:
datalink_url = band_match['access_url'][0]
dl_result = DatalinkResults.from_result_url(datalink_url, session=session)
sq = SodaQuery.from_resource(dl_result,
dl_result.get_adhocservice_by_id("cutout-sync"),
session=session)
sq.circle = (ra * u.deg, dec * u.deg, 10.0 / 3600.0 * u.deg)
cutout_bytes = sq.execute_stream().read()
hdul = fits.open(io.BytesIO(cutout_bytes))
img_array = hdul[1].data
norm = ImageNormalize(img_array, interval=ZScaleInterval(),
stretch=LinearStretch())
ax.imshow(img_array, origin='lower', cmap='gray', norm=norm)
cy, cx = img_array.shape[0]//2, img_array.shape[1]//2
ax.add_patch(plt.Circle((cx, cy), radius=6, color='cyan',
fill=False, lw=1.5, alpha=0.7))
except Exception:
ax.text(0.5, 0.5, 'SODA Error', ha='center', va='center',
transform=ax.transAxes, color='red')
else:
ax.text(0.5, 0.5, 'No Coverage', ha='center', va='center',
transform=ax.transAxes, color='red')
ax.set_xticks([])
ax.set_yticks([])
if row_idx == 0:
ax.set_title(f"{f}-band", fontsize=15, fontweight='bold')
if col_idx == 0:
label_str = f"z_sp: {z_spec: .2f}\nID: {obj_id}\ng-r: {g_r: .2f}\nr-i={r_i: .2f}"
ax.set_ylabel(label_str, rotation=0, labelpad=70, ha='center', fontweight='bold')
plt.tight_layout()
plt.show()
Now plot the 4-filter cutouts for a few (5) real high redshift galaxies confirmed by VANDELS that were identified correctly in the Rubin color selection.
print(f"Total confirmed VANDELS LBGs available to plot: {len(confirmed_r_data)}")
plot_cutouts(confirmed_r_data, confirmed_v_truth, n_max=5,
title_label="VANDELS high-z confirmations identified by Rubin color selection")
Total confirmed VANDELS LBGs available to plot: 27 Requesting cutouts for 5 sources: VANDELS high-z confirmations identified by Rubin color selection...
Figure 4: Deep coadd image cutouts of a few confirmed high-redshift galaxies that are selected as LBGs from the Rubin imaging. As expected for galaxies at z > 3.5, the galaxies are not detected in the u-band, their flux is significantly decreased in the g-band, and are detected in r and i-bands.
Optionally, also generate cutouts of the true high-redshift galaxies that were missed the color selection. Now plot the 4-filter cutouts for the real high redshift galaxies confirmed by VANDELS but that the Rubin color selection missed. They are a mix of galaxies with too-blue g-r colors and too-red r-i colors.
# plot_cutouts(missed_r_data, missed_v_truth, n_max=5,
# title_label="VANDELS high-z confirmations missed by Rubin color selection")
4. Interloper fraction¶
Below, select all Rubin objects at any redshift with a secure VANDELS ectroscopic redshift measurement (high quality only, q_zsp of 3 or 4; See Garilli et al. 2021). Use that full sample across all redshifts to investigate the interloper fraction of the Rubin color selection (this exercise is only informative above the magnitude limit of the VANDELS sample, and does not apply to fainter galaxies).
First, cross-match all galaxies for which VANDELS secured a robust redshift measurement with the Rubin objects from hte query in Section 2.
is_vandels_hq = vandels_cdfs['q_zsp'] >= 3
vandels_all_hq = vandels_cdfs[is_vandels_hq]
coord_v_all = SkyCoord(ra=vandels_all_hq['RAJ2000'], dec=vandels_all_hq['DEJ2000'], unit='deg')
coord_r = SkyCoord(ra=tab['coord_ra'], dec=tab['coord_dec'], unit='deg')
idx_rubin_all, d2d_all, _ = coord_v_all.match_to_catalog_sky(coord_r)
match_mask_all = d2d_all < 1.0 * u.arcsec
For simplicity, make new arrays to hold the matched datasets.
matched_rubin_all = tab[idx_rubin_all[match_mask_all]]
matched_v_all = vandels_all_hq[match_mask_all]
z_spec_all = matched_v_all['zsp']
i_mag_all = matched_rubin_all['i_mag_robust']
obj_ids_all = matched_rubin_all['objectId']
Now make a histogram of VANDELS galaxies with spectroscopic redshifts that met the Rubin LBG color selection, and investigate the interloper fraction among confirmed galaxies.
First, perform the color selection again on the parent sample of all VANDELS high-quality confirmations.
is_selected_lbg = (
(matched_rubin_all['g_minus_r'] > 1.0)
& (matched_rubin_all['u_sersicFlux'] / matched_rubin_all['u_sersicFluxErr'] < 3)
& (matched_rubin_all['r_minus_i'] < 1.0)
& (matched_rubin_all['g_minus_r'] > 1.5 * matched_rubin_all['r_minus_i'] + 0.8))
selected_spec_z = matched_v_all['zsp'][is_selected_lbg]
Next, identify low-redshift interlopers that met the color selection criteria, and compare. Define the redshift window according to Ono et al. 2018 selection window of 3.2 < z < 4.5.
failed_spec_z = selected_spec_z[selected_spec_z < 3.2]
print("Among the VANDELS spectroscopic sample:")
print(f"{len(selected_spec_z)} out of {len(matched_v_all['zsp'])} are color selected")
print(f"{len(failed_spec_z)} LBG candidates are confirmed low redshift interlopers, ")
print(f"indicating a {len(failed_spec_z)/len(matched_v_all['zsp'])*100: .2f}% interloper fraction")
Among the VANDELS spectroscopic sample: 69 out of 469 are color selected 5 LBG candidates are confirmed low redshift interlopers, indicating a 1.07% interloper fraction
Next, plot the histogram along with the redshift selection window that was characterized by Ono et al. 2018. That study found that the HSC filterset and their color selection criteria (both very similar to Rubin's filters and the color selection used in this notebook) produces a selection window at 3.2 < z < 4.5.
fig, ax = plt.subplots(figsize=(10, 6))
z_bins = np.arange(0, 6.6, 0.2)
ax.hist(selected_spec_z, bins=z_bins, color='dodgerblue', edgecolor='black',
alpha=0.8, zorder=3)
ax.axvspan(3.2, 4.5, color='mediumseagreen', alpha=0.2,
label='Selection Window ($3.2 < z < 4.5$)')
ax.axvline(3.2, color='darkgreen', linestyle='--', linewidth=1.5, zorder=4)
ax.axvline(4.5, color='darkgreen', linestyle='--', linewidth=1.5, zorder=4)
ax.set_xlim(0, 6.5)
ax.set_xlabel('Spectroscopic Redshift (VANDELS z_spec)', fontsize=14)
ax.set_ylabel('Number of Selected Candidates', fontsize=14)
ax.set_title('True Redshift Distribution of LBG Candidates (g-dropout)',
fontsize=16, fontweight='bold')
ax.legend(loc='upper right', fontsize=12, framealpha=1)
ax.grid(axis='y', linestyle=':', alpha=0.7, zorder=0)
plt.tight_layout()
plt.show()
Figure 5: Histogram of the redshifts of all spectroscopically confirmed galaxies that enter the Lyman-break color selection, indicating a very low interloper fraction. The selection window with Rubin data matches the one characterized by Ono et al. 2018 (green) using HSC data and the same color criteria (3.2 < z < 4.5).