The XMM-Newton Serendipitous Source Catalogue: 2XMM

User Guide to the Catalogue


Release 1.1 20 August 2008 Associated with Catalogue version 1.0 (released 22 August 2007)

Prepared by the XMM-Newton Survey Science Centre Consortium.)

This User Guide refers directly to the full FITS and plain-text formats of the catalogue. It provides a detailed account of the production and contents of the catalogue. Users interested in the main properties of the catalogue will find the summary and sections 1 & 5 of most immediate interest.

Contents


Summary

2XMM is a catalogue of serendipitous X-ray sources from the European Space Agency's (ESA) XMM-Newton observatory, and has been created by the XMM-Newton Survey Science Centre (SSC) on behalf of ESA. The pre-release catalogue, 2XMMp, made public in July 2006, was essentially a subset of this full 2XMM catalogue.

The catalogue contains source detections drawn from 3491 XMM-Newton EPIC observations made between 2000 February 3 and 2007 March 31; all datasets included were publicly available by 2007 May 01 but note not all public observations are included in this catalogue. The total area of the catalogue fields is ~ 560 deg2, but taking account of the substantial overlaps between observations, the net sky area covered independently is ~ 360 deg2.

The processing used to generate the catalogue is based on the pipeline developed for the re-processing of all XMM observations. The new pipeline includes a number of significant improvements over the previous data processing system (as used by the SSC in routine processing of XMM-Newton data on behalf of ESA). These improvements include a more sensitive source detection scheme using exposures of all cameras, the detection and parameterization of extended sources and the extraction of spectra and time series for the brightest sources.

The 2XMM catalogue contains 246897 X-ray source detections above the processing likelihood threshold of 6. The 246897 X-ray source detections relate to 191870 unique X-ray sources, that is, a significant fraction of sources (27522) have more than one detection in the catalogue.

As part of extensive quality evaluation for the catalogue, each field has been visually screened. Regions where there were obvious deficiencies in the automatic processing were identified, and all sources within those regions were flagged. There are 199359 out of 246897 detections which have not received such a flag (and can thus be considered to be 'clean').

The present catalogue also distinguishes between extended emission and point-like detections. Parameters of detections of extended sources are only reliable up to the maximum extent measure of 80 arcseconds. There are 20837 detections of extended emission, of which 3836 are 'clean' (i.e., have not received a manual flag).

For 38320 detections spectra and time series were automatically extracted during processing, and a χ2-variability test was applied. 2307 detections in the catalogue are considered variable at a probability of 10-5 or less based on the null-hypothesis that the source is constant.

The median flux (in the total photon-energy band 0.2 - 12 keV) of the catalogue detections is ~ 2.5 × 10-14 erg/cm2/s; in the soft energy band (0.2 - 2 keV) the median flux is ~ 5.8 × 10-15, and in the hard band (2 - 12 keV) it is ~ 1.4 × 10-14. About 20% have fluxes below 1 × 10-14 erg/cm2/s. The positional accuracy of the catalogue detections is generally < 5 arcseconds (99% confidence radius). The flux values from the three EPIC cameras are overall in agreement to ~ 10% for most energy bands.

1. Introduction

Pointed observations with the XMM-Newton Observatory detect significant numbers of previously unknown 'serendipitous' X-ray sources in addition to the proposed target. Combining the data from many observations thus yields a serendipitous source catalogue which, by virtue of the large field of view of XMM-Newton and its high sensitivity, represents a significant resource. The serendipitous source catalogue enhances our knowledge of the X-ray sky and has the potential for advancing our understanding of the nature of various Galactic and extragalactic source populations.

The 2XMM catalogue is ~ 6 times larger than the 1XMM catalogue released in 2003 and over 50% larger than the pre-release version, 2XMMp. (The difference  between 2XMM and 2XMMp arises from both a longer observation baseline (~ 1 year) and the more inclusive screening regime for 2XMM which was more cautious for 2XMMp. )

The 2XMM catalogue is the largest X-ray source catalogue ever produced, containing almost twice as many discrete sources as either the ROSAT survey or pointed catalogues. 2XMM complements deeper Chandra and XMM-Newton small area surveys, probing a large sky area at the flux limit where the bulk of the objects that contribute to the X-ray background lie. The 2XMM catalogue provides a rich resource for generating large, well-defined samples for specific studies, utilizing the fact that X-ray selection is a highly efficient (arguably the most efficient) way of selecting certain types of object, notably active galaxies (AGN), clusters of galaxies, interacting compact binaries and active stellar coronae. The large sky area covered by the serendipitous survey, or equivalently the large size of the catalogue, also means that 2XMM is a superb resource for exploring the variety of the X-ray source population and identifying rare source types.

The production of this catalogue has been undertaken by the XMM-Newton Survey Science Centre (SSC) consortium in fulfillment of one of its major responsibilities within the XMM-Newton project. The catalogue production process has been designed to exploit fully the capabilities of the XMM-Newton EPIC cameras and to ensure the integrity and quality of the resultant catalogue through rigorous screening of the data.

2. Data selection

2.1 Selection of observations

The selection of XMM-Newton observations for processing in the 2XMM catalogue pipeline is based on the desire to re-process all available observations with the latest available data processing system and calibration data. All observations that have a public release date prior to 2007 May 01 are eligible for inclusion. Table 2.1 gives the list of the final 3491 observations which are included in the catalogue, while Table 2.2 gives a list of observations which were public by 2007 May 01 and have EPIC images suitable for source detection but which could not be included in the catalogue. A short comment on the reason for exclusion is given as well.

It should be noted that the field of view (FOV) of an XMM observation (the combined three cameras) has a radius ~ 17 arcminutes, and that contiguous multi-FOV spatial coverage is rare.

2.2 Selection of exposures

Most XMM-Newton observations comprise a single exposure by each of the three EPIC cameras (a significant number of observations have multiple exposures and/or do not include exposures with one or more of the three cameras). For each observation, exposures were selected for each of the three EPIC cameras for processing using the following criteria:

(i) An exposure must have > 1000 seconds duration.
(ii) The exposure must have been taken through a scientifically useful filter. In practice this rejected all exposures for which the filter position was closed, calibration or undefined. The possible filters used in the observations selected for the catalogue are Medium, Thick, Thin1, Thin2 (PN only), and Open. For a detailed description of the filters see 3.3.6 EPIC filters and effective area in the XMM user hand book.
(iii) The exposure must have been taken in a mode which could usefully be processed by the detection stage. PN small window modes were rejected since the effective FOV in these modes is small, making the background fitting stage of the source detection problematic. For the MOS nearly all modes, including those modes in which the area of the central CCD was windowed or missing (e.g., timing modes, here 'Fast Uncompressed') or modified ('Refreshed Frame Store' mode), were included. The observing modes used in the observations selected for the catalogue are given in Table 2.3. For a detailed description of the modes see 3.3.2 Science modes of the EPIC cameras in the XMM user hand book.
(iv) Background filtering (see Sec. 3.1.1 c)) must have been successfully applied. Cases, where the sum of all Good Time Intervals (hereafter GTIs) was less than 1000s, were rejected as unusable. Without background filtering the source detection is typically of limited value due to the much higher net background.
(v) After background filtering has been applied, each of the five images of an exposure (energy bands 1 - 5) must have at least one pixel per image with more than one event.

Where more than one exposure with a particular camera met the above selection conditions, all exposures with the same filter and data mode were merged and then the exposure group with maximum good exposure time was chosen for the source detection. The zoom-in flow chart below visualizes the selection procedure.


Figure 2.1: Flow-chart of the 2XMM exposure selection.
Figure 2.1: Flow-chart of the 2XMM exposure selection

3. Data processing

The processing of the observations was facilitated through an improved pipeline configuration over the one previously used for the routine production processing of observations. This new processing pipeline was used to re-process all available observations up-to-date and has become the new routine pipeline after the re-processing was concluded.

After creation of the pipeline data products (Sec. 3.1), the catalogue file was constructed to contain key columns from the source lists plus additional columns which include observation-level meta-data for each source as well as further processing and analyses (Sec. 3.2). Additional products for each source were created to facilitate data access (Sec. 3.3).

Throughout the documentation references to catalogue columns are marked as links to their description. The prefix 'ca_' in a column name indicates a wildcard for any of the three EPIC cameras, i.e., 'ca' is to be replaced by 'PN,' 'M1' or 'M2'; it can also stand for 'EP' (that is, EPIC) where applicable. Some of the column names also include an energy band identifier ('_b_', where b = 1,2,3,4,5,6,7,8,9) which is typically not explicitly indicated. The four hardness ratios are identified by the designator 'n' (n = 1,2,3,4).

The zoom-in flow chart below gives an overview of the processing steps from the XMM-Newton observation to the final observation source list as described in this section. The 'exposure group selection' is described in Sec. 2.2.


Figure 3.1: Flow-chart of the 2XMM processing steps.
Figure
3.1: Flow-chart of the 2XMM processing steps

3.1 Pipeline processing

The following sections describe the individual steps taken within the pipeline processing chain leading to images on which source detection could be performed (Sec. 3.1.1), the source detection procedure (Sec. 3.1.2), the rectification of the positions (Sec. 3.1.3), and the extraction of source-specific products (Sec. 3.1.4). Note that these sections are applicable to all observations processed with this pipeline, indpendent on whether they have been used in the catalogue or not.

3.1.1 Event calibration and filtering

a) The processing of a MOS exposure

1. A first pass constructs a high energy background lightcurve selecting events with the (XMMEA_22 = REJECT_BY_GATTI & XMMEA_EM = GOOD_MOS_EVENTS) flags set.

2. GTIs are made by filtering the high energy background lightcurve using the MOS flare threshold (2 counts/arcmin2/kilosecond). These GTIs are used to define the time regions in which bad pixel searching occurs.

3. All GTIs with a duration of less than 100 seconds are excluded.

4. The SAS task embadpixfind is used to locate dark pixels in each MOS CCD (using events which have not been filtered through the flare GTIs).

5. If no flare GTIs were made the SAS task embadpixfind is used to locate bright pixels.

6. If flare GTIs do exist the events are filtered through the flare GTIs and then embadpixfind is used to locate bright pixels.

7. The SAS task badpix is run on each CCD event file in order to add a bad pixel extension.

8. The intervals in the global GTI file are aligned with the event list and merged with the CCD GTIs.

9. Attitude correction is applied to the individual events to convert raw CCD pixel coordinates, through camera coordinates, to celestial coordinates.

10. Raw event pulse height values are converted to rectified event energies.

11. Unwanted events are filtered out before lists are merged.

12. The per-CCD event lists are merged into one per camera.

13. Filter the good imaging events into final event lists.

14. Copy the Calibration Index File (CIF) into a separate extension in the event list.

15. Make a second pass high energy background lightcurve selecting events as before, but now with the bad pixels excluded.

16. Create flare GTIs using the MOS flare threshold.

17. Filter the event files through GTIs into final event files.

b) The processing of a PN exposure

1. The SAS task badpixfind is run to create a mask of non-source pixels to be used in generating a high energy background lightcurve.

2. badpixfind is run on each CCD to locate bright and dead pixels.

3. Attitude correction is applied to the individual events to convert raw CCD pixel coordinates, through camera coordinates, to celestial coordinates.

4. Raw event pulse height values are converted to rectified event energies.

5. Filter events by selecting events with the (XMMEA_EP = PN_GOOD_EVENTS) flags set.

6. Filter the CCD event files on the HK GTIs and merge into one.

7. Copy the CIF into a separate extension in the event list.

8. Make a high energy background lightcurve using events with energies between 7 keV & 15 keV events and excluding bad pixels by using the previously created pixel mask.

9. Create flare GTIs using the PN flare threshold (10 counts/arcmin2/kilosecond) for use in later processing stages. 

c) Background filtering

The MOS high energy background lightcurves were produced from GATTI-flagged (essentially events with energies above 14 keV), single-pixel events from the outer CCDs. After binning the lightcurve, the GTIs were selected by imposing a rate threshold of 2 counts/arcmin2/ksec.

The PN high energy background lightcurves were produced in the 7.0 - 15 keV energy range. After binning the lightcurve, the GTIs were selected by imposing a rate threshold of 10 counts/arcmin2/ksec.

d) Image creation

1. Exclude from the flare GTIs all intervals with duration less than 100 seconds.

2. For each energy band make a counts image. The images are 600 × 600 pixels with 4-arcsecond pixel sides. The images are tangent plane projections of celestial coordinates. Note that the energy bands have changed slightly with respect to previous processing: the old band 2 is split into two bands now (0.5 keV - 1.0 keV and 1.0 keV - 2.0 keV), while the old bands 4 and 5 have been merged into a single band. The definitions of all the new energy bands are given in Table 3.1 below.


Table 3.1:  Energy bands used in 2XMM processing
Basic energy bands: 1 = 0.2 -   0.5 keV  
2 = 0.5 -   1.0 keV   (formerly part of band 2)
3 = 1.0 -   2.0 keV   (formerly part of band 2)
4 = 2.0 -   4.5 keV   (formerly band 3)
5 = 4.5 - 12.0 keV   (formerly bands 4 and 5)
Broad energy bands: 6 = 0.2 -   2.0 keV   soft band, no images made
7 = 2.0 - 12.0 keV   hard band, no images made
8 = 0.2 - 12.0 keV   total band
9 = 0.5 -   4.5 keV   XID band

3. Event selection for PN images is PATTERN <= 4 and RAWY > 12 with events ON_OFFSET_COLUMN excluded. Band 1 images have the additional stricter requirement PATTERN = 0, while band 8 images have PATTERN = 0 below 0.5 keV. Band 1 - 5 images have also events OUT_OF_FOV excluded.

4. For MOS band 1 - 5 and 8 images no PATTERN selection is made beyond the 0 - 25 selection made in the event lists. Events OUT_OF_FOV are excluded for band 1 - 5 images.

5. Make exposure images corresponding to bands 1 - 5 count images

3.1.2 Source detection and parametrization

Source detection is performed simultaneously on images in the energy bands 1 - 5 and from the three EPIC cameras. For observations having multiple exposures from the same camera, exposures are merged by filter and observing mode, and for each camera the merged ('added') exposure with the longest integration time is used for source detection (cf. Sec. 2.2).

In the fitting routine source parameters are determined individually for the three cameras [PN,M1,M2] in the energy bands 1 - 5 as well as the XID band (cf. Table 3.1). These parameters are then combined to obtain camera dependent parameters in band 8 as well as all-EPIC parameters. On the other hand, the fitted source position is constrained to be the same in all bands and cameras.

a) Creation of multiband exposure maps

Exposure maps hold the effective exposure time for each detector point (see Sec. 3.1.1). They are created by the SAS task eexpmap for each EPIC camera and energy bands 1 - 5 using the latest calibration information on the mirror vignetting, quantum efficiency and filter transmission. The exposure maps (see ca_EXP) are corrected for bad pixels, bad columns and CCD gaps as well as being multiplied by an out-of-time factor (oot_factor):

oot_factor = 0.9411  for PN PrimeFullFrame modes,
  0.97815  for PN ExtendedFullFrame modes,
  1.0  for all other PN and M1/M2 modes.

See SSC-LUX-SP-0004.pdf, Sec. 6.5.2.6, for more details.

b) Creation of detection masks

The SAS task emask is used to create a detection mask for each camera. Detection masks define the area of the detectors which is suitable for source detection. Only the areas of the detector where the exposure is at least 50% of the maximum exposure have been used for source detection (based on unvignetted exposure maps).

c) Sliding box source detection (local mode)

The SAS task eboxdetect is used to create a preliminary source list. It has two operation modes: local and map detection. At this stage eboxdetect is run in local mode and performs a sliding box cell detection (box size 5 × 5 pixels) on the detector areas defined by the detection masks. Eboxdetect in local mode uses a local background that is determined in a frame around the search box. All sources with a 0.2 - 12 keV EPIC detection likelihood above 5 are included in the output source list.

d) Creation of multiband background maps

The SAS task esplinemap is used to create background maps for each camera and energy bands 1 - 5. Using a cut-out radius dependent on source brightness, esplinemap blanks out the areas of the images where sources were detected by eboxdetect in local mode. Then esplinemap performs 12 × 12 nodes spline fits on the resulting source-free images to calculate a smoothed background map for the entire images (ca_BG).

e) Sliding box source detection (map mode)

A second pass of eboxdetect is carried out in map mode. It creates a new source list using this time the background maps generated by esplinemap, increasing thereby the source detection sensitivity as compared to the local detection step. The box size is again set to 5 × 5 pixels. All sources with a 0.2 - 12 keV EPIC detection likelihood above 5 are included in the eboxdetect map mode source list.

f) Source parameter estimation by maximum likelihood fitting

The sources detected by eboxdetect in map mode are passed on to the SAS task emldetect. Emldetect does not perform source detection, instead it calculates PN/M1/M2 source parameters in the bands 1 - 5 by fitting the instrumental point spread function (PSF) convolved with a source extent model to the distribution of counts of the sources detected by eboxdetect (in map mode) simultaneously in the bands 1 - 5 and the three cameras. The extent model used is a beta-model profile, see SSC-AIP-TN-003.pdf for more details on the extended source detection. Free parameters of the fits are source positions, extent (ep_EXTENT), and count rates (ca_RATE). Positions and extent are constrained to be the same in all energy bands and for all cameras while count rates are obtained from the best fit value for each camera and energy band. Detection likelihoods (ca_DET_ML) and extent likelihoods (ca_EXTENT_ML) are derived as well.

In a second loop, emldetect attempts to fit two PSFs to sources detected as extended, and for those detections where the split improved the likelihood of the fit the (point) source parameters were re-calculated.

Emldetect uses the multiband exposure maps (Sec. 3.1.2 a)) to correct the count rates for vignetting and losses due to inter-chip gaps and bad pixels/columns as well as for losses in the PN due to events arriving during readout times (out of time events):

count_rate = source_counts / exp_map .

Emldetect derives four camera-specific X-ray colours known as hardness ratios (HR), which are obtained for each camera by combining corrected count rates from different energy bands. Each hardness ratio, HRn, is obtained as

HRn = (RATE_b - RATE_a) / (RATE_b + RATE_a),

where RATE_a and RATE_b are the corrected count rates in energy bands a and b (see ca_b_RATE). Energy bands 1  & 2 are used to obtain ca_HR1, 2 & 3 for ca_HR2, 3 & 4 for ca_HR3, and 4 & 5 for ca_HR4.

Count rates and therefore hardness ratios are camera dependent. In addition they depend on the blocking filter used for the observation, especially the HR1. This needs to be taken into account when comparing hardness ratios for different sources. Note that a large fraction of the hardness ratios are calculated from marginal or non-detections in at least one of the energy bands. Consequently, individual hardness ratios should only be deemed reliable if the source was above the detection likelihood threshold in both energy bands, else they have to be treated as upper limits.

Emldetect calculates observed source fluxes (ca_FLUX) in bands 1 - 5 in units of [erg/s/cm2], using the count rates (ca_RATE) in those bands using the following expression:

Flux = Rate / ECF ,

where the ECF is an energy conversion factor (to 'convert' count rates to fluxes). ECFs have been calculated using the most recent calibration matrices for MOS, and v6.7 response matrices (RMFs) for the PN. They have been calculated assuming a spectral model of an absorbed power-law with absorbing column density Nh = 3.0 × 1020 cm2 and continuum spectral slope Gamma = 1.7  (see CAL-TN-0023-v2.0.pdf).

ECFs for each camera, energy band, and filter in units of [1011 cts cm2/erg] are given in Table 3.2. EPIC-PN Thin1 and Thin2 filters have the same transmission.


Table 3.2:  Energy conversion factors (ECFs) used in 2XMM processing
Camera Band Open Thin Medium Thick
PN 1 16.1784 8.95403 7.82028 4.71096
2 10.0418 8.09027 7.83782 6.02015
3 6.17030 5.88255 5.78272 5.00419
4 1.95859 1.92805 1.90529 1.80647
5 0.555924 0.555226 0.554529 0.547205
9 5.07412 4.53836 4.43953 3.74772
MOS-1 1 3.15223 1.80399 1.60150 1.06500
2 2.27921 1.88017 1.82853 1.48465
3 2.14933 2.05034 2.01594 1.79446
4 0.757786 0.746128 0.737800 0.707822
5 0.143619 0.143340 0.143131 0.141213
9 1.54600 1.42040 1.39361 1.23264
MOS-2 1 3.17622 1.81179 1.60670 1.06620
2 2.28390 1.88369 1.83088 1.48818
3 2.15017 2.05117 2.01594 1.79530
4 0.761672 0.750569 0.741687 0.711708
5 0.151083 0.150769 0.150560 0.148537
9 1.54912 1.42326 1.39647 1.23524

Note that all count rates (ca_RATE) and fluxes (ca_FLUX) correspond to the flux in the entire PSF and do not need any further corrections for PSF losses.

Band 8 source parameters are derived from the combination of parameters from bands 1 - 5. For details on how each parameter was obtained see the column descriptions for the source parameters.

Detection likelihood values (ca_DET_ML) as calculated by emldetect are based on the likelihood ratio described by Cash (1979) and are defined as DET_ML = - ln(P), where P is the probability of the detection occurring by chance. To allow comparisons of source detection runs with different source parameters, the detection likelihoods in emldetect are given in the form of 'equivalent' detection likelihoods, i.e., they are corrected for the number of free fit parameters. All sources (as detected by eboxdetect map mode) with 0.2 - 12 keV EPIC detection likelihoods greater than 6 as determined by emldetect are included in the output source list.

g) Fitting XID band source parameters

Band 9 source parameters are for the XID band (0.5 - 4.5 keV). Instead of combining parameters from bands 2, 3 and 4, which will produce overall larger source parameter errors, the SAS task emldetect is run a second time using merged images, exposure maps and background maps from bands 2 - 4. Source positions are kept fixed at the values determined previously, and a likelihood threshold of zero is used to ensure that band 9 parameters are obtained for all sources detected in the first run of emldetect. The output source list contains only band 9 parameters with errors determined directly from the merged images.

h) Automatated quality warning flags

One of the improvements over the previous processing pipelines is the setting of automatic flags by the SAS task dpssflag; based on the available information in the emldetect source list it writes a string of twelve different flags back into the source list (ca_FLAG) to indicate various conditions (note that only nine of these are used in the pipeline processing). Because the decision tree had to be simple these flags should be understood mainly as a warning. In particular, sources with a low coverage on the detector, sources in problematic areas (near a bright source or within an extended source) as well as sources near artefacts like the known bright MOS-1 corner or the occasionally bright low gain columns of the PN are flagged. In addition, an attempt was made to identify spurious extended sources which can often be found near bright sources, within complicated extended emission, or generally in areas where the background changes considerably on a small spatial scale and the spline maps can not adapt well enough. The nine flag positions have been assigned the meanings given in Table 3.3a (note that flags 1, 8, and 9 are camera dependent):


Table 3.3a:  Flag Keys
1   Low detector coverage ca_MASKFRAC <  0.5
2   Near other source R ≤ 65 * SQRT (EP_RATE); R(min) = 10", R(max) = 400"
3   Within extended emission R ≤ 3 * EP_EXTENT; R(max) = 200"
4   Possible spurious extended source near bright source Flag 2 is set and EP_CTS(min) = 1000 for the causing source
5   Possible spurious extended source within extended emission R ≤ 160" and fraction of rate wrt causing source is 0.4
6   Possible spurious extended source due to unusal large single-band DET_ML Fraction of ca_b_DET_ML wrt the sum of all ≥ 0.9
7   Possible spurious extended source At least one of the flags 4, 5, 6 is set
8   On bright MOS-1 corner or bright low gain PN column
9   Near bright MOS-1 corner R ≤ CUTRAD = 60" of a bright pixel the corner

The default value of every flag is F for False. When a flag was set it means it has been changed to T for True.

The task dpssflag sets all flags except the camera-specific flags (i.e., flags 2,3,4,5,6,7) on the summary row (EPIC band 8) which are then propagated backwards to the individual cameras and bands.

i) Merging of emldetect source lists

The emldetect source list for the bands 1 - 5 is merged with the emldetect XID source list into a common list by the SAS task srcmatch. The output table consists of a single row per detection with parameters from both input lists in different columns. The task srcmatch also calculates band 1 - 5 EPIC fluxes (EP_FLUX), EPIC hardness ratios (EP_HRn), and their respective errors.

The task srcmatch introduces flag columns which are later populated by the pipeline, e.g., for sources where source-specific products (Sec. 3.1.4) have been made (TSERIES and SPECTRA).

3.1.3 Position rectification

The SAS task eposcorr correlates the X-ray positions from an observation (as determined by the fitting routine of emldetect) with catalogued optical positions and minimizes the positional offsets by applying a translation and rotation to the X-ray positions. For the catalogue pipeline the srcmatch source lists were correlated with the USNO B1.0 optical catalogue. The correlation allows offsets in RA/DEC of up to 10 arcseconds while all optical sources more than 15 arcseconds from an X-ray source are removed prior to correlation.

The SAS task evalcorr evaluates the quality of the position rectification of eposcorr. For 2XMM the following empirically determined condition was used to accept the refined astrometric solution:

POSCOROK is set to True by evalcorr if

 LIK_HOOD > 9.0 + ( 2.0 * LIK_NULL ) ,

where LIK_HOOD and LIK_NULL are determined by the SAS task eposcorr. LIK_NULL is the likelihood calculated for purely coincidental X-ray/optical matches in a given observation, i.e., if there were no true counterparts.

If POSCOROK is set to True the columns RA and DEC give the corrected X-ray positions calculated by eposcorr. If the refined astrometric solution was not accepted the columns RA and DEC are the same as the uncorrected values RA_UNC and DEC_UNC (as determined by emldetect). POSCOROK also determines the value to the parameter SYSERRCC which is the nominal value of the systematic 1-sigma position error for XMM-Newton fields.

In the pipeline products as well as in 2XMMp, the value of SYSERRCC was estimated from the width of the distributions of position shifts found in eposcorr runs. It is 0.5 arcseconds for all detections in a field for which an acceptable astrometric correction using eposcorr was determined (that is, POSCOROK is True). For fields for which no acceptable astrometric correction using eposcorr was determined (that is, False), the value of SYSERRCC is 1.5 arcseconds.

For the 2XMM catalogue, a re-analysis of the astrometric properties has led to a new determination of the systematic 1-sigma position error, reflected in the new (catalogue-only) parameter SYSERR, see Sec. 3.2.2 for details.

3.1.4 Source-specific products

The new pipeline automatically extracts time series and spectra for the brighter detections (EPIC counts ≥ 500. Where the detection was only observed with one or two cameras the equivalent EPIC counts were calculated using the PN to MOS count ratio 3.5 : 1). All exposures that passed the filtering (i) -  (v) in Sec. 2.2  were used for extraction. Source-specific products were made when the following camera-specific conditions were met: (i) ca_MASKFRAC ≥ 0.5, and (ii) ca_DET_ML ≥ 15. Detection flags (see Sec. 3.1.2 h)) were not taken into account.

The source counts were extracted from a circular aperture with radius 28", and the background counts were extracted from an annulus around the detection position with r (min) = 60" and r (max) = 180". PATTERN selection is the same as for image creation (PN: PATTERN <= 4, MOS: PATTERN <= 12). Event FLAG selection was done according to the recommendations: FLAG = 0 for the PN, XMMEA_EM for MOS time series, and XMMEA_SM for MOS spectra. The energy range for the extraction of all products is 0.2 - 12.0 keV.

While time series are filtered only by instrumental GTIs (see Sec. 3.1.1 a) and  b)), spectra are also filtered for the flare background (see Sec. 3.1.1 c)). The variability tests, however, exclude times where the flare background is high.

The bin size for the time series was chosen in such a way that the PN bins contain at least 18 counts and the MOS bins at least 5 counts as derived from the source lists. Note that these are background subtracted according to the background maps determined in the source detection process, see Sec. 3.1.2 d). The minimum bin size is 10 seconds, and all other bin sizes are rounded up to an integral multiple of 10.

To test for variability a χ2-test (suitable for binned data) was used with the Pearson's approximation for Poissonian data. Times with high background flaring were excluded from the test. The SAS task ekstest writes four keywords into the header of the time series file, namely CHI2PROB for the probability, CHISQUAR for the χ2-statistic, N_POINTS for the number of bins used in the test, and AVRATE for the mean rate in the number of bins used for the test.

The spectral products for each selected detection in 2XMM are (i) a grouped source spectrum (20 counts/bin where energies below 0.35 keV as well as energies in the PN around the copper line at 8.05 keV are set to 'bad'), (ii) a background spectrum, (iii) a source ARF (auxillary response file), and (iv) a spectral plot made using XSPEC. A keyword in the header of the source spectrum file indicates the name of the canned RMF (response matrix file) that can be used with this detection.

The time series products for each selected detection in 2XMM are (i) a time series file containing the source minus background and background arrays (corrected for exposure, cosmic rays, and dead time) as well as the keywords regarding the variability, and (ii) a plot of the time series and the background made by the SAS task elcplot.

The available products are identified by their observation ID (OBS_ID), exposure ID and the observation-specific source number SRC_NUM in the hexadecimal system. Further details and a discussion of the limitations of an automatic extraction can be found in SSC-LUX-RE-0155.pdf.

3.2 Catalogue creation

Most of the catalogue columns are derived from information in the lists output by the srcmatch task (see Sec. 3.1.2 i)); some further information has been extracted from the emldetect source lists (see Sec. 3.1.2 f)). Additional columns, derived from other products and obtained by further processing, are explained in this section.

3.2.1 Meta data

The catalogue includes meta data derived from keywords in the source list files to help characterize the detections. These are the observation ID (OBS_ID); revolution number (REVOLUT); the beginning and end of the observation in Modified Julian Date format (MJD_START and MJD_STOP); filter (ca_FILTER) and submode (ca_SUBMODE); note that the latter two apply to all exposures in a merged set, see Sec. 2.2 .

3.2.2 Positional errors

A detailed analysis of the 2XMM catalogue has been used to refine the value of the systematic 1-sigma error for XMM-Newton sources. This is reflected in the SYSERR parameter which replaces the nominal SYSERRCC value used in 2XMMp (see Sec. 3.1.3 ). The analysis is based on a comparison between the 2XMM X-ray and SDSS optical positions for a sample of ~ 1000 broad emission line quasars (the Sloan DR5 Quasar Catalog) which is expected to have neglible contamination by chance positional matches (the SDSS positions are known to better than 100 milliarcseconds). This analysis demonstrates that the statistical properties of this sample, which is believed to be representative of the whole catalogue, can be well described with an additional systematic positional error component with a fixed value SYSERR = 0.35 arcseconds for all fields for which an acceptable astrometric correction was determined. For those fields with no acceptable astrometric correction this appropriate value is SYSERR = 1.0 arcseconds. For comparison, the nominal values (i.e., SYSERRCC) used prior to this new analysis were 0.5 and 1.5 arcseconds, respectively.

The two positional errors determined during the processing of the data, RADEC_ERR (determined whilst fitting the detection, see Sec. 3.1.2 f)) and SYSERR (the systematic error of the XMM-Newton fields), have been combined to a single error, POSERR as:

POSERR = SQRT ( RADEC_ERR2 + SYSERR2 ).

This error is used for the determination of mean positions of the unique sources (see Sec. 3.2.3 a)).

3.2.3 Identifications

Every row in the catalogue is a detection and has received a running number (DETID). Several detections can refer to the same physical source in the sky (observed at different times), these are identified with a unique source ID (SRCID, see the description in subsection a) below). Every detection is also identified by their observation-specific (decimal) source number SRC_NUM which, in the hexadecimal system, is used together with the observation and exposure ID to identify source-specific products via their file name.

a) Unique source number

Many parts of the sky were observed more than once, either because an interesting object was a target more than once, or because two or more fields happened to overlap. It was therefore desirable to identify all cases in which the same source was responsible for two or more detections, i.e., separate rows in the catalogue. All detections for which this appears to be true have been given the same SRCID number.

The matching to find unique sources was performed on the basis of coincidence of celestial coordinates within certain limits, using the combined positional error, POSERR (see Sec. 3.2.2 ). Because in a few cases RADEC_ERR values were rather large (up to 18 arcseconds for point sources, see Fig. 5.11 in Sec. 5.3.1 ) an upper limit to matching distance of 7 arcseconds was also applied.

All possible pairs of detections from different observations are considered and the great-circle distance between them, GCDIST, computed. Two detections a and b are considered to be matched if (using SQL notation):

GCDIST < LEAST (0.9 * a.DIST_NN, 0.9 * b.DIST_NN,  7.0, 3.0 * (a.POSERR + b.POSERR)) .

The DIST_NN value for each detection records the distance to its nearest neighbour in that observation, which in a few cases was less than 7 arcseconds, generally because a detection which initially appeared to be an extended object was split into two. The 0.9 * DIST_NN part of the formula was therefore used to ensure that close pairs of detections did not cross-match incorrectly. Note that there are a few exceptions to the condition of preventing cross-matching on the same observation (see Sec. 6 for details).

The matching was performed efficiently within a Postgres database using R-tree indexing.

Since the matching algorithm is unavoidably affected by limitations such as the coordinate precisions, it is likely that a few cases exist in which two distinct objects have been assigned the same SRCID number, or a few detections have distinct SRCID numbers but are actually part of the same source.

b) IAU identification

An IAU identification, IAUNAME, has been assigned to each unique source (SRCID) based upon the IAU registered classification 2XMM. The form of these names is "2XMM Jhhmmss.sSddmmss" where hhmmss.s is taken from the eposcorr corrected and averahttps://xmm-tools.cosmos.esa.int/external/sas/current/doc/elcplot.pdfhttps://xmm-tools.cosmos.esa.int/external/sas/current/doc/elcplot.pdfged right ascension coordinate given in the column SC_RA and Sddmmss is the eposcorr corrected and averaged declination taken from the column SC_DEC. The correct nomenclature for references to detections in the catalogue is the IAUNAME followed by a colon and the detection identification number DETID (with six digits), that is: "2XMM Jhhmmss.sSddmmss:detid".

3.2.4 Combined parameters for unique sources

Several source parameters were averaged or otherwise combined to characterize a unique source in the catalogue (N_DETECTIONS indicates the number of detections found for the unique source). All columns referring to the parameters of a unique source have the prefix 'SC'.

Weighted means (inversely with the estimated variance) and their errors are given for coordinates (SC_RA, SC_DEC, SC_POSERR) as well as the flux in each band (SC_EP_FLUX, SC_EP_FLUX_ERR) and hardness ratios (SC_HRn, SC_HRn_ERR). Note that the error on a weighted mean is calculated as

mean_err = SQRT( 1.0 / SUM( 1 / err_i2 ) ).

The maximum likelihood, SC_DET_ML, of a unique source is the maximum of all the detections of it, while the detection likelihood of an extended source, SC_EXT_ML, is the average of the extent likelihoods of all detections. The maximum of all summary flags was determined (SC_SUM_FLAG). A variability flag is set to True if it is set in at least one of the detections (SC_VAR_FLAG), and the respective (minimum) χ2-probability (SC_CHI2PROB) is listed.

3.2.5 Cross-matching with 1XMM and 2XMMp detections

Most of the 2400 and 585 observations used for the 2XMMp and 1XMM Serendipitous Source Catalogues, respectively, are also used in the present catalogue (cf. the selection of observations in Sec. 2.1). The most likely counterparts in the respective catalogues were found by cross-matching the 1XMM and 2XMMp detections with the 2XMM unique source positions (using SC_RA and SC_DEC) with a simple limit of 3 arcseconds in distance; only the closest match is given. Their names (MATCH_2XMMP and MATCH_1XMM) and the distance between the two detections (SEP_2XMMP and SEP_1XMM) are given in the catalogue as well as the unique source number SRCID_2XMMP for 2XMMp.

3.2.6 Visual screening

For the pre-release catalogue 2XMMp, a relatively simple visual screening was used to exclude entire observations from the catalogue where there appeared to be a significant likelihood of the automatic source detection producing spurious results. For 2XMM a more sensitive and detailed visual examination of each field was carried out, in such a way that only specific regions, where spurious detections are known to occur, were excluded. Such spurious detections are usually caused by an insufficient background determination (Sec. 3.1.2 d) ), that is, where a 12 × 12-node spline map is not sufficiently detailed. Problems of this kind can be caused by bright point sources, extended emission, and any kind of 'sharp edges' caused by bright segments, RGA scattered light spike, insufficiently determined OOT events (due to pileup), and edges of noisy CCDs. All detections within such regions have received a 'manual' flag 11 (ca_FLAG) independent of whether they are considered to be spurious or not.

Often a very bright point source is at the centre (and the cause) of such a region. To distinguish these (since they are deemed to be little affected by the unreliable background subtraction) an additional 'manual' flag 12 was set to indicate that this source can be safely used as a 'real' detection with reliable parameters. The parameters of all other detections that have received only flag 11 should be regarded as suspicious.

Note that the parameters of extended emission detections are directly affected by the presence of any spurious detections, and as a consequence no extended detection has received flag 12. In addition, it is possible that an extended detection consists of multiple point sources since the SAS task emldetect attempts to split an extended detection into a maximum of two detections (Sec. 3.1.2 f) ). Such detections have not explicitly been flagged.

Table 3.3b is the continuation of Table 3.3a and summarizes the meanings of the manual flags and their positions in the flag column ca_FLAG (note that flag 10 was not used):


Table 3.3b:  Manual Flag Keys
10   Not set
11   Within region where spurious detections occur
12   Bright point source in region where spurious detections occur

The regions used for the manual flagging are available as a mask for each observation (see A. 1) where the value 0 denotes the area used for flagging. Note that in the case of only a single detection being identified as probably spurious a circular aperture with r = 10 arcseconds was used.

About half of all observations in the catalogue are little affected by the background subtraction problem; it is hence useful to classify each observation with respect to the area affected by bad background and the presence of spurious detections. The observation class (OBS_CLASS) is based on the fraction of area covered by the flag mask as compared to the total detection mask for that observation. It replaces the column OBSFLAG used in the 2XMMp catalogue which was defined based on the number of spurious detections in the field.

Six classes of observations were identified, they are listed in Table 3.4 together with the number of observations affected, the fractional area, and a comment on the approximate size of the excluded region (note that the shape is arbitrary and may consist of several patches).

Table 3.4:  Observation Class Keys
0 (38% of obs): 0% area no region has been identified for flagging
1 (12% of obs): 0% < area < 0.1% this corresponds to < ~3 single detections
2 (10% of obs): 0.1% <= area < 1% this corresponds to a circular area of radius 40" - 60"
3 (25% of obs): 1% <= area < 10% this corresponds to a circular area of radius 60" - 200"
4 (10% of obs): 10% <= area < 100% this corresponds to a circular area with a radius > 200"
5 (5% of obs): 100% the whole field is flagged

In addition, 44 observations were identified to comprise regions of high spatial density of sources. In such regions the source detection fails to detect some sources and multiple sources are detected as extended. A list of these observations is given in Table 3.5 . Note that these observations can have any of the given observation classes.

3.2.7 The summary flag

The column SUM_FLAG provides an overall quality indication of a detection, as a single integer value, based on the flags set automatically (Sec. 3.1.2 h)) and manually (Sec. 3.2.6 ). It is defined as follows:

Table 3.6:  Summary Flag Keys
0 : Summary flag 0 is given if none of the flags [1-12] for the three cameras [PN,M1,M2] are set to True, i.e., there are no negative indications for this detection.
1 : Summary flag 1 is given if any of the warning flags [1,2,3,9] for any of the cameras [PN,M1,M2] is True, i.e., the source parameters are considered to possibly have some problems.
2 : Summary flag 2 is given if any of the 'spurious detection' flags [7,8] for any of the cameras [PN,M1,M2] is True (note that flag 7 is set to True if any of the flags for possible spurious extended detection [4,5,6] is set to True), but the manual flag [11] is False, i.e., the detection is likely to be spurious.
3 : Summary flag 3 is given if the manual flag [11] is True but the automatic 'spurious detection' flags [7,8] for any of the cameras [PN,M1,M2] is False, i.e., the detection lies in a region where spurious detections occur.
4 : Summary flag 4 is given if the manual flag [11] is True and any of the automatic 'spurious detection' flags [7,8] for any of the cameras [PN,M1,M2] is True, i.e., the detection lies in a region where spurious detections occur and is flagged as likely spurious.

Note that the summary flag does not take into account flag 12 which indicates a bright point source with probably reliable parameters within an area that has received flag 11.

3.2.8 Variability information

A variability flag VAR_FLAG was set to True for a detection if at least one of the time series for this detection (derived from all appropriate exposures) has a χ2-probability  ≤ 1E-5 as determined by the SAS task ekstest (see CHI2PROB and Sec. 3.1.4). If the flag was set, then the camera and exposure ID with the lowest χ2-probability are given as well (VAR_INST_ID and VAR_EXP_ID).

Note that no assessment of potential variability has been made between observations for those sources detected more than once.

3.3 Additional processing

To facilitate access to information for each detection, catalogue-specific products were made to accompany the catalogue. They are described in this section.

3.3.1 Thumbnails

Thumbnail images have been made of every detection in the catalogue. A maximum of 12 small and one large thumbnail image (called location image) is available per source. The small thumbnail images were made in the bands 6, 7, and 8 (see Table 3.1) for each of the (sometimes merged) exposures used in the source detection from the set M1, M2 and PN as well as the all-EPIC mosaiced image. The thumbnails are stored as PNG files.

The small and large thumbnail images are 8 and 24 arcminutes across in size, respectively. The images are not smoothed. The band 1 - 5 image data are taken from the fits images which form part of the catalogue product set, and thus embody the same X-ray event selections as these images. The merged images are derived by adding the images from the individual bands and/or cameras. The brightness scaling of the thumbnail images is linear, but pixel brightness is truncated at a given saturation value. The 'heat' colour map is used, and the images are scaled so that the pixel range from 0 to the saturation limit spans the colour map. The value of the saturation is calculated for optimum display of the source at the centre of the field. Green cross-hairs are overlaid over the centre of the image to display the source position.

The legend at the head of each image gives the following:

3.3.2 Summary html pages

Summary html pages for a quick overview have been created for every detection in the catalogue. These pages give a selection of parameters from the catalogue. Parameters that are not explicitly given in the catalogue are identified by curly brackets and are explained here. The parameter names are linked to their column descriptions. In addition, click-on images of the thumbnails, time series, and spectra are shown.

The summary pages are organized as follows:

4. Catalogue content and organisation

This section summarises the organization of the catalogue and gives details of all the columns. Known problems with parameters presented in the catalogue or with products associated with it are listed in Sec. 6.

There are 297 columns in the catalogue; they are grouped together and explained in the links below.

For each observation there are up to three cameras with one or more exposures which were merged when the filter and submodes were the same (Sec. 2.2). The data in each exposure are accumulated in several distinct energy bands (Table 3.1). Consequently, the source parameters can refer to some or all of these levels: on the observation level there are the final mean parameters of the source (prefix 'EP'), on the camera level the data for each of the three cameras (where available) are given (prefix 'PN', 'M1', or 'M2'), and on the energy band level the energy-dependent details of the source parameters are given (indicated by a 'b' in the column name where b = 1,2,3,4,5,8,9). Finally, on a meta-level, some parameters of sources that were detected more than once (prefix 'SC') were combined, see Sec. 3.2.4.

The column name is given in capital letters, the FITS data format in brackets and the unit in square brackets. If the column originates from a SAS task, the name of the task is given to the right hand side and a link is set to the online SAS 6.9 package documentation (see App. A.3 for more details). A description of the column and possible cross-references follow.

Entries with NULL are given when no detection was made with the respective camera (that is, ca_MASKFRAC < 0.15).

Details of the columns

Part 1: 9 columns: Identification of the source
This includes cross matches with the 1XMM and 2XMMp catalogues.
Part 2: 11 columns: Details of the observation and exposures
Part 3: 9 columns: Coordinates
The external equatorial and Galactic coordinates and the internal equatorial coordinates as derived from the SAS tasks eposcorr and emldetect are given together with the error estimates.
Part 4: 223 columns: Source parameters
The parameters of the source detection as derived from the SAS tasks emldetect and srcmatch are given here.
Part 5: 7 columns: Detection flags
This part lists the flags to qualify the detections. The summary flag, which gives an overall assessment for the detection, is followed by particular flags for each camera. A flag each is given if there exists at least one time series or one spectrum for this source.
Part 6: 7 columns: Source variability
This part gives variability information for those detections for which time series were extracted.
Part 7: 31 columns: Unique source parameters
This part lists the source parameters for the unique sources across all observations (using the prefix 'SC'); these are coordinates, fluxes, hardness ratios, likelihoods, a variability and a summary flag. The number of detections is given also.

5. Catalogue Properties

Some of the more important properties of the 2XMM Serendipitous Source Catalogue are discussed in this section. A comprehensive discussion, however, goes beyond the scope of this user guide and will be a major part of the catalogue paper (Watson et al. 2008).

5.1 Overview

The catalogue contains source detections drawn from 3491 XMM-Newton EPIC observations made between 2000 February 3 and 2007 March 31 and which were publicly available by 2007 May 01; they are selected according to the criteria described in Sec. 2. Net exposure times in these observations range from < 1000 up to ~ 130000 seconds. Figure 5.1 shows the distribution of fields with net exposure time, Fig. 5.2 shows the distribution of fields on the sky, and Fig. 5.3 shows the distribution of fields with Galactic latitude.

The total sky area of the 3491 XMM-Newton observations is ~ 560 deg2 which translates to ~ 360 deg2 when corrected for field overlaps. Figure 5.4 shows the sky area as a function of net exposure time including the maximum coverage of the 1XMM and 2XMMp catalogues. A set of sensitivity maps, one for each EPIC instrument in each of the 5 standard energy bands, has been computed according to the empirical method for estimating the minimum detectable flux of an XMM survey as given in Appendix A of Carrera et al. (2007). They correspond to a maximum likelihood detection threshold of 10.0 for the 3491 observations of the 2XMM survey. Survey data products of types DETMSK, BKGMAP, and EXPMAP were used to create count-rate sensitivity maps. These count rates were then converted to fluxes using the energy conversion factors given in Table 3.2. To allow for the fact that some parts of the sky were observed more than once, a mapping to a celestial grid of HEALPix pixels (using NSIDE=8192) was used to find the lowest detectable flux for each point in the survey, i.e., the flux limit at that point from the deepest available observation. The resulting graphs of the sky area covered at each flux level or higher are shown in Fig. 5.5.

The catalogue contains 246897 X-ray detections with total-band (0.2 -12 keV) likelihood values ≥ 6. Of these 191870 are unique X-ray sources (Sec. 3.2.3 a)), that is, 27522 X-ray sources were observed more than once and up to 31 times in total. Of the 246897 X-ray detections 20837 are classified as extended. Table 5.1 shows the number of detections and unique sources per camera and energy band (cf. Table 3.1) split into point sources and extended sources; a cut of likelihood values (ca_b_DET_ML) > 10 has been applied in all cases.


Table 5.1:  Numbers of detections with DET_ML > 10 in the 2XMM catalogue
 Camera   Energy band   Point src   Ext'd src   Unique point src   Unique ext'd src 
PN 1 38074 4319 30811 3843
PN 2 63248 7457 50639 6714
PN 3 68197 6217 55035 5555
PN 4 37511 3604 30702 3167
PN 5 11144 1586 8682 1337
M1 1 20841 3392 15887 2958
M1 2 40965 6734 30998 5892
M1 3 52569 6754 40062 5882
M1 4 34230 4452 26710 3858
M1 5 7818 1825 5776 1547
M2 1 20626 3485 15718 3012
M2 2 42488 7045 32055 6149
M2 3 56060 6997 42624 6107
M2 4 36760 4703 28538 4080
M2 5 8546 2008 6265 1716

Sources with extended emission vary considerably in size and form. The fitting task emldetect (cf. Sec. 3.1.2 f)) allows the fitting of a circular shape with 6" < r < 80", where r is the extent parameter EP_EXTENT. The frequency distribution is shown in Fig. 5.6. Note that the tail at 80 arcseconds represents detections that have reached the extent limit and are in fact larger.

5.2 Statistical properties

5.2.1 Counts and fluxes

Figure 5.7 shows the M1/M2 and PN distributions of the total band net counts (ca_CTS) for 2XMM detections. The median of the distributions is ~ 50 PN net counts and ~ 30 M1/M2 net counts. About 35% of the detections have more than 100 PN counts, which is sufficient for basic X-ray spectral analyses. The fraction of detections with at least 100 M1 or M2 counts is ~ 25%.

2XMM detections cover a very broad range of X-ray fluxes from 10-16 erg/cm2/s to ~ 10-9 erg/cm2/s. The total-band (0.2 -12 keV) median flux of the catalogue detections is ~ 2.5 × 10-14 erg/cm2/s, while ~ 20% of the detections have fluxes below 1*10-14 erg/cm2/s, cf. Fig. 5.8 which shows the distribution of detections with flux for point sources (top panel) and for extended sources (bottom panel). The median fluxes in the soft (0.2 - 2 keV) and the hard energy bands (2 - 12 keV) are ~ 5.8 × 10-15 and ~ 1.4 × 10-14, respectively.

Note that the new source detection used for 2XMM is more sensitive than the source detection method used in 1XMM: this probably accounts for the shift  to lower median flux values in 2XMM.

5.2.2 Hardness ratios

To first approximation, broad-band emission properties of sources can be derived from their X-ray colours. The 2XMM catalogue contains 4 different X-ray colours or hardness ratios (ca_HRn) covering the energy interval from 0.2 keV to 12 keV. Colour-colour distributions are shown in Fig. 5.9 for detections with the PN camera at high (|b| > 20 degrees; left) and low (|b| < 20 degrees; right) Galactic latitudes. Source populations have been divided into low and high Galactic latitude samples since different populations of sources dominate the X-ray sky in each case. Only objects with the best X-ray colour quality were used for the plots, i.e., detections where the errors (90% confidence) are lower than 0.2. In addition, since X-ray colours are camera dependent, distributions using PN data only are shown. Note that the X-ray colour HR1 is filter dependent.

5.3 Evaluation of source parameters

5.3.1 Astrometry

The positional uncertainty of the detections is known to be a function of a number of different parameters including off-axis angle and source counts. As an example, Figure 5.10 shows the statistical position error, RADEC_ERR, as a function of source counts (ca_CTS): the statistical error strongly decreases with increasing source counts. The correlation is most pronounced for point sources (blue), while for extended sources (green) there is larger scatter in the observed distribution of values. For a given brightness extended sources have typical position errors larger than those for point sources, as expected, since it is more difficult to constrain the positions for sources which are intrinsically extended.

There are two components to the over-all positional uncertainty of 2XMM source detections: the statistical error associated with the position determination carried out by emldetect, reflected in the parameter RADEC_ERR, and the systematic error component for each XMM field which takes into account any residual errors in the position determination and correction process, e.g. in the eposcorr rectification (Sec. 3.1.3 ). As described in Sec. 3.2.2, an analysis of the 2XMM - SDSS optical position separations of a sample of ~ 1000 broad emission line quasars has led to a new robust determination of SYSERR which is incorporated in the catalogue.

This analysis demostrates that the observed distribution of the normalised position separations of a sample of ~ 1000 broad emission line quasars closely follows the expected statistical distribution, see Figure 5.11. On this basis, one can be confident that the quoted total position errors, represented by POSERR for individual detections or by SC_POSERR for unique sources, are a good representation of the true errors. This analysis also confirms that the 2XMM positions have no residual systematic shifts.

For most of the 2XMM catalogue sources (excluding extended sources) the total positional uncertainty is in the range from 0.35 - 3 arcseconds, with the systematic component (SYSERR) dominating for detection likelihoods > ~ 100 where the statistical error becomes small. The average 1-sigma position error for the whole catalogue is ~ 1.5 arcseconds. This means that for the vast majority of the 2XMM point sources the true position will lie within 5 arcseconds (< ~3σ) of the catalogue location, entirely consistent with the rule-of-thumb assumptions made in, for example, XMM identification programs.

5.3.2 Flux cross-calibration

A statistical analysis has been carried out to investigate the flux cross-calibration between PN, M1, and M2 as a function of different parameters such as time, offaxis angle, and energy bands. Only 2XMM detections with optimal signal-to-noise were selected and distributions of the difference in flux between cameras were obtained. An example of the results of this study is shown in Fig. 5.12 for the comparison of PN - MOS fluxes in the energy band 3 (1 - 2 keV).

It was found that the cross-calibration between M1 and M2 cameras is better than 5% for all energy bands, while it is better than 10% for PN - MOS cameras for the energy bands 2, 3, and 4. For the 2XMM hardest energy band, band 5, the cross-calibration between PN - MOS was found to be ~ 15%. The results of this analysis will be presented in Mateos et al. 2008. Note that new MOS QE CCFs (quantum efficiency calibration configuration files) are available from 2007 August 20 which decrease the discrepancy between PN and MOS below 2 keV (that is, for bands 1, 2, and 3).

5.3.3 False detection rates

In an attempt to investigate the expected false detection rate, realistic Monte-Carlo simulations of the 2XMM catalogue source detection and parameterisation process were carried out. The simulations represent typical high-latitude fields without bright sources or extended X-ray emission apart from the unresolved cosmic X-ray background. The distribution of X-ray point sources, with uniform spectral shape, was drawn from a representative extragalactic  log N - log S  relationship (e.g., Hasinger et al., 2001). The source spectrum was assumed to be the same as used in the determination of the ECFs (Sec. 3.1.2 f)), i.e., a power law characterised by Gamma = 1.7  with a Galactic column density Nh = 3.0 × 1020 cm2. Finally, a particle background component was added to the images. The simulation creates images in the five standard energy bands using the appropriate calibration information (i.e., energy- and position-dependent PSFs, vignetting, detection efficiency, etc.). The simulated images were then processed in the same way as the observed images (Sec. 3.1.2) and the detections were compared against the list of known input sources using a statistical matching procedure from which numbers of false detections in each simulated field could be ascertained.

The simulations have been carried out for three different exposure times: a nominal exposure time (12 ks for MOS and 8 ks for PN), corresponding to around 70% of the median exposure, as well as three and ten times higher exposure values. The resulting numbers of false detections per field as a function of the minimum detection likelihood are shown in Fig. 5.22. In addition, the expected false detection number for the number of 'beams' (i.e., independent detection cells) per field is shown. The exact determination of the number of 'beams' depends on the search box size (Sec. 3.1.2 c)) and the degradation and change of shape of the PSF with the offaxis angle and is not straight forward to calculate. For the purpose of comparison with the simulations, it has been estimated to be 5000.

For the nominal exposure times (assumed to be typical for the observations included in the catalogue) the number of false detections is ~ [1, 0.3, 0.1] per EPIC field at detection likelihood thresholds (EP_DET_ML) of [6, 8, 10] respectively. These values increase to ~ [4, 2, 1.5] for the longest exposure time.

To summarise, one can say that

A more detailed analysis can be found in the catalogue paper (Watson et al. 2008) as well as in a dedicated paper about the simulations (Sakano et al., in preparation).

5.3.4 Quality flags and observation class

The 2XMM catalogue includes a number of quality flags, some of which were set automatically (Sec. 3.1.2 h)), others were set manually (Sec. 3.2.6). In addition, observations were divided into classes according to the area over which manual flags were set (Sec. 3.2.6). The flags (EP_FLAG and SUM_FLAG) and observation classes (OBS_CLASS) provide two ways to filter the detections in the catalogue to achieve a cleaner sample with regard to spurious detections, as well as unreliable parameters, eg. those caused by problems in the background maps and artefacts in the images.

While Table 3.4 shows the distribution of fields with observation class, Fig. 5.13 shows the distribution of detections with observation class for manual and automated flags. The manual flag selection indicates a strong dependence on observation class, as expected, since the size of the flagged region roughly correlates with the number of detections within (that is, the spatial density within the regions is usually higher due to the spurious detections). For comparison, Fig. 5.14 shows the same for detections of extended sources only. It is obvious that most of the spurious sources are detected as extended.

Many of the spurious detections caused by problems in the background maps and artefacts in the images are relatively bright and have a high detection likelihood (EP_DET_ML), contrary to the false detection rate which increases with decreasing detection likelihoods (cf. Sec. 5.3.3). Figure 5.15 shows the distribution of detections with a maximum detection likelihood > 15 with and without manual flag settings.

For a comprehensive discussion of the automated quality flags for extended source detection see Sec. 2 in SSC-AIP-TN-003.pdf.

5.3.5 Variability parameters

a) Evaluation of the variability test

Evaluation of the frequency distributions of χ2 probability (ca_CHI2PROB) from the timeseries analysis (cf. Sec. 3.1.4 ) indicates no significant problems and shows good agreement between the measurements and predictions for the regions of the distributions dominated by random noise.

Figures 5.16 and 5.17 show the frequency distribution of ca_CHI2PROB for PN, M1 and M2. Note that the distributions are almost constant per unit interval of probability (Fig. 5.16) down to low probabilities and all have similar shapes. Obviously, a non-variable set of timeseries would have a constant frequency per unit interval of probability across the whole probability range 0.0 - 1.0.

Figures 5.18 and 5.19 show the observed frequency distribution of EP_CHI2PROB compared with a simulated distribution for a non-variable set of timeseries. As there are many detections with less than the full set of [PN, M1, M2] timeseries, it was necessary to reproduce this incompleteness in the simulation. The numbers of detections with 3, 2, 1, or 0 ca_CHI2PROB values are: 14917, 11330, 11917, 156, respectively. The simulation was conducted by generating three columns representing PN, M1, M2, with each cell containing a uniform, random number in the range 0.0 - 1.0. For each cell, a check was performed to see if there was a valid ca_CHI2PROB value for the associated, real camera data; if not, the random value was set to NULL (so that the correct 'run' of valid values was mimicked in the simulations). These values simulate the expected distribution of PN_ , M1_ , M2_CHI2PROB for the case of no real variability (see e.g. Figs 5.20 and 5.21). As expected, the resulting distributions are 'flat', as discussed above. A fourth column was then computed with the minimum simulated CHI2PROB, i.e., a simulated set of EP_CHI2PROB = min(PN_CHI2PROB,M1_CHI2PROB,M2_CHI2PROB), over all available values for each detection.

b) A note on use of the screening flags in the context of the time series and the variability information

Here we attempt to offer some advice on selecting 'good quality' timeseries and variability information, or rather more strictly, avoiding timeseries with likely problems.

13859 out of 38320, i.e., 36% of timeseries have SUM_FLAG = 0, and a similar fraction (750 / 2307 =  33%) applies to those timeseries flagged as 'variable'.

Whatever levels of SUM_FLAG are used in a specific analysis project, it is strongly advised to check visually the timeseries (using, e.g., the online graphical products) before final acceptance. Using SUM_FLAG = 0 will certainly greatly reduce the number of 'poor' or 'bad' timeseries, but may not completely eliminate them. If the SUM_FLAG is 3 or 4 it is worth checking if flag 12 in EP_FLAG is set to T, in which case the source is very likely real (see Sec. 3.2.6  for details).

6. Known problems and issues

References:

Cash, W., 1979, Parameter estimation in Astronomy through application of the likelihood ratio, ApJ, 228, p. 939

Carrera, F.J., Ebrero, J., et al., 2007, The XMM-Newton serendipitous survey. III. The AXIS X-ray source counts and angular clustering, A&A 469, 27

Hasinger, G., Altieri, B., Arnaud, M., et al. 2001, A&A 365, L45

Mateos, S., et al. 2008, A&A, in press

Watson, M., et al. 2008, The XMM-Newton serendipitous survey. V. The Second XMM-Newton Serendipitous Source Catalogue, A&A, submitted

Document revision history

Release No. Release Date      Comments
1.0 22 August 2007 First release
1.1 20 August 2008 Added column obs_class in Table 2.1;
Added problem issue about source matching on same observation;
Added text to false detection rate (sect. 5.3.3)

Appendices

A.1 Catalogue data-products description

The catalogue was produced using the standard SSC pipeline that re-processed all available observations since the begin of the mission (http://xmmssc-www.star.le.ac.uk and follow the link 'Pipeline Processing'). Thus all the products associated with the catalogue follow the standard specification, as described in the Data Files Handbook and the SSC products Specification available here. Products for the OM or RGS associated with the catalogue are available through the ESA data base XSA.

In addition to the standard pipeline products, a fits flagging mask (Sec. 3.2.6) and three graphical product types were made during the catalogue pipeline processing: the thumbnail and source location images (Sec. 3.3.1), and the source summary html pages (Sec. 3.3.2).

The flagging masks are fits images where the value 0 stands for the region where spurious detections occur (detecions within this region have receceived flag 11), and the value 1 stands for regions without any identified problems. The masks named C<obsid>EPX000FLGMSK0000.FIT.
The thumbnail images are graphical products and were made from the fits (merged) images. They are named C<obsid><exp>SRCIMG<band><srcnum>.PNG.
The location image is a graphical product made from the fits EPIC (merged) images; it is named C<obsid>EPX000SRCIMW8<srcnum>.GIF.
The source summary page is an html page which includes several graphic products, some extracts from the catalogue, and links to external catalogues (e.g. the SIMBAD database); it is named C<obsid>EPX000SRCSUM0<srcnum>.HTM. Additional parameters used on these pages but not given in the catalogue are explained here.

These files were not made using the PCMS, and for this reason they have filenames starting with C instead of P (the other parts of the filename follow the pipeline product standard).

A.2 List of observations used to construct the catalogue

List of observations ('fields').

A.3 Catalogue pipeline processing details

The processing was conducted during October/November 2006 using the calibration files (CCFs) that were available on 2006 July 02 plus three additional CCFs (EMOS2_ADUCONV_0042.CCF, EMOS2_CTI_0030.CCF, RGS1_CTI_0009.CCF). The pipeline configuration can be found Table A.1 where the SAS task version used for making the catalogue (SAS 6.9) is listed together with the most similar publicly available task version (SAS 7.0) at the time of processing, SAS 7.0.

A.4 Target classification

The 2XMM catalogue is a catalogue of serendipitous sources. The observations from which it has been compiled, however, are pointed observations with one or more targets. To avoid a selection bias an attempt has been made to identify the target or targets and to classify the field with regard to the kind of target (that is, whether one or more detections can be taken to be the target). The result is presented in Table A.2. It is worth noting that any attempt to identify and classify a target is subjective and only the investigators of that observation themselves know all the details. The table therefore lists next to the identified source number and field classification the coordinates from the proposal and from the XSA as well as Simbad coordinates where the target name could be resolved. A more detailed description of the work is given in SSC-LUX-TN-0065.pdf. The columns in the table and the options are explained at the end of the table.