New approaches for faint source detection in INTEGRAL's hard X-ray survey
INTEGRAL has a rich history of producing catalogs of high-energy
sources, owing to the wide field of view of the main instruments and
good resolution provided by their coded masks. The catalog from the
first 1000 orbits of INTEGRAL, 'cat1000', had a survey team of 9 expert
astronomers working for nearly 2.5 years to produce the final catalog of
939 astronomical sources. This catalog did not search in maps on
timescales shorter than the ~3-day orbital period of INTEGRAL because of
limitations of the techniques and tools available at the time. A way to
search on shorter observation timescales could have revealed fainter
sources and transient events, but appeared too costly in human time and
computational power. Coupled to the need to rapidly exploit the
ever-expanding INTEGRAL data set, this motivated to develop a new
approach for source detection using deep learning and Bayesian reasoning.
The approach uses a convolutional neural network (CNN) to detect sources
in INTEGRAL/ISGRI science window (ScW) images, then employs a Bayesian
reasoning merging algorithm to produce a final unique source list. CNNs
developed in recent years are most commonly applied in the field of
image processing because they perform well at dealing with image
recognition and classifications tasks and are considered to be one of
the leading techniques in the field, and can outperform humans in image
classification due to the networks' ability to pick out underlying
patterns and structures that domain experts can be unaware existed. The
CNN was trained on thousands of small labelled windows, some with
sources and some without, and this enabled the CNN to learn how to
detect if a source is present or not, to an extremely high accuracy. The
CNN utilises five energy bands simultaneously, which speeds up the
source detection process and produces a more reliable and flux-sensitive
detection list. Once trained, the CNN searched 67000 ScWs from the first
1000 orbits in one day.
During its mission up to now, INTEGRAL has visited many parts of the sky
numerous times, meaning most sources will be detected in multiple ScWs
and a technique to determine a list of unique sources from this larger
list of detections is needed. Previous approaches to merging suffered
from human bias and there were concerns over robustness of the method. A
Bayesian reasoning algorithm was therefore implemented that removes
biases by not starting from a reference catalog and is independent of
the order in which detections are presented to it, unlike the previous
methods.
The combination of the CNN and Bayesian matching produces a very
accurate merged list of detections with very few detections needing to
be manually checked - compared to the old method which took 2.5 years
for 9 people to manually check each source for inclusion into the
catalog. Looking on a ScW level allowed to detect sources that have
outbursts on smaller timescales than previous studies. This approach
also helped to generate a clearer picture of the emission from the
Galactic centre region, as detection at ScW level is easier to do than
with stacked images because sources are not all 'on' at the same time.
In the left image all detections of one of the sources in the Galactic
centre region is shown (magenta points, size scaled by detection
significance), as well as the locations of other nearby sources (with
the INTEGRAL resolution shown as the grey circle). The newly developed
source detection and merging method is reliable, scalable, removes need
for continuous human intervention and eliminates some of the human
subjectivity that previously existed. They will be ideal tools to aid in
the generation of future ISGRI catalogs.