N231-045 TITLE: Multi-Spectral, Multi-Sensor Image Fusion
OUSD (R&E) CRITICAL TECHNOLOGY AREA(S): Artificial Intelligence (AI)/Machine Learning (ML); General Warfighting Requirements (GWR)
The technology within this topic is restricted under the International Traffic in Arms Regulation (ITAR), 22 CFR Parts 120-130, which controls the export and import of defense-related material and services, including export of sensitive technical data, or the Export Administration Regulation (EAR), 15 CFR Parts 730-774, which controls dual use items. Offerors must disclose any proposed use of foreign nationals (FNs), their country(ies) of origin, the type of visa or work permit possessed, and the statement of work (SOW) tasks intended for accomplishment by the FN(s) in accordance with the Announcement. Offerors are advised foreign nationals proposed to perform on this topic may be restricted due to the technical data under US Export Control Laws.
OBJECTIVE: Develop video image fusion processing algorithms that produce a video stream with quality exceeding that produced by individual sensors operating separately in different visible and infrared bands.
DESCRIPTION: Electro-optic and infrared (EO/IR) video imaging sensors (cameras) are widely used for situational awareness, surveillance, and targeting. The Navy is deploying such cameras in multiple spectral bands and in differing formats to cover everything from narrow field of view (NFOV) up to very wide field of view (WFOV). Multiple spectral bands are useful because different bands, for example the near infrared (NIR) and the mid-wave infrared (MWIR) bands, "see" targets differently, especially under different lighting and environmental conditions. While each camera has its own strengths and weaknesses, taken collectively and properly interpreted, the combined video data can reveal far more than any single imaging sensor can individually. Consequently, the camera systems are integrated to provide coordinated and optimized coverage to meet the various mission requirements.
The effectiveness of the combined imaging sensor system depends on how well the copious amount of video image data that the various cameras produce is processed and evaluated in real time, either by human operators or by automated methods. Even with the most judicious use and coordination of these sensors, the amount of video image data produced is far in excess of what a single human operator can absorb and process. Automated aids can considerably reduce the burden on the human operator. However, there are still many situations where there is no substitute for a clear picture delivered in real time without need of the operator flipping between bands and between NFOV and WFOV cameras to assimilate the best view. Efficient algorithms for fusing imagery taken across multiple wavelengths bands in the highly complex maritime environment simply do not exist. While available technologies address some aspects of the problem, for example automated image interpretation (facial recognition, crop monitoring with satellite imagery, etc.), no commercial application approaches the requirements for real-time, multi-spectral, multi-sensor, image fusion presented by modern naval operations.
The Navy needs an innovative video image fusion technology, realized and demonstrated as a coherent set of image processing algorithms, that ingests imagery from multiple sensors operating in different bands to produce an output video stream that exceeds the quality of the imagery obtained from any of the individual sensors taken separately. At a minimum, the content captured by each sensor should be aggregated in the output video without loss of detail or resolution. However, the goal is to produce output that exceeds the quality (resolution, contract, noise, etc.) of the individual sensors. That is, algorithms that selectively combine and "blend" regions of image data taken from the individual sensors represent the minimum acceptable solution. Algorithms that smartly fuse video image data to reduce clutter, improve target resolution, increase apparent dynamic range, mitigate the effects of adverse environmental conditions, reveal additional target information, and improve the capture of dim, fast moving targets (for example, targets travelling at mach 3 at the resolution limit of the sensor) are of particular interest.
It should be assumed that the fused video stream will be viewed directly by weapon system operators as well as further processed through additional image processing systems. Therefore, the solution should be optimized with both purposes in mind and the resulting fused video output be available in real time. Of particular interest is whether the fused video aids or inhibits the performance of automatic target detection, tracking, recognition, and identification algorithms. While the goal of this effort is not to develop detection, tracking, recognition, and identification algorithms, the solution must be compatible with these functions. Therefore, the solution should clearly show that the fused video will enhance these functions or clearly show that these functions must be applied prior to fusion of the input video streams to be effective. In order to deploy to a tactical system, the solution must be computationally efficient and the processor load presented by the algorithms is a key metric that must be addressed, minimized, and verified in demonstration of the solution. In addition, the technology should be fundamentally extensible to multiple sensors operating across the visible to long wave IR band.
The goal is to fuse imagery from sensors that have overlapping fields of view. The goal is not to stitch the output of sensors with adjacent fields of view. Solutions should not assume that the input video is identical in FOV, resolution, dynamic range, or frame rate. Furthermore, frame capture between the sensors should not be assumed synchronous. However, solutions should anticipate that sufficient video metadata is available from each sensor to align the video inputs temporally and, to a high degree, spatially. The solution should be agnostic to sensor format, frame rate, resolution, etc., and accept non-compressed Class 0 motion imagery as well as compressed inputs. Imagery and metadata input will be compliant with (and therefore the solution must be compliant with) MIL-STD-2500C National Imagery Transmission Format Standard, Motion Imagery Standards Profile (MISP), Motion Imagery Standards Board (MISB) Standard (ST) 1606, MISB ST 1608, MISB ST 1801, MISB ST 0902, and MISB ST 1402.
At a minimum, the solution should be demonstrated on video generated from a minimum of two sensors operating in different bands. The bands, formats, and native resolutions chosen are at the discretion of the proposer. Demonstration need not include operation with actual sensors. Demonstration with collected data is acceptable. However, the Government will not provide collected data during development of the solution. The Government will also not provide tactical or developmental hardware during the effort so the solution should include the means of demonstrating the fusion algorithms on surrogate processors and displays.
Work produced in Phase II may become classified. Note: The prospective contractor(s) must be U.S. Owned and Operated with no Foreign Influence as defined by DOD 5220.22-M, National Industrial Security Program Operating Manual, unless acceptable mitigating procedures can and have been implemented and approved by the Defense Counterintelligence Security Agency (DCSA), formerly the Defense Security Service (DSS). The selected contractor must be able to acquire and maintain a secret level facility and Personnel Security Clearances, in order to perform on advanced phases of this contract as set forth by DSS and NAVSEA in order to gain access to classified information pertaining to the national defense of the United States and its allies; this will be an inherent requirement. The selected company will be required to safeguard classified material IAW DoD 5220.22-M during the advance phases of this contract.
All DoD Information Systems (IS) and Platform Information Technology (PIT) systems will be categorized in accordance with Committee on National Security Systems Instruction (CNSSI) 1253, implemented using a corresponding set of security controls from National Institute of Standards and Technology (NIST) Special Publication (SP) 800-53, and evaluated using assessment procedures from NIST SP 800-53A and DoD-specific (KS) (Information Assurance Technical Authority (IATA) Standards and Tools).
The Contractor shall support the Assessment and Authorization (A&A) of the system. The Contractor shall support the government�s efforts to obtain an Authorization to Operate (ATO) in accordance with DoDI 8500.01 Cybersecurity, DoDI 8510.01 Risk Management Framework (RMF) for DoD Information Technology (IT), NIST SP 800-53, NAVSEA 9400.2-M (October 2016), and business rules set by the NAVSEA Echelon II and the Functional Authorizing Official (FAO). The Contractor shall design the tool to their proposed RMF Security Controls necessary to obtain A&A. The Contractor shall provide technical support and design material for RMF assessment and authorization in accordance with NAVSEA Instruction 9400.2-M by delivering OQE and documentation to support assessment and authorization package development.
Contractor Information Systems Security Requirements. The Contractor shall implement the security requirements set forth in the clause entitled DFARS 252.204-7012, "Safeguarding Covered Defense Information and Cyber Incident Reporting," and National Institute of Standards and Technology (NIST) Special Publication 800-171.
PHASE I: Develop a concept for a video fusion system that meets the objectives stated in the Description. Demonstrate the feasibility of the concept in meeting the Navy�s need. Analyze the effect on image quality and predict the benefits to target detection, tracking, and identification. Feasibility shall be demonstrated by a combination of analysis, modeling, and simulation. The Phase I Option, if exercised, will include the initial design specifications and capabilities description to build a prototype solution in Phase II.
PHASE II: Develop and demonstrate a prototype sensor fusion system (suite of coded algorithms) based on the concept, analysis, architecture, and specifications resulting from Phase I. Demonstration of the multi-spectral, multi-sensor fusion system shall be accomplished through test of a prototype in a laboratory environment using real-time or collected imagery data. At the conclusion of Phase II, prototype software shall be delivered to NSWC Crane along with complete test data, sample image files (both input and output), installation and operation instructions, and any auxiliary software and special hardware necessary to run the prototype.
It is probable that the work under this effort will be classified under Phase II (see Description section for details).
PHASE III DUAL USE APPLICATIONS: Support the Navy in transitioning the technology for Government use. Develop tactical code specific to Navy sensor systems, processing hardware, and existing software interfaces. Establish software configuration baselines, produce support documentation, and assist the Government in the integration of the multi-spectral, multi-sensor fusion algorithms into existing and future imaging sensor systems.
The technology resulting from this effort is anticipated to have broad military application. In addition, there are law enforcement and security applications. Scientific applications include processing of satellite and aerial imagery, medical imagery, and imaging of natural events such as complex weather phenomena.
REFERENCES:
1. Li, Jinjiang, et al. "Multispectral image fusion using fractional-order differential and guided filtering." IEEE Photonics Journal 11 6 Dec. 2019: 19 pages. https://ieeexplore.ieee.org/document/8848440
2. Han, Xiyu, et al. "An adaptive two-scale image fusion of visible and infrared images." IEEE Access 7 (2019): 56341-56352. https://ieeexplore.ieee.org/document/8698903
KEYWORDS: Video Imaging; Imaging Sensors; Image Fusion; Image Processing; Automatic Target Detection; Target Resolution.
** TOPIC NOTICE ** |
The Navy Topic above is an "unofficial" copy from the Navy Topics in the DoD 23.1 SBIR BAA. Please see the official DoD Topic website at www.defensesbirsttr.mil/SBIR-STTR/Opportunities/#announcements for any updates. The DoD issued its Navy 23.1 SBIR Topics pre-release on January 11, 2023 which opens to receive proposals on February 8, 2023, and closes March 8, 2023 (12:00pm ET). Direct Contact with Topic Authors: During the pre-release period (January 11, 2023 thru February 7, 2023) proposing firms have an opportunity to directly contact the Technical Point of Contact (TPOC) to ask technical questions about the specific BAA topic. Once DoD begins accepting proposals on February 8, 2023 no further direct contact between proposers and topic authors is allowed unless the Topic Author is responding to a question submitted during the Pre-release period. SITIS Q&A System: After the pre-release period, and until February 22, 2023, (at 12:00 PM ET), proposers may submit written questions through SITIS (SBIR/STTR Interactive Topic Information System) at www.dodsbirsttr.mil/topics-app/, login and follow instructions. In SITIS, the questioner and respondent remain anonymous but all questions and answers are posted for general viewing. Topics Search Engine: Visit the DoD Topic Search Tool at www.dodsbirsttr.mil/topics-app/ to find topics by keyword across all DoD Components participating in this BAA.
|
2/9/23 | Q. | What is considered "higher quality" in terms of this RFP? - All images fused at the resolution of the highest resolution sensor? - Multi-look video superresolution? - Algorithmic contract/detail enhancement? - Something else? |
A. | We are asking the vendors to provide fused images that contain higher quality video than is available from the native sensors. The higher quality video would be beneficial for operators to interpret the imagery and for algorithms to provide automated detection, recognition and identification. Examples of metrics to measure quality in video include resolution, dynamic range, contrast, detection range, reduction in detection false alarm rate, etc. |