SpectroLive : optical spectra (autofluorescence and diffuse reflectance) acquired on human skin carcinomas
Spectroscopic data were collected on 131 patients suspected of being affected (or affected) by skin carcinomas during the SpectroLive clinical trial: NCT02956265 (national clinical trial identifier from clinicaltrials.gov).
Spectra were acquired in vivo before local anesthesia and surgical resection performed in order to get diagnosis by anatomopathology. Spectra were acquired on three sites located within the surgical spindle: on the suspected lesion itself (L sites), on the surgical margins defined by the surgeon called perilesional (PL) sites, and finally on the surgical spindle edges that are clinically considered (then confirmed by anatomopathology) as non-lesional (NL) sites.
One spectroscopic measurement set consist in 24 spectra: 4 spectra corresponding to 4 distances (0.4; 0.6; 0.8; 1 mm) between excitation and emission optical fibers were simultaneously acquired using 6 different excitation light sources sequentially (365, 385, 395, 405, 415 nm to acquire autofluorescence spectra and white broadband light source to acquire diffuse reflectance spectra).
A preprocessing pipeline was developed in order to correct raw data acquired on 4 different optical chains, each made of one optical fibers beam, a high-pass optical filter and a spectrometer. Raw data and programs to preprocess them are both provided in the database available online. Then features extraction and machine learning methods (SVM, k-NN, LDA, etc.) were developed to test the ability of spectroscopic data to provide diagnostic assistance in skin carcinomas surgical guidance.
Several strategies were tested to evaluate optical spectroscopy’s ability to discriminate between several histological classes: e.g. four histological classes (healthy skin versus actinic keratoses versus in situ carcinomas versus invasive carcinomas) or two classes (invasive versus in situ carcinomas). Such a dataset can be useful for research teams that develop machine learning methods applied to data automated classification whose result is useful to provide diagnostic help to dermatologist or surgeons in charge of skin carcinomas surgical resection. Such data will help them train or test the methods they developed.