Skip to main content

Table 1 Characteristics of the six microarray datasets used

From: Comparison of linear discriminant analysis methods for the classification of cancer based on gene expression data

Dataset

No. of samples

Classes

(No. of samples)

No. of genes

Original ref.

Website

Two-class lung cancer

181

MPM(31), adenocarcinoma(150)

12533

[8]

http://www.chestsurg.org

Colon

62

normal(22), tumor(40)

2000

[9]

http://microarray.princeton.edu/oncology/affydata/index.html

Prostate

102

normal(50), tumor(52)

6033

[10]

http://microarray.princeton.edu/oncology/affydata/index.html

Multi-class lung cancer

68(66) a

adenocarcinoma(37), combined(1), normal(5), small cell(4), squamous cell(10), fetal(1), large cell(4), lymph node(6)

3171

[11, 12]

http://www-genome.wi.mit.edu/mpr/lung/

SRBCT

88(83) b

Burkitt lymphoma (29), Ewing sarcoma (11), neuroblastoma (18), rhabdomyosarcoma (25), non-SRBCTs(5)

2308

[13]

http://research.nhgri.nih.gov/microarray/Supplement/

Brain

42(38) c

medulloblastomas(10), CNS AT/RTs(5), rhabdoid renal and extrarenal rhabdoid tumours(5), supratentorial PNETs(8), non-embryonal brain tumours (malignant glioma) (10), normal human cerebella(4)

5597

[14]

http://research.nhgri.nih.gov/microarray/Supplement/

  1. Note: Some samples were removed for keeping adequate number of each type.
  2. a. One combined and one fetal cancer samples were removed, and real sample size is 66;
  3. b. Five non-SRBCT samples were removed, and real sample size is 83;
  4. c. Four normal tissue samples were removed, and real sample size is 38.