Quantitative analysis of breast cancer diagnosis using a probabilistic modelling approach

Shuo Liu, Jinshu Zeng, Huizhou Gong, Hongqin Yang, Jia Zhai, Yi Cao, Junxiu Liu, Yuling Luo, Yuhua Li, Liam Maguire, Xuemei Ding

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Background

Breast cancer is the most prevalent cancer in women in most countries of the world. Many computer-aided diagnostic methods have been proposed, but there are few studies on quantitative discovery of probabilistic dependencies among breast cancer data features and identification of the contribution of each feature to breast cancer diagnosis.

Methods

This study aims to fill this void by utilizing a Bayesian network (BN) modelling approach. A K2 learning algorithm and statistical computation methods are used to construct BN structure and assess the obtained BN model. The data used in this study were collected from a clinical ultrasound dataset derived from a Chinese local hospital and a fine-needle aspiration cytology (FNAC) dataset from UCI machine learning repository.

Results

Our study suggested that, in terms of ultrasound data, cell shape is the most significant feature for breast cancer diagnosis, and the resistance index presents a strong probabilistic dependency on blood signals. With respect to FNAC data, bare nuclei are the most important discriminating feature of malignant and benign breast tumours, and uniformity of both cell size and cell shape are tightly interdependent.

Contributions

The BN modelling approach can support clinicians in making diagnostic decisions based on the significant features identified by the model, especially when some other features are missing for specific patients. The approach is also applicable to other healthcare data analytics and data modelling for disease diagnosis.
LanguageEnglish
Pages168-175
Number of pages8
JournalComputers in Biology and Medicine
Volume92
Early online date21 Nov 2017
DOIs
Publication statusPublished - 1 Jan 2018

Fingerprint

Bayesian networks
Cytology
Breast Neoplasms
Cell Shape
Fine Needle Biopsy
Chemical analysis
Needles
Cell Biology
Ultrasonics
Cell Size
Learning algorithms
Data structures
Learning systems
Tumors
Neoplasms
Decision Making
Blood
Learning
Delivery of Health Care
Datasets

Keywords

  • Clinical decision support
  • Data modelling
  • Bayesian networks
  • Quantitative analysis
  • Diagnostic contribution
  • Breast cancer diagnosis

Cite this

Liu, Shuo ; Zeng, Jinshu ; Gong, Huizhou ; Yang, Hongqin ; Zhai, Jia ; Cao, Yi ; Liu, Junxiu ; Luo, Yuling ; Li, Yuhua ; Maguire, Liam ; Ding, Xuemei. / Quantitative analysis of breast cancer diagnosis using a probabilistic modelling approach. In: Computers in Biology and Medicine. 2018 ; Vol. 92. pp. 168-175.
@article{8b265420df3149a88e38d38784a24ed6,
title = "Quantitative analysis of breast cancer diagnosis using a probabilistic modelling approach",
abstract = "BackgroundBreast cancer is the most prevalent cancer in women in most countries of the world. Many computer-aided diagnostic methods have been proposed, but there are few studies on quantitative discovery of probabilistic dependencies among breast cancer data features and identification of the contribution of each feature to breast cancer diagnosis.MethodsThis study aims to fill this void by utilizing a Bayesian network (BN) modelling approach. A K2 learning algorithm and statistical computation methods are used to construct BN structure and assess the obtained BN model. The data used in this study were collected from a clinical ultrasound dataset derived from a Chinese local hospital and a fine-needle aspiration cytology (FNAC) dataset from UCI machine learning repository.ResultsOur study suggested that, in terms of ultrasound data, cell shape is the most significant feature for breast cancer diagnosis, and the resistance index presents a strong probabilistic dependency on blood signals. With respect to FNAC data, bare nuclei are the most important discriminating feature of malignant and benign breast tumours, and uniformity of both cell size and cell shape are tightly interdependent.ContributionsThe BN modelling approach can support clinicians in making diagnostic decisions based on the significant features identified by the model, especially when some other features are missing for specific patients. The approach is also applicable to other healthcare data analytics and data modelling for disease diagnosis.",
keywords = "Clinical decision support, Data modelling, Bayesian networks, Quantitative analysis, Diagnostic contribution, Breast cancer diagnosis",
author = "Shuo Liu and Jinshu Zeng and Huizhou Gong and Hongqin Yang and Jia Zhai and Yi Cao and Junxiu Liu and Yuling Luo and Yuhua Li and Liam Maguire and Xuemei Ding",
note = "Compliant at Salford University - evidence uploaded to other files",
year = "2018",
month = "1",
day = "1",
doi = "10.1016/j.compbiomed.2017.11.014",
language = "English",
volume = "92",
pages = "168--175",
journal = "Computers in Biology and Medicine",
issn = "0010-4825",
publisher = "Elsevier",

}

Quantitative analysis of breast cancer diagnosis using a probabilistic modelling approach. / Liu, Shuo; Zeng, Jinshu; Gong, Huizhou; Yang, Hongqin; Zhai, Jia; Cao, Yi; Liu, Junxiu; Luo, Yuling; Li, Yuhua; Maguire, Liam; Ding, Xuemei.

In: Computers in Biology and Medicine, Vol. 92, 01.01.2018, p. 168-175.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Quantitative analysis of breast cancer diagnosis using a probabilistic modelling approach

AU - Liu, Shuo

AU - Zeng, Jinshu

AU - Gong, Huizhou

AU - Yang, Hongqin

AU - Zhai, Jia

AU - Cao, Yi

AU - Liu, Junxiu

AU - Luo, Yuling

AU - Li, Yuhua

AU - Maguire, Liam

AU - Ding, Xuemei

N1 - Compliant at Salford University - evidence uploaded to other files

PY - 2018/1/1

Y1 - 2018/1/1

N2 - BackgroundBreast cancer is the most prevalent cancer in women in most countries of the world. Many computer-aided diagnostic methods have been proposed, but there are few studies on quantitative discovery of probabilistic dependencies among breast cancer data features and identification of the contribution of each feature to breast cancer diagnosis.MethodsThis study aims to fill this void by utilizing a Bayesian network (BN) modelling approach. A K2 learning algorithm and statistical computation methods are used to construct BN structure and assess the obtained BN model. The data used in this study were collected from a clinical ultrasound dataset derived from a Chinese local hospital and a fine-needle aspiration cytology (FNAC) dataset from UCI machine learning repository.ResultsOur study suggested that, in terms of ultrasound data, cell shape is the most significant feature for breast cancer diagnosis, and the resistance index presents a strong probabilistic dependency on blood signals. With respect to FNAC data, bare nuclei are the most important discriminating feature of malignant and benign breast tumours, and uniformity of both cell size and cell shape are tightly interdependent.ContributionsThe BN modelling approach can support clinicians in making diagnostic decisions based on the significant features identified by the model, especially when some other features are missing for specific patients. The approach is also applicable to other healthcare data analytics and data modelling for disease diagnosis.

AB - BackgroundBreast cancer is the most prevalent cancer in women in most countries of the world. Many computer-aided diagnostic methods have been proposed, but there are few studies on quantitative discovery of probabilistic dependencies among breast cancer data features and identification of the contribution of each feature to breast cancer diagnosis.MethodsThis study aims to fill this void by utilizing a Bayesian network (BN) modelling approach. A K2 learning algorithm and statistical computation methods are used to construct BN structure and assess the obtained BN model. The data used in this study were collected from a clinical ultrasound dataset derived from a Chinese local hospital and a fine-needle aspiration cytology (FNAC) dataset from UCI machine learning repository.ResultsOur study suggested that, in terms of ultrasound data, cell shape is the most significant feature for breast cancer diagnosis, and the resistance index presents a strong probabilistic dependency on blood signals. With respect to FNAC data, bare nuclei are the most important discriminating feature of malignant and benign breast tumours, and uniformity of both cell size and cell shape are tightly interdependent.ContributionsThe BN modelling approach can support clinicians in making diagnostic decisions based on the significant features identified by the model, especially when some other features are missing for specific patients. The approach is also applicable to other healthcare data analytics and data modelling for disease diagnosis.

KW - Clinical decision support

KW - Data modelling

KW - Bayesian networks

KW - Quantitative analysis

KW - Diagnostic contribution

KW - Breast cancer diagnosis

U2 - 10.1016/j.compbiomed.2017.11.014

DO - 10.1016/j.compbiomed.2017.11.014

M3 - Article

VL - 92

SP - 168

EP - 175

JO - Computers in Biology and Medicine

T2 - Computers in Biology and Medicine

JF - Computers in Biology and Medicine

SN - 0010-4825

ER -